小编给大家分享一下OnionSearch是什么,希望大家阅读完这篇文章之后都有所收获,下面让我们一起去探讨吧!
OnionSearch是一款针对洋葱域名的URL搜索脚本,该工具基于Python 3开发,可以帮助广大研究人员在不同的.onion搜索引擎中完成URL地址爬取。
Python 3
ahmia
darksearchio
onionland
notevil
darksearchenginer
phobos
onionsearchserver
torgle
onionsearchengine
tordex
tor66
tormax
haystack
multivac
evosearch
deeplink
pip3 install onionsearch
git clone https://github.com/megadose/OnionSearch.git cd OnionSearch/ python3 setup.py install
usage: onionsearch [-h] [--proxy PROXY] [--output OUTPUT] [--continuous_write CONTINUOUS_WRITE] [--limit LIMIT] [--engines [ENGINES [ENGINES ...]]] [--exclude [EXCLUDE [EXCLUDE ...]]] [--fields [FIELDS [FIELDS ...]]] [--field_delimiter FIELD_DELIMITER] [--mp_units MP_UNITS] search positional arguments: search The search string or phrase optional arguments: -h, --help show this help message and exit --proxy PROXY Set Tor proxy (default: 127.0.0.1:9050) --output OUTPUT Output File (default: output_$SEARCH_$DATE.txt), where $SEARCH is replaced by the first chars of the search string and $DATE is replaced by the datetime --continuous_write CONTINUOUS_WRITE Write progressively to output file (default: False) --limit LIMIT Set a max number of pages per engine to load --engines [ENGINES [ENGINES ...]] Engines to request (default: full list) --exclude [EXCLUDE [EXCLUDE ...]] Engines to exclude (default: none) --fields [FIELDS [FIELDS ...]] Fields to output to csv file (default: engine name link), available fields are shown below --field_delimiter FIELD_DELIMITER Delimiter for the CSV fields --mp_units MP_UNITS Number of processing units (default: core number minus 1) [...]
默认配置下,该脚本将会使用“mp_units = cpu_count() - 1”参数来运行。这也就意味着,如果我们的设备CPU有四个核,它将会同时运行三个爬虫。我们可以随意设置“mp_units”参数的值,但建议使用默认值。
向所有的搜索引擎请求查询“computer”:
onionsearch "computer"
向所有的搜索引擎请求查询“computer”,但排除“Ahmia”和“Candle”:
onionsearch "computer" --exclude ahmia candle
向所有的搜索引擎请求查询“computer”,需同时包含“Tor66”、“DeepLink”和“Phobos”,
onionsearch "computer" --engines tor66 deeplink phobos
跟上述查询内容相同,但仅限每个搜索引擎查询三个页面:
onionsearch "computer" --engines tor66 deeplink phobos --limit 3
默认配置下,搜索结果将以CSV格式存储,其中包含下列数据:
"engine","name of the link","url"
我们还可以使用“--fields”和“--field_delimiter”参数来指定输出文件中的数据项:
“--fields”可以帮助我们添加、移除和重新排序输出项:
"engine","name of the link","url","domain"
或者:
"engine","domain"
看完了这篇文章,相信你对“OnionSearch是什么”有了一定的了解,如果想了解更多相关知识,欢迎关注亿速云行业资讯频道,感谢各位的阅读!
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。