Friday, April 18, 2008

Recursively convert encoding

for i in `find . -type f -print`;
do
iconv -f CP1251 -t UTF-8 $i -o $i.new;
mv -f $i.new $i;
done

Wednesday, February 20, 2008

Robots/Bots by user agent



count | user_agent
-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
16.86 | Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
3.75 | Mozilla/5.0 (compatible; LinksManager.com_bot +http://linksmanager.com/linkchecker.html)
2.69 | msnbot/1.0 (+http://search.msn.com/msnbot.htm)
2.21 | LinkLint-spider/2.3.5
1.99 | Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
1.28 | msnbot-media/1.0 (+http://search.msn.com/msnbot.htm)
0.64 | ia_archiver
0.51 | Mozilla/4.0 (compatible; NaverBot/1.0; http://help.naver.com/delete_main.asp)
0.51 | Speedy Spider (http://www.entireweb.com/about/search_tech/speedy_spider/)
0.43 | Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)
0.43 | LmCrawler
0.41 | Morfeus Fucking Scanner
0.39 | IRLbot/3.0 (compatible; MSIE 6.0; http://irl.cs.tamu.edu/crawler)
0.38 | Smart/Spider 2.61.3 - Mozilla/4.0
0.35 | Gigabot/3.0 (http://www.gigablast.com/spider.html)
0.31 | Speedy Spider (http://www.entireweb.com/about/search_tech/speedyspider/)
0.28 | ichiro/2.0 (http://help.goo.ne.jp/door/crawler.html)
0.27 | Mozilla/4.0 compatible ZyBorg/1.0 Dead Link Checker (wn.dlc@looksmart.net; http://www.WISEnutbot.com)
0.25 | Mozilla/5.0 (compatible; Exabot/3.0; +http://www.exabot.com/go/robot)
0.23 | e-SocietyRobot(http://www.yama.info.waseda.ac.jp/~yamana/es/)
0.22 | Yeti/0.01 (nhn/1noon, yetibot@naver.com, check robots.txt daily and follow it)
0.22 | Baiduspider+(+http://www.baidu.com/search/spider.htm)
0.21 | bot/1.0 (bot; http://; bot@bot.bot)
0.19 | Baiduspider+(+http://www.baidu.com/search/spider_jp.html)
0.19 | CazoodleBot/CazoodleBot-0.1 (CazoodleBot Crawler; http://www.cazoodle.com/cazoodlebot; cazoodlebot@cazoodle.com)
0.17 | WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)
0.16 | msnbot/0.9 (+http://search.msn.com/msnbot.htm)
0.16 | CazoodleBot/Nutch-0.9-dev (CazoodleBot Crawler; http://www.cazoodle.com; mqbot@cazoodle.com)
0.15 | MJ12bot/v1.2.0 (http://majestic12.co.uk/bot.php?+)
0.14 | MJ12bot/v1.0.8 (http://majestic12.co.uk/bot.php?+)
0.14 | Gigabot/2.0 (http://www.gigablast.com/spider.html)
0.14 | Shim-Crawler(Mozilla-compatible; http://www.logos.ic.i.u-tokyo.ac.jp/crawler/; crawl@logos.ic.i.u-tokyo.ac.jp)
0.13 | Exabot/3.0
0.13 | Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)
0.12 | Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 (http://www.voila.com/)
0.12 | libwww-perl/5.805
0.12 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; QihooBot 1.0 qihoobot@qihoo.net)
0.12 | Mozilla/5.0 (compatible; Yahoo! Slurp China; http://misc.yahoo.com.cn/help.html)
0.11 | Acme.Spider
0.11 | MSNBOT_Mobile MSMOBOT Mozilla/2.0 (compatible; MSIE 4.02; Windows CE; Default)
0.11 | Mozilla/4.0 compatible ZyBorg/1.0 (wn-14.zyborg@looksmart.net; http://www.WISEnutbot.com)
0.11 | voyager/1.0
0.10 | Nokia6682/2.0 (3.01.1) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 configuration/CLDC-1.1 UP.Link/6.3.0.0.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
0.10 | genieBot wgao@genieknows.com
0.10 | FAST MetaWeb Crawler (helpdesk at fastsearch dot com)
0.10 | ConveraCrawler/0.9e (+http://www.authoritativeweb.com/crawl)
0.09 | : TCL Crawler (crawler@tcllab.org), Java/1.5.0_06
0.08 | sproose/1.0beta (sproose bot; http://www.sproose.com/bot.html; crawler@sproose.com)
0.08 | CazoodleBot/Nutch-0.9-dev (CazoodleBot Crawler; http://www.cazoodle.com/cazoodlebot; cazoodlebot@cazoodle.com)
0.08 | Mozilla/5.0 (compatible; MJ12bot/v1.2.1; http://www.majestic12.co.uk/bot.php?+)
0.08 | Accoona-AI-Agent/1.1.2 (aicrawler at accoonabot dot com)
0.08 | ia_archiver-web.archive.org
0.08 | FAST Enterprise Crawler 6 used by Singapore Press Holdings (crawler@sphsearch.sg)
0.07 | ICC-Crawler(Mozilla-compatible; http://kc.nict.go.jp/icc/crawl.html; icc-crawl(at)ml(dot)nict(dot)go(dot)jp)
0.07 | MQBOT/Nutch-0.9-dev (MQBOT Nutch Crawler; http://vwbot.cs.uiuc.edu; mqbot@cs.uiuc.edu)
0.07 | UbiCrawler/v0.4beta (http://gii.nagaokaut.ac.jp/~ubi/)
0.07 | Mozilla/5.0 (compatible; Gigamega.bot/1.0; +http://www.gigamega.net/bot.html)
0.06 | my-heritrix-crawler(+http://mywebsite.com)
0.06 | Speedy Spider (Entireweb; Beta/1.3; http://www.entireweb.com/about/search_tech/speedyspider/)
0.06 | MSRBOT
0.06 | Mozilla/4.0 (compatible; BOTW Spider; +http://botw.org)
0.05 | DoCoMo/2.0 N902iS(c100;TB;W24H12)(compatible; moba-crawler; http://crawler.dena.jp/)
0.05 | MSRBOT (http://research.microsoft.com/research/sv/msrbot)
0.05 | DoCoMo/1.0/N505i/c20/TB/W20H10 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
0.05 | ConveraCrawler/0.9d (+http://www.authoritativeweb.com/crawl)
0.05 | Snapbot/1.0
0.05 | sogou spider
0.05 | libwww-perl/5.803
0.04 | ICC-Crawler(Mozilla-compatible; http://kc.nict.go.jp/icc/crawl.html; icc-crawl-contact(at)ml(dot)nict(dot)go(dot)jp)
0.04 | Mozilla/5.0 (compatible; heritrix/1.12.0 +http://www.accelobot.com)
0.04 | YahooFeedSeeker/2.0 (compatible; Mozilla 4.0; MSIE 5.5; http://publisher.yahoo.com/rssguide; users 0; views 16)
0.04 | noxtrumbot/1.0 (crawler@noxtrum.com)
0.04 | TurnitinBot/2.1 (http://www.turnitin.com/robot/crawlerinfo.html)
0.04 | Mozilla/5.0 (compatible; BecomeBot/3.0; +http://www.become.com/site_owners.html)
0.04 | Mozilla/5.0 (compatible; Webbot/0.1; http://www.webbot.ru/bot.html)
0.04 | YahooFeedSeeker/2.0 (compatible; Mozilla 4.0; MSIE 5.5; http://publisher.yahoo.com/rssguide)
0.04 | iearthworm/1.0, mailto:iearthworm@yahoo.com.cn
0.04 | Teemer (NetSeer, Inc. is a Los Angeles based Internet startup company.; http://www.netseer.com/crawler.html; crawler@netseer.com)
0.04 | KAIST AITrc Crawler
0.04 | MSRBOT (http://research.microsoft.com/research/sv/msrbot/
0.04 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT; MS Search 4.0 Robot)
0.04 | Mozilla/5.0 (Windows;) NimbleCrawler 2.0.1 obeys UserAgent NimbleCrawler For problems contact: crawler@healthline.com
0.04 | Mozilla/4.0 (compatible; MSIE 4.01; Windows NT; MS Search 5.0 Robot)
0.03 | McBot/5.001 (windows; U; NT4.0; en-us)
0.03 | SurveyBot/2.3 (Whois Source)
0.03 | spider tspyyp@tom.com
0.03 | Snapbot/1.0 (Snap Shots, +http://www.snap.com)
0.03 | genieBot genieBot@genieknows.com
0.03 | Gigabot/1.0
0.03 | ShopWiki/1.0 ( +http://www.shopwiki.com/wiki/Help:Bot)
0.03 | YahooFeedSeeker/2.0 (compatible; Mozilla 4.0; MSIE 5.5; http://publisher.yahoo.com/rssguide; users 1; views 64)
0.03 | Yahoo-MMCrawler/3.x (mms dash mmcrawler dash support at yahoo dash inc dot com)
0.03 | Sensis Web Crawler (search_comments\\at\\sensis\\dot\\com\\dot\\au)
0.03 | disco/Nutch-1.0-dev (experimental crawler; www.discoveryengine.com; disco-crawl@discoveryengine.com)
0.03 | Factbot 1.09
0.03 | DiGi-RSSBot
0.03 | Mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html) googlebot@google.com
0.03 | Mozilla/5.0 (compatible; heritrix/1.8.0 +http://crawlerx51.com)
0.03 | Mozilla/5.0 (compatible; archive.org_bot/1.13.1x +http://crawler.archive.org)
0.02 | Googlebot/2.1 (+http://www.google.com/bot.html)
0.02 | Speedy Spider (Entireweb; Beta/1.2; http://www.entireweb.com/about/search_tech/speedyspider/)
0.02 | NaverBot-1.0 (NHN Corp. / +82-31-784-1989 / nhnbot@naver.com)
0.02 | Gigabot/2.0
0.02 | Snapbot/1.0 (+http://www.snap.com)
0.02 | NIF/1.1 (http://www.newsisfree.com/robot.php)
0.02 | FAST Enterprise Crawler 6 used by fast (jorgent@fast)
0.02 | Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Exabot-Thumbnails)
0.02 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; QihooBot 1.0)
0.02 | RedBot/redbot-1.0 (Rediff.com Crawler; redbot at rediff dot com)
0.02 | Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )
0.02 | WebCrawler_1.1 internet@bredband.net
0.02 | Mozilla/5.0 (compatible;YodaoBot-Image/1.0;http://www.yodao.com/help/webmaster/spider/;)
0.02 | LocalcomBot/1.3.0 (+http://www.local.com/bot.htm)
0.02 | ilial/Nutch-0.9 (Ilial, Inc. is a Los Angeles based Internet startup company. For more information please visit http://www.ilial.com/crawler; http://www.ilial.com/crawler; crawl@ilial.com)
0.02 | Mozilla/5.0 (compatible; AnsearchBot/1.0; +http://www.ansearch.com.au/)
0.02 | Nokia6820/2.0 (4.83) Profile/MIDP-1.0 Configuration/CLDC-1.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
0.02 | st7bot/0.03
0.02 | ilial/Nutch-0.9 (Ilial, Inc. is a Los Angeles based Internet startup company.; http://www.ilial.com/crawler; crawl@ilial.com)
0.02 | NetResearchServer/4.0(loopimprovements.com/robot.html)
0.02 | http://www.almaden.ibm.com/cs/crawler [bc23]
0.02 | Googlebot-Image/1.0
0.02 | VMBot/0.9 (VMBot; http://www.verticalmatch.com; vmbot@tradedot.com)
0.02 | Mozilla/4.0 (compatible; LinkMarket-Bot)
0.02 | Mozilla/5.0 (compatible; BecomeJPBot/2.3; MSIE 6.0 compatible; +http://www.become.co.jp/site_owners.html)
0.02 | Gigabot/2.0; http://www.gigablast.com/spider.html
0.02 | Gaisbot/3.0+(robot06@gais.cs.ccu.edu.tw;+http://gais.cs.ccu.edu.tw/robot.php)
0.02 | libwww-perl/5.79
0.02 | Factbot 1.09 (see http://www.factbites.com/webmasters.php)
0.02 | VadixBot
0.01 | Nutch/Nutch-1.0-dev (Experimental webcrawler for personal research; http://lucene.apache.org/nutch/; nutch at h0m3)
0.01 | LapozzBot/1.4 (+http://robot.lapozz.hu)
0.01 | Mozilla/5.0 (compatible; BecomeBot/2.3; MSIE 6.0 compatible; +http://www.become.com/site_owners.html)
0.01 | Teemer (NetSeer, Inc.; http://www.netseer.com/crawler.html; crawler@netseer.com)
0.01 | Mozilla/5.0 compatible WebaltBot/1.00 (i686-pc-linux)
0.01 | Speedy Spider (Entireweb; Beta/1.0; http://www.entireweb.com/about/search_tech/speedyspider/)
0.01 | Speedy Spider (Entireweb; Beta/1.1; http://www.entireweb.com/about/search_tech/speedyspider/)
0.01 | SeznamBot/2.0 (+http://fulltext.seznam.cz/)
0.01 | MSRBOT (http://research.microsoft.com/research/sv/msrbot/)
0.01 | Mozilla/5.0 (compatible; Exabot Test/3.0; +http://www.exabot.com/go/robot)
0.01 | Steeler/3.3 (http://www.tkl.iis.u-tokyo.ac.jp/~crawler/)
0.01 | LinksManager.com_bot
0.01 | libwww-perl/5.65
0.01 | Nusearch Spider (www.nusearch.com)
0.01 | Mozilla/5.0 (compatible; worio bot heritrix/1.10.0 +http://worio.com)
0.01 | RSSMicro.com RSS/Atom Feed Robot
0.01 | Sogou web spider/3.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
0.01 | NPBot/3 (NPBot; http://www.nameprotect.com; npbot@nameprotect.com)
0.01 | Krugle/Krugle,Nutch/0.8+ (Krugle web crawler; http://corp.krugle.com/crawler/info.html; webcrawler@krugle.com)
0.01 | Mozilla/5.0 (compatible; woriobot heritrix/1.10.0 +http://worio.com)
0.01 | Pluggd/Nutch-0.9 (automated crawler; http://www.pluggd.com; support at pluggd dot com)
0.01 | NutchCVS/0.8-dev (Nutch running at UW; http://www.nutch.org/docs/en/bot.html; sycrawl@cs.washington.edu)
0.01 | VMBot/0.7.2 (VMBot; http://www.VerticalMatch.com/; vmbot@tradedot.com)
0.01 | HMSE_Robot
0.01 | genieBot enash@genieknows.com
0.01 | NutchCVS/0.7.2 (Nutch; http://lucene.apache.org/nutch/bot.html; nutch-agent@lucene.apache.org)
0.01 | SeznamBot/1.0 (+http://fulltext2.seznam.cz/)
0.01 | EmeraldShield.com Web Spider (http://www.emeraldshield.com/webbot.aspx)
0.01 | genieBot (wgao@genieknows.com)
0.01 | Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; Girafabot; girafabot at girafa dot com; http://www.girafa.com)
0.01 | ExaBotTest/2.0
0.01 | OutfoxBot/0.5 (for internet experiments; http://; outfoxbot@gmail.com)
0.01 | VWBOT/Nutch-0.9-dev (VWBOT Nutch Crawler; http://vwbot.cs.uiuc.edu; vwbot@cs.uiuc.edu)
0.01 | MJ12bot/v1.1.0 (http://majestic12.co.uk/bot.php?+)
0.01 | Acoon-Robot v3.00 (http://www.acoon.de and http://www.acoon.com)
0.01 | RufusBot (Rufus Web Miner; http://www.webaroo.com/rooSiteOwners.html)
0.01 | woriobot (+http://www.worio.com)
0.01 | ZIBB Crawler (email address / WWW address)
0.01 | psbot/0.1 (+http://www.picsearch.com/bot.html)
0.01 | Twiceler www.cuill.com/twiceler/robot.html
0.01 | LarbinWebCrawler spider@download11.com
0.01 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322; MSIECrawler)
0.01 | msnbot-Products/1.0 (+http://search.msn.com/msnbot.htm)
0.01 | PHP version tracker (http://www.nexen.net/phpversion/bot.php)
0.01 | IRLbot/2.0 (compatible; MSIE 6.0; http://irl.cs.tamu.edu/crawler)
0.01 | NextGenSearchBot 1 (for information visit http://www.zoominfo.com/About/misc/NextGenSearchBot.aspx)
0.01 | VisBot/2.0 (Visvo.com Crawler; http://www.visvo.com/bot.html; bot@visvo.com)
0.01 | nicebot
0.01 | Findexa Crawler (http://www.findexa.no/gulesider/article26548.ece)
0.01 | Gigabot/2.0att
0.01 | sogou develop spider
0.01 | lawinfo-crawler/Nutch-0.9-dev (Crawler for lawinfo.com pages; http://www.lawinfo.com; webmaster@lawinfo.com)
0.01 | Mozilla/5.0 (compatible; AnsearchBot/1.0; +http://www.ansearch.com/)
0.01 | aipbot/2 (aipbot; http://www.aipbot.com; aipbot@aipbot.com)
0.01 | Twiceler-0.9 http://www.cuill.com/twiceler/robot.html
0.00 | Mozilla/4.0 (BejiBot Crawler 1.2a)
0.00 | IlseBot/1.1
0.00 | owsBot/0.2 (owsBot; www.oneworldstreet.com; owsBot)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; http://www.changedetection.com/bot.html )
0.00 | Mozilla/5.0 (compatible; askpeter_jeanie_2008_bot/4.1; +http://www.askpeter.info)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) (via babelfish.yahoo.com)
0.00 | Bigsearch.ca/Nutch-1.0-dev (Bigsearch.ca Internet Spider; http://www.bigsearch.ca/; info@enhancededge.com)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MRSPUTNIK 1, 8, 0, 17 SW; MRA 4.10 (build 01952); .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; MSIECrawler)
0.00 | Mozilla/4.0 (compatible: FDSE robot)
0.00 | SonyEricssonP910c/R2A SEMC-Browser/Symbian/3.0 Profile/MIDP-2.0 Configuration/CLDC-1.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
0.00 | HouxouCrawler/Nutch-0.9 (houxou.com's nutch-based crawler which serves special interest on-line communities; http://www.houxou.com/crawler; crawler at houxou dot com)
0.00 | NUTCHCRAWLER/Nutch-0.9 (anouar@yatinoo.com)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; MSIECrawler)
0.00 | Googlebot/1.0 (Googlebot; http://www.google.com/; googlebot@google.com)
0.00 | ScSpider/0.2
0.00 | 411.info Crawler (http://411.info)
0.00 | KDDI-CA33 UP.Browser/6.2.0.10.4 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Alexa Toolbar; MSIECrawler)
0.00 | Mozilla/5.0 (compatible; Synoobot/0.9; http://www.synoo.com/search/bot.html)
0.00 | Mozilla/5.0 (compatible; OWPBot/0.3; http://www.openwhitepages.com/)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; FunWebProducts; MSIECrawler)
0.00 | Mozilla/5.0 (compatible; CjLogbot 1.0; +http://www.cjlog.com/bot)
0.00 | ichiro/3.0 (http://help.goo.ne.jp/door/crawler.html)
0.00 | libwww-perl/5.76
0.00 | DataparkSearch/4.46 (+http://dataparksearch.org/bot)
0.00 | Qweery_robot.txt_CheckBot/3.01 (http://qweerybot.qweery.com)
0.00 | ZoomSpider - wrensoft.com [ZSEBOT]
0.00 | Mozilla/5.0 (compatible; TridentSpider/3.1)
0.00 | CazoodleBot/0.1 (CazoodleBot Crawler; http://www.cazoodle.com; mqbot@cazoodle.com)
0.00 | spider (tspyyp@tom.com)
0.00 | genieBot (genieBot@genieknows.com)
0.00 | Cazoodle/Nutch-0.9-dev (Cazoodle Nutch Crawler; http://www.cazoodle.com; mqbot@cazoodle.com)
0.00 | Pete-Spider Light/1.46
0.00 | KAIST AITrc Crawler(\xba\xbb crawler\xb4\xc2 \xbf\xac\xb1\xb8\xbf\xeb\xc0\xd4\xb4\xcf\xb4\xd9. crawling\xc0\xbb \xbf\xf8\xc7\xcf\xc1\xf6 \xbe\xca\xc0\xb8\xbd\xc3\xb8\xe9 \xbf\xac\xb6\xf4\xc1\xd6\xbd\xc3\xb1\xe2 \xb9\xd9\xb6\xf8\xb4\xcf\xb4\xd9.
0.00 | lanshanbot/1.0
0.00 | Mozilla/4.0 (compatible; MSIE is not me; EDI/1.6.6; Edacious & Intelligent Web Robot; Daum Communications Corp., Korea)
0.00 | msnbot/1.0+(+http://search.msn.com/msnbot.htm)
0.00 | Mozilla/3.0 (compatible; ScollSpider; http://www.webwobot.com)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; SIMBAR Enabled; FunWebProducts; SIMBAR={7D736384-0BF3-4924-AFC2-80D3C73DB022}; MSIECrawler)
0.00 | OrangeSpider
0.00 | LTI/LemurProject Nutch Spider/Nutch-1.0-dev (Research spider using Nutch; http://lucene.apache.org/nutch/bot.html; admin@lemurproject.org)
0.00 | Keyword Density Analyzer v1.01 ( http://www.ranks.nl/tools/spider.html )
0.00 | SouFZ-Spider[SouFZ.COM]
0.00 | libwww-perl/5.64
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MyIE2; .NET CLR 1.1.4322) (via babelfish.yahoo.com)
0.00 | libwww-perl/5.53
0.00 | Feed-Directory/0.1 (+http://www.feed-directory.com/bot.html)
0.00 | QweeryBot/3.02 (http://qweerybot.qweery.nl)
0.00 | blogsearchbot-pumpkin-2
0.00 | GnoZtiK bot/1.0 (http://www.gnoztik.com)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; .NET CLR 2.0.50727; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; DigExt) (via babelfish.yahoo.com)
0.00 | shaboyi spider
0.00 | NextGenSearchBot 1 (for information visit http://about.zoominfo.com/About/NextGenSearchBot.aspx)
0.00 | Mozilla/5.0 (compatible; Megaglobe Crawler/1.0; http://www.megaglobe.com)
0.00 | SygolBot http://www.sygol.com
0.00 | yacybot (i386 Linux 2.6.20-vs2.2.0-gentoo; java 1.5.0_11; Europe/en) http://yacy.net/yacy/bot.html
0.00 | Exabot-Test/1.0
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) (via babelfish.yahoo.com)
0.00 | Seekbot/1.0 (http://www.seekbot.net/bot.html) RobotsTxtFetcher/1.2
0.00 | Mozilla/5.0 (SnapPreviewBot) Gecko/20061206 Firefox/1.5.0.9
0.00 | Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Voyager; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)
0.00 | Mozilla/4.0 (compatible; MSIE is not me; DAUMOA/1.0.0; DAUM Web Robot; Daum Communications Corp., Korea)
0.00 | Shim-Crawler(Mozilla-compatible; http://www.logos.ic.i.u-tokyo.ac.jp/crawl/; crawl@logos.ic.i.u-tokyo.ac.jp)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MSIECrawler)
0.00 | Yahoo! Mindset
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; SIMBAR Enabled; .NET CLR 1.1.4322; MSIECrawler)
0.00 | MQbot http://metaquerier.cs.uiuc.edu/crawler
0.00 | Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9 (via babelfish.yahoo.com)
0.00 | WebarooBot (Webaroo Bot; http://64.124.122.252/feedback.html)
0.00 | ScollSpider/2.0 (+http://www.webwobot.com/ScollSpider.php)
0.00 | yacybot (i386 Linux 2.6.9-023stab033.9-smp; java 1.5.0_10; Europe/de) http://yacy.net/yacy/bot.html
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0), libwww-perl/5.69
0.00 | MJ12bot/v1.1.2 (http://majestic12.co.uk/bot.php?+)
0.00 | zerxbot/Version 0.6 libwww-perl/5.79
0.00 | varsabulur.com [ZSEBOT]
0.00 | Norbert the Spider(Burf.com)
0.00 | Googlebot/2.1 (+http://www.googlebot.com/bot.html; MSIE 6.0; Windows NT 5.1; SV1)
0.00 | Mozilla/5.0 (Yahoo-Test/4.0 mailto:vertical-crawl-support@yahoo-inc.com)
0.00 | gsa-crawler (Enterprise; S4-FUVPASTFQ6AAB; dahe@google.com)
0.00 | ZoomSpider - wrensoft.com
0.00 | owsBot/0.1 (Nutch; www.oneworldstreet.com; nutch-agent@lucene.apache.org)
0.00 | Mozilla/4.0 (compatible; Vagabondo/4.0Beta; webcrawler at wise-guys dot nl; http://webagent.wise-guys.nl/)
0.00 | TerrawizBot/1.0 (+http://www.terrawiz.com/bot.html)
0.00 | holmes/3.11 (http://morfeo.centrum.cz/bot)
0.00 | GigaBot/1.0
0.00 | MOT-V975/81.33.02I MIB/2.2.1 Profile/MIDP-2.0 Configuration/CLDC-1.1 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
0.00 | boitho.com-dc/0.85 ( http://www.boitho.com/dcbot.html )
0.00 | Grub/2.0 (Grub.org crawler; http://www.grub.org/; bot@grub.org)
0.00 | Sogou Orion spider/3.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
0.00 | GurujiBot/1.0 (+http://www.guruji.com/WebmasterFAQ.html)
0.00 | Mozilla/5.0 (Yahoo-Test/4.0; yahoo-ysm-ks@yahoo-inc.com)
0.00 | Sogou head spider/3.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
0.00 | oBot
0.00 | LTI/LemurProject Nutch Spider/Nutch-1.0-dev (Research spider using Nutch; http://www.lemurproject.org; mhoy@cs.cmu.edu)
0.00 | FyberSpider
0.00 | Gigabot/2.0/gigablast.com/spider.html
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; .NET CLR 1.1.4322; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; MSIE is not me; DAUMOA/1.0.1; DAUM Web Robot; Daum Communications Corp., Korea)
0.00 | Evaal/0.7.2 (Evaal search engine; http://evaal.coml; bot@evaal.com)
0.00 | Blogobot/Delta
0.00 | k2spider
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; Girafabot [girafa.com])
0.00 | BilgiBot/1.0(beta) (http://www.bilgi.com/; bilgi at bilgi dot com)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; HbTools 4.8.4; MSIECrawler)
0.00 | btfc Obotnxmnuiulrftvu
0.00 | Jyxobot/1
0.00 | pythonic-crawler (suzuki@tkl.iis.u-tokyo.ac.jp)
0.00 | FDF-Bot - Free Domainfinder by www.geld13.de
0.00 | Mozilla/5.0 (compatible; askpeter_jeanie_2008_bot/4.2; +http://www.askpeter.info)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; DigExt; .NET CLR 1.0.3705; Zango 10.0.341.0; MSIECrawler)
0.00 | Mozilla/5.0 (Yahoo-Test/4.0; mailto:vertical-crawl-support@yahoo-inc.com)
0.00 | i1searchbot/2.0 (i1search web crawler; http://www.i1search.com; crawler@i1search.com)
0.00 | Exabot/2.0
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; Googlebot; .NET CLR 1.1.4322; .NET CLR 1.0.3705; Googlebot)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; HbTools 4.8.4; SeekmoToolbar 4.8.4; MSIECrawler)
0.00 | Yeti/0.01 (nhn/1noon, yetibot@naver.com, check robots.txt daily and follows it)
0.00 | GurujiBot/1.0 (+http://www.guruji.com/en/WebmasterFAQ.html)
0.00 | Jambot/0.2.1 (Jambot; http://www.jambot.com/blog/static.php?page=webmaster-robot; crawler@jambot.com)
0.00 | libwww-perl/5.48
0.00 | Metaeuro Web Crawler/0.2 (MetaEuro Web Search Clustering Engine; http://www.metaeuro.com; crawlwer at metaeuro dot com)
0.00 | WebarooBot (Webaroo Bot; http://www.webaroo.com/rooSiteOwners.html)
0.00 | kicktooBotV1.1 kictooBot@kictoo.com
0.00 | Novosoft Spider/Nutch-0.9 (http://www.novosoft.net)
0.00 | LBot
0.00 | PCbot/5.3
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; yie6_SBC; YPC 3.0.1; Alexa Toolbar; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; NaverBot/1.0; nhnbot@naver.com)
0.00 | SpiderVex v0.1
0.00 | Mozilla/5.0 (compatible; MojeekBot/2.0; http://www.mojeek.com/bot.html)
0.00 | Nokia6682/2.0 (3.01.1) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 configuration/CLDC-1.1 UP.Link/6.3.0.0.0 (compatible;YahooSeeker/M1A1-R2D2;http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
0.00 | Crawler Mozilla/4.0( compatible; MSIE 6.0; Windows NT 5.1; SV1; Maxthon; Alexa Toolbar)
0.00 | Sosoimagespider+(+http://help.soso.com/soso-image-spider.htm)
0.00 | crawler/10.36
0.00 | focused_crawler (flexo; Linux x86_64; http://ivia.ucr.edu/user_agents.html)
0.00 | Mozilla/5.0 (compatible; NGBot/4.5)
0.00 | Mozilla/4.0 compatible ZyBorg/1.0 (wn-16.zyborg@looksmart.net; http://www.WISEnutbot.com)
0.00 | WeRelateBot/0.9 (WeRelate; http://www.werelate.org/wiki/WeRelate:Bot; dallan@werelate.org)
0.00 | Attentio/Nutch-0.9-dev (Attentio's beta blog crawler; www.attentio.com; info@attentio.com)
0.00 | VanillaZilla/0.1 libwww-perl/5.79
0.00 | hl_ftien_spider_v1.1
0.00 | Mozilla/4.0 (compatible; GPU p2p crawler http://gpu.sourceforge.net/search_engine.php)
0.00 | KiwiStatus (NZS.com)/0.2 (NZS.com KiwiStatus Spider, Local Search New Zealand; http://www.nzs.com; bot-at-nzs dot com)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; DigExt; .NET CLR 1.0.3705; MSIECrawler)
0.00 | boitho.com-dc/0.82 ( http://www.boitho.com/dcbot.html )
0.00 | W3C-checklink/4.3 [4.42] libwww-perl/5.805
0.00 | Mozilla/4.0 (compatible; MyFamilyBot/1.0; http://www.ancestry.com/learn/bot.aspx)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) (via babelfish.yahoo.com)
0.00 | libwww-perl/5.69
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; YPC 3.0.2; SV1; MSIECrawler)
0.00 | masidani_bot_v0.6(www.masidani.com) (masidani@masidani.com)
0.00 | Mozilla/6.0 (MSIE 6.0; Windows NT 5.1; RSSMicro.com RSS/Atom Feed Robot)
0.00 | SGH-Z130 SHP/VPP/R5 SMB3.1 SMM-MMS/1.1.0 profile/MIDP-2.0 configuration/CLDC-1.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
0.00 | QweeryBot/3.02 ( http://qweerybot.qweery.nl)
0.00 | TMCrawler
0.00 | BCNutch/Nutch-0.9 (BrightCloud crawler; http://www.brightcloud.com/crawler.asp)
0.00 | LapozzBot/1.5 (+http://robot.lapozz.hu)
0.00 | Mozilla/5.0 (compatible;MAINSEEK_BOT)
0.00 | Googlebot/2.X (http://www.googlebot.com/bot.html) (383; MSIE 7.0; Windows NT 5.1; InfoPath.1; .NET CLR 2.0.50727)
0.00 | AdSenserBot
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; MSIECrawler)
0.00 | yacybot (i386 Mac OS X 10.4.9; java 1.6.0-dp; Europe/de; YaCy 0.514/03593; yacy.net) http://yacy.net/yacy/bot.html
0.00 | Mozilla/5.0 (compatible; Exabot-Images/3.0; +http://www.exabot.com/go/robot)
0.00 | GoldenFeed Spider 1.0 (http://www.goldenfeed.com)
0.00 | Mozilla/5.0 (MrCarlito-0.1 http://www.mrcarlito.com/spider.html)
0.00 | NG-Search/0.9.8 (NG-SearchBot; http://www.ng-search.com)
0.00 | BuildCMS crawler (http://www.buildcms.com/crawler)
0.00 | NetWhatCrawler/0.06-dev (NetWhatCrawler from NetWhat.com; http://www.netwhat.com; support@netwhat.com)
0.00 | MSNBOT_Mobile Mozilla/2.0 (compatible; MSIE 4.02; Windows CE; Default)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Advanced Searchbar 3.25; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; SIMBAR={16BC191A-29AA-4259-9FB3-AB3CD3CA7170}; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; HbTools 4.7.7; MSIECrawler)
0.00 | URL_Spider_Pro/2.5+(http://www.innerprise.net/usp-spider.asp)
0.00 | iaskspider/2.0(+http://iask.com/help/help_index.html)
0.00 | sdcresearchlabs-testbot/0.8-dev (www.shopping.com/bot.html; http://lucene.apache.org/nutch/bot.html; researchbot@shopping.com)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; FunWebProducts; SV1; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MEGAUPLOAD 1.0; MSIECrawler)
0.00 | IIITBOT/1.1 (Indian Language Web Search Engine; http://webkhoj.iiit.net; pvvpr at iiit dot ac dot in)
0.00 | Nokia6682/2.0 (3.01.1) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 configuration/CLDC-1.1 UP.Link/6.3.0.0.0 (compatible;YahooSeeker/M1A1-R2D2;mobile-search-customer-care AT yahoo-inc dot com)
0.00 | LarbinWebCrawler (spider@download11.com)
0.00 | RwB's spider for atUKplc. Version 0.0.3ish
0.00 | Nokia6610/1.0 (3.09) Profile/MIDP-1.0 Configuration/CLDC-1.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
0.00 | asked/Nutch-0.8 (web crawler; http://asked.jp; epicurus at gmail dot com)
0.00 | libwww-perl/5.800
0.00 | libwww-perl/5.801
0.00 | SIE-SX1/05 UP.Browser/7.0.0.1.181 (GUI) MMP/2.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
0.00 | YahooFeedSeeker/2.0 (compatible; Mozilla 4.0; MSIE 5.5; http://publisher.yahoo.com/rssguide; users 1; views 16)
0.00 | Seekbot/1.0 (http://www.seekbot.net/bot.html) HTTPFetcher/2.2
0.00 | grbot
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; yie6_SBC; YPC 3.0.1; MSIECrawler)
0.00 | msnbot/1.1 (+http://search.msn.com/msnbot.htm)
0.00 | BilgiBetaBot/0.8-dev/0.8-dev (bilgi.com (Beta); http://lucene.apache.org/nutch/bot.html; nutch-agent@lucene.apache.org)
0.00 | BilgiBetaBot/0.8-dev (bilgi.com (Beta) ; http://lucene.apache.org/nutch/bot.html; nutch-agent@lucene.apache.org)
0.00 | Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) m_pupil@yahoo.com.cn
0.00 | Oxford_AI_BOT_V1.0 (compatible; MSIE 6.0;)
0.00 | lanshanbot/1.0 (+http://search.msn.com/msnbot.htm)
0.00 | MQBOT/Nutch-0.9-dev (MQBOT Nutch Crawler; http://falcon.cs.uiuc.edu; mqbot@cs.uiuc.edu)
0.00 | MSMOBOT Mozilla/2.0 (compatible; MSIE 4.02; Windows CE; Default)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; Vagabondo/4.0; webcrawler at wise-guys dot nl; http://webagent.wise-guys.nl/; http://www.wise-guys.nl/)
0.00 | YodaoBot/1.0 (http://www.yodao.com/help/webmaster/spider/; )
0.00 | libwww-perl/5.73
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) (m_pupil@yahoo.com.cn)
0.00 | YahooFeedSeeker/2.0 (compatible; Mozilla 4.0; MSIE 5.5; http://publisher.yahoo.com/rssguide; users 0; views 0)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; ZangoToolbar 4.8.3; HbTools 4.8.4; MSIECrawler)
0.00 | Nokia6680/1.0 ((4.04.07) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 Configuration/CLDC-1.1 (botmobi find.mobi/bot.html) )
0.00 | JAC (Just Another Crawler BETA 1.0)
0.00 | msnbot/0.01 (+http://search.msn.com/msnbot.htm)
0.00 | nrsbot/5.0(loopip.com/robot.html)
0.00 | Spock Alpha Crawler (joseph@corp.spock.com)
0.00 | RSS-SPIDER (http://www.rss-spider.com/submit200709.php)
0.00 | Mozilla/4.0 (compatible; MSIE 5.0; Windows 95) VoilaBot BETA 1.2 (http://www.voila.com/)
0.00 | Mozilla/5.0 (compatible; Googlebot/2.1;+http://www.google.com/bot.html)
0.00 | GSiteCrawler/v1.12 rev. 260 (http://gsitecrawler.com/)
0.00 | SEO[.AG] - Search Engine Optimizer Bot [http://www.seo.ag]
0.00 | NutchCVS/0.06-dev (Nutch; http://www.nutch.org/docs/en/bot.html; nutch-agent@lists.sourceforge.net)
0.00 | Mozilla/5.0 (compatible; BecomeJPBot/2.3; MSIE 6.0 compatible; +http://www.become.co.jp/site_owner.html)
0.00 | NutchCVS/0.7.2 (Nutch; http://lucene.apache.org/nutch/bot.html; west@cis.poly.edu)
0.00 | Steeler/3.2 (http://www.tkl.iis.u-tokyo.ac.jp/~crawler/)
0.00 | VSynCrawler/1.0
0.00 | Mozilla/5.0 (superbot.com; +http://www.super.info)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; MEGAUPLOAD 1.0; MSIECrawler)
0.00 | Balihoo/Nutch-1.0-dev (Crawler for Balihoo.com search engine - obeys robots.txt and robots meta tags ; http://balihoo.com/index.aspx; robot at balihoo dot com)
0.00 | sproose/0.1 (sproose bot; http://www.sproose.com/bot.html; crawler@sproose.com)
0.00 | Mozilla/5.0 (compatible; Jeanie_2008_bot/3.2; +http://www.quickengines.com)
0.00 | Mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html) (googlebot@google.com)
0.00 | genieBot (enash@genieknows.com)
0.00 | Mozilla/5.0 (compatible; BuiltWith/0.1; +http://builtwith.com/bot.html)
0.00 | FAST Enterprise Crawler 6 used by Comperio AS (sts@comperio.no)
0.00 | LG-C1500 UP.Browser/6.2.3 (GUI) MMP/1.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
0.00 | gsinfobot_1.2.0 (gary@sysadminforhire.com)
0.00 | PCbot/4.3
0.00 | Feedster Crawler/3.0; Feedster, Inc.
0.00 | PhpDig/1.8.8 (+http://www.phpdig.net/robot.php)
0.00 | Mozilla/4.0 (compatible; MSIE enviable; DAUMOA/1.0.1; DAUM Web Robot; Daum Communications Corp., Korea; +http://ws.daum.net/aboutkr.html)
0.00 | IlseBot/1.0
0.00 | Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0; obot)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; Girafabot; girafabot at girafa dot com; http://www.girafa.com)
0.00 | libwww-perl/5.63
0.00 | googlebot 1.0
0.00 | MOT-V975/81.33.02I MIB/2.2.1 Profile/MIDP-2.0 Configuration/CLDC-1.1 (compatible;YahooSeeker/M1A1-R2D2;mobile-search-customer-care AT yahoo-inc dot com)
0.00 | Acoon-Robot v3.1.0 (http://www.acoon.de and http://www.acoon.com)
0.00 | Mozilla/4.0 (compatible; Vagabondo/2.3; webcrawler at wise-guys dot nl; http://webagent.wise-guys.nl/)
0.00 | yacybot (i386 Linux 2.6.18-xenU; java 1.5.0_11; Etc/en) http://yacy.net/yacy/bot.html
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; .NET CLR 1.1.4322; InfoPath.2; MSIECrawler)
0.00 | Yoriwa/0.1 (compatible; Mozilla 4.0; MSIE 5.5; robot@yoriwa.com)
0.00 | pulseBot (pulse Web Miner)
0.00 | Spock Crawler (http://www.spock.com/crawler)
0.00 | CydralSpider/1.9 (Cydral Web Image Search; http://www.cydral.com)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.30; MSIECrawler)
0.00 | blackspider
0.00 | VadixBot_Exp
0.00 | SiteCrawler/1.0
0.00 | Giant/1.0 (Openmaru bot; robot@openmaru.com)
0.00 | Snapbot/1.0 (Site Search Crawler, +http://www.snap.com)
0.00 | Smart/Spider 2.7.1 - Mozilla/4.0
0.00 | Mozilla/4.0 (compatible; MSIE 6.0 compatible; Asterias Crawler v4; +http://www.singingfish.com/help/spider.html; webmaster@singingfish.com); SpiderThread Revision: 3.11
0.00 | Mozilla/5.0 (compatible; BecomeBot/3.0; MSIE 6.0 compatible; +http://www.become.com/site_owners.html)
0.00 | Mozilla/4.0 (compatible; Vagabondo/4.0Beta; webcrawler at wise-guys dot nl; http://webagent.wise-guys.nl/; http://www.wise-guys.nl/)
0.00 | BilgiBot/1.0(beta) (bilgi.com(beta); http://lucene.apache.org/nutch/bot.html; nutch-agent@lucene.apache.org)
0.00 | http://www.almaden.ibm.com/cs/crawler [bc5]
0.00 | libwww-perl/5.50
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; MSIECrawler)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; DigExt; MSIECrawler)
0.00 | DiamondBot/2.0
0.00 | CydralSpider/2.4 (Cydral Image Search; http://www.cydral.com)
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; (R1 1.5); MSIECrawler)
0.00 | boitho.com-dc/0.86 ( http://www.boitho.com/dcbot.html )
0.00 | Nokia6680/1.0 ((4.04.07) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 Configuration/CLDC-1.1 (for mobile crawler) )
0.00 | NIF/1.1 (http://www.newsisfree.com/robot.php users:)
0.00 | WWWeasel Robot v1.00 (http://wwweasel.de)
0.00 | GeonaBot/1.2; http://www.geona.com/
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; MSIECrawler)
0.00 | Mozilla/5.0 (compatible; nextthing.org/1.0; +http://www.nextthing.org/bot)
0.00 | disco/Nutch-0.9 (experimental crawler ... please email imagine@gmail.com if problems observed; nedrocks@gmail.com)
0.00 | MyApp/0.1 libwww-perl/5.805
0.00 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1) (via babelfish.yahoo.com)
0.00 | Twiceler www.cuill.com/robots.html
0.00 | BrightCrawler (http://www.brightcloud.com/brightcrawler.asp)
0.00 | Mozilla/5.0 (compatible; AboutUsBot/0.9; +http://www.aboutus.org/AboutUsBot)
0.00 | BaiduImagespider+(+http://www.baidu.jp/search/s308.html)
0.00 | hbtronix.spider.2 -- http://hbtronix.de/spider.php
0.00 | gsinfobot_1.2.0 gary@sysadminforhire.com
0.00 | Sogou Pic Spider/3.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
0.00 | LG-C1500 UP.Browser/6.2.3 (GUI) MMP/1.0 (compatible;YahooSeeker/M1A1-R2D2;mobile-search-customer-care AT yahoo-inc dot com)
0.00 | Mozilla/4.0 (compatible; MSIE enviable; DAUMOA 2.0; DAUM Web Robot; Daum Communications Corp., Korea; +http://ws.daum.net/aboutkr.html)
0.00 | SGH-Z130 SHP/VPP/R5 SMB3.1 SMM-MMS/1.1.0 profile/MIDP-2.0 configuration/CLDC-1.0 (compatible;YahooSeeker/M1A1-R2D2;mobile-search-customer-care AT yahoo-inc dot com)
0.00 | libwww-perl/5.808
0.00 | EasyDL/3.04 http://keywen.com/Encyclopedia/Bot
0.00 | RIBAECrawler
0.00 | Mozilla/5.0 (compatible; Yahoo! DE Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
0.00 | woriobot (+http://www.worio.com/)
0.00 | KiwiStatus (NZS.com)/0.2 (NZS.com KiwiStatus Spider, Local Search New Zealand; http://www.nzs.com; bot@nzs.com)
0.00 | Mozilla/5.0 (+http://www.toile.com/) ToileBot/0.1
0.00 | OutfoxMelonBot/0.5 (for internet experiments; http://; outfoxbot@gmail.com)
0.00 | Xaldon WebSpider 2.7.b6
0.00 | kicktooBotV1.1 (kictooBot@kictoo.com)
0.00 | WebCrawler_1.1 (internet@bredband.net)
0.00 | NutchCVS (Nutch; http://lucene.apache.org/nutch/bot.html; nutch-agent@lucene.apache.org)
0.00 | Mozilla/5.0 (compatible; Yahoo! Slurp/si-emb; http://help.yahoo.com/help/us/ysearch/slurp)
0.00 | Googlebot/2.X (+http://www.googlebot.com/bot.html)
0.00 | Mozilla/4.0 (compatible; Vagabondo/4.0; webcrawler at wise-guys dot nl; http://webagent.wise-guys.nl/)
0.00 | Vacobot; (+http://vaco.ws/bot.html)
0.00 | PlantyNet_WebRobot_V1.9 babo@plantynet.com
(492 rows)


Tuesday, February 5, 2008

Comma delimited results

In psql, you can do:

regression=# select * from tenk1 limit 2;
unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4
---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
8800 | 0 | 0 | 0 | 0 | 0 | 0 | 800 | 800 | 3800 | 8800 | 0 | 1 | MAAAAA | AAAAAA | AAAAxx
1891 | 1 | 1 | 3 | 1 | 11 | 91 | 891 | 1891 | 1891 | 1891 | 182 | 183 | TUAAAA | BAAAAA | HHHHxx
(2 rows)

regression=# \a
Output format is unaligned.
regression=# \f ,
Field separator is ','.
regression=# \t
Showing only tuples.
regression=# select * from tenk1 limit 2;
8800,0,0,0,0,0,0,800,800,3800,8800,0,1,MAAAAA,AAAAAA,AAAAxx
1891,1,1,3,1,11,91,891,1891,1891,1891,182,183,TUAAAA,BAAAAA,HHHHxx
regression=#

Wednesday, September 19, 2007

select only 2-level domains.

select url from d_directories_backlinks where directory_id in (select entry_id from d_directories where lang='ja') and length(split_part(replace(get_host(url),'www.',''),'.',3))=0;

Thursday, September 6, 2007

network monitoring tools

http://nethogs.sourceforge.net/

Just found these great tools. the nethogs allows to measure the b/w by process id that is really helpful.

Sunday, July 8, 2007

Tuesday, May 29, 2007

make fonts to be smooth in kubuntu linux

emacs /home/user/.fonts.conf


<?xml version="1.0″ ?>

<!DOCTYPE fontconfig SYSTEM "fonts.dtd">

<fontconfig>

<match target="font">

<edit name="autohint" mode="assign">

<bool>true</bool>

</edit>


</match>

</fontconfig>