After moving to the new server, we synchronized our logs and found 69 new bots over the last few weeks, which pushed us past the 700 bot milestone!
And without further ado, the highlights:
Next up, YodaoBot/1.0
dropped in for a visit. We couldn't tell much about this bot, as it's home site appears to be in Japanese. However, it looks like they honor the robots.txt standard.
We ran across Twiceler in our logs, and visited their site.
Their informational page was brief and to the point, saying "Twiceler is an experimental robot. Please contact email@example.com if you have any problems.
Twiceler should obey robots.txt.". I guess that says it all...
Dries Buytaert sent his personal bot over.
According to his blog, he is a PHD student at the University of Ghent (Belgium), and lead of the
DepSpid/5.07, "The Dependency Spider", dropped in and provided a URL for it's site.
Upon visiting, we found out (in verbatim) that "DepSpid is a distributed kind of a web crawler. The DepSpid spider visits domains, analyses links and finally calculates scores about the link dependencies between individual domains. Each spider job starts at the main page of a domain and then follows each link on that page retrieving more pages and analysing them, too. The spider stays within one domain. If it finds an external link it only checks if the linked domain is reachable but doesn't continue crawling into the external domain. Every unknown domain will be visited from another spider job at a later time.
The DepSpid spider is currently under devlopment. Once it's running in production mode, the data collected by the spider will be publically available and will give webmasters a new kind of sight into their own or foreign domains."
We now have 51,490
user agents and 720 bots