Your Daily Source for Apache News and Information  
Breaking News Preferences Contribute Triggers Link Us Search About
SEWATCH: The Big List of Web Robots
(Oct 24th, 16:41:21 )

Who sent that web robot, and what is it doing crawling around on your server? Identify and track robots with this list of hundreds of active crawlers, link checkers and other cybercritters.

"In reality, crawlers are relatively simple programs, though they have the power to bring a web site to a standstill. They can also automatically and rapidly fetch material that a site owner may not want anyone to see. For this reason, most crawlers (also called "robots") abide by the "robots exclusion protocol," an informal set of rules that constrains their behavior."

Complete Story

Related Stories:
LinuxWorld: How to save an Apache log file in a PostgreSQL database(Oct 09, 2001)
evolt.org: Using Apache to stop bad robots(Sep 20, 2001)
Apache Guide: Spiders and Robots(Nov 21, 2000)
Apache Guide: Logging with Apache--Understanding Your access_log(Aug 21, 2000)


Printed from Apache Today (https://apachetoday.com).
https://apachetoday.com/news_story.php3?ltsn=2001-10-24-001-06-PS-AD

About Triggers Media Kit Security Triggers Login


All times are recorded in UTC.
Linux is a trademark of Linus Torvalds.
Powered by Linux 2.4, Apache 1.3, and PHP 4
Copyright 2002 INT Media Group, Incorporated All Rights Reserved.
Legal Notices,  Licensing, Reprints, & Permissions,  Privacy Policy.
http://www.internet.com/