Archive for February, 2006

Web Spider Traps, webbot and search engine detection, blocking a robot, allowing some bots and blocking others with PHP

When an author does not want his site to be copied or indexed by search engines, he can use:

1. A meta tag as (well-behaved bots).
2. A robots.txt file which indicates the parts of the site not to be explored (well-behaved bots).
3. .htaccess to ban known or detected robots (any webbot).
4. A java applet, some html, a script written in php, javascript or any other language (any webbot).

Read more from:
Web Spider Traps, webbot and search engine detection, blocking a robot, allowing some bots and blocking others with PHP

No Comments

Search firms surveyed on privacy | CNET News.com

Verbatim: Search firms surveyed on privacy | CNET News.com
update To find out what kind of information the four major search companies retain about their users, CNET News.com surveyed America Online, Google, Microsoft and Yahoo.

No Comments