• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

WTF!!! Googlebot is killing my site. Aaaargh

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Originally posted by: JEDIYoda
Originally posted by: mzkhadir
If you want robots (or spiders) to stop indexing your site, there are a couple of different ways you can go about this. You can also stop them from indexing individual Web pages or subdirectories and you stop robots from specific search engines.

You can stop robots in one of two ways. You can do it with either a txt file or a meta tag. Let's use Google as an example. One of Google's more popular robots is called "Googlebot". To stop Googlebot from indexing a particular page you would add the meta tag <META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW">. If you want to stop all robots from indexing a page the meta tag would look like this: <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">. If you would want a robot or robots excluded from an entire site you would have to put this tag on every page. That would be a serious pain.

The best way to exclude a robot or robots from an entire site is to use a robots.txt file. You place the robots.txt file on your server and it instructs the robots what not to index on your web site. You create can create this file in Dreamweaver, NotePad, or any application that will allow you to create a text file. Search engines will request a robots.txt file before indexing your site. As an example, if you didn't want Google to index your site, put the following into a robots.txt file:

User-agent: Googlebot
Disallow: /

Good info...noted!!

Good info indeed.
 
New month and the site is up again. This time I WILL block google.
Just wanted you to see the ststs:

Googlebot          40649        2.19 GB     31 Oct 2006 - 09:30
Inktomi Slurp     807            18.05 MB   30 Oct 2006 - 15:25
MSNBot              58              1.37 MB     25 Oct 2006 - 20:46

2.19GB, 73% of my bandwidth. WTF!!!
 
That's the funny part. The site only has a few, maybe 10, pages with a banner on top and then basically text and then a CSS file.
The main part is a phpBB forum and a coppermine gallery with very few pics.

 
question, I have my meta tag set to instruct robots

<meta name="revisit-after" content="15 Days">

Am I correct to understand with this they will only visit once every 15 days and not as often as they want? Not sure if google, and the others bots pay attention to this tag.

 
Originally posted by: QueBert
question, I have my meta tag set to instruct robots

<meta name="revisit-after" content="15 Days">

Am I correct to understand with this they will only visit once every 15 days and not as often as they want? Not sure if google, and the others bots pay attention to this tag.

Proper site bots should follow meta tags...

 
Deny the google bot from the rapidly changing sections of your site such as the forums. That is probably where it keeps going and eating bandwidth. Let it hit the main page so that gets indexed.
 
Originally posted by: FrustratedUser
Phew.... I think it's working. The traffic dropped dramatically a few days after I uploaded the robots.txt file.

🙂

:thumbsup: robots obey. 🙂
 
Back
Top