WTF!!! Googlebot is killing my site. Aaaargh

Aug 16, 2001
22,505
4
81
This sucks. Googlebot has eaten up 1.79GBof bandwidth this month. I've 3GBn total....
Last month the bot was barely even visiting.

Yeah, I'm a noob at this. What can I do?
 

mzkhadir

Diamond Member
Mar 6, 2003
9,509
1
76
deny google bot from ever searching your site. Put code on your site where it wont search your site again.
 

mzkhadir

Diamond Member
Mar 6, 2003
9,509
1
76
If you want robots (or spiders) to stop indexing your site, there are a couple of different ways you can go about this. You can also stop them from indexing individual Web pages or subdirectories and you stop robots from specific search engines.

You can stop robots in one of two ways. You can do it with either a txt file or a meta tag. Let's use Google as an example. One of Google's more popular robots is called "Googlebot". To stop Googlebot from indexing a particular page you would add the meta tag <META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW">. If you want to stop all robots from indexing a page the meta tag would look like this: <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">. If you would want a robot or robots excluded from an entire site you would have to put this tag on every page. That would be a serious pain.

The best way to exclude a robot or robots from an entire site is to use a robots.txt file. You place the robots.txt file on your server and it instructs the robots what not to index on your web site. You create can create this file in Dreamweaver, NotePad, or any application that will allow you to create a text file. Search engines will request a robots.txt file before indexing your site. As an example, if you didn't want Google to index your site, put the following into a robots.txt file:

User-agent: Googlebot
Disallow: /
 

mugs

Lifer
Apr 29, 2003
48,920
46
91
robots.txt

But this is to your disadvantage if you want people to come to your site.
 
Aug 16, 2001
22,505
4
81
Thanks for the advice. I was just taken by surprise.
It's weird that google suddenly is eating all my bandwidth now. It used to be more like 100-200MB/month but 1.79GB...WTF.
 

Shawn

Lifer
Apr 20, 2003
32,236
53
91
Originally posted by: FrustratedUser
WTH is good about that?

it's indexing your site. if you want your site to come up on google searches you have to let it index. if you don't care then use the method someone posted above to block it.
 

Steve

Lifer
May 2, 2004
15,945
11
81
Wallydraigle et al went through the same thing at TFNN, you might want to drop them a line.
 

InlineFive

Diamond Member
Sep 20, 2003
9,599
2
0
If you have followed the above user's advice about robots.txt then you wouldn't have that problem.
 

SarcasticDwarf

Diamond Member
Jun 8, 2001
9,574
2
76
Originally posted by: InlineFive
If you have followed the above user's advice about robots.txt then you wouldn't have that problem.


QFT, and if that did not work, then it is not Google doing it (DOS attack).
 
Aug 16, 2001
22,505
4
81
Originally posted by: InlineFive
If you have followed the above user's advice about robots.txt then you wouldn't have that problem.

Yeah I know. I thought I had more time.
The stats said I had 20% BW left and I thought that would be enough until the new month.

I was wrong...:(


Not sure if it's a DOS attack. The stats pointed to google as the culprit.

 

FeuerFrei

Diamond Member
Mar 30, 2005
9,144
929
126
.................................Hits.............Bandwidth.............Last visit
Inktomi Slurp..........1975+1202.....19.94 MB........31 Oct 2006 - 05:08
MSNBot...................847+251........13.08 MB........31 Oct 2006 - 05:00
Googlebot................169+25...........1.97 MB........30 Oct 2006 - 07:28

Funny how I get 5 times the amount of links to my site(below) from Google, despite it ranking third (above) in bot indexing activity

Links from an Internet Search Engine
- Google.......1112......1112
- Yahoo..........256........256
- MSN............210........212
 

Goosemaster

Lifer
Apr 10, 2001
48,775
3
81
Originally posted by: FeuerFrei
.................................Hits.............Bandwidth.............Last visit
Inktomi Slurp..........1975+1202.....19.94 MB........31 Oct 2006 - 05:08
MSNBot...................847+251........13.08 MB........31 Oct 2006 - 05:00
Googlebot................169+25...........1.97 MB........30 Oct 2006 - 07:28

Funny how I get 5 times the amount of links to my site(below) from Google, despite it ranking third (above) in bot indexing activity

Links from an Internet Search Engine
- Google.......1112......1112
- Yahoo..........256........256
- MSN............210........212


Robots/Spiders visitors (Top 25) - Full list - Last visit
4 different robots* Hits Bandwidth Last visit
Inktomi Slurp 62 46.74 KB 30 Oct 2006 - 12:45
Googlebot 50 143.58 KB 30 Oct 2006 - 23:55
Unknown robot (identified by 'spider') 1 2.12 KB 20 Oct 2006 - 10:23
Alexa (IA Archiver) 1 2.79 KB 04 Oct 2006 - 00:27


pretty small amount for me:D
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
Sounds like you need to buy some Old Glory Robot Insurance





or (ironing++) use Google to find out about robots.txt
 

JEDIYoda

Lifer
Jul 13, 2005
33,986
3,321
126
Originally posted by: mzkhadir
If you want robots (or spiders) to stop indexing your site, there are a couple of different ways you can go about this. You can also stop them from indexing individual Web pages or subdirectories and you stop robots from specific search engines.

You can stop robots in one of two ways. You can do it with either a txt file or a meta tag. Let's use Google as an example. One of Google's more popular robots is called "Googlebot". To stop Googlebot from indexing a particular page you would add the meta tag <META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW">. If you want to stop all robots from indexing a page the meta tag would look like this: <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">. If you would want a robot or robots excluded from an entire site you would have to put this tag on every page. That would be a serious pain.

The best way to exclude a robot or robots from an entire site is to use a robots.txt file. You place the robots.txt file on your server and it instructs the robots what not to index on your web site. You create can create this file in Dreamweaver, NotePad, or any application that will allow you to create a text file. Search engines will request a robots.txt file before indexing your site. As an example, if you didn't want Google to index your site, put the following into a robots.txt file:

User-agent: Googlebot
Disallow: /

Good info...noted!!