Home
All About Google
Webmaster Info
Index
Getting Listed
Not Listed
Incorrect Listing
Rank Questions
Guidelines
Facts & Fiction
SEOs
Frequent Questions
|
 |
 |
A. I
need my site information changed.
1. My information is outdated.
When you update information on your site
it does not automatically update instantly in Google's index. Rather, Google's
index is updated
approximately once a month after our robots have crawled more than
8 billion web pages. This process is totally automated, so updated or
outdated link submissions are not necessary. Changes to your site's
content will be noted by the next crawl. Due to the volume of sites
in our index, we cannot manually update pages on an individual basis.
2. You continue listing an 'old' version of my site.
If we continue to list an 'old' version of your site (i.e. we continue
to list www.my123site.org despite the fact that your site now lives
at www.my456site.org ) you need to update the links that are pointing
to the sites. Since our robots jump from page to page via hyperlinks,
someone must still be linking to the defunct page. Once others correct
their links, we can too. Once your new site is live, you may wish
to place a permanent redirect (using a "301" code in HTTP
headers) on your old site to inform visitors and search engines that
your site has moved.
One way to determine who is linking to the dead site is to try a
link search. You can find instructions on how to do this on our features
page. Please note that this process does not work for all of the
sites in our index.
3. I am changing my URL.
We cannot manually change your listed address at the exact time you
move to your new site. There are steps you can take to make sure that
your transition goes smoothly, however. Google listings are based
in part on our ability to find your site by following links from other
web pages. To preserve your ranking, you will want to inform any sites
that currently link to your pages of your change of address. As long
as the links change as you move your site over to a new location,
your PageRank
should not be adversely affected.
If your site goes unlisted for a time, this does not mean you were
intentionally dropped from our index. Sometimes in these transitions,
we fail to find a site at its new address. Just be sure that others
are linking to you and we should pick you up on our next web crawl.
4. There's no description of my site.
The Google index contains two types of pages--fully indexed and partially
indexed pages. Your page is currently partially indexed, which means
that although we know about your site, our robots have not read all
the content on your page(s) in past crawls. This does not adversely
affect your PageRank or your inclusion in our index. It does mean
that we don't 'know' what to call your page, so it gets listed with
the URL as the title and no description.
We appreciate the frustration this causes webmasters who work hard
to make their sites accessible to users. We are working to increase
the number of fully indexed pages in our search results to alleviate
this problem.
5. The description of my site is wrong in the results.
Site descriptions in Google results are actually quoted from the
web page in question. Google automatically generates different descriptions
based on the search terms used to find the site (these "snippets"
display the search term(s) in the context of the page on which they
appear).
For example, if there is a pet site that deals with cats and dogs,
and someone enters a search for the word 'dog,' the site description
on Google will only talk about 'dogs.' If a person searches Google
for 'cats' and the same site is delivered as a result, the description
will be different it will contain references to the word 'cat' as
it appears on the website.
Google does not display a standard description. We look for the
search terms specified (and in some cases, variations of those terms)
and show snippets of where those terms appear. This is a completely
automated process and editing is not an option. If you alter the
relevant text on the page itself, Google will pick up those changes
during our next crawl in a few weeks.
Descriptions of sites in the Google
Directory are written by volunteers who are part of the Open
Directory Project. If you'd like to have your site's description in
the directory modified, contact the ODP by going to the category where
it is listed, and filling out the "update URL" form.
6. I'm in your index, but not listed as a
result for keyword "X".
Google does not manually assign keywords to your site, nor do we
manually "boost" the rankings of any site. The ranking process
is completely automated and depends on the relative PageRank
of each result found.
You can read more about our ranking process elsewhere in this FAQ,
but the best way to improve your position in results is to have relevant
content and multiple links from other web sites. If there are certain
keywords you feel are essential to your site's success, you may want
to consider our targeted keyword advertising
program. Google does not sell placement in our results, but we
do have advertising positions available adjacent to them.
B. I
need my site information removed.
1. Removing a page from the Google index.
Except in instances involving legal issues or spam, Google's
policy for removing a page from our index requires that we obtain the
permission of that page's webmaster. This prevents competitors from
sabotaging each other's listings. Please have the webmaster for the
page in question contact us with proof that he/she is indeed the webmaster.
This proof must be in the form of a root level page on the site in question,
requesting removal from Google. Once we receive the URL that corresponds
with this root level page, we will remove the offending page from our
index. For more information on this process, please see http://www.google.com/remove.html.
2. I don't want Google to keep a cached version
of my page.
Google automatically takes a "snapshot"
of each page it crawls and caches it. This enables us to show the search
terms highlighted on text heavy pages so users can find relevant information
quickly, and to retrieve pages for users if the site's server temporarily
fails. Users can access the cached version by choosing the "Cached"
link on the search results page. If you do not want your content to
be accessible through Google's cache, you can use the NOARCHIVE meta-tag.
Place this in the <HEAD> section of your documents:
<META NAME="ROBOTS" CONTENT="NOARCHIVE">
This tag will tell robots not to archive the page. Google will continue
to index and follow links from the page, but will not present cached
material to users.
If you want to allow other robots to archive your content, but prevent
Google's robots from caching, you can use the following tag:
<META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">
Note that the change will occur the next time Google
crawls the page containing the NOARCHIVE tag (typically about once a month). If you want the change to take effect sooner than this,
the site owner must contact us and request immediate removal of archived
content. Also, the NOARCHIVE directive only controls whether the cached
page is shown. To control whether the page is indexed, use the NOINDEX
tag; to control whether links are followed, use the NOFOLLOW tag. See
the Robots
Exclusion page for more information.
3. I don't want Google to crawl part or all of my site.
There is a standard method involving a "robots.txt" file
for excluding
robot crawlers. This will prevent Googlebot or other crawlers
from visiting your site. Googlebot has a user-agent of
"Googlebot". In addition, Googlebot understands
some extensions to the robots.txt standard: Disallow
patterns may include * to match any sequence of
characters, and patterns may end in $ to indicate that
the $ must match the end of a name. For
example, to prevent Googlebot from crawling files
that end in gif, you may use the
following robots.txt entry:
User-agent: Googlebot
Disallow: /*.gif$
There is another standard for telling robots not to index a particular
web page or follow links on it, which may be more helpful, since it
can be used on a page-by-page basis. This method
involves placing
a "META" element into a page of HTML.
Remember, changing your server's robots.txt file or changing the
"META" elements on its pages will not cause an immediate
change in what results Google returns. It is likely that it will take
a while for any changes you make to propagate to Google's next index
of the web.
4. Googlebot is asking for robots.txt. Why?
Robots.txt is a standard document that can tell Googlebot
not to download some or all information from your web server. For information
on how to create a robots.txt file, see The
Robot Exclusion Standard.
5. Googlebot is trying to download incorrect links from my server.
It is a property of the web that many links will be
broken or outdated at any given time. Whenever anyone types a link incorrectly
that points to your site, or fails to update their pages to reflect
changes in your server, Googlebot will try to download an incorrect
link from your site. This is also why you may get hits on a machine
that is not even a web server.
6. Googlebot is downloading information from
our "secret" web server.
It is almost impossible to keep a web server secret
by not publishing any links to it. As soon as someone follows a link
from your "secret" server to another web server, it is likely
that your "secret" URL is in the referer tag, and it can be
stored and possibly published by the other web server in its referer
log. So, if there is a link to your "secret" web server or
page on the web anywhere, it is likely that Googlebot and other "web
crawlers" will find it.
7. Googlebot isn't obeying my robots.txt file.
In order to save bandwidth Googlebot only downloads
the robots.txt file once a day or whenever we have fetched many pages
from the server. So, it may take a while for Googlebot to learn of any
changes that might have been made to your robots.txt file. Also, Googlebot
is distributed on several machines. Each of these keeps its own record
of your robots.txt file. Finally, you may want to check that your syntax
is correct against the standard at: http://www.robotstxt.org/wc/norobots.html.
If there still seems to be a problem, please let us know, and we will
correct it.
8. I see hits from multiple machines at Google.com, all with user-agent Googlebot.
Googlebot was designed to be distributed on several
machines to improve performance and scale as the web grows. Also, to
cut down on bandwidth usage we run multiple crawlers on machines located
near to the sites they are indexing in the network.
9. Googlebot is crawling my site too fast.
Please send an email to googlebot@google.com
with the name of your site and a detailed description of the problem.
Please also include a portion of the weblog that shows Google accesses,
so we can track down the problem more quickly on our end.
10. I don't want Google to index non-HTML
file types on my site.
To disallow a specific file type,simply modify the Disallow command
in your robots.txt file. This works for all of the types of files
Googlebot crawls,including HTML, GIFs and .docs. For example, to disallow
Microsoft Word files with the ".doc" extension, you would
add the following lines to your robots.txt file:
User-agent: Googlebot
Disallow: /*.doc$
|