Home
About Google
Webmaster Info
FAQ
Guidelines
Facts & Fiction
SEOs
Googlebot
Feedfetcher
Removals
|
 |
 |
Need to remove content
from Google's index?
Google views the comprehensiveness of our search results as an extremely important priority.
We're committed to providing thorough and unbiased search results for our users. We stop indexing pages of a site only at the request of the webmaster who's responsible for
those pages, when it's spamming our index, or as required by law. This policy is necessary to ensure that pages
aren't inappropriately removed from our index.
Please select an option below for instructions. Removals will take effect the next time Google crawls your site.
Remove your entire website |
If you wish to exclude your entire website from Google's index, you can place a file at the
root of your server called robots.txt. This is the standard protocol that most web crawlers observe for excluding
a web server or directory from an index. More information on robots.txt is available here:
http://www.robotstxt.org/wc/norobots.html.
Please note that Googlebot does not interpret a 401/403 response
("Unauthorized"/"Forbidden") to a robots.txt fetch as a request not to crawl any pages on the site.
To remove your site from search engines and prevent all robots from crawling it in the future,
place the following robots.txt file in your server root:
User-agent: *
Disallow: /
To remove your site from Google only and prevent just Googlebot from crawling your site in the future,
place the following robots.txt file in your server root:
User-agent: Googlebot
Disallow: /
Each port must have its own robots.txt file. In particular, if you serve content via both http and https,
you'll need a separate robots.txt file for each of these protocols. For example, to allow Googlebot to
index all http pages but no https pages, you'd use the robots.txt files below.
For your http protocol (http://yourserver.com/robots.txt):
User-agent: *
Allow: /
For the https protocol (https://yourserver.com/robots.txt):
User-agent: *
Disallow: /
Note: If
you believe your request is urgent
and cannot wait until the next time Google crawls your site,
use our automatic
URL removal system. In order for this automated
process to work, the webmaster must first create and
place a robots.txt file on the site in question.
Google will continue to exclude your site
or directories from successive crawls if the robots.txt
file exists in the web server root. If you do not have access
to the root level of your server, you may place a robots.txt
file at the same level as the files you want to remove.
Doing this and submitting via the automatic URL removal
system will cause a temporary, 180 day removal of
your site from the Google index, regardless of whether you remove the robots.txt file
after processing your request.
(Keeping the robots.txt
file at the same level would require you to return to the
URL removal system every 180 days to reissue the removal.)
|
|
Remove part of your website |
Option 1: Robots.txt
To remove directories or individual pages of your website, you can place a robots.txt file at the root of your server.
For information on how to create a robots.txt file, see the
The Robot Exclusion Standard.
When creating your robots.txt file, please keep the following in mind:
When deciding which pages to crawl on a particular host, Googlebot will obey the first record in
the robots.txt file with a User-agent starting with "Googlebot." If no such entry exists, it will
obey the first entry with a User-agent of "*". Additionally, Google has introduced increased
flexibility to the robots.txt file standard through the use asterisks. Disallow patterns may include
"*" to match any sequence of characters, and patterns may end in "$" to indicate the end of a name.
To remove all pages under a particular directory (for example, lemurs), you'd use the following robots.txt entry:
User-agent: Googlebot
Disallow: /lemurs
To remove all files of a specific file type (for example, .gif), you'd use the following robots.txt entry:
User-agent: Googlebot
Disallow: /*.gif$
To remove dynamically generated pages, you'd use this robots.txt entry:
User-agent: Googlebot
Disallow: /*?
Option 2: Meta tags
Another standard, which can be more convenient for page-by-page use, involves adding a <META> tag to an HTML page
to tell robots not to index the page. This standard is described at
http://www.robotstxt.org/wc/exclusion.html#meta.
To prevent all robots from indexing a page on your site, you'd place the following meta tag into
the <HEAD> section of your page:
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
To allow other robots to index the page on your site, preventing only Google's robots from indexing the page,
you'd use the following tag:
<META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW">
To allow robots to index the page on your site but instruct them not to follow outgoing links, you'd use the following tag:
<META NAME="ROBOTS" CONTENT="NOFOLLOW">
Note: If you
believe your request is urgent
and cannot wait until the next time Google crawls your site,
use our automatic
URL removal system. In order for this automated
process to work, the webmaster must first insert the appropriate
meta tags into the page's HTML code.
Doing this and submitting via the automatic URL removal system will cause a temporary,
180-day removal of these pages from the Google index, regardless of whether you remove
the robots.txt file or meta tags after processing your request. |
|
A snippet is a text excerpt that appears below a page's title in our search results
and describes the content of the page..
To prevent Google from displaying snippets for your page, place this tag in the <HEAD> section of your page:
<META NAME="GOOGLEBOT" CONTENT="NOSNIPPET">
Note: removing snippets
also removes cached pages.
Note: If you
believe your request is urgent
and cannot wait until the next time Google crawls your site,
use our automatic
URL removal system. In order for this automated
process to work, the webmaster must first insert the appropriate
meta tags into the page's HTML code. |
|
Google automatically takes a "snapshot" of each page it crawls and archives it. This "cached"
version allows a webpage to be retrieved for your end users if the original page is ever unavailable
(due to temporary failure of the page's web server). The cached page appears to users exactly as it looked when
Google last crawled it, and we display a message at the top of the page to indicate that it's a cached version.
Users can access the cached version by choosing the "Cached" link on the search results page.
To prevent all search engines from showing a "Cached" link for your site, place this tag in
the <HEAD> section of your page::
<META NAME="ROBOTS" CONTENT="NOARCHIVE">
To allow other search engines to show a "Cached" link, preventing only Google from displaying one,
use the following tag:
<META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">
Note: this tag only removes
the "Cached" link for the page. Google will continue to index the
page and display a snippet.
Note: If you
believe your request is urgent
and cannot wait until the next time Google crawls your site,
use our automatic
URL removal system. In order for this automated process
to work, the webmaster must first insert the appropriate
meta tags into the page's HTML code. |
|
Remove an outdated ("dead") link |
Google updates its entire index automatically on a regular
basis. When we crawl the web, we find new pages, discard dead links, and
update links automatically. Links that are outdated now will most likely
"fade out" of our index during our next crawl.
Note: If you
believe your request is urgent
and cannot wait until the next time Google crawls your site,
use our automatic
URL removal system. We'll accept your removal request only if the page returns a true 404 error via the http headers.
Please ensure that you return a true 404 error even if you choose to display a more user-friendly body of
the HTML page for your visitors. It won't help to return a page that says "File Not Found" if the http
headers still return a status code of 200, or normal. |
|
Remove an image from Google's Image Search |
To remove an image from Google's image index, add a robots.txt file to the root of the server. (If you can't put it in the
server root, you can put it at directory level.)
Example: If you want Google to exclude the dogs.jpg image that appears on your site at www.yoursite.com/images/dogs.jpg,
create a page at www.yoursite.com/robots.txt and add the following text:
User-agent: Googlebot-Image
Disallow: /images/dogs.jpg
To remove all the images on your site from our
index, place the following robots.txt file in your server root:
User-agent: Googlebot-Image
Disallow: /
This is the standard protocol that most web crawlers
observe for excluding a web server or directory from an index. More information
on robots.txt is available here:
http://www.robotstxt.org/wc/norobots.html.
Additionally, Google has introduced increased flexibility to the robots.txt file standard through the use asterisks.
Disallow patterns may include "*" to match any sequence of characters, and patterns may end in "$" to indicate the
end of a name. To remove all files of a specific file type (for example, to include .jpg but not .gif images),
you'd use the following robots.txt entry:
User-agent: Googlebot-Image
Disallow: /*.gif$
Note: If you
believe your request is urgent
and cannot wait until the next time Google crawls your site,
use our automatic
URL removal system. In order for this automated
process to work, the webmaster must first create and place
a robots.txt file on the site in question.
Google will continue to exclude your site
or directories from successive crawls if the robots.txt file
exists in the web server root. If you do not have access to
the root level of your server, you may place a robots.txt
file at the same level as the files you want to remove.
Doing this and submitting via the automatic URL removal system will cause a temporary,
180 day removal of the directories specified in your robots.txt file from the Google index,
regardless of whether you remove the robots.txt file after processing your request.
(Keeping the robots.txt file at the same
level would require you to return to the URL removal system
every 180 days to reissue the removal.)
|
|
Remove a blog from Blog Search |
Only blogs with site feeds will be included in Blog Search. If you'd like to prevent your feed
from being crawled, make use of a robots.txt file or meta tags (NOINDEX or NOFOLLOW), as described above.
Please note that if you have a feed that was previously included, the old posts will remain in the index even though new
ones will not be added.
Remove a RSS or Atom feed |
When users add your feed to their
Google homepage or Google Reader, Google's Feedfetcher attempts to obtain the content of
the feed in order to display it. Since Feedfetcher requests come from explicit
action by human users, Feedfetcher has been designed to ignore robots.txt
guidelines.
It's not possible for Google to restrict access to a publicly available feed.
If your feed is provided by a blog hosting service, you should work with them to restrict access to your feed.
Check those sites' help content for more information (e.g.,
Blogger,
LiveJournal, or
Typepad).
![]() |
Google Web Search on mobile phones allows users to search all the content in the Google index for desktop web browsers. Because this content isn't written specifically for mobile phones and devices and thus might not display properly, Google automatically translates (or "transcodes") these pages by analyzing the original HTML code and converting it to a mobile-ready format. To ensure that the highest quality and most useable web page is displayed on your mobile phone or device, Google may resize, adjust, or convert images, text formatting and/or certain aspects of web page functionality.
To prevent your web page(s) from being transcoded, please send a removal request to mobile-support@google.com.
|