Feedfetcher is how Google grabs RSS or Atom feeds when users choose to add them to their Google homepage or Google Reader. Feedfetcher collects and periodically refreshes these user-initiated feeds, but does not index them in Blog Search or Google's other search services (feeds only appear in our search results if they've been crawled by Googlebot). Find answers below to some of the most commonly asked questions about how this user-controlled feed grabber works.
For detailed information about how to prevent Feedfetcher or Googlebot from accessing all or part of your site, please refer to our Removals page.
Frequently Asked Questions |
- How do I add my feed to the search results for Google's personalized homepage and Google Reader?
- How do I request that Google not retrieve some or all of my site's feeds?
- How often will Feedfetcher retrieve my feeds?
- Feedfetcher is retrieving my site's feeds too frequently. What can I do?
- Why is Feedfetcher trying to download incorrect links from my server, or from a server that doesn't exist?
- Why is Feedfetcher downloading information from our "secret" web server?
- Why isn't Feedfetcher obeying my robots.txt file?
- Why are there hits from multiple machines at Google.com, all with user-agent Feedfetcher?
- Can you tell me the IP addresses from which Feedfetcher makes requests so that I can filter my logs?
- Why is Feedfetcher downloading the same page on my site multiple times?
- Why don't the feeds from my site that Feedfetcher requested show up in your index?
- What kinds of links does Feedfetcher retrieve?
- My Feedfetcher question isn't answered here. Where can I get more help?
Answers |
1. How do I add my feed to the search results for Google's personalized homepage or Google Reader?
Feedfetcher request are all user-initiated, so it does not index feeds to add them to search results for Google services. The feeds that appear in search results are those crawled by Googlebot. Googlebot uses feed autodiscovery to find public feeds. Learn how to add these tags to your site.
2. How do I request that Google not retrieve some or all of my site's feeds?
Since Feedfetcher requests are all user-initiated, it does not follow the typical robots.txt guidelines for robots. For detailed instructions about how to prevent Feedfetcher from requesting all or part of your site, please see our Removals page.
Note: Feedfetcher is not related to the Blog Search index. If you'd like to exclude your feed from Blog Search, turn off syndication for that feed. Blog Search indexes your feed by pinging a syndication server. Learn more about Blog Search.
3. How often will Feedfetcher retrieve my feeds?
Feedfetcher shouldn't retrieve feeds from most sites more than once every hour on average. Some frequently updated sites may be refreshed more often. Note, however, that due to network delays, it's possible that Feedfetcher may briefly appear to retrieve your feeds more frequently.
4. Feedfetcher is retrieving my site's feeds too frequently. What can I do?
Please contact us with the URL of your site and a detailed description of the problem. Please also include a portion of the web server access log that shows Google accesses so we can find the problem quickly.
5. Why is Feedfetcher trying to download incorrect links from my server, or from a server that doesn't exist?
Feedfetcher retrieves feeds at the request of users who have added them to their Google homepage. It is possible that a user has requested a feed URL location that does not exist.
6. Why is Feedfetcher downloading information from our "secret" web server?
Feedfetcher retrieves feeds at the request of users who have added them to their Google homepage or Google Reader. It is possible that the request came from a user who knows about your "secret" server or typed it in by mistake. If you'd like to prevent Feedfetcher from requesting all or part of your site, please see the detailed instructions on our Removals page.
7. Why isn't Feedfetcher obeying my robots.txt file?
Feedfetcher retrieves feeds only after users have explicitly added them to their Google homepage or Google Reader. Feedfetcher behaves as a direct agent of the human user, not as a robot, so it ignores robots.txt entries. Feedfetcher does have one special advantage, though: because it's acting as the agent of multiple users, it conserves bandwidth by making requests for common feeds only once for all users.
For more information about robots.txt files, please see the Robots FAQ.
8. Why are there hits from multiple machines at Google.com, all with user-agent Feedfetcher?
Feedfetcher was designed to be distributed on several machines to improve performance and scale as the web grows. To cut down on bandwidth usage, the machines used are often located near the sites that they're retrieving in the network.
9. Can you tell me the IP addresses from which Feedfetcher makes requests so that I can filter my logs?
The IP addresses used by Feedfetcher change from time to time. The best way to identify accesses by Feedfetcher is to use its identifiable user-agent: Feedfetcher-Google.
10. Why is Feedfetcher downloading the same page on my site multiple times?
In general, Feedfetcher should only download one copy of each file from your site during a given feed retrieval. Very occasionally, the machines are stopped and restarted, which may cause it to again retrieve pages that it's recently visited.
11. Why don't the feeds from my site that Feedfetcher requested show up in your index?
Feedfetcher retrieves feeds only at the request of users who have added the feeds to their Google homepage or Google Reader; it is not retrieving content to be added to Google's search index, so any content it retrieves won't show up there unless it has also been requested by Googlebot.
12. What kinds of links does Feedfetcher follow?
Unlike normal web crawlers, Feedfetcher isn't following links at all; instead, it follows the requests given to it by users of Google's personalized homepage.
13. My Feedfetcher question isn't answered here. Where can I get more help?
If you're still having trouble, feel free to contact us here.