The Wayback Machine - https://web.archive.org/all/20060406205209/http://www.fybersearch.com:80/fyberspider.php
FyberSearch    
Web    Feeds    Thoughts    Tutorials    Gheto
 
What is FyberSpider?

FyberSpider is the FyberSearch web crawler. All of the cool search engines have them, so we decided that FyberSearch needed to have one.

FyberSpider downloads data from web pages, processes the data to appear in our search results (known as indexing), extracts the links in those web pages and then repeats the same process on each link.

Would you like to know if your website has been crawled by the FyberSpider? Check your server logs for the following line: User-Agent: FyberSpider (+http://www.fybersearch.com/fyberspider.php)


Prevent Content from being Crawled

FyberSpider is the only web crawler we are aware of that allows you to keep it from crawling specific portions of a web page. Just place the <fybersearch_ignore> HTML tag before the content you would like FyberSpider to ignore on your web page and the </fybersearch_ignore> HTML closing tag after the content you would like FyberSpider to ignore on your web page.

Not sure how/if this works? First, go take a look at this example file and then break the URL down. You will see that only the content not within the <fybersearch_ignore></fybersearch_ignore> tags is displayed. If you still do not understand, try looking at the source code of the example file.


Web Crawling Frequency

It depends on how good your web pages taste, but they should never be eaten too quickly. If FyberSpider eats too much, please contact us.


Prevent Web Pages from being Crawled

You can tell FyberSpider not to crawl specific pages on your website. Although FyberSpider may be a little sad, it will move on.... to the next 20 billion pages.

FyberSpider will obey the The Robots Exclusion Protocol.
FyberSpider will not obey The Robots META tag at this time.

If you would like to use the robots.txt file to tell FyberSpider (or other robots) not to crawl pages of your website the first step is to create a file named "robots.txt". You should then follow the examples found on this page. If you would like to block FyberSpider instead of all robots from a page, enter "fyberspider" into the user agent line instead of a "*". Upload it to your server so that it can be viewed from your domain name. Example: www.yourwebsiteaddress.com/robots.txt.


URLs FyberSpider Will Not Crawl

In case you have not noticed, there are a lot of useless web pages out there. We have found that many of these useless pages can be avoided by having FyberSpider ignore any URLs that contain one of the following characters: *, [, ], ?, &, %, #, @, |, (, ), !, +, and ,

Some of you may think this makes FyberSearch bad. You are probably the same people who would like to avoid using mod_rewrite to turn all your query strings into directory-style URLs. To be honest, we really do not want to waste resources cataloging a few hundred thousand pages of content dynamically generated by your PHP script. But if you think your content is so useful that you learn how to turn all those query strings into normal-looking URLs then FyberSpider will consider downloading all that automatically created text.
 
Need More Visitors?
We have some waiting for you....
www.fybersearch.com/cmca
FyberSearch Sponsors