robotstxt - Usage Instructions

Usage

Prerequisites

These instruction assume that you have an HST project based on the HST archetype, i.e. a Maven multiproject consisting of three submodules: cms, site and content.

Installation

  1. Add to the pom.xml of your cms module:
        <dependency>
          <groupId>org.onehippo.forge</groupId>
          <artifactId>robotstxt-addon-repository</artifactId>
          <version>1.01.00</version>
        </dependency>
  2. Add to the pom.xml of your site module:
         <dependency>
          <groupId>org.onehippo.forge</groupId>
          <artifactId>robotstxt-hst-client</artifactId>
          <version>1.01.00</version>
        </dependency>
  3. Copy robotstxt.jsp to src/main/webapp/jsp/templates/ in your site module.
  4. Add the following beans to the beans-annotated-classes.xml
    <annotated-class>org.onehippo.forge.robotstxt.annotated.Robotstxt</annotated-class>
    <annotated-class>org.onehippo.forge.robotstxt.annotated.Section</annotated-class>
  5. Rebuild your project using Maven.
  6. In the CMS create a file robots.txt and add configuration, for example:
    User-agent: *
    Disallow: /donotindex
  7. In the HST editor add configuration for robots.txt
    • Add a template "robotstxt" with renderpath jsp/templates/robotstxt.jsp
    • Add a Web Page Design "robotstxt" with componentclassname "org.onehippo.forge.robotstxt.components.RobotstxtComponent" and template "robotstxt"
    • Add a URL Design with Level pattern "robots.txt", Content path <path to the robots.txt file>, Template "robotstxt"

Admire the result

Open your site in your browser, and check out the test page! http://localhost:8085/site/preview/robots.txt

User-agent: *
Disallow: /donotindex/
Disallow: /search/

User-agent: Googlebot
Disallow: /hide/for/googlebot/

User-agent: EvilBot
Disallow: /