Part 3: Anatomy of the pipeline

13 February 05

With Lenya installed, you're itching to figure out how to get around this thing so you can start creating a website. But, you want to put it all together the way you're used to. Well, bring out the frozen dinners, because this is not your momma's cooking. Instead of opening up a file, adding our HTML, and throwing in some CSS, we'll be working with pipelines, XML, XSLT to get the job done.

What are pipelines?

Pipelines aren't a Lenya thing - it's a Cocoon thing. If you want to master Lenya, you'll have to get your hands around Cocoon, and that's not an easy thing. Let me remind you that the articles I write here are not because I'm an expert in Lenya or Cocoon. Far from it. But, I have learned some things along the way that I see being asked over and over, so I think it's important to keep the open-source spirit alive and document what I've learned.

OK, so back to pipelines. Pipelines are basically part of a set of items that can be found in a sitemap of your publication, which include the components, views, resources, etc. We'll save all those for another time. Pipelines are a way to match a request coming to your publication and act on them in some way. So, for example, if you are accessing a particular page within your publication using your web browser, a pipeline can find a match for that request and possibly send back a page for you to view.

Minimum Requirements

For a pipeline to work, you'll need to match an incoming request, generate something to be used, and then send it back in a format that is recognizable and can be dealt with easily. Here's an example pipeline:

1. <map:pipeline>
2. <map:match pattern="example">
3. <map:generate type="file" src="example.xml"/>
4. <map:serialize type="xml"/>
5. </map:match>
6. </map:pipeline>

Let's walk through this line by line. Line 1 starts the definition of the pipeline. Everything starts with "map:" because all of this XML is part of the map namespace defined by Cocoon. Line 2 tries to match an incoming request to see if it looks like "example". If it does, we keep going. If not, this pipeline gets skipped over. Line 3 is the generator. In this case, we'll be generating "stuff" from a file, and that file is example.xml. So, what do we do with this stuff inside the file? Well, we need to send it back to the user making the request for "example" in something they (or it) can understand. In line 4, we're doing just that by using a serializer to take all that stuff in example.xml and sending back to the user as XML.

How about something more usable?

OK, admittedly, that was a boring example. The file was already XML, so the only thing the serializer did was probably add in our declaration at the top of the page and send it back to the user. Let's add in some spice and relate it to web pages:

1. <map:pipeline>
2. <map:match pattern="test.html">
3. <map:generate type="file" src="test.xml"/>
4. <map:transform type="xslt" src="test2html.xsl"/>
5. <map:serialize type="html"/>
6. </map:match>
7. </map:pipeline>

OK, so here, we're trying to match the request for test.html. Now, keep in mind, it could very well be that test.html doesn't exist (and in this case it doesn't). That's OK - we're just matching requests for something, and in return, we can send back whatever we like. Think of it as a virtual link to another file we're creating on the fly.

So, if we do match test.html, we'll grab the contents of the file test.xml, but before we send it back, we'll transform that XML into something else using the transformer (line 4). Using XSLT, we can convert that batch of XML into an HTML page! That's done using the file test2html.xsl. When that's all said and done, off we go to serialize it back to the user, but this time as HTML instead of XML.

I won't have time to show you how the XSL transformation works, but I can throw you over to W3Schools and they'll give you a nice intro.

The Lenya pipeline

So now that we know the basics, how does Lenya use the pipeline in creating it's pages? Well, it's not too much different. While it looks more complicated, the basics are still there.

Below is the pipeline that is used in the publication-sitemap.xmap file in Lenya's default publication:

1. <map:pipeline>
2. <!-- /lenyabody-{rendertype}/{publication-id}/{area}/{doctype}/{url} -->
3. <map:match pattern="lenyabody-*/*/*/*/**">
4. <map:aggregate element="cmsbody">
5. <map:part src="cocoon://navigation/{2}/{3}/breadcrumb/{5}.xml"/>
6. <map:part src="cocoon://navigation/{2}/{3}/tabs/{5}.xml"/>
7. <map:part src="cocoon://navigation/{2}/{3}/menu/{5}.xml"/>
8. <map:part src="cocoon://navigation/{2}/{3}/search/{5}.xml"/>
9. <map:part src="cocoon:/lenya-document-{1}/{3}/{4}/{page-envelope:document-path}"/>
11. <map:transform src="xslt/page2xhtml-{4}.xsl">
12. <map:parameter name="root" value="{page-envelope:context-prefix}/{2}/{3}"/>
13. <map:parameter name="url" value="{5}"/>
14. <map:parameter name="document-id" value="{page-envelope:document-id}"/>
15. <map:parameter name="document-type" value="{page-envelope:document-type}"/>
16. </map:transform>
17. <map:select type="parameter">
18. <map:parameter name="parameter-selector-test" value="{1}"/>
19. <map:when test="view">
20. <map:transform type="link-rewrite"/>
21. </map:when>
22. </map:select>
23. <map:serialize type="xml"/>
24. </map:match>
25. </map:pipeline>

OK, yikes, I know what you're thinking. But seriously, it's not that bad. We still open with a pipeline tag, we match something, we have this aggregation part which I'll explain in a minute, we transform the results, and after another part I'll explain, we serialize the results back the user. Let's start with the matcher.

The Matcher

So, um, what exactly are we matching? Without going into too much and getting you swamped with terminology, we're basically trying to match a whole bunch of things at once. The comment right above the matcher tries to tell you what each of the asterisks are. There's the rendertype (whether you're viewing the page, or editing it), the publication ID (which in this case is "default" for the Default Publication), the area (it could be Authoring, or Live, or Admin, etc.), the document type, and the actual URL of the document.

The document type is pretty interesting. In XML, one document could be for describing a shape, while another could be describing a set of books. It doesn't have to be that way, though. For example, the "homepage" and "xhtml" doctypes provided for you in Lenya are exactly the same, except there's another pipeline that says the top index.html page of the publication will be assigned the doctype of "homepage". It's handy because since most homepages have a different design that the secondary or tertiary pages, you can use a different XSLT file to transform it however you want without having to setup a new pipeline for it. Just think of the possibilities with different doctypes...

The Aggregator

So, after we've matched all of those options (the asterisks mean anything and everything), we get to this aggregate tag. Basically, it's a generator like we saw in the previous examples, it's just aggregating the results of the generation from all these sources together as one.

So, Lenya separates out the menu (or navigation of the site), the tabs (all the high-level items in the navigation), the breadcrumb trails on the pages, the search box, and the actual content of the page into separate files. See all those {2}'s, {3}'s, and {5}'s? Each one of those points to the value for that numbered asterisk in the matcher. So, whatever happened to have been in the second asterisk in the matcher, we use that in place of {2}. Simple, no?

The Transformer

The transformer is pretty straight-forward. We transform the results of all the aggregated content using the file page2xhtml-{4}.xsl. Except, the {4} is replaced with whatever was in the place of the fourth asterisk, or in this case, the doctype. So, if our doctype was "homepage", we would transform our page using the XSL file page2xhtml-homepage.xsl. If our doctype were "xhtml" (which is what most of the pages are in Lenya), then you would transform it with page2xhtml-xhtml.xsl. See how you can differentiate the design with different transformations and doctypes?

The parameters inside of the transform tag are basically setting up variables to be passed to the XSL file. In this case, we're passing along the root location of the publication (perhaps it's just a / in, for example), the URL of the publication (like "some/where.html"), as well as the document ID (the latter half of the URL without the .html extension), and the doctype.

The Selector

We haven't seen this one yet, but think of the selector as an if/else statement. You have to tell the selector what you are testing against, then test it against some value, and do something. In this case, we're testing the rendertype (that's the first asterisk, or {1}). If the rendertype is "view", as in we're viewing the page and not editing it, then go through one more transformation, called link-rewrite.

The link-rewrite transformer basically checks where you are, then goes through the contents of the page and rewrites all links in relation to what area you are in. For example, if I am in the Authoring environment in Lenya, then my links could be rewritten to look like "/lenya/default/authoring/some/where.html". If I am in the Live area, they would be rewritten to look like "/lenya/default/live/some/where.html". That way, you just keep track of the organization of the site using the Site tab within Lenya, and Lenya will rewrite your links according to what area you are in when viewing the page so that it all just works!

The Serializer

In the end, we serialize everything we've done into XML. So, why XML? Because it's later on in the series of pipelines that the results are serialized again into HTML (or XHTML, if you so choose).


So hopefully that gets you cracking on understanding how Lenya is setup to handle pages. There's no doubt I've exposed you to quite a bit that deserves more explanation, and it will certainly come. For now, don't hesitate to post questions.


Pravni - Mar 25, 6:15am

Hi Jon, Will you please guide me how i can access sitetree.xml from any of the xsl file througout the site. For example i need access to sitetree.xml from page2xhtml.xsl directly instead of indirect access through tabs.xsl,menu.xsl,breadcrumb.xsl. Please tell me what all chages i have to do to access the node of sitetree from any of the xsl within the publication. Your guidance will be highly appreciated. Thanx in advance

Jon Linczak - Mar 26, 11:55pm

Hi Pravni,

Hmm, well, based on the pipeline and how it works (I’ve never tried this myself), but I think you can point directly to the sitetree.xml file as one of the aggregated content pieces. For example, inside of the “map:aggregate” tag, you can add a part that points to the full location of the sitetree.xml file. The problem is how to access it in your XSL file.

You see, in the tabs.xsl and breadcrumbs.xsl files, you can access nodes inside those files by referencing the HTML tag used. I don’t know if you can just access the nodes in the sitetree file by referencing the root node, the “site” tag. That’s something you’ll have to experiment with.

But let me ask you this: why do you need access directly to the sitetree file? Perhaps by knowing the answer to this we can come up with a better solution? Hope my explanation helps some.


Pravni - Mar 28, 7:24am

Hi Jon, thanx for your reply. to provide flexible as required interface i.e look and feel i just wanted to know how flexible apache lenya is. for the sake of using it i just wanted to know if we can access the node label e.g home from sitetree.xml in the menu2xhtml.xsl where Workflow state and User is shown. i think by knowing this i will understand how i can use the underlaying technology to get the required work done. i have tried the way for agreegate and part but it seems not working or may be i am doing in wrong way. your help will definately be a plus for me. thanx in advance.

Pravni - Mar 29, 5:42am

Hi Jon, in this page you have discussed the code for publication-sitemap.xmap. please help me understand the meaning and usage of line…..

first of all what “cocoon://” means and where it is declared and there is no such path “navigation/pubs/authoring/breadcrumb/secondpage.xml” what this path indicates and there are no xml files as indicated in tabs/{5}.xml,menu/{5}.xml/,breadcrumb/{5}.xml how these contents from tabs.xsl,menu.xsl,breadcrumb.xsl gathered? Your valueable comments are appreciated. thanx in advance.

Pravni - Mar 29, 5:46am

Hi Jon the line missing in previous message is “< map:part src='cocoon://navigation/{2}/{3}/breadcrumb/{5}.xml' />”

smita padole - Mar 31, 2:10am

It’s very nice to read this document. It helped me a lot to understand lenya architecture.

Pravni - Apr 2, 2:31am

Hello Jon,

One more doubt again, i am using aggreagation whose parts are two xml files which are having same node hierarchy e.g.
< html>
< body>
< /body>
< /html> how we can differentiate in the xsl two different body sections from two xml’s
i mean can we use something like

Body 1
< xsl:value-of select="html/body[ 1]">
Body 2
< xsl:value-of select="html/body[ 2]">

hope to get the response early.
Thanx in advance.

Jon Linczak - Apr 3, 4:37pm

Smita, glad it was helpful for you!

Pravni, I think you might be misunderstanding something about how the navigation works. You can see my document on creating custom navigation for your site, and that might give you some clues on how to go about accessing the nodes of the site.

I noticed that you were asking on the mailing lists for a sitetree, or listing, of all the pages on your site. This isn’t that hard to do, and you don’t have to access the sitetree.xml file directly to do it. That’s what the XSL files are for. Again, see my article on what files are used to get access to all the nodes and their labels in the sitetree. You just might have to be inventive if you want a semantic listing of the pages in the whole site (nested unordered lists).

May I recommend something? I wouldn’t try to find out what each part means and then dig through the source code to find out how and where it is declared. You’ll get lost very quickly. Start with the basics that I have listed in my first few articles in the series (see the archived articles for more information).

The “cocoon://” is called a resource, I believe, and it’s a way for Cocoon (the backend to Lenya) to call documents in a uniform way so that Cocoon uses them appropriately. That’s why the URLs don’t actually “exist”, per se, because they are probably not in the publication’s files.

The breadcrumbs.xsl file is located globally in Lenya, so you should be able to find them in /usr/local/tomcat/webapps/lenya/lenya/xslt/navigation/. You can override them with your own files, which I explained in my custom navigation article. The {5} after the map:aggregate part simply corresponds to the fifth asterisk group when you do the matching. It essentially passes the page you are currently at to the XSL file so that it can determine what page is current. You can use the XSL markup to change the results based on the current page, which is why this is done.

As for your last comment, I’m not really sure where you are coming from. Do you mean that you want to have two pages of content that you want to put together onto one page in your XSL? If so, just differentiate the two with an ID, and refer to them with id = “body1” and id = “body2”.

Amna & Kiran - Apr 5, 6:35am

Actually we wish to have ur suggestion in this: We are to make a CMS for the Website of our university and are required to “customize” Lenya for that. We wish to alter the source files for doing this,is that possible? or should we simply edit the lenya default-publication to serve our purpose(Edit through the WYSIWYG editors provided in the 1.2 version, that is.)

Jon Linczak - Apr 8, 3:14pm

Hi Amna and Kiran, it really depends on what you want to do in terms of customization. For example, if you wanted to create a quick way to create a photo gallery by choosing an option in the menus in the default publication, you’ll change some files within the default publication, but you won’t change the source code. Anything that has to do with page content or page layouts are all within the Lenya default publication.

The only time you would really want to change the source code itself would be when you would want to add a functionality to all of Lenya that just isn’t easy to add through some changes in the default publication. For example, I’ve been wanting to figure out an easy way to add pages to the navigation of your publication that link to another location. There’s a quick way of doing this within the publication, but it’s not exactly what I want, and the only I think to be able to get exactly what I need is to modify how Lenya works with the sitetree that keeps track of all those pages. That’s when changing the source code becomes necessary, but it should always be a last resort.

saurabh - Nov 14, 4:53am

I have seen the documents these are very great & helpful.Could you please give me a some practicle example which I can add in my cocoon to understand it in a more better way.

Like some of the xml,xsl,servlets,xmap files & please let me know what would be the url in tomcat for that.

Best Regards,

Jon Linczak - Nov 21, 1:17am

Hi SAurabh, I’m glad the documents were helpful in some way. As for a practical example, that’s a bit harder to do. I was hoping I could have a way of distributing the sample project I showed off in my recent presentation at HighEdWebDev, but it’s tough to find a place to put Lenya so that everyone can use it. What I’ll try to do is zip up the sample project and post it on the site within the next few days and see if you and others find it at all helpful.

Ahmed - Aug 1, 11:43am

Hi ,
Iam a new bee to lenya , Need your help in understanding few things with respect to lenya 1.4

1. Firstly i couldn’t find publication-sitemap.xmap in lenya any where , I don’t know the reason .

2. In some of the .xmap files , i see lenya refering cocoon://fallback , can you please tell me funda behind fallback.

I went through your article , it was really very informative and I would really apprecaite if you could help me in understanding the things.

Thanks in advance.

commenting closed for this article

in this site