Skip navigation.

XML Processing with Java Object Technology

by Scott Ryan
05/04/2004

Introduction

With growth in the use of XML for everything from deployment descriptors to Web services most developers are facing the challenge of processing this XML into Java objects for processing by their business and data layer logic. In the past this has been a challenge since the tools offered were rather primitive. Early on, developers had the choice of SAX or DOM parsers, which allowed developers to pick out selected parts of an XML document and load the data into Java objects for later processing. Programming with these technologies was rather tedious and involved a lot of repetitive code. Performance was poor in the beginning, but as parser technology matured the performance improved, producing good results. The natural evolution was to create frameworks that minimized this repetitive.

In this article, we will look at two of the more popular frameworks used to turn XML input into Java objects and Java objects into XML. In addition, we will explore at XMLBeans and Castor, which seem to be the most talked about and most commonly used frameworks in the open source world (including the digester from Apache commons). Both technologies can be easily downloaded from the Internet. XMLBeans was developed by BEA and is used to increase the productivity of developers using those technologies. XMLBeans is part of WebLogic Workshop 8.1 and has been implemented in such a way as to make development of Web services that use XML easy for even the most novice developers. Both technologies have excellent support via online Web sites and multiple user groups.
Download the author's files associated with this article


Here, we will start with a fairly complex XSD-based schema and develop code to read and write that schema using both of the technologies. We will look at the effort required to use the two technologies including ease-of-use and lines of code generated and developed. Finally we will look at the performance of the technologies in the area of memory usage and raw performance.

Why Is this Paper Important? If you are a Java developer or manager you are already dealing with XML and will continue to use it more and more as Web services become more pervasive. Object-parsing technologies enable increased productivity by eliminating the need to write much of the repetitive code involved with XML. This will increase overall productivity and should, in the long run, reduce leave more time to develop new code rather than fixing bugs in older code. You should walk away from this paper with the ability to implement a project in either of the technologies and begin to understand the strengths and weaknesses of each.

Technology Acquisition and Setup This section will cover where to find the technologies used in this paper and how to download and set them up to begin your development. I use both WebLogic Workshop 8.1 and Eclipse 2.1.2 in my development. In the project I have included a Workshop project you can use to run the samples. In the download section I have provided the sample Java code as well as sample Ant scripts to generate and build the samples. In order to build and use the samples you must download the file at the end of this article.

Once you unzip the file to a directory you should then download the XMLBeans toolkit from one of the sources listed below. You will only need the xbean.jar in the lib directory. You will also need to download the Castor toolkit from one of the sources listed below and install the Castor.0.9.5.2-xml.jar in the lib directory. You will also need to download a parsing toolkit.

I used Xerces for this project and you can download it from http://xml.apache.org/xerces2-j/index.html and install the xercesImpl.jar and xml-apis.jar into the lib directory. You will need Ant and a Java compiler and have the proper ANT_HOME, JAVA_HOME and paths set up to access those two tools. If you type ant -version and Java -version you can tell if you have your environment set up correctly. Ant can be downloaded from http://ant.apache.org/ and Java can be downloaded from http://Java.sun.com/j2se/index.jsp.

If you have BEA products installed on your system, the Java compiler and xbean.jar are already on your system and just need to be copied to the proper locations. Now you are ready to work with the samples. You then need to change to the bin directory and type ant -projecthelp for a list of the scripts that are available.

XMLBeans XMLBeans is available from both the BEA dev2dev Web site http://dev2dev.bea.com/technologies/xmlbeans/overview.jsp and the Apache Web site http://xml.apache.org/xmlbeans/. At the writing of this article XMLBeans was still in the incubator stage of the Apache process so only source code downloads from CVS were available from the Apache site. I prefer to deal with binary downloads so I used the one from the BEA site. The download from the BEA site also includes some functionality that is not currently in the Apache offering but I am sure that will be resolved in short order. BEA also offers the capability to upload an XSD schema and generate the supporting Java code for that schema.

On the download page there is a button labeled "Upload Schema." I uploaded the sample schema and the system generated the source code for implementing the XMLBeans technology as well as a Jar file containing the compiled code required to process the schema. You can take that jar file and use it to process any XML files that utilize the schema presented below.

I have included both files in the JavaOne/WebGeneratedCode directory. I downloaded the XMLBeans implementation code from BEA and unzipped the file into my JavaOne directory and it loaded all the supporting code into a directory labeled xkit. I then copied the xbeans.jar file from the JavaOne/xkit/lib directory to the JavaOne/lib directory to facilitate the use of my Ant scripts.

The download from BEA includes several useful tools in addition to the XMLBeans kit including:

  • Xpretty - which pretty-prints an XML instance document
  • Validate - which validates and XML instance against a schema
  • Xsdtree - which will show the inheritance hierarchy of the schema types in a directory
  • Dumpxsb - which will dump the contents of "xsb" (binary schema metadata) files in a human readable form. The .xsb files contain the compiled metadata resulting from the .xsd files of a type system. They are analogous to the .class files for Java.

You are now ready to develop using XMLBeans.

Castor

Castor is an open source project hosted at http://www.Castor.org/. Castor is a very active project and offers much more than just XML object parsing. Castor supports additional features such as JDO processing frameworks and much more. For this paper I am going to concentrate on the features that assist us in processing XML files.

The Castor.org Web site is very well organized and easy to use. There is voluminous documentation that is well written and easy to use. The support from the contributor group is outstanding. The project makes use of several other open source projects including Jakarta and Apache projects and includes JUnit testing framework support.

I downloaded the Castor Jar XML only (Castor-0.9.5.2-xml.jar) and placed it into my JavaOne/lib directory. I also needed to download the latest Xerces jar from Apache at http://www.apache.org/dist/xml/xerces-j/. I downloaded the 2.6.0 version and placed it in the JavaOne/lib directory.

NOTE: When I first started using Castor I received an error about an illegal argument that read, "the prefix 'XML' is reserved (XML 1.0 Specification) and cannot be declared. After researching this I found that you needed to turn off the validation during the generation of the code. I created a Castor.properties file in the package org.exolab.Castor and set the org.exolab.Castor.parser.namespaces=true property to true. A sample file is in the CastorProperties directory of the project. After that everything went fine.

You are now ready to develop with Castor.

The Schema

Developing the Schema

We tried to use a schema (resume.xsd) that everyone could understand and identify with. The schema is in the JavaOne/schemas directory. We chose a resume/curriculum vitae schema that was a combination of several schemas in the public domain. The schema was developed and edited with the XMLSpy product from Altova (www.altova.com) that is included with WebLogic 8.1 Platform which can be downloaded from the dev2dev site http://dev2dev.bea.com/index.jsp. This tool included several useful features such as validation and code completion. The tool also supports generation of XML from a schema or DTD and generation of a schema from an XML test file. This tool came in very useful in jump-starting the development of both the schema and the test XLM file. We initially created an XML file and used XMLSpy to create baseline schema. We then modified the schema to fit our needs and then used it to output a test XML file for out development and testing. Castor also comes with some useful tools in this area.

Schema Components

We avoided a schema that included other external schemas since this presents challenges that are beyond the scope of this paper. We tried to include aspects that we felt would demonstrate complex processing capabilities and validation to see how the frameworks handled these challenges. We use several sequences in order to allow us to easily expand the XML files size to generate some load when performing the performance analysis. For example, the resume can contain zero or many job history elements.

The schema includes elements that are required and some that are optional. For example, a resume header is required but the academics elements are not. We have elements with various count requirements as well as some that are limited by the type and values of data allowed. For example e-mail address can contain a minimum of zero and a maximum of two strings. We included both raw elements and attribute data to understand how the frameworks handled those differences. For example, the instant message service attribute is limited to one of three values (Aim, Yahoo and MSN).

We have included a sample XML file that implements this schema in the JavaOne/xml directory. You can use this XML file and the schema to get familiar with the expected structure of each as well as to get familiar with the tools used such as XMLSpy and the tools provided with Castor and XMLBeans.

Implementing XMLBeans

This section will cover how to generate the framework code to support the resume schema as well as how to develop code to read and write the sample XML file from your Java application.

Standalone (with Ant)

The first step in using the XMLBeans technology is to use the toolkit to generate the XML processing code from the schema file. I have included a sample build.xml file for use with Ant in the JavaOne/bin directory to demonstrate how this is done. The target is called generateXMLBeansSource. I created a custom Ant task and feed the schema file to the task. Some of the parameters I used are:

  • Schema - name and path to all schema files involved in the generation. If you have included schema files and they reside in the same directory XMLBeans will automatically use them without specifying them in this attribute.
  • Classpath - the classpath to the xbeans.jar jar file containing the generator.
  • Download - Set to true to permit the compiler to download URLs for imports and includes. Defaults to false, meaning all imports and includes must reside locally.
  • Srconly - If True the only the source files are generated. If false the source is generated, compiled and placed in a usable jar file defined in the destfile parameter.
  • Srcgendir - The directory to place the generated Java source into. If you use a namespace in your schema XMLBeans will use that namespace to generate the proper package structure.
  • Srcgendir - The directory to place the generated Java source into. If you use a namespace in your schema XMLBeans will use that namespace to generate the proper package structure.
  • Destfile - The location and name of the jar file containing generated and compiled support code.
  • Classgendir - The directory to compile the support code into prior to placing it into the jar file defined in destfile.


The documentation that comes with the XMLBeans download has an excellent explanation of the options available when using the command.

Once the ant script is run we now have a jar file we can begin to build our project with.

Using WebLogic Workshop

WebLogic Workshop has done an excellent job in supporting XMLBeans and making it easy to use the technology. The process for generating the supporting code is as follows:

1) Create a new application or use an existing application.
2) Create a new schema project or use an existing one.
3) Drag and drop the schema file you wish to use into the schema project.
4) Watch as the magic happens and Workshop generates the required support code to support your development effort. This is truly awesome ease of use.


I have included the WebLogic Workshop project in the JavaOne/Workshop directory. You get Workshop for free when you download the WebLogic 8.1 product from the BEA download center.

Development Using the Frameworks

Now is the time to develop the code that uses the frameworks. The code developed to support this article is in XMLBeansWrittenSource and CastorWrittenSource. See the included Ant targets for easy use of this code.

Reading XML with XMLBeans

The code we will be studying is included in the file ResumeReader.Java in the com.soaringeagleco.Javaone.xmlbeans package and is driven by the file XMLBeansDriver.Java.

The first step in accessing the XML is to read the XML file into the framework. XMLBeans is unique in that it processes the XML file in place via the use of tokens. This has the advantage of not modifying the file during processing. I have experienced some cases using other frameworks including Castor where I read the XML file in and wrote it back out and the input and output files differed. This does not happen with the XMLBeans framework.

There is also an API that allows you to access the tokens directly. This gives you high-speed access to the XML file but at the expense of more difficult development. Initial access to the XML file is fairly simple to code as demonstrated by the code snippet included below:
// Parse the incoming document
ResumeDocument resumeDocument = ResumeDocument.Factory.parse(inputFile);
// Get the resume
Resume resume = resumeDocument.getResume();




The first line takes a file object as input and processes the input XML file into the XMLBeans framework. Unlike the Castor framework the file is not validated during this step. The next step is to get a reference to the resume document. You now have access to a Java object representing the resume XML file and can continue to access and process the various elements of this file.

Accessing the XML Data via the Java Objects
Now that you have access to the resume object you can access the various components in the XML document. We will look at some of the interesting processing steps but you can look at the complete code to understand how to process the entire document.

Simple Elements

In order to access simple elements you merely call the proper getter method on the object you are processing. If the element is a simple element such as the first name of the header.contact element you can get back its Java type or you can get back its XML type. It is always a good idea to make sure the element is filled in before processing it as it could be an optional element and not have data associated with it.


// Get the name from the header
Name name = header.getName( );
// Get the first name
String firstName = name.getFirstName( );




Complex Elements

If the element is a complex element such as the header element you will get back a reference to an object containing the complex type.

// Get the header
Header header = resume.getHeader( );
if (header = = null)
{
throw new MissingElementException ("Missing required Header element");
}




Sequences of Elements

If you have sequences of elements either as standalone simple types or as complex types the framework presents these as embedded arrays that can be accessed from the containing object. You merely acquire the array and then move through the array until you locate the value or values you are seeking.

Contact contact = header.getContact( );
// Get the phone array
Phone [] phones = contact.getPhoneArray( );
// Process all the phones
for (int j = 0; j < phones.length; j++)
{
cat.debug("the " + phones[j].getLocation( ).toString( )
+ " phone is " + phones[j].stringValue( )) ;
}




Attributes Versus Elements

One of the nice things about the frameworks is that you can access attributes the same way as you access element data. For example if you have a construct such as:


<instantMessage service="yahoo">sryan1_2000</instantMessage>




You would access the attribute "service" with the following code:

//Process the messenger data service attribute
InstantMessage[] messengers  =  contact.getInstantMessageArray( );
for (int j = 0; j < messengers.length ;  j++ )
{
cat.debug ("The IM service is " + messengers[j].getService( ));
}




To access the element data you would use the following code:

//Process the messenger data element
InstantMessage[] messengers = contact.getInstantMessageArray( );
for (int j = 0; j < messengers.length; j++)
{
cat.debug("The IM user id is " + messengers[j].stringValue( ));
}




Enumerated Values

For elements that have enumerated values the framework supports such operations with a fixed list of enumerated values. This allows for easy processing of these enumerated values. For example in our schema we only allow three values for the instant messenger service (Yahoo, MSN, Aim). The enumerations are represented as immutable singleton objects so they can be compared directly with ==. Be careful when using such comparisons, as they tend to not always work since in a cluster the concept of a singleton is rather loose. It is a better idea to compare the values either string or integer to insure consistency. The objects return string representations as well as integer representations. The code for the enumerations is contained inside the document generated code. For example the service enumerations for IM is in the InstantMessageDocument.Java file.

Dates

XMLBeans uses the Java.util.Date object to represent dates. You can use this date object for further processing. In order to read a date you would use the following code:

// Write out the end date
cat.debug (" The end date is " + period.getEndDate().toString));




Writing XML with XMLBeans

The code we will be studying in this section is included in the file ResumeWriter.Java in the com.soaringeagleco.Javaone.xmlbeans package. The first step in creating an XML file is to create the base object to hold the document. The following code demonstrates how to create the base object that will contain all the other Java objects that make up the document:

// Create the holder for the document
ResumeDocument resumeDocument = ResumeDocument.Factory.newInstance( );
// Create a resume object
Resume resume = resumeDocument.addNewResume( );




The first line uses a factory object to create the resume document. The next line adds a resume object to the document. You will continue to build up the document by adding either elements and attributes or additional objects. It is usually a good idea to keep the reference to the object you add so that you can add additional simple or complex elements to that object.

Simple Elements

To add a simple element you merely use a setter method for the object or element you wish to add. For example to add an id element to the resume object you use the following code:

// Add an ID to the resume
resume.setId("Scott Ryan");




Complex Elements

To add a complex element to the object chain you use an add method. It is a good idea to keep the reference to the added object so you can add additional elements to it later. For example to add a header complex element to the resume object you use the following code:

// Create a new header object
Header header = resume.addNewHeader( );




You can use the header reference to add more elements to it.

Sequences of Elements

For elements that can have a count of more than one you can use the add method to add those elements. For example to add two phone numbers to the contact object you would use the following code:

// Create the work phone
Phone workPhone = contact.addNewPhone( );
// Set the phone number
workPhone.set("(303) 263-3044");
// Set the phone type attribute
workPhone.setLocation(Phone.Location.WORK);

//Create the home phone
Phone homePhone = contact.addNewPhone( );
// Set the phone number
homePhone.set("(303) 263-20XX ");
// Set the phone type attribute
homePhone.setLocation(Phone.Location.HOME);




Elements Versus Attributes

Just like we observed in reading the XML file elements and attributes are treated in a similar fashion. You create them using setter methods. For example to create the phone type attribute we used the following code:

// Set the phone type attribute
workPhone.setLocation(Phone.Location.WORK);




To create the first name element we used the following code:

// Set the first name element
name.setFirstname("Scott");




Enumerated Values

As we noted above when reading the XML file fixed values are represented with enumerated values that are immutable singletons. To use them we simple access them their Java class. To set the phone location to home we use the following:

// Set the phone Location Attribute
homePhone.setLocation(Phone.Location.HOME);




Dates

XMLBeans uses the Java.util.Date object to represent dates. In order to write a date into the XML file you might use the following code:

// Create a start date
period.setStartDate(new GregorianCalendar(2001, 10, 01));




Remember the Gregorian calendar begins counting months with 0 so January is 0 and December is 11.

Implementing Castor

Stand Alone Bat File

For this project I used a Windows bat file to kick off the Castor source code generation process. The bat file is called from within the ant script. I am sure there is an Ant task for this but could not locate any information on it. The bat file is called Castor.bat and resides in the bin directory of the project. The bat file uses the following flags:

-cp = The class path required to run Castor for XML
-dest = The directory where the generated source code will be stored.
-i = the input schema to be used for the code generation
-package = the package name to place the generated source files into


When I first ran the bat file I received an error about an illegal argument that read "the prefix 'XML' is reserved (XML 1.0 specification) and cannot be declared. After researching this I found that you needed to turn off the validation during the generation of the code. I created a Castor.properties file in the package org.exolab.Castor and set the org.exolab.Castor.parser.namespaces=true property to true. A sample file is in the CastorProperties directory of the project. After that everything went fine.

Reading XML with Castor

I tried to follow the same process for all the examples. I have a master driver called CastorDriver to manage the process and I have a file called ResumeReader to read the XML files and ResumeWriter to write out the xml files. The first step in reading the XML file with Castor is to marshal it into the object map. You use the following code for that purpose:

// Marshal the resume document
try
{
        resume = (Resume) Resume.unmarshal (new FileReader(inputFileName));
}
catch (MarshalException ex)
{
        cat.error(Marshall Exception thrown while marshalling file "
                + inputFileName + " " + ex.getMessage(), ex);
        throw ex;
}
catch (ValidationException ex)
{
        cat.error("ValidationException thrown while marshalling file "
                + inputFileName + " "  + ex.getMessage( ) , ex);
        throw ex;
}
catch (FileNotFoundException ex)
{
        cat.error("FileNotFoundException thrown while marshalling file "
                + inputFileName + " "  + ex.getMessage( ) , ex);
        throw ex;
}




This code will read (marshal) the XML file into your object tree. One of the advantages of this process over the XMLBeans process is that it validates the file when it is being marshaled. You can turn off the validation if you like in the properties file. I like knowing if the file is valid before wasting time processing it.

One downside of the marshalling process is that the file is read and parsed. I have run across some instances where I have marshaled in a file and marshaled it back out without changes and the file contents have changed. The next step is to process the elements and attributes of the file.

Simple Elements

In order to access simple elements you merely call the proper getter method for the element you want. If the element is a simple element such as the first name of the header.contact element you can get back its Java type. It is always a good idea to make sure the element is filled in before processing it as it could be an optional element and not have data associated with it.

// Get the name object
Name name = header.getName( );
// Get the first name
String firstName = name.getFirstName( );




Complex Elements

If the element is a complex element such as the header element you will get back a reference to an object containing the complex type.

// Get the header object
Header header = resume.getHeader( ) ;
If (header = = null)
{
        throw new MissingElementException("Missing required Header element");
}




Sequences of Elements

If you have sequences of elements either as standalone simple types or as complex types the framework presents these as embedded arrays that can be accessed from the containing object. You can access the data in multiple ways. You can acquire the array and then move through the array until you locate the value or values you are seeking or you can get an enumeration and process the list that way.

// Get the phones array
Phone [ ] phones = contact.getPhone( );
// Process all the phones
for ( int j = 0; j < phone.length; j++)
{
        cat.debug("The " + phones[j].getLocation( ).toString + " phone is " 
                + phones[j].getContent( ));
}




OR

// Process all the phones
Enumeration phones = contact.enumeratePhone( );
while (phones.hasMoreElements( ))
{
        Phone phone = (Phone) phones.nextElement( );
cat.debug(" The " + phone.getLocation( ).toString + " phone is " 
                + phone.getContent( ));
}




Attributes Versus Elements

One of the nice things about the frameworks is that you can access attributes the same way as you access element data. For example if you have a construct such as:

<instantMessage service="yahoo">sryan1_2000</instantMessage>





You would access the service attribute with the following code:

// Process the messenger data
InstantMessage[ ] messengers = contact.getInstantMessageArray( );
// Process the array for the service attribute
for (int j = 0; j < messengers.length; j++)
{
        cat.debug("The IM service is " + messengers[j].getService( );
}




To access the user ID element data you would use the following code:

// Process the messenger data
InstantMessage[ ] messengers = contact.getInstantMessageArray( );
// Process the array for the name element
for (int j = 0; j < messengers.length; j++)
{
        cat.debug("The IM user Id  " + messengers[j].getContent( ).toString( );
}




Enumerated Values

For elements that have enumerated values the framework supports such operations with a fixed list of enumerated values. This allows for easy processing of these enumerated values. For example in our schema we only allow three values for the IM service (Yahoo, MSN, Aim). The enumerations are represented as immutable singleton objects so they can be compared directly with ==. Be careful when using such comparisons, as they tend to not always work since in a cluster the concept of a singleton is rather loose. It is a better idea to compare the values either string or integer to insure consistency. The code for the enumerations is contained in the types directory of the generated code. For example the service enumerations for IM is in the InstantMessageServiceType.Java file.

// Process the messenger data
InstantMessage[] messengers = contact.getInstantMessageArray();
for (int j = 0; j < messengers.length; j++)
{
cat.debug("The IM service is " + messengers[j].getService());
}




Dates

Castor has its own date object. You can use the SimpleDateFormat class to create a Java.util.Date object for further processing. When reading dates from an XML file you would use the following type of code:

// Write out the end date
period.getEndDate( ).toString( );




Writing XML with Castor

The code we will be studying is included in the file ResumeWriter.Java in the com.soaringeagleco.Javaone.Castor package. The first step in creating an XML file is to create the base object to hold the document. The following code demonstrates how to create the base object that will contain all the other Java objects that make up the document:

// Create a new resume object
Resume resume = new Resume( );




To begin the process you instantiate an instance of the resume object. You will continue to build up the document by adding either elements and attributes or additional objects. It is usually a good idea to keep the reference to the object you add so that you can add additional simple or complex elements to that object. You will also need the handle to add the object to the object tree as this a separate step unlike using XMLBeans where this is folded into the creation of the object.

Simple Elements

To add a simple element you merely use a setter method for the object you wish to add the element to. For example to add an id element to the resume object you use the following code:

// Add an id to the resume
resume.setId( "Scott.Ryan");




Complex Elements

To add a complex element to the object chain is a two-step process. You first create the object and then add it to the object chain. It is a good idea to keep the reference to the added object so you can add additional elements to it later. For example to add a header complex element to the resume object you use the following code:

// Create a new header object
Header header = new Header( );
// Attach the header to the resume
resume.setHeader(header);




You can use the header reference to add more elements to it.

Sequences of Elements

For elements that can have a count of more than one you can use the add method to add those elements. For example to add two phone numbers to the contact object you would use the following code:

// Create the work phone
Phone workPhone = new Phone( );
// Attach the phone to the contact object
contact.addPhone(workPhone);
// Set the phone number
workPhone.setContent( "(303) 263-3044");
// Set the phone type
workPhone.setLocation(PhoneLocationType.WORK);
// Create the home phone
Phone homePhone = new Phone( );
// Attach the phone to the contact object
contact.addPhone(homePhone);
// Set the phone number
homePhone.setContent("(303) 263-30XX");
// Set the phone location
homePhone.setLocation(PhoneLocationType.HOME);




Elements Versus Attributes

Just like we observed in reading the XML file elements and attributes are treated in a similar fashion. You create them using setter methods. For example to create the phone type attribute we used the following code:

// Set the Phone type attribute
workPhone.setLocation(PhoneLocationType.WORK);




To create the first name element we used the following code:

// Set the first name element
name.setFirstname("Scott");




Enumerated Values

As we noted above when reading the XML file fixed values are represented with enumerated values that are immutable singletons. To use them when creating our XML file we simply access them via the document object they are used in. To set the phone location to home we use the following:

// Set the phone type attribute
homePhone.setLocation(Phone.Location.HOME);




Dates

Castor has its own date object. To create a date object you must enter a string in a specialized format or you can use the SimpleDateFormat object to convert a standard Java.util.Date object into the proper format. Here is how to create a date entry in an XML file:

// Create a start date
Date startDate = new org.exolab.castor.types.Date("2001-11-01");
// Attach the date to the period
period.setStartDate(startDate);




Performance Metrics

We studied various metrics for all the frameworks we used. We tried to implement them with the same level of coding in order to make the comparisons as fair as possible. The code is not the most efficient but in order to implement the same solution in all frameworks we went for the least common denominator. The first metric we looked at was the complexity of the implementation. We used an open source tool called NCSS ( http://www.kclee.com/clemens/Java/Javancss) to measure the lines of code both generated and written. The table below represents the results of that measurement. The Ant task included in the download includes countXMLBeansGenerated, countXMLBeansWritten , countCastorGenerated, and countCastorWritten to count the metrics presented in this paper.

1

The next task we undertook was to use the created applications to read and write 1000 resumes. We merely created the same resume over and over 1000 times and then we read in the very same resumes. We looped over the 1000 resumes 100 times to get an average time per run. This should eliminate startup time. We used a simple performance monitor to measure performance and we also used the Wiley toolset to understand more of what was going on inside the applications in terms of memory and CPU load and impact.

1

Summary

We've looked at how to process XML files into Java objects and Java objects into XML files using two open source technologies. Both technologies are very similar in the way they are used and have very similar implementations. There are pros and cons to both technologies.

It looks like XMLBeans generates more code and requires slightly less code to be written by the developer. Castor gives you multiple ways to access your sequences of objects while XMLBeans uses an array to access the data. XMLBeans use standard date objects for processing dates and Castor uses its own date object. XMLBeans gives you access to both the Java type representation and the XML type representation, which is useful for new XML types, and those that cannot be cleanly translated to Java types.

It also looks like XMLBeans is more efficient in both reading and writing with the difference being much more pronounced in the reading aspect probably due to the parsing overhead. I would imagine you could tune this somewhat by using a higher performance parser library. If you are a user of WebLogic Workshop (which you should be), you will gain incredible productivity by combining XMLBeans and Workshop.

One interesting thing I tried was to try and create an illegal number of email elements. The schema limits the number of emails to two. When I tried to create the third e-mail address I got different behavior. In XMLBeans I could write and read an illegal number of e-mail addresses. In the Castor example the reader caught the error when parsing and threw an exception. When writing out the data the generated code has a fixed length array and no check is made to keep you from writing off the end. When I tried to write the third e-mail the code threw an IndexOutOfBoundsException and exited processing.

References

Walmsley, Priscilla (2002) Prentice Hall, PTR "Definitive XML Schema"

Copyright © 2003 Soaring Eagle LLC.

Article Tools

 E-mail
 Print
 Discuss
 Blog