Main Content

Segmentation Rules eXchange (SRX)

About SRX

Segmentation Rules eXchange (SRX) is the vendor-neutral standard for describing how translation and other language-processing tools segment text for processing. It allows Translation Memory (TM) and other linguistic tools to describe the language-specific processes by which text is broken into segments (usually sentences or paragraphs) for further processing. It was developed when it was realized that TMX leverage was sometimes lower than expected because different tools segmented text in different ways, preventing a direct correlation between results between the tools. When implemented with TMX, SRX allows the transmission of the segmentation rules that were used when a TM was created so that tools can improve the leverage achieved when deploying TM data. SRX can also be used by any tool that segments text to improve integration with other processes.

SRX version 2.0 was officially accepted as an OSCAR standard in April 2008.

Why Use SRX?

  • SRX improves the function of TMX to save time and money. SRX was developed as a companion standard to TMX and is best used in situations where leverage from TMX files may be lower than expected. By using SRX, localization tools can adjust their processing to emulate that of the tool that created a TMX file, thus improving leverage. In test cases run by OSCAR members, it was found that segmentation differences lessened 100% TM matches by up to 15% in some case, thus requiring substantial review of text that should have been accepted as full matches.
  • SRX provides a standard way to describe tool functionality. Even outside of a TMX environment, SRX provides a mechanism for any tool that segments text—word processors, content management systems, etc.—to describe how it segmented text to ensure that it can interoperate with other systems.

History

SRX 1.0 was adopted in April 2004 by OSCAR.

SRX 2.0 was adopted in April 2008 by OSCAR.

SRX Specification

The SRX 2.0 (April 2008) specification can be downloaded in the following formats. The current files (November 2008) make one minor correction to the schema present in the April 2008 version:

  • PDF (89 KB)
  • XHTML (109 KB)
  • ZIP (64 KB): includes PDF and HTML versions of the standard, plus the XML schema as a separate file.

Previous Versions

The SRX 1.0 (April 2004) specification can be downloaded (GZip/WinZip) or viewed online here. Please note that this specification has now been replaced by SRX 2.0 and is no longer considered an OSCAR standards. It is provided here for purposes of comparison to aid developers in migrating existing implementations from SRX 1.0 to SRX 2.0.