Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Part8

Medical Image Format FAQ, Part 2/8

From: dclunie@flash.us.com (David A. Clunie) Newsgroups: alt.image.medical,comp.protocols.dicom,sci.data.formats,alt.answers,comp.answers,sci.answers,news.answers Subject: Medical Image Format FAQ, Part 2/8 Followup-To: alt.image.medical Date: 5 Jul 1997 08:05:29 -0400 Organization: Mollard Consultants Expires: 05 Aug 1997 00:00:00 GMT Message-ID: <5plda9$4pc@flash.ksapax> Reply-To: dclunie@flash.us.com (David A. Clunie) Summary: This posting contains answers to the most Frequently Asked Question on alt.image.medical - how do I convert from image format X from vendor Y to something I can use ? In addition it contains information about various standard formats. Archive-name: medical-image-faq/part2 Posting-Frequency: monthly Sat Jul 5 06:20:06 CDT 1997 Version: 3.15 2.2 ACR/NEMA DICOM 3.0 ACR/NEMA Standards Publications No. PS 3.1-1992 <- DICOM 3 - Introduction & Overview No. PS 3.8-1992 <- DICOM 3 - Network Communication Support No. PS 3.2-1993 <- DICOM 3 - Conformance No. PS 3.3-1993 <- DICOM 3 - Information Object Definitions No. PS 3.4-1993 <- DICOM 3 - Service Class Specifications No. PS 3.5-1993 <- DICOM 3 - Data Structures & Encoding No. PS 3.6-1993 <- DICOM 3 - Data Dictionary No. PS 3.7-1993 <- DICOM 3 - Message Exchange No. PS 3.9-1993 <- DICOM 3 - Point-to-Point Communication No. PS 3.10-???? <- DICOM 3 - Media Storage & File Format No. PS 3.11-???? <- DICOM 3 - Media Storage Application Profiles No. PS 3.12-???? <- DICOM 3 - Media Formats & Physical Media DICOM (Digital Imaging and Communications in Medicine) standards are of course the hot topic at every radiological trade show. Unlike previous attempts at developing a standard, this one seems to have the potential to actually achieve its objective, which in a nutshell, is to allow vendors to produce a piece of equipment or software that has a high probability of communicating with devices from other vendors. Where DICOM differs substantially from other attempts, is in defining so called Service-Object Pairs. For instance if a vendor's MR DICOM conformance statement says that it supports an MR Storage Class as a Service Class Provider, and another vendor's workstation says that it supports an MR Storage Class as a Service Class User, and both can connect via TCP/IP over Ethernet, then the two devices will almost certainly be able to talk to each other once they are setup with each others network addresses and so on. The keys to the success of DICOM are the use of standard network facilities for interconnection (TCP/IP and ISO-OSI), a mechanism of association establishment that allows for negotiation of how messages are to be transferred, and an object-oriented specification of Information Objects (ie. data sets) and Service Classes. Of course all this makes for a huge and difficult to read standard, but once the basic concepts are grasped, the standard itself just provides a detailed reference. From the users' and equipment purchasers' points of view the important thing is to be able to read and match up the Conformance Statements from each vendor to see if two pieces of equipment will talk. Just being able to communicate and transfer information is of course not sufficient - these are only tools to help construct a total system with useful functionality. Because a workstation can pull an image off an MRI scanner doesn't mean it knows when to do it, when the image has become available, to which patient it belongs, and where it is subsequently archived, not to mention notifying the Radiology or Hospital Information System (RIS/HIS) when such a task has been performed. In other words DICOM Conformance does not guarantee functionality, it only facilitates connectivity. In otherwords, don't get too carried away with espousing the virtues of DICOM, demanding it from vendors, and expecting it to be the panacea to create a useful multi-vendor environment. Fred Prior (prior@xray.hmc.psu.edu) has come up with the concept of a User Conformance Statement to be generated by purchasers and to be satisfied by vendors. The idea is that one describes what one expects and hence gives the vendor a chance to realistically satisfy the buyer! Of course each such statement must be tailored to the user's needs, and simply stapling a copy of Fred's statement to a Request For Proposals is not going to achieve the desired objective. Caveat empor. To get more information about DICOM: - Purchase the standards from NEMA. - Ftp the final versions of the drafts in electronic form one of the sites described below. - Follow the Usenet group comp.protocols.dicom. - Get a copy of "Understanding DICOM 3.0" $12.50 from Kodak. - Insist that your existing and potential vendors supply you with DICOM conformance statements before you upgrade or purchase, and don't buy until you know what they mean. Don't take no for an answer!!!! What is all this doing in an FAQ about medical image formats you ask ? Well first of all, in many ways DICOM 3.0 will solve future connectivity problems, if not provide functional solutions to common problems. Hence actually getting the images from point A to B is going to be easier if everyone conforms. Furthermore, for those of us with old equipment, interfacing it to new DICOM conforming equipment is going to be a problem. In otherwords old network solutions and file formats are going to have to be transformed if they are going to communicate unidirectionally or bidirectionally with DICOM 3.0 nodes. One is still faced with the same old questions of how does one move the data and how does one interpret it. The specifics of the DICOM message format are very similar to the previous versions of ACR/NEMA on which it is based. The data dictionary is greatly extended, and certain data elements have been "retired" but can be ignored gracefully if present. The message itself can now be transmitted as a byte stream over networks, rather than using a point-to-point paradigm excusively (though the old point-to-point interface is available). This message can be encoded in various different Transfer Syntaxes for transmission. When two devices ("Application Entities" or AE) begin to establish an "Association", they negotiate an appropriate transfer syntax. They may choose an Explicit Big-Endian Transfer Syntax in which integers are encoded as big-endian and where each data element includes a specific field that says "I am an unsigned 16 bit integer" or "I am an ascii floating-point number", or alternatively they can fall back on the default transfer syntax which every AE must support, the Implicit Little-Endian Transfer Syntax which is just the same as an old ACR/NEMA message with the byte order defined once and for all. This is all very well if you are using DICOM as it was originally envisaged - talking over a network, negotiating an association, and determining what Transfer Syntax to use. What if one wants to store a DICOM message in a file though ? Who is to say which transfer syntax one will use to encode it offline ? One approach, used for example by the Central Test Node software produced by Mallinkrodt and used in the RSNA Inforad demonstrations, is just to store it in the default little-endian implicit syntax and be done with it. This is obviously not good enough if one is going to be mailing tapes, floppies and optical disks between sites and vendors though, and hence the DICOM group decided to define a "Media Storage & File Format" part of the standard, the new Parts 10, 11 and 12 which have recently passed their ballot and should be available in final form from NEMA soon. Amongst other things, Part 10 defines a generic DICOM file format that contains a brief header, the "DICOM File Meta Information Header" which contains a 128 byte preamble (that the user can fill with anything), a 4 byte DICOM prefix "DICM", then a short DICOM format message that contains newly defined elements of group 0002 in a specified Transfer Syntax, which uniquely identify the data set as well as specifying the Transfer Syntax for the rest of the file. The rest of the message must specify a single SOP instance. The length of the brief message in the Meta Header is specified in the first data element as usual, the group length. Originally the draft of Part 10 specified the default Implicit Value Representation Little Endian Transfer Syntax as the DICOM File Meta Information Header Transfer Syntax, which is in keeping with the concept that it is the default for all other parts of the standard. The final standard has changed this to Explicit Value Representation Little Endian Transfer Syntax. This seems like an extremely irritating change to me but I guess purists have prevailed. So what choices of Transfer Syntax does one have and why all the fuss ? Well the biggest distinction is between implicit and explicit value representation which allows for multiple possible representations for a single element, in theory at least, and perhaps allows one to make more of an unknown data element than one otherwise could perhaps. Some purists (and Interfile people) would argue that the element should be identified descriptively, and there is nothing to stop someone from defining their own private Transfer Syntax that does just that (what a heretical thought, wash my mouth out with soap). With regard to the little vs. big endian debate I can't see what the fuss is about, as it can't really be a serious performance issue. Perhaps more importantly in the long run, the Transfer Syntax mechanism provides a means for encapsulating compressed data streams, without having to deal with the vagaries and mechanics of compression in the standard itself. For example, if DICOM version 3.0, in addition to the "normal" Transfer Syntaxes, a series are defined to correspond to each of the Joint Photographic Experts Group (JPEG) processes. Each one of these Transfer Syntaxes encodes data elements in the normal way, except for the image pixel data, which is defined to be encoded as a valid and self-contained JPEG byte stream. Both reversible and irreversible processes of various types are provided for, without having to mess with the intricacies of encoding the various tables and parameters that JPEG processes require. Presumably a display application that supports such a Transfer Syntax will just chop out the byte stream, pass it to the relevant JPEG decode, and get an uncompressed image back. Contrast this approach with that taken by those defining the TIFF (Tagged Image File Format) for general imaging and page layout applications. In their version 6.0 standard they attempted to disassemble the JPEG stream into its various components and assign each to a specific tag. Unfortunately this proved to be unworkable after the standard was disseminated and they have gone back to the drawing board. Now one may not like the JPEG standard, but one cannot argue with the fact that the scheme is workable, and a readily available means of reversible compression has been incorporated painlessly. How effective a compression scheme this is remains to be determined, and whether or not the irreversible modes gain wide acceptance will be dictated by the usual medico-legal paranoia that prevails in the United States, but the option is there for those who want to take it up. There is of course no reason why private compression schemes cannot be readily incorporated using this "encapsulation" mechanism, and to preserve bandwidth this will undoubtedly occur. This will not compromise compatibility though, as one can always fall back to a default, uncompressed Transfer Syntax. The DICOM Working Group IV on compression will undoubtedly bring out new possibilities. Currently there is a lot of interest in RLE (Run Length Encoded) compression, and the TIFF PackBits mechanism has been adopted the Ultrasound group. In order to identify all these various syntaxes, information objects, and so on, DICOM has adopted the ISO concept of the Unique Identifier (UID) which is a text string of numbers and periods with a unique root for each organization that is registered with ISO and various organizations that in turn register others in a hierarchical fashion. For example 1.2.840.10008.1.2 is defined as the Implicit VR Little Endian Transfer Syntax. The 1 identifies ISO, the 2 is the ISO member body branch, the 840 is the specific member body country code, in this case ANSI, and the 10008 is registered by ANSI to NEMA for DICOM. UID's are also used to uniqely identify non-DICOM specific things, such as information objects. These are constructed from a prefix registered to the supplier or vendor or site, and a unique suffix that may be generated from say a date and time stamp (which is not to be parsed). For example an instance of a CT information object might have a UID of 1.2.840.123456.002.999999.940623.170717 where a (presumably US) vendor registered 123456, and the modality generated a unique suffix based on its device number, patient hospital id, date and time, which have no other significance other than to create a unique suffix. The other important new concept that DICOM introduced was the concept of Information Objects. In the previous ACR/NEMA standard, though modalities were identified by a specific data element, and though there were rules about which data elements were mandatory, conditional or optional in ceratin settings, the concept was relatively loosely defined. Presumably in order to provide a mechanism to allow conformance to be specified and hence ensure interoperability, various Information Objects are defined that are composed of sets of Modules, each module containing a specific set of data elements that are present or absent according to specific rules. For example, a CT Image Information Object contains amongst others, a Patient module, a General Equipment module, a CT Image module, and an Image Pixel module. An MR Image Information module would contain all of these except the CT Image module which would be replaced by an MR Image module. Clearly one needs descriptive information about a CT image that is different from an MR image, yet the commonality of the image pixel data and the patient information is recognized by this model. Hence, as described earlier, one can define pairs of Information Objects and Services that operate on such objects (Storage, Query/Retrieve, etc.) and one gets SOP classes and instances. All very object oriented and initially confusing perhaps, but it provides a mechanism for specifying conformance. From the point of view of an interpreters of a DICOM compatible data stream this means that for a certain instance of an Information Object, certain information is guaranteed to be in there, which is nice. As a creator of such a data stream, one must ensure that one follows all the rules to make sure that all the data elements from all the necessary modules are present. Having done so one then just throws all the data elements together, sorts them into ascending order by group and element order, and pumps them out. It is a shame that the data stream itself doesn't reflect the underlying order in the Information Objects, but I guess they had to maintain backward compatibility, hence this little bit of ugliness. This gets worse when one considers how to put more than one object in a folder inside another object. At this point I am tempted to include more details of various different modules, data elements and transfer syntaxes, as well as the TCP/IP mechanism for connection. However all this information is in the standard itself, drafts of which are readily available electronically from ftp sites, and in the interests of brevity I will not succumb to temptation at this time. 2.3 Papyrus Papyrus is an image file format based on ACR/NEMA version 2.0. It was developed by the Digital Imaging Unit of the University Hospital of Geneva for the European project on telemedicine (TELEMED project of the RACE program), under the leadership of Dr. Osman Ratib (osman@cih.hcuge.ch). The University Hospital of Geneva uses Papyrus for their hospital-wide PACS. The medical file format component of Papyrus version 2 extended the ACR/NEMA format, particularly in order to reference multiple images by placing folder information referencing ACR-NEMA data sets in a shadow (private) group. Contributing to the development of DICOM 3, the team are updating their format to be compatible with the offline file format provisions of the draft Part 10 of DICOM 3 in Papyrus version 3. The specifications, toolkit and image manipulation software that is Papyrus aware, Osiris, is available for the Mac, Windows, and Unix/X11/Motif by ftp from ftp://expasy.hcuge.ch/pub/Osiris. See also Papyrus and Osiris. Further information is available in printed form. Contact yves@cih.hcuge.ch (Yves Ligier). 2.4 Interfile V3.3 Interfile is a "file format for the exchange of nuclear medicine image data" created I gather to satisfy the needs of the European COST B2 Project for the transfer of images of quality control phantoms, and incorporates the AAPM (American Association of Physicists in Medicine) Report No. 10, and has been subsequently used for clinical work. It specifies a file format composed of ascii "key-value" pairs and a data dictionary of keys. The binary image data may be contained in the same file as the "administrative information", or in a separate file pointed to by a "name of data file" key. Image data may be binary integers, IEEE floating point values, or ascii and the byte order is specified by a key "imagedata byte order". The order of keys is defined by the Interfile syntax which is more sophisticated than a simple list of keys, allowing for groups, conditionals and loops to dictate the order of key-value pairs. Conformance to the Interfile standard is informally described in terms of which types of image data types, pixel types, multiple windows, special Interfile features including curves, and restriction to various maximum recommended limits. Interfile is specifically NOT a communications protocol and strictly deals with offline files. There are efforts to extend Interfile to include modalities other than nuclear medicine, as well as to keep ACR/NEMA and Interfile data dictionaries in some kind of harmony. A sample list of Interfile 3.3 key-value pairs is shown here to give you some idea of the flavor of the format. The example is culled from part of a Static study in the Interfile standard document and is not complete: !INTERFILE := !imaging modality :=nucmed !version of keys :=3.3 data description :=static patient name :=joe doe !patient ID :=12345 patient dob :=1968:08:21 patient sex :=M !study ID :=test exam type :=test data compression :=none !image number :=1 !matrix size [1] :=64 !matrix size [2] :=64 !number format :=signed integer !number of bytes per pixel :=2 !image duration (sec) :=100 image start time :=10:20: 0 total counts :=8512 !END OF INTERFILE := One can see how easy such a format would be to extend, as well as how it is readable and almost useable without reference to any standard document or data dictionary. Undoubtedly ACR/NEMA DICOM 3.0 to Interfile translators will soon proliferate in view of the fact that many Nuclear Medicine vendors supply Interfile translators at present. To get hold of the Interfile 3.3 standard, see the Interfile sources, Interfile information contacts and Interfile mailing list described later in this document. 2.5 Qsh Qsh is a family of programs for manipulating images, and it defines an intermediate file format. The following information was derived with the help of one of the authors maguire@it.kth.se(Chip Maguire): Uses an ASCII key-value-pair (KVP sic.) system, based on the AAPM Report #10 proposal. This format influenced both Interfile and ACR-NEMA (DICOM). The file format is referred to as "IMAGE" in some of their articles (see references). The header and the image data are stored as two separate files with extensions *.qhd and *.qim respectively. Qsh is available by anonymous ftp from the Qsh ftp site. This is a seriously large tar file, including as it does some sample images, and lots of source code, as well as some post-script documents. Subtrees are available as separate tar files. QSH's Motif-based menu system (qmenu) will work with OpenWindows 3.0 if SUN patch number 100444-54 for SUNOS 4.1.3 rev. A is applied. The patch is available from ftp://sunsolve1.sun.com (192.9.9.24). The image access subroutines take the same parameters as the older /usr/image package from UNC, however, the actual subroutines support the qsh KVP and image data files. The frame buffer access subroutines take the same parameters as the Univ. of Utah software (of the mid. 1970s). The design is based on the use of a virtual frame buffer which is then implemented via a library for a specific frame buffer. There exists a version of the the display routines for X11. Conversions are not supported any longer, instead there is a commercial product called InterFormat. InterFormat includes a qsh to Interfile conversion, along with DICOM to qsh, and many others. Information is available from reddy@nucmed.med.nyu.edu (David Reddy) (see InterFormat in the Sources section). [Editorial note: this seems a bit of a shame to me - the old distribution included lots of handy bits of information, particularly on driving tape drives. I am advised however that the conversion stuff was pulled out because people wanted it supported, the authors were worried they were disclosing things perhaps they ought not to be, and NYU had switched to using InterFormat themselves anyway. DAC.] The authors of the qsh package are: - maguire@it.kth.se (Gerald Q. (Chip) Maguire) - noz@nucmed.NYU.EDU (Marilyn E Noz) The following references are helpful in understanding the philosophy behind the file format, and are included in postscript form in the qsh ftp distribution: @Article[noz88b, Key=<noz88b>, Author=<M. E. Noz and G. Q. Maguire Jr.>, Title=<QSH: A Minimal but Highly Portable Image Display and Processing Toolkit>, Journal=<Computer Methods and Programs in Biomedicine>, volume=<27>, month=<November>, Year=<1988>, Pages=<229-240> ] @Article[maguire89e, Key=<maguire>, Author=<G.Q. Maguire Jr., and M.E. Noz>, Title=<Image Formats: Five Years after the AAPM Standard Format for Digital Image Interchange>, Journal=<Medical Physics>, volume=<16>, month=<September/October>, year=<1989>, pages=<818-823>, comment=<Also as CUCS-369-88> ] 2.6 DEFF DEFF (Data Exchange File Format) is a portable image file format designed for the exchange, printing and archiving of ultrasound images. It is written by John Bono of ATL from whom the specification may be obtained. The latest version is 2.5, March 25, 1994. It is based on the TIFF 5.0 specification, though a more recent version, TIFF 6.0 is available. Theorectically, any TIFF reader should be able to read the standard tags from a DEFF image, so long as only 8 bit images are in use, as in the Camera Ready class of DEFF images for instance. Additional support is provided for multi-frame images, and 9 to 16 bit images by extending the TIFF format. Because Aldus only allocates a small number of unique registered tags to each vendor, ATL have defined their own extensive set of additional tags, which are referenced by using one of the registered tags ExtendedTagsOffset. Hence these additional tags will not be visible to a conventional TIFF reader. The next part is part3 - proprietary CT formats.

Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Part8

Send corrections/additions to the FAQ Maintainer:
dclunie@flash.us.com (David A. Clunie)

Last Update August 06 1997 @ 03:04 AM

faq-admin@faqs.org