The Development of the DSM

(Primarily from Blashfield, 1998, DSM-IV-TR, Scotti and Morris, 2000)

    Initially, the need for a classification system in the United States was driven by the collection of statistical information in the census.  During the 1840 census, the frequency of “idiocy/insanity” was recorded.  By the 1880 census, 7 categories of insanity had been established, including mania, melancholia, monomania, paresis, dementia, dipsomania, and epilepsy (DSM-IV-TR, 1994.)

                In 1917, the American Psychiatric Association (APA) and National Commission on Mental Hygiene adopted a system similar to Kraepelin’s.  By the end of World War II, there were four competing classification systems: APA’s 1932 revision, US Army’s system, US Navy’s system, and the system of the Veteran’s Administration.

                In addition, the first International Classification of Diseases (ICD) was released in 1900 to provide a standard format for morbidity and mortality statistics.  ICD-6 was the first edition of that series to included mental disorders.  It offered 10 categories for psychoses, 9 for neuroses, and 7 for disorders of character, behavior, and intelligence.  However, ICD-6 lacked organization and was not accepted widely (Blashfield, 1998).

                At this time in the United States, there were 5 competing systems.  To lessen the confusion, APA’s Committee on Nomenclature and Statistics began work on the first edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM).

                The purpose of DSM-I was to create a common nomenclature based on a consensus of the contemporary knowledge about psychiatric disorders.  APA sent questionnaires to 10% of its membership and asked for comments on the proposed categories.  The final version, which assigned categories based on lists of symptoms, was approved by a vote of the membership and published in 1952 (Blashfield, 1998).  DSM-I included 3 categories of psychopathology: organic brain syndromes, functional disorders, and mental deficiency.  These categories contained 106 diagnoses.  Only one diagnosis, Adjustment Reaction of Childhood/Adolescence, could be applied to children.

    DSM-II was published in 1968 to further facilitate communication among professionals.  It had 11 major diagnostic categories.  Increased attention was given to the problems of children and adolescence with the categorical addition of Behavior Disorders of Childhood-Adolescence.  This category included Hyperkinetic Reaction, Withdrawing Reaction, Overanxious Reaction, Runaway Reaction, Unsocialized Aggressive Reaction, and Group Delinquent Reaction (Scotti and Morris, 2000).  There were 185 diagnoses in DSM.

    DSM-I and DSM-II were widely criticized for a variety of reasons.  Most importantly, the reliability and validity of the first two editions were challenged (Blashfield, 1998; Kirk and Kutchins, 1994).  The diagnostic descriptions were not detailed, leaving lots of room for error.  Additionally, the descriptions had been written by a small number of academics rather than empirical studies.  Many psychiatrists criticized the implicit medical model, stating that it was inappropriate because the cause of most disorders was unknown.

    The most outspoken critic was Thomas Szasz, who became the leader of the anti-psychiatrists.  His book The Myth of Mental Illness (1961) claimed that mental disorders are really “problems in living” and accused psychiatrists of being “moral policemen” (Blashfield, 1998).  Fear was developing surrounding stigmatization and self-fulfilling prophecies.  Rosenhan’s On Being Sane in Insane Places (1973) increased anxiety about the reliability and validity of the first two editions of DSM, and about psychiatry/psychology as a profession (

    Rosenhan’s study caused Robert Spitzer and others to question the reliability of the DSM-II.  In 1974, Spitzer and Fleiss wrote “A Re-Analysis of the Reliability of Psychiatric Diagnosis.”  Spitzer and Fleiss used kappa, a statistic of inter-rater reliability, to re-compute the findings of 6 earlier studies.  Here are their findings:

There are no diagnostic categories for which reliability is uniformly high.  Reliability appears to be satisfactory for three categories: mental deficiency, organic brain syndrome (but not its subtypes), and alcoholism.  The level of reliability is no better than fair for psychosis and schizophrenia and is poor for the remaining categories.
Spitzer and Fleiss, 1974

    Spitzer and others began work on the Research Diagnostic Criteria (RDC), or research-based criteria with associated structured interviews to increase reliability.  Because of the impact of his work on the RDC, Spitzer became the named the head of the DSM-III Task Force (Kirk and Kutchins, 1994). 

    The other major work for the creation of DSM-III was “Diagnostic Criteria for Use in Psychiatric Research” (Feighner et al., 1972; in Blashfield, 1998).  Focusing on the problem of uniform definitions, Feighner et al. offered explicit criteria for 15 categories.  They also presented a considerable amount of evidence of validity.  Because it seemed that Feighner et al. had found the answer to the DSM’s problems, those criteria were accepted immediately.

    DSM-III was revolutionary in many ways.  Although based on the medical model, it purported to be atheoretical.  It included a multiaxial system for assessment of the client as an individual as well as a family and community member.  Unlike its predecessors, DSM-III was based on scientific evidence.  Its reliability was improved with the addition of explicit diagnostic criteria and structured interviews.  DSM-III stimulated additional research to ensure the adequacy of criteria.  Social and political debates over terminology and diagnoses, such as the use of the word “neurosis” and the removal of homosexuality as a diagnosis, were frequent.  DSM-III was so popular that its revenues lead to the formation of the American Psychiatric Press.

    Despite its innovations, DSM-III also presented challenges to professionals.  It contained 482 pages, a tremendous increase over DSM-II’s 92 pages.  DSM-III also contained 265 diagnoses to DSM-II’s 185 diagnoses (  It was not user-friendly in its bulk or in its categories.  A new problem of international communication was becoming apparent.  Although ICD and DSM were similar in terms of criteria, their codes were very different.  Kirk and Kutchins (1994) claimed that there were still issues of reliability in DSM-III.  For example, the reliability statistics were computed across broad classes of diagnoses.  Therefore, if one clinician determines a patient has Histrionic Personality Disorder, and another clinician determines that same patient has Borderline Personality Disorder, they are in perfect agreement – they both diagnosed a personality disorder.  Due to new research, field trials, and the problem of coding, APA published DSM-III-R in 1987.

    DSM-III-R was intended to be a short update to the 3rd edition manual; however, the differences between III and III-R were great.  DSM-III-R saw categories renamed, reorganized, and significant changes in criteria.  Six new categories were deleted while others, such as Trichotillomania, were added.  Additional assurances of reliability were presented in the form of field trials and more diagnostic interviews.  Controversial diagnoses, such as Premenstrual Syndrome, Masochistic Personality Disorder, and Paraphilic Rapism were considered and discarded due to their social implications (Blashfield, 1998). Altogether, DSM-III-R contained 297 diagnoses.

    Although widely accepted, DSM-III and DSM-III-R were also widely criticized.  First, the scientific evidence was questioned.  Many of the field trials were conducted by experts in the field so true objectivism could not be assured.  The multiaxial system prevented efficiency in diagnosis.  Additionally, DSM offered a different amount of support and direction for each axis.  While there were 300 pages of description for Axis I and 39 pages for Axis II, Axes IV and V were given only 2 pages each (Blashfield, 1998).  The rating scale format of IV and V was also foreign to many professionals.  The axes themselves were problematic for many practitioners because no one seemed to know how those particular areas were chosen.  Psychoanalysts began to argue for an axis on defense mechanisms, and nurses wanted an axis for level of care.

    DSM-IV was published in 1994 to reflect research conducted since 1987’s DSM-III-R.  DSM-IV was a major undertaking.  It involved a steering committee of 27 people, including 4 psychologists.  The steering committee created 13 work groups of 5-16 members.  Each work group had approximately 20 advisors.  Individuals were chosen for their knowledge in the field as well as to maximize diversity.  The work groups conducted a 3-step process.  First, each group conducted an extensive literature review of their diagnoses.  Then, they requested for data from researchers, conducting analyses to determine which criteria required change.  Finally, they conducted multicenter field trials to be sure that clinical research was relevant to clinical practice (Frances, Mack, Ross, and First, 2000.)

    DSM-IV saw the restructuring of several categories, such as the inclusion of Overanxious Disorder of Childhood within GAD (Scotti and Morris, 2000.)  The multiaxial system was maintained.  DSM-IV offered detailed information about each disorder, including essential and associated features; presence, course, and familial pattern; differential diagnosis; and age, gender, and culture.  Source books, decision trees, glossaries, and alphabetical and numerical listings provided ways to increase the manual’s utility.  DSM-IV included 365 diagnoses, and at 886 pages, it was more than 7 times longer than DSM-II (Blashfield, 1998.)

    Like its predecessors, DSM-IV was criticized.  It was accused of leaning toward biological explanations although it purported to be atheoretical.  Comorbidity, symptom overlap, and heterogeneity of presentation were seen as threats to reliability.  It did not solve the problem of PMS and the other controversial disorders; it simply listed them among disorders requiring further study.  Additionally, the axes problem remained unsolved, with 3 candidates (defense mechanisms, interpersonal functioning, and occupational functioning) still in the running (Blashfield, 1998.)

                DSM-IV-TR (2000) was released to correct any factual errors and make changes to reflect recent research.  It did not attempt to address any of the problems of DSM-IV.  Rather, the changes were limited to text, with particular emphasis placed on client-centered speech.  Phrases such as “a schizophrenic” were removed and replaced with “an individual with Schizophrenia” in an effort to classify disorders, not people (DSM-IV-TR, 2000.)

                What will DSM-V be like?  Fogel (in Brendel, 2001) suggests that it might become more descriptive with the removal of all etiological implications, even the phrase “due to a General Medical Condition.”  Medical disorders could continue to appear on Axis III but not on I.  Similarly, Fogel suggests that psychosocial stressors appear only on Axis IV, not Axis I.  However, he concedes that these would limit the practitioner’s ability to fully integrate data.  The controversies over social diagnoses and the multiaxial system must also be addressed in DSM-V.