Cleveland Map
Cleveland, Ohio. From the Tiger Mapping Service.

Cutler/Glaeser/Vigdor Segregation Data:
User's Guide

Table of Contents

  1. How Segregation Is Measured
  2. Area Definitions
  3. Sample Selection Criteria
  4. Supplementary Variable Definitions
  5. Citing This Work
  6. References

How Segregation Is Measured

Segregation is the physical separation of members of one racial, ethnic, social, or economic group from members of another group. This dataset measures the segregation of blacks from non-blacks.

No single summary measure can fully characterize segregation, but several indices attempt to capture important aspects. All segregation indices require a city or similar area to be divided up into smaller geographic subunits. The racial composition of these subunits is then used to derive the various indices. Complete definitions of the indices presented here can be found in Cutler, Glaeser, and Vigdor (1999). For additional discussion of segregation indices, see Massey and Denton (1988).


The Index of Dissimilarity, proposed by Duncan and Duncan (1955) , measures the extent to which blacks and non-blacks inhabit different areas of a city. It ranges between zero and one, and can be interpreted as the fraction of blacks that would have to switch areas to achieve an even racial distribution citywide. Zero indicates perfect integration; one indicates perfect segregation -- i.e. blacks and non-blacks inhabit completely different areas. A value above 0.6 is generally considered high.


Even if blacks are concentrated in certain parts of a city, they may have extensive contact with whites within those areas. The Index of Isolation, calculated with slightly different formulas by Bell (1954) and White (1986), measures the extent of black contact with non-blacks. This index also varies between zero and one, with higher values indicating greater black exposure to other blacks. Values above 0.3 are considered high.


The Index of Clustering, defined by White (1983) measures the extent that black neighborhoods abut one another. In this respect, it overcomes shortcomings of the other indices, since they fail to account for the racial composition of neighborhoods surrounding any one area. Unfortunately, this index requires data on area latitude and longitude that we only have for 1990. The index varies (approximately) between zero and one, with a value of one indicating that all blacks (or non-blacks) live in one large cluster of neighborhoods.


The Index of Concentration measures the extent to which blacks inhabit the physically smallest geographic subunits of a city. It ranges from -1 to 1, with extreme values indicating that blacks inhabit the largest or smallest areas, and a value of zero indicating a relatively even spread of blacks across areas. Calculating this index requires data on unit area which we only have for 1990.


The Index of Centralization measures the tendency for a black population to be distributed close to a city's central business district. This index also ranges from -1 to 1, with higher values indicating greater centralization, and a value of zero indicating an even spread. Latitude and longitude data are necessary to calculate this index, thus we only present values for 1990.

Area Definitions

From 1890 to 1940, the Census published reports for city wards. Wards are political units which vary widely in population size and area across cities, for this reason they are not the ideal city subunits for cross-city comparisons. Beginning in 1940, the Census published tract reports. Census tracts are geographically compact areas, usually delimited by major streets, city boundaries, or natural features, and containing approximately 4,000 persons. Tracts are designed to be comparable units across cities and over time. Our segregation indices use wards from 1890 to 1940, and tracts from 1940 to 1990.

Ward-based indices tend to fall below tract-based indices simply because wards are larger areas. To correct for this discrepancy, we suggest using a correction factor of 0.152 for the Dissimilarity Index and 0.157 for the Isolation Index. These factors are simply the mean differences between ward and tract indices for the 47 cities that report both in 1940. Note that these correction factors have not been added to the data presented here.

Through 1950, segregation is defined at the city level. After World War II, rapid suburbanization led to the creation of Metropolitan Statistical Areas (MSAs), groups of counties surrounding a central city. Beginning in 1960, our indices use MSAs rather than cities. For further discussion, see Cutler, Glaeser and Vigdor (1999).

Geographic areas are identified two ways in these files. Where applicable, each line of data contains a 4-digit Federal Information Processing Standard (FIPS) code. These codes identify metropolitan areas; we use the codes in effect as of June 1988. For MSAs without FIPS codes, and for city-level data, we use an eight-character alphabetical code consisting of the first six letters of the city name followed by the two-letter state postal abbreviation. For example, the name code birminal refers to Birmingham, Alabama. In the case of MSAs, name codes refer to the central city.

Sample Selection Criteria

To be included in our sample, a city or MSA must have ward or tract-level data available in a given year. In 1890, cities with over 25,000 inhabitants had ward reports. From 1900 to 1940, the population cutoff was 50,000. Tract reporting began with a select group of large cities in 1940 and 1950. Beginning in 1960, tracts are reported for most MSAs. Note that the number of MSAs has increased substantially since then.

Because segregation measures are most meaningful when the black population is sizable, we sample only cities or MSAs with at least 1,000 blacks. The following table describes the resulting sample.

Year Area Used Geographic Subunit Sample Size
1890 City Ward 60
1900 City Ward 54
1910 City Ward 72
1920 City Ward 90
1930 City Ward 111
1940 City Ward
1950 City Tract 76
1960 MSA Tract 158
1970 MSA Tract 211
1980 MSA Tract 284
1990 MSA Tract 313
Supplementary Variable Definitions

Note: Year identifiers take the place of the underline (__) in the actual data. Also note that supplementary datasets often include values for cities that do not have segregation indices for the given year.

Variable Name Mnemonic Years Available Notes
Population POP__ All --
Black Population B__ All --
Foreign Born Population FB__ 1890-1980 Number of foreign born whites.
Hispanic Population H__ 1970-1990 --
Number of Wards WARD__ 1890-1940 --
Number of Tracts TRACT__ 1940-1990 --
Land Area AREA__ 1900-1940; 1970, 1990 Measured in Acres, 1900-1940; Square Miles 1970 and 1990.
Person-Weighted Density PWDENS 1990 Equal to the density (in persons per square kilometer) of the Census tract where the "average" person lives.
Trolley Passengers Per Capita, 1902 PASSPC 1910 From 1902 Census of Streetcar Railways.
Black Servant Share, 1910 BSVTSH 1910 Share of employed blacks working as domestic servants.
Growth in Urban Mileage, 1950-1960 GUM 1970 Measures the increase in miles of state-maintained roads within urban areas. This is a state-level variable.
Number of Governments, 1962 NGOV62 1990 From 1962 Census of Governments; municipal and township governments only.
Percent of Revenue from Intergovernmental Transfers, 1962 REVIG62 1990 From 1962 Census of Governments.
Median Household Income MEDINC 1990 From 1990 Census
Manufacturing Share MANSHR 1990 Percent of labor force employed in manufacturing.
Income Segregation INCSEG 1990 Dissimilarity Index using high and low income households.
Black Income Segregation BINCSEG 1990 Dissimilarity Index using high and low income black households.
Educational Exposure EDUCEX 1990 Measures black exposure to persons who have attended some college.
Relative Rate of Single Motherhood SINGMOM 1990 Difference between black and nonblack rates. Based on women between the ages of 40 and 60.
Male Education MEDUC 1990 Difference between black and nonblack share of college attendees. Based on men between the ages of 40 and 60.
Time to Work TTWORK 1990 Difference between average black and nonblack commuting time, in minutes.
Cost of Living Index CSTINDEX 1990 Imputed for 59 MSAs, based on income, population, and region.

Additional variables found in the ward and tract data:

Variable Name Mnemonic Years Available Notes
City Indicator CITYFLAG 1890-1950 When equal to 1, the line contains city totals rather than individual ward/tract values. No city totals available after 1950.
Native White, Native Parentage NNP 1890-1930 --
Native White, Foreign or Mixed Parentage NFP 1890-1930, 1960-1970 --
Native White NTVW 1940-1950 --
White Population WHITE 1970-1990 --
Other Race OTHER 1890-1950 --
New Immigrants NEWIMM 1910-1920 Individuals born in Southern/Eastern European countries.
Old Immigrants OLDIMM 1910-1920 Individuals born in Northern European countries.
Other Immigrants OTHIMM 1910-1920 Individuals born in foreign countries outside Europe.
Tract Area TAREA 1990 Measured in square kilometers.
Tract Longitude LONG 1990 --
Tract Latitude LAT 1990 --
Citing This Work

Users of the historical segregation and supplemental data should cite Cutler, Glaeser and Vigdor (1999). Users of the 1990 supplemental data should cite Cutler and Glaeser (1997). The full bibliographic citiations are listed here.

Bell, W. (1954) "A Probability Model for the Measurement of Ecological Segregation." Social Forces 32:357-364.

Cutler, D. and E. Glaeser (1997) "Are Ghettos Good or Bad?" Quarterly Journal of Economics 112:827-872.

Cutler, D., E. Glaeser and J. Vigdor (1999) "The Rise and Decline of the American Ghetto." Journal of Political Economy 107:455-506.

Duncan, O. and B. Duncan (1955) "A Methodological Analysis of Segregation Indices." American Sociological Review 20:210-217.

Massey, D. and N. Denton (1988) "The Dimensions of Residential Segregation." Social Forces 67(2):281-315.

White, M. (1986) "Segregation and Diversity: Measures in Population Distribution." Population Index 52:198-221.

White, M. (1983) "The Measurement of Spatial Segregation." Amercian Journal of Sociology 88:1008-1019.

