From Challenges for the New Century
p. 109-122, published 2000
This analysis uses data from the March Supplement of the Current Population Survey (CPS) compiled by Unicon Research Corporation. The CPS is a monthly survey conducted by the U.S. Bureau of the Census to obtain data used in estimating the official unemployment statistics for the U.S. government. Each March the core labor questions are supplemented with questions on income, poverty and geographic mobility. The sample is comprised of approximately 62,500 housing units from 792 geographic regions.(1) People are asked questions about amounts and sources of income from the previous year (i.e., the income data from the March 1999 survey refer to income from 1998).
Definition of Income. The survey provides information on pre-tax cash income from a variety of sources. These sources include (1) money wages or salary; (2) net income from nonfarm self-employment; (3) net income from farm self-employment; (4) Social Security or railroad retirement; (5) Supplemental Security Income; (6) public assistance or welfare payments; (7) interest (on savings or bonds); (8) dividends, income from estates or trusts, or net rental income; (9) veterans’ payment or unemployment and workmen’s compensation; (10) private pensions or government employee pensions; and (11) alimony or child support, regular contributions from persons not living in the household and other periodic income. The data on income do not include post-tax income, income from capital gains or such non-cash benefits as food stamps, school lunches or housing subsidies. In addition, the income measures used in this report have been converted to real 1998 dollars using the CPI-UX1 as the appropriate price deflator.
Definition of Family Income. The level of analysis is the family. Rather than use census-defined family income, this report sums individual incomes for persons included in the same family. Persons are grouped together as one family if they are related and living in the same housing unit. For instance, a married couple and relatives with whom they live, such as an elderly mother or a grown child with a spouse and child, are counted as one family. The incomes of each of these members would be included in the summation of that family’s income. Unrelated individuals living together count as separate families. Examples of this situation are an unmarried couple living together each with their children from previous relationships, and three single, unrelated individuals living together as roommates, which would be counted as two and three families, respectively. The only case in which an unrelated individual is included in a family is that of a child under the age of 18 living with an unrelated family or adult individual.
Although this definition allows for families of various sizes, the income measures have been adjusted to represent a family of four.(2) This is done using poverty-line estimates for the various family sizes. Total family income is divided by the poverty line for that family size and then multiplied by the poverty line for a family of four. This family-size correction reflects economies of scale in consumption enjoyed by people who live in larger families. Therefore families of different sizes with the same income will have different levels of adjusted family income. For example, the US poverty scales imply that a family of four requires about twice, rather than four times, the income of a single individual to be equally well off.
Treatment of Top-coded Variables. Total personal income is made up of income from earnings and from non-earned sources. In several years these sources of income would be top-coded to protect the privacy of survey respondents and help ensure anonymity. For instance, in 1977 "earnings from wage and salary" was top-coded at $50,000. If a respondent reported a value for this category that was equal to or exceeded $50,000, the reported value would be $50,000. Top-codes have changed over time. Increasing the magnitude of the top-code can increase measured income inequality even when the true underlying distribution of income does not change. Other complications arise from the simple fact that top-coding can mask changes in the measure of income inequality caused by the extremes of the income distribution.(3)
To limit biases in measures of inequality due to changing top codes, the percentage top-coded was standardized across every year. Using total personal income, a person’s income is counted as top-coded if it exceeds the top reported value for "earnings from wage and salary." The top-code value for "earnings from wage and salary" was $50,000 from 1976 to 1981. It changed to $75,000 from 1982 to 1984, $99,999 from 1985 to 1995 and $150,000 from 1996 to 1999. All income values greater than these levels for the appropriate years were counted as top-coded. The percentage of incomes top-coded using this method was calculated for each year of the sample. To ensure a consistent distribution of top-coded incomes for the 20-year period, the highest calculated percentages, which were 1.4 percent at the state level and 1.7 percent for the United States, were used to determine the proportion of incomes to be top-coded in every year. Therefore, the 98.6 income percentile for Kentucky and the 98.3 income percentile for the United States for each year were used as the "cutoff" incomes for that year and all incomes greater than those values were set equal to those values at the respective levels of analysis. These maximum amounts of top-coded incomes both occurred in 1995.
Percentile and Ratio Estimation. Trends in income distribution were analyzed using percentiles and ratios. Incomes at the 10th, 25th, 50th, 75th and 90th percentiles were estimated. The 75-25 ratio demonstrated the relative difference between 75th percentile income and 25th percentile income by dividing the 75th percentile income by the 25th percentile income. If this ratio increased, this would demonstrate an increase in the income gap between these two percentile levels, while a decline would show a narrowing of the gap. A ratio of 3.0 showed that income at the 75th percentile was three times that of income at the 25th percentile for that year. To identify more clearly the long-term trend, three-year moving averages of these ratios were used at the state level to smooth out any year-to-year variations resulting primarily from small sample size. Finally, the percentage change in incomes between the beginning and end of the period were analyzed for all five percentile levels.
Since both dependent variables, access to a computer at home and utilization of network services, are binary (yes or no), we use a multivariate probit model to estimate the effect of the predictor variables of income, education, race, gender, location, and age.
Income. The CPS does not report the precise household family income. Rather, it is reported in 14 broad categories. We divided these 14 categories into quartiles for the analysis: the bottom quartile ($0-$19,999) includes 26.8 percent of the sample; the second quartile ($20,000-$34,999) includes 23 percent; the third quartile ($35,000-$59,999) includes 27 percent; and the fourth quartile ($60,000 and over) includes 23.2 percent. In our analysis the first quartile is omitted from the model and is the reference group. We also included a variable in the model, MISINC, if the income data are missing for a household.
Education. Educational attainment is collected for individuals 15 years and older. There are 16 categories, which range from "less than 1st grade" to doctorate. Given this categorization scheme, references to persons with a high school degree or equivalent does not mean we are referring to all persons with a high school degree or equivalent level of educational attainment and higher. These references are made about those people who have a high school degree or equivalent only. This holds true for all levels of education used in this analysis, including a bachelor’s degree.
Race and Ethnicity. We use five variables to test the effect of race and ethnicity. They are a series of dichotomous variables: non-Hispanic Whites, non-Hispanic Blacks, Hispanics, Asians, and Native Americans. The variable "white" is left out of the model and is therefore the reference group.
Age. This variable is also modeled as a series of dichotomous variables: 0 to 19, 20 to 39, 40 to 59, and 60 and over. The reference group in the model is under 19 years old.
Gender. This variable is equal to 1 for males and 0 for females.
Location of Residence. This is a dichotomous variable indicating whether the residence is in a metropolitan area. The variable (URBAN) is set to 1 if metropolitan and 0 if nonmetropolian.
The probit coefficients and standard errors(4) for the two models are presented in Table 1. Both models have excellent predictive power. If we hold all variables at their mean and calculate a predicted probability for the "average" Kentuckian, the computer access model predicts a probability of .437 and the network services model predicts a value of .369. These percentages are virtually the same as the actual percentages of .432 (computer access) and .368 (network services).
We calculate the "net" percentages by holding all variables at their mean values and changing only the variable of interest. So, to estimate the effect of race, we hold all variables constant at their mean except non-Hispanic Blacks, which we assign the value of 1. The predicted probability or "net" percentage is the estimated effect of this one factor while holding all other factors constant.
Table 2 contains the estimated means ("gross" percentages) and the 95 percent confidence intervals for access to a computer at home and utilization of network services. Refer to table Estimated Gross and Net Percentages of Kentuckians Who Have Access to a Home Computer and Use Network Services, 1998.
In February and March of 2000 the Division of Driver Licensing generated a list of randomly selected 16- and 17-year-old Kentuckians which included 1,500 16-year-olds and 1,500 17-year-olds in the sample. The University of Kentucky Survey Research Center administered the survey. The 4-page, 39-question survey was mailed to these 3,000 individuals June 2-8, 2000. The survey was closed on August 29, 2000, with 1,088 total completions included in the data. Among responses, 85 were considered ineligible, and 1,827 respondents did not answer the survey. The response rate was 37.3 percent (1,088 divided by 2,915). Table 3 shows some sample characteristics.
Table 3: Kentucky High School Students, Sample Characteristics
The dependent variable, whether an individual attends school, is a dichotomous or binary variable (yes or no). There are two questions that we used to determine if one is attending school: Is … attending or enrolled in regular school? (Regular school includes elementary school and schooling which leads to a high school diploma or college, university or professional school degree.); and Excluding regular college courses and on the job training is … taking any business, vocational, technical, secretarial, trade, or correspondence courses?
We used a probit model to estimate the effect of the predictor variables (income, education, race, ethnicity, gender, location, and age) on the probability of attending school. The data are pooled from 1996, 1997, and 1998 to bolster the state-level sample size. The number of observations for analysis of 18 to 44 year olds is 144,299 for the United States and 1,833 for Kentucky. The number of observations for the analysis of 15 to 24 year olds is 48,188 for the United States and 553 for Kentucky.
Income. The CPS does not report precise household family income. Rather, it is reported in 14 broad categories. We divided these 14 categories roughly into quartiles for the analysis. Since the data are pooled from three years, we transformed the data before pooling them. In our analysis the first quartile is omitted from the model and is the reference group. We also included a variable in the model, MISINC, if the income data are missing for a household.
Education. Educational attainment is collected for individuals 15 years old and older. There are 16 categories, which range from "less than 1st grade" to doctorate.
Race and Ethnicity. We use five variables to test the effect of race and ethnicity. They are a series of dichotomous variables: non-Hispanic Whites, non-Hispanic Blacks, Hispanics, Asians, and Native Americans. The variable "white" is left out of the model and is therefore the reference group.
Age. This variable is modeled as a continuous variable.
Gender. This variable is equal to 1 for males and 0 for females.
Location of residence. This is a dichotomous variable indicating whether the residence is in a metropolitan area. The variable (URBAN) is set to 1 if metropolitan and 0 if nonmetropolian.
The probit coefficients and standard errors(5) for the four models are presented in Table 5.
Table 5: Probability of Attending School, Probit Estimates
In the Fall of 1999 the Administrative Office of the Courts generated, from voter registration and driver’s license lists, a random sample of Kentuckians born before January 1, 1955. Included in the sample were the names and addresses of 2,500 persons age 45 and older. The University of Kentucky Survey Research Center administered a 17-page, 69-question survey to these 2,500 individuals between February, 1, 2000 and February 4, 2000. The survey was closed on May 12, 2000, with 962 total completions included in the data. Among the responses, 313 were considered ineligible for various reasons and 1,225 recipients did not answer the survey. The response rate was 44.4 percent (962 divided by 2,187). Table 6 shows some sample characteristics.
Table 6: Kentuckians, Age 45 and Older, Sample Characteristics
To view a list of all chapters in this book, click here. To read the chapters in sequential order, please follow the arrow below.
Back to Policy Options for the New Century
Sample sizes have changed since the survey’s inception in 1940 to accommodate shifting population patterns and population growth. The monthly sample consists of approximately 60,000 housing units, but the March Supplement includes an additional 2,500 housing units that have at least one resident of Hispanic origin. For a detailed discussion of the changes that have taken place during this period see U.S. Census Bureau, Current Population Survey: Design and Methodology, Technical Paper 63, March 2000. Electronic version available online at: www.bls.census.gov/cps/tp/tp63.htm. Return to text.
Other studies that use similar family-size adjustments are Lynn A. Karoly, "Anatomy of the U.S. Income Distribution: Two Decades of Change," Oxford Review of Economic Policy 12.1 (1998): 77-96; Gary Burtless, "Effects of Growing Wage Disparities and Changing Family Composition on the U.S. Income Distribution," Working Paper No. 4, The Brookings Institution, Washington, D.C. (1999). Return to text.
Since the sum of individual incomes was used to create a new family income, the census-defined family income top-code values were not used. Return to text.
The analyses were done using individual weights that approximately equal the inverse of the probability of being in the sample and normalized to add up to the sample size. This produces the correct standard errors. Return to text.
The analyses were done using individual weights that approximately equal the inverse of the probability of being in the sample and normalized to add up to the sample size. This produces the correct standard errors. Return to text.