METHODOLOGICAL NOTE
Traditional approaches can create difficulties when one is attempting to accurately sample congressional districts. A dual frame alternative can be an effective solution.
Creating representative telephone samples for congressional districts (CDs) can be extremely problematic.CDs are fashioned as part of a political process and their geographic boundaries are determined based on total population with little or no regard to existing geographic definitions such as ZIP Codes or counties. The resulting geographic definitions are non-standard in nature and do not follow nice neat geographic boundaries. This can cause those rather bizarre shapes drawn on maps known as gerrymanders. Further complicating the issue is that most respondents won't be able to identify which CD they are in. This makes it nearly impossible to use them as a source of information accurate enough for sampling purposes.
This Methodological Note will review the problems associated with previous CD sampling methods, and then detail current approaches that provide solutions to these problems. The net result is a sample frame that provides accurate geographic respresentation of the CD, requires little if any reliance on the respondent for screening purposes, and is cost effective in the data collection phase of the project.
History of CD Sampling
In the past, commercial approaches to sampling CDs have utilized approximations rather than exact area definitions. This resulted in poor coverage of the CD or, even worse, erroneous coverage of neighboring CDs. Further complicating the issue, biases were also introduced based on the type of sample utilized: random digit dial (RDD) or listed sample frames.
RDD Approach. Sampling CDs using RDD telephone sampling involves approximations of actual CD geographic boundaries. Typically, ZIP Codes (usually only those that serve the CD exclusively) are used as a first approximation. This results in the elimination of an unknown proportion of the voting age population and the likely inclusion of households that are geographically located in a neighboring district. As far as we are aware, the commercially available CD samples do not provide estimates of either over or under coverafe. Consequently, no reliable assessments of the potential biases are available.
The RDD approach's major advantage is its ability to include unlisted and unpublished household numbers in the sampling frame.
Listed Household Approach. The listed household approach provides the ability to select samples that provide almost 100% congruence with CD boundaries. Most major database suppliers designate households at the Census Tract/Block Group (CT/BG) level which maps almost exactly to CD boundaries.
The major problem with using the listed household frame is the elimination of unlisted and unpublished numbers from the frame. In some areas, this can mean that 50% or more of the voting age population would be excluded from the sample frame.
There are obviously significant limitations involved when attempting to sample CDs using either of the above approaches. GENESYS Sampling Systems would like to suggest an alternative that utilizes new data sources as well as the best of both of the above methods.
The Dual Frame Alternative
The dual frame approach begins with accurate definitions of the CD boundaries and then utilizes the strengths of both sample frames; the household coverage provided by the RDD sampling approach and the geographic speficity of listed samples. The dual frame sampling approach can achieve near 100% geographic coverage of the CD, virtually eliminate over coverage, minimize exclusion of unlisted households, and you don't have to rely on the respondent for screening purposes. It is estimated that upwards of 95% of all households will be included in the sample frame.
The dual frame approach begins with the construction of the basic sample frame. It is a multistage process, involving a number of diverse data sources:
- The CD is defined at the CT/BG level using Census geography. This provides an accurate geographic definition of the CD
- Using the detailed exchange/CT/BG files maintained by GENESYS, a summary of CD/exchange coverage is created. These files provide both the CDs served by each individual telephone exchange, and the number of listed telephone households served in each CD. This information is available for every telephone exchange and CD in the US, including Alaska and Hawaii.
- A CD coverage report is created. This report details the number and proportion of households served by each exchange that are located in the CD of interest. The exchanges are ranked from highest to lowest in terms of the proportion of households residing in each CD along with, the cumulative proportion of the CD represented.
From here GENESYS provides a number of different CD sampling options to the researcher, based on how good the "fit" is between the exchange's boundaries and the CD boundary. Using the CD coverage report:
- Select inly those exchanges that serve the CD in their entirety (i.e. 100% of the exchange is in the CD). This option is effective if these 100% exchanges also provide good coverage of the CD. So, the advisability of this option will depend upon the proportion of the CD households covered by the exchanges.
- Select those exchanges that serve the CD in their entirety plus exchanges that are close to 100% in the CD. This will increase coverage of the target CD, but will result in a "small" proportion of spill-over into neighboring CDs (since some of the exchanges will be less than 100% in the CD).
- Incorporate the full CD methodology, which provides for RDD sampling in the 100%, or nearly 100% coverage exchanges, and a listed household sample in those exchanges where significant proportions of households reside outside the target CD.
This third option is a dual frame approach that assures full geographic coverage while minimizing potential biases that can be caused by sampling only listed telephone households. Moreover, the nominal sample sizes for each frame can be determined directly from the CD coverage report, which indicates the proportion of total CD households served by the exchanges in each frame.
Which Option is Best?
Each CD will behave differently. Its geographic boundaries must be examined and the CD-exchange coverage report must be reviewed in order to determine which option will be most effective. As with every sampling application, the goals for that particular study have to be taken into account. But, as you can see, CD telephone sampling no longer needs to be a matter of mere approximations and no longer requires coverage compromises.