Individuals in our family-based case-control study were recruited from 2000 to 2006 by the Morris K. Udall PD Research Center of Excellence at Duke University Medical Center to identify genetic and environmental factors that influence the risk of PD. Probands were recruited from Duke University Medical Center clinics, other physician referrals, and self referrals. Individuals from referring physicians were introduced to the study by posted study information or direct physician contact, while self referring individuals were introduced to the study through presentations by study staff at local support groups, the Udall Center web site, other individuals, or media coverage of Udall Center activities. Upon enrollment, probands were asked to contact their affected and unaffected relatives and to request their participation. In families with only one individual with PD, siblings, parents, and spouses were recruited. In families with multiple individuals with PD, all relatives with PD were recruited, and the siblings, parents, and spouses of all individuals with PD were recruited along with relatives connecting the branches of the family. Relatives who agreed to participate were contacted by study staff for enrollment into the study. The difficulty of matching referred cases to appropriate controls was addressed with the family-based nature of the study, as cases and relative controls were well matched on genetic and demographic factors and were thus taken from the same underlying population. However, the participation rates among cases and controls could not be determined. The referral-based nature of the study prevented a sampling frame from being established for cases. For family-based controls, no restriction was placed on the number collected, but other factors, including geography, willingness to participate, enrollment cost, and age, were used to prioritize enrollment of controls when multiple existed. Therefore, there is no simple way to determine the denominator for calculating a participation rate in controls.
All study protocols were approved by the Institutional Review Board at Duke University, and approved consent forms were signed by all individuals prior to enrollment. The enrollment procedure involved a blood sample collection to use as a DNA source for our genetic studies, a detailed medical history questionnaire, at least a three-generation family history report, a standard cognitive status test (either the Blessed Orientation-Memory Concentration test or the Modified Mini-Mental State examination), and an environmental risk factor questionnaire.
The structured, 30 to 45-minute telephone environmental risk factor questionnaire was administered by trained interviewers to gather detailed environmental risk factor data on demographics, health habits, and pesticide and other chemical exposures (see Additional file 1: Risk factor questions used to assess residences, occupations, and pesticide applications). The initial questionnaire was evaluated for face validity by several researchers in the PD research field. Small-scale pilot testing was then performed, and the questionnaire was accordingly revised for procedural content. The final questionnaire implemented in this study has not been formally evaluated for reliability over time.
All individuals were given a standard in-person clinical examination utilizing the full Unified PD Rating Scale (UPDRS) to determine affection status and symptom severity. Individuals with PD demonstrated at least two cardinal signs of PD (resting tremor, rigidity, and/or bradykinesia), an asymmetry of symptom onset, and no atypical signs during examination by a board-certified neurologist. Individuals with PD self-reported age-at-onset, defined as the age at which the first cardinal sign was noticed. A board-certified neurologist, physician assistant, or registered nurse examined unaffected individuals, who had no signs of PD, and unclear individuals, who had only one cardinal sign, a history of encephalitis, neuroleptic therapy within a year prior to diagnosis, evidence of normal pressure hydrocephalus, or unusual clinical features suggestive of atypical or secondary parkinsonism. Individuals with an unclear diagnosis were referred to movement disorder specialists for further examination and excluded from our analyses to minimize phenotypic misclassification. Families with more than one individual with PD, as identified by clinical examination for sampled members or by family history report for members not sampled, were considered positive history families. Otherwise, families with only one individual with PD were considered negative history families.
Measures of pesticide exposure
In the environmental risk factor telephone questionnaire, direct pesticide application was assessed with the following question: "Have you ever applied pesticides to kill weeds, insects, or fungus at work, in your home, in your garden, or on your lawn?" Individuals provided only a "yes/no" answer for this question, so separation of residential applications from occupational applications for analysis was not possible. If the answer was "yes", individuals were asked to list the name of any pesticides they remembered using. For each pesticide, individuals were asked the number of days it was used per year, whether it was currently being used, the years application started and stopped (if applicable), and whether protective gear, such as a mask, rubber gloves, or rubber boots, was used during application. Application of any pesticide chemical by spreading solid granules, spraying by hand, spraying by tractor, spraying by airplane, putting in irrigation water, or placing pest strips or traps was considered a direct pesticide application. Those who reported a direct application of any pesticide that was initiated prior to the reference age were classified as ever exposed. Otherwise, individuals were classified as never exposed. For cases, reference age equated to age-at-onset, and for controls, reference age equated to age-at-examination (AAE) minus the mean disease duration among cases. This adjustment was necessary to give cases and controls comparable exposure periods. For ever exposed individuals, frequency (days per year) and duration (years prior to reference age) were summed across all reported direct pesticide applications, and cumulative exposure was calculated as the multiplicative result of frequency and duration. Frequency, duration, and cumulative exposure were divided into tertiles of exposure, such that the high exposure category included values greater than the upper tertile value (value that 67% of the data is equal to or less than), the middle exposure category included values between the upper tertile value and the lower tertile value (value that 33% of the data is equal to or less than), and the low exposure category included values less than the lower tertile value. The referent level of never being exposed was used as a fourth category.
In order to move beyond a broad assessment of direct pesticide application, the recalled pesticide products were classified into specific functional types (e.g., insecticides, herbicides, and fungicides) and chemical classes. The chemical or trade name of each pesticide product was input into the Pesticide Action Network pesticides database  to obtain its primary function (herbicide, insecticide, fungicide, or multiple uses) and the chemical class of the main active ingredient. This database has been used for classification in other pesticide studies [14, 15]. For each functional type and chemical class, individuals were classified as ever exposed to the relevant class or type, ever exposed to any other pesticide, and never exposed to any pesticide.
In the residential history section of the environmental risk factor questionnaire, individuals were asked to "list the cities, towns, or communities in which you have lived for the majority of each year from childhood to now." For each location, individuals were then questioned on their years of residence, whether they lived on or next to a farm, whether they drank well-water, and if so, the years of drinking well-water. Individuals who reported drinking well-water at any residence prior to their reference age were classified as ever exposed to well-water consumption. Cumulative duration (years prior to reference age) was then determined and categorized into tertiles of exposure (high, middle, and low exposure categories) and a referent level of never being exposed.
In the occupational history section of the questionnaire, individuals were asked to list their full or part-time jobs held for a year or longer and the years worked at each job. Individuals were asked each job title and company name, from which study staff matched the appropriate Standard Occupational Classification job code of the United States Department of Labor Bureau of Labor Statistics . The following job categories (and job codes) were considered farming occupations: farm, ranch, and other agricultural managers (11–9011); farmers/ranchers (11–9012); supervisors/managers of farming, fishing, and forestry workers (45–1011); farm labor contractors (45–1012); agricultural inspectors (45–2011); animal breeders (45–2021); graders/sorters, agricultural products (45–2091); farmworkers/laborers, crop, nursery, and greenhouse (45–2092); farmworkers, farm and ranch animals (45–2093); agricultural workers, all other (45–2099). Individuals whose residence report revealed that they lived on or next to a farm prior to their reference age or occupation report revealed that they worked in a farming occupation prior to their reference age were classified as ever exposed to farming. Cumulative duration (years of residential or occupational exposure to farming prior to reference age) was calculated and categorized into tertiles of exposure (high, middle, and low exposure categories) and a referent level of never being exposed. History (ever vs. never) and cumulative duration of farming residences and occupations, separately, were also examined.
Even though family-based case-control data sets are robust to within-family confounding by ethnicity, bias due to across-family confounding by ethnicity may still exist. To minimize this bias, we stratified families by self-reported race/ethnicity. Only the white families (308 of 322 ascertained families) provided sufficient statistical power, so analyses were performed in this subset only.
Population-averaged generalized estimating equations (GEE) as implemented by SAS version 8e (SAS Institute, Cary, NC) were used to model the associations between pesticide exposure and PD. GEE modeling requires specification of a within-cluster correlation matrix structure, which serves as the basis for deriving the working correlation matrix from the data. Correlations from this working correlation matrix are then used as nuisance parameters to estimate the across-population regression parameters and corresponding robust variance estimates . GEE with an independence correlation matrix, which begins with the assumption of no correlation between relatives but later estimates correlations to appropriately adjust the variance of association tests, is a valid test for association of an environmental risk factor using pedigrees of similar size and structure to the current data set . The independence matrix was specified in our GEE models, but even if this correlation matrix does not accurately fit the data, GEE models are generally robust to misspecification of the correlation matrix .
GEE models using affection status as the outcome assessed associations of history, frequency, duration, and cumulative exposure for broadly defined direct pesticide application, associations of history for direct pesticide application as divided into pesticide functional types and chemical classes, and associations of history and duration for well-water consumption and farming residences/occupations. Individuals never exposed to the relevant measure served as the referent group. Sex and AAE were included in the models as confounders given significant differences between cases and controls. Cigarette smoking and caffeine (coffee, tea, or soft drink) consumption histories (1 = ever, 0 = never) were also included as confounders given significant inverse associations between these environmental factors and PD in this data set .
Two types of GEE models were constructed. GEE model 1 tested for the trend of effects across the exposure categories with an ordinal variable (3 = high, 2 = moderate, 1 = low, 0 = never) for frequency, duration, and cumulative exposure. GEE model 2 tested the effect of each exposure category with indicator variables (ever versus never or high, moderate, and low versus never) for each measure (history, frequency, duration, and cumulative exposure). Two-sided p-values are presented for model 1 to show the significance of the linear trend tests, while the adjusted odds ratios (ORs) and 95% confidence interval (CI) are presented for model 2 to show the strength of association for each exposure level. Associations of direct pesticide application, well-water consumption, and farming residences and occupations with PD were evaluated in the overall data and in the data stratified by sex and by family history. These analyses were repeated with only adulthood (≥ 18 years old) exposures considered. Associations of direct pesticide application as divided into functional types and chemical classes were only evaluated in the overall data to maintain sufficient statistical power.
Data sets resembling the actual PD sample were created using the simulation of linkage and association program . Given that the actual PD sample consists mostly of sibling pairs with at least one affected individual, the simulated data sets were created with 300 such sibling pairs to resemble the overall sample, 200 sibling pairs to resemble the negative family history stratum, or 100 sibling pairs to resemble the positive family history stratum. A binary risk factor with 10% within-sibling correlation was specified at a frequency of 0.5, which closely matches the exposure frequencies of childhood and adulthood direct pesticide application, well water consumption, and farming residences/occupations. Data sets at varying relative risks (RRs) were simulated to determine the minimum detectable RR for 80% power. Statistical power of GEE was based on OR estimates from 1,000 replicates of each simulated data set. The minimum detectable RR was 1.9 when assessing risk factors in 300 sibling pairs, 2.1 in 200 sibling pairs, and 3.0 in 100 sibling pairs.