Population normative data for the 10/66 Dementia Research Group cognitive test battery from Latin America, India and China: a cross-sectional survey

Background 1) To report site-specific normative values by age, sex and educational level for four components of the 10/66 Dementia Research Group cognitive test battery; 2) to estimate the main and interactive effects of age, sex, and educational level by site; and 3) to investigate the effect of site by region and by rural or urban location. Methods Population-based cross-sectional one phase catchment area surveys were conducted in Cuba, Dominican Republic, Venezuela, Peru, Mexico, China and India. The protocol included the administration of the Community Screening Instrument for Dementia (CSI 'D', generating the COGSCORE measure of global function), and the Consortium to Establish a Registry for Alzheimer's Disease (CERAD) verbal fluency (VF), word list memory (WLM, immediate recall) and recall (WLR, delayed recall) tests. Only those free of dementia were included in the analysis. Results Older people, and those with less education performed worse on all four tests. The effect of sex was much smaller and less consistent. There was a considerable effect of site after accounting for compositional differences in age, education and sex. Much of this was accounted for by the effect of region with Chinese participants performing better, and Indian participants worse, than those from Latin America. The effect of region was more prominent for VF and WLM than for COGSCORE and WLR. Conclusion Cognitive assessment is a basic element for dementia diagnosis. Age- and education-specific norms are required for this purpose, while the effect of gender can probably be ignored. The basis of cultural effects is poorly understood, but our findings serve to emphasise that normative data may not be safely generalised from one population to another with quite different characteristics. The minimal effects of region on COGSCORE and WLR are reassuring with respect to the cross-cultural validity of the 10/66 dementia diagnosis, which uses only these elements of the 10/66 battery.

effect of region with Chinese participants performing better, and Indian participants worse, than those from Latin America. The effect of region was more prominent for VF and WLM than for COGSCORE and WLR.

Conclusion:
Cognitive assessment is a basic element for dementia diagnosis. Age-and educationspecific norms are required for this purpose, while the effect of gender can probably be ignored. The basis of cultural effects is poorly understood, but our findings serve to emphasise that normative data may not be safely generalised from one population to another with quite different characteristics. The minimal effects of region on COGSCORE and WLR are reassuring with respect to the cross-cultural validity of the 10/66 dementia diagnosis, which uses only these elements of the 10/66 battery.

Background
Rapid demographic ageing around the world has important implications for health and social care. Cognitive decline and dementia have a high individual impact and are strongly age-associated [1], so that their overall prevalence and societal impact is increasing rapidly. A recent consensus report estimated that the number of people with dementia in the world will increase from 24 million to 82 million from 2000 to 2040 [2]. This increase will be particularly marked in low and middle income countries where epidemiological research into the aetiology and impact of dementia and cognitive decline is limited. The 10/66 Dementia Research Programme was set up to facilitate research in these regions and to provide data that can be used for public health and service planning [1]. Cognitive tests covering multiple domains are an essential component of a definitive dementia diagnostic assessment: for the purposes of establishing the criterion of decline in at least two domains of cognitive function, including memory [3]. Normative data are urgently required, given the influence of both education and culture on cognitive test performance [4,5].
The data presented in this paper were drawn from the 10/ 66 Dementia Research Group's cross-sectional surveys of older people carried out in seven urban and four rural sites in five Latin American countries, China and India. The primary objective was to generate site-specific norms for the cognitive test battery used in the 10/66 studies comprising tests of general cognitive function, verbal fluency and immediate and delayed verbal recall. Further objectives were: a) to assess the independent influences of age, educational level and gender and their homogeneity across sites, and b) to assess the extent to which variance attributable to site could be attributed to the effects of region and/or rural versus urban residence.

Study design
The design of the 10/66 Dementia Research Group (DRG) baseline population-based studies has been described in detail [6]. Briefly, cross-sectional surveys were carried out, approaching all residents aged 65 and over within purposively selected geographically-defined catchment areas at each site. No over-sampling strategy was applied (e.g. with respect to age groups). Affluent districts were intentionally avoided. A target sample of 2000 persons aged 65 years and over, per country (3,000 in Cuba) was identified by means of door knocking the catchment areas. Peru, Mexico, China and India recruited both from rural and urban sites. Interviews followed a comprehensive one-phase design where all participants received a full assessment including: cognitive and mental health evaluation, an informant interview, a physical and neurological examination, blood assays and genotyping, in addition to questionnaire measures of environmental and behavioural risk exposures, sociodemographic and socioeconomic status, and physical health status. Disability, health service utilisation, care arrangements and impact of providing care were also evaluated.

Measurements
For this analysis we considered the following socio-demographic measures as independent variables: participants' age divided into four groups (6569 years, 7074 years, 7579 years, 80 years and over), sex, and education level divided into five groups (none, some (but did not complete primary), completed primary, completed secondary, and tertiary). Age of participants was formally established during interview from stated age, official documentation, informant report and, in the case of discrepancy, age according to an event calendar.
The 10/66 cognitive assessment battery was drawn principally from the Community Screening Instrument for Dementia (CSI 'D') developed by the Ibadan-Indianapolis study group [7] specifically for use in cross-cultural research, and in low education settings, and from the Consortium to Establish a Registry for Alzheimer's Disease (CERAD) [8]. As such, components of the battery have been very widely used in other population and clinical research. In our large multi-site pilot study [9] we developed and validated a culture-and education-fair algorithm for dementia diagnosis across a wide variety of low and middle income country settings, comprising components of the cognitive test battery in combination with the Geriatric Mental State and the informant section of the CSI 'D' The analysis described here focussed on the four main tests included in the 10/66 cognitive test battery: 1) Global cognitive function: The Community Screening Instrument for Dementia (CSI 'D') [7] includes a 32 item cognitive test assessing orientation, comprehension, memory, naming and language expression, which is used to generate a global cognitive score (COGSCORE). The CSI 'D' was from the outset intended to be used across cultures with the minimum of necessary adaptation. It was developed and first validated among Cree American Indians [7,10], further validated and used in population-based research (The US-Nigeria Study) among Nigerians in Ibadan and African-Americans in Indianapolis [11], and has also been validated among white Canadians in Winnipeg [12], and in Jamaica in conjunction with the CERAD battery [13]. The CSI 'D' test score distributions among those with dementia and controls, and the degree of discrimination provided were remarkably consistent across the aforementioned cultural settings [12].
2) Memory: The 10/66 battery includes two elements of the CERAD 10 word list learning test: world list memory (WLM) and word list recall (WLR), testing immediate and delayed recall respectively. WLR has been reported to be of particular value in distinguishing early dementia from normal aging [14]. WLM and WLR are taken from the adapted CERAD ten word list learning task used in the Indo-US Ballabgarh dementia study [15]. Six words; butter, arm, letter, queen, ticket, and grass; were taken from the original CERAD battery English language list [16]. Pole, shore, cabin, and engine were replaced with corner, stone, book and stick, which were deemed more cross-culturally applicable. In the learning phase, the list is read out to the participant from a green card, who is then asked to recall straight away the words that they remember. This process is repeated three times, giving a WLM score out of 30. In the 10/66 protocol, approximately five minutes later, after a series of unrelated CSI 'D' questions (name registration, object naming, object function, repetition) the participant is again asked to recall the 10 words with prompting that they were read from a green card, giving a WLR score out of 10.
3) Verbal fluency (VF): the animal naming verbal fluency task [7] from the CERAD is administered as part of the CSI 'D', however it is accorded very little weight within the algorithm for calculating the total CSI 'D' score. In the version of the test used in CSI 'D', after a brief practice naming items from another category (clothing), participants are encouraged to name as many different animals as they can in the space of one minute. The instructions read out to the participant stipulate: 'think of any kinds of animal in the air, on land, in the water, in the forest, all the different animals'. If the participant stops before the allotted time has elapsed they are encouraged to continue. The score is one point for each valid name. In the computation of the CSI 'D' cognitive test (COGSCORE) the VF score is divided by 23. These weighted scores generally range between 0 and 1, the same as for a single CSI 'D' orientation item.
The CERAD neuropsychological battery has been adapted for use in India [15], Korea [17], Brazil [18], Nigeria [16] and Jamaica [13], and norms have been provided for black and white persons in the USA, both with dementia [19], and among the general population [20]. While education effects are prominent, cultural or ethnic differences have been less evident [13,17]. CERAD battery components have been found to distinguish reliably between those with dementia and controls across cultures [13,15].

Ethical considerations
The study was carried out in compliance with the Helsinki Declaration and all participants provided informed consent.

Statistical analysis
For this analysis, all participants who had received a diagnosis of dementia according to either DSM-IV [3] or 10/ 66 dementia criteria were excluded [9]. Participants' age, sex and education data were described by site. Means and standard deviations (SD) for each of the four cognitive tests were calculated by age, sex and education for each of the eleven sites. General linear models were used to determine the unadjusted and independent effects of age, sex and education on cognitive test scores across sites. We then tested formally for effect modification by extending the models used to estimate the main effects of age, education and sex to include site by age, site by education and site by sex interaction terms. Finally, we estimated the proportion of the variance (eta 2 ) in each cognitive test accounted for by age, education, gender and site. We further sought to investigate the variance accounted for by site by substituting this variable with two further variables sub-classifying sites into region (Latin America versus (a) China, and (b) India) and rural or urban location. The effect of region (controlling for age, education, gender and rural/urban location) is summarised as adjusted means and mean differences with 95% confidence intervals for the two contrasts: China versus Latin America and India versus Latin America. All analyses were carried out using STATA 9.
All age groups were well represented. The Venezuelan, rural Chinese and Indian samples had a younger age distribution than other sites. The female/male ratio exceeded 1 in all sites, but with a less striking preponderance of women in rural Peru, China and India. Educational level showed considerable variation across sites, highest in urban Latin America sites (other than the Dominican Republic), and lowest in rural China and in India. Within countries, educational levels were consistently higher in urban compared with rural sites. Tables 2, 3, 4 and 5 present normative data: stratified means and standard deviations for the four cognitive outcomes. Older age and lower levels of education were consistently associated with poorer cognitive test performance on scores for all four tests, across all sites. The effect of sex on cognitive test performance was smaller and more variable, both between tests and between sites. Men tended to perform marginally better than women on the COGSCORE, and on VF. For WLM and WLR, women performed better than men in Latin American sites, but there was no gender difference in China and India.
Tests for interaction indicated that the effects of age, sex and education on cognitive test performance were each significantly modified by site for all four cognitive tests. However, the effects were uniformly very modest in size, generally accounting for between 0.1% and 0.3% of the overall variance. The two largest interaction effects were those for verbal fluency between site and age (0.6% of the variance), and site and education (0.5%). Table 6 summarises the independent effects of age, sex, education and site on cognitive test performance. Site accounted for the highest proportion of variance for all four scores followed by education and then age, except for WLR where the effect of age was stronger than education. The contribution of sex to the models was uniformly low. Most of the effect of site could be more parsimoniously accounted for by region (Latin America versus (a) China,         13.6 (4.      and (b) India) and, to a lesser extent, rural versus urban location (with marginally poorer performance on WLM and WLR in rural compared with urban settings). Controlling for age, education, sex and rural/urban location, performance on all cognitive tests was best among Chinese participants, intermediate among Latin American participants, and worst among Indian participants. Chinese participants scored one point more and Indian participants one and a half points less on the COGSCORE than did participants in Latin American sites. Indian participants generated nearly six fewer animals on verbal fluency than did participants in China and Latin America. Compared with Latin American participants, Chinese participants remembered on average nearly three more words out of 30 on WLM and one more word out of 10 on delayed WLR. Indian participants, on the other hand, recalled on average two and a half words fewer on WLM and half a word fewer on WLR.

Discussion
We have provided normative data by age group, sex and educational level for widely used neuropsychological tests of global cognitive function, verbal fluency and immediate and delayed word recall in seven low or middle income countries. People with any degrees of dementia, including questionable dementia, were excluded. These norms have been rigorously generated applying a standardized testing procedure amongst representative community-dwelling samples. To our knowledge this is the largest study to date on neuropsychological tests norms and the first to present direct comparisons between so many culturally diverse countries.
With the exception of rural India, our norms for CERAD WLM and WLR are well aligned with those previously reported from affluent western countries. [4,[22][23][24]. Our norms for CERAD VF are comparable to previously determined norms from both Europe and North America countries [22,23,25,26] and from Latin America [27][28][29]. We found that older age and lower educational level corresponded to poorer performances in all four tests and across all sites. The influences of age and educational level on test performances were large, and consistent in size and direction with other normative data investigations from western countries [23]. Sex had a much weaker influence and can probably be safely ignored when constructing reference norms. Likewise, while the site by age, education and sex interactions were statistically significant for all cognitive tests, these were very modest effects, and the beta coefficients (Tables 2, 3, 4 and 5) are remarkable mainly for their consistency across sites.
There was a considerable residual effect of site upon cognitive test performance, not accounted for by compositional differences between samples in the distribution of age and education. Further analyses clarified that the between-site difference was most parsimoniously accounted for by the effect of region, with smaller effects of rural versus urban location evident for the two memory tests. We should still be cautious about attributing the effect of region to that of language and culture. First, other compositional differences not directly linked to culture per se, but relevant to cognitive performance and differently distributed across sites, may not have not been taken into consideration in our analyses. One such effect may be the quality and nature of education received that may not be adequately summarised in terms of level of education [30]. Second, while we have included a wide variety of Latin American and Hispanic Caribbean countries and shown fairly consistent norms between them, the norms derived from the Tamil speaking Indians in Tamil Nadu, and the Mandarin-speaking Chinese in and around Beijing clearly cannot be generalised to the vast and diverse populations of India and China as a whole.
By design, the two cognitive tests included in the 10/66 dementia diagnosis, the CSI 'D' COGSCORE and the CERAD WLR, were those that showed the smallest cultural influences and the most robust cross-cultural discriminating properties [9]. This finding has now been, in part, replicated in the population-based phase of our study and is reassuring with respect to the cross-cultural validity of that diagnosis. However, in the light of the findings with respect to other tests, it may be necessary in the future to use region-specific norms for the identification of impairment in immediate recall or verbal fluency for the identification of those meeting cognitive impairment criteria (1.5 standard deviations below the age-and educationspecific norms for those with no dementia) for DSM-IV dementia [31], and amnestic and non-amnestic mild cognitive impairment. The general effect of such a change would be to lower still further the already negligible prevalence of DSM-IV dementia in Indian sites, and to increase slightly the prevalence of DSM-IV dementia in Chinese sites.

Conclusion
Cognitive assessment is a basic element for dementia diagnosis. Age-and education-specific norms are required for this purpose, while the effect of gender can probably be ignored. The basis of cultural effects is poorly understood, but our findings serve to emphasise that normative data may not be safely generalised from one population to another with quite different characteristics. The minimal effects of region on COGSCORE and WLR are reassuring with respect to the cross-cultural validity of the 10/66 dementia diagnosis, which uses only these elements of the 10/66 battery.