Skip to main content

Validation of the Cognitive Assessment of Later Life Status (CALLS) instrument: a computerized telephonic measure



Brief screening tests have been developed to measure cognitive performance and dementia, yet they measure limited cognitive domains and often lack construct validity. Neuropsychological assessments, while comprehensive, are too costly and time-consuming for epidemiological studies. This study's aim was to develop a psychometrically valid telephone administered test of cognitive function in aging.


Using a sequential hierarchical strategy, each stage of test development did not proceed until specified criteria were met. The 30 minute Cognitive Assessment of Later Life Status (CALLS) measure and a 2.5 hour in-person neuropsychological assessment were conducted with a randomly selected sample of 211 participants 65 years and older that included equivalent distributions of men and women from ethnically diverse populations.


Overall Cronbach's coefficient alpha for the CALLS test was 0.81. A principal component analysis of the CALLS tests yielded five components. The CALLS total score was significantly correlated with four neuropsychological assessment components. Older age and having a high school education or less was significantly correlated with lower CALLS total scores. Females scored better overall than males. There were no score differences based on race.


The CALLS test is a valid measure that provides a unique opportunity to reliably and efficiently study cognitive function in large populations.

Peer Review reports


The human and economic costs of cognitive decline and dementia [1] point to the need for an improved understanding of age-related cognitive deficits and the prevalence of such deficits in the United States [2, 3]. The distinction between normal age-related changes in cognition from deficits indicative of incipient dementia is problematic but increasingly relevant in health care research and epidemiological studies.

Current standards of good practice entail that the assessment of cognitive performance to differentiate between age-related neurocognitive deficits and neurodegenerative disorders be conducted by neurologists and by clinical neuropsychologists who use standardized neuropsychological testing [46]. Exhaustive neuropsychological batteries, however, tend to be limited to specialized medical centers. While some briefer evaluations (such as the CAMCOG, CAMCOG-R) have been developed that contain multiple cognitive domains, they require in-person administration [7, 8]. As a result, primary care physicians are the most likely observer of cognitive change among older persons. Studies have shown, however, that physicians often underreport or fail to identify problems with cognition [9, 10]. Strategies that improve the availability of cognitive measures that may assess multiple cognitive domains are timely and appropriate.

Although a variety of cognitive screens (in-person and telephonic) have been developed to monitor cognitive decline [1117], they tend to contain only a few limited measures of cognitive functioning [12]. The Mini-Mental State Exam (MMSE) [18] is the most widely used in-person instrument for assessing cognition among older adults. The MMSE has shown good test-retest reliability (0.89) and inter-rater reliability (0.82) [18]. The MMSE, however, has failed to demonstrate consistent predictive ability among heterogeneous populations. It shows a high rate of false positives among individuals with low socioeconomic status and low education [1921] and false negatives in mildly impaired persons [2224].

Telephone screening tests of cognition have been developed. The most frequently used is the Telephone Interview of Cognitive Status (TICS) [25] and, an adapted version that adds a delayed memory item, the TICS-modified (TICSm) [26]. The TICS was originally adapted from the MMSE. The TICS and the TICSm show high correlations with the MMSE [25, 27] and equivalent sensitivity and specificity as cognitive screens [28]. While the TICS and TICSm share the MMSE's strength as a general screening measure, these screening tests also share the MMSE's potential for failing to detect subtle cognitive decline, and cannot substitute for neuropsychological assessment in answering questions of cognitive decline [22]. The fact that these screens do not measure many of the cognitive domains used in a full neuropsychological battery is a major limitation. As a result, great caution needs to be exercised in using such screens because they are often unable to proffer reliable information about specific cognitive domains affected [29]. Hence, there is need for a reliable and more comprehensive tool.

To this end, a psychometrically valid, time-efficient, telephone-administered test of cognitive performance associated with aging, the Cognitive Assessment of Later Life Status (CALLS), was modeled after standardized neuropsychological batteries to overcome the limitations of screening batteries modeled on the MMSE and in person administration. The objective of this study was to validate the CALLS instrument.


The study was reviewed and approved by the Institutional Review Board of Kaiser Permanente Southern California.

Instrument development

The CALLS instrument has undergone extensive developmental work. Applying classical psychometric theory, we have followed a sequential, hierarchical strategy for developing this test, where each stage of development does not proceed further until specified criteria are met. A brief explanation of preliminary work to the validation study follows.

Item generation stage

A range of cognitive items were identified as necessary to be included for an effective telephone screen. These comprehensive cognitive items were pilot tested with 43 elderly participants (mean age = 73.2; female = 56 percent; non-white = 41 percent) over six separate telephone testing sessions. These cognitive domains were correlated with a brief battery of in-person neuropsychological tests (Judgment of Line Orientation, Boston Naming Test, Letter Number Sequencing, Trail-Making Test Parts A and B and the California Verbal Learning Test). Four separate focus groups were held with participants to elicit feedback on the comprehension and clarity of the questions and test experience. An Expert Panel, comprised of a team from neuropsychology, psychometry, geriatrics, speech pathology, audiology and epidemiology were consulted and a core set of items with acceptable face validity, usability and preliminary convergent validity were identified. Tests of verbal learning and memory, attention and working memory, orientation, processing speed, and executive functioning, along with assessment of auditory discrimination and depression were considered essential.

Item selection stage

Based on the selected cognitive items, the prototype CALLS test was administered to 101 participants over the age of 64, randomly selected from the membership of Kaiser Permanente Southern California. All participants took part in two 30-minute test sessions over the telephone. The reliability of the subtests was good with coefficient alphas between 0.72 and 0.87. Principal component analysis was conducted to evaluate construct validity. Resulting components were derived from tests of verbal learning and memory, attention and working memory, executive functioning, and verbal fluency and naming. Coefficient alpha for the final set of items was 0.77. The Expert Panel agreed that those with high factor loadings be retained and those without be eliminated. A 30 minute interview was created through streamlining of instructions, use of adaptive questions and skip patterns.

Validation stage

To validate the test, each participant was given the 30-minute telephone CALLS battery and a full 2.5 hour in-person neuropsychological battery of tests. Approximately half of the final sample was given the CALLS interview first (n = 108) and half were given the neuropsychological interview first (n = 103). Every effort was made to ensure that the two tests were administered within a reasonably close time period without fatiguing the participant. The mean time between tests was 16.27 days (Range: from one day to 60 days).

Current study


No participants from previous interview pools were recruited to subsequent interview pools. In the validation study, 908 men and women 65 years and older were randomly selected from the membership of Kaiser Permanente Southern California. Sampling was conducted to maximize the chances of ethnic and racial diversity. Based on geocoding, equal numbers of African -Americans, Whites, Hispanics and Asians were sought. Also equal numbers of men and women were targeted. Once the target criteria for a given ethnic or racial group were met, no more participants were recruited for that group. Each participant was recruited by initially sending a letter which described the nature of the study and provided an opt-out postcard and study brochure. Due to the time commitment and diversity goals, participants were provided with an $80.00 incentive for participation in the CALLS interview and the in-person neuropsychological assessment. A maximum of six calls were made to recruit for the study.

Of the original 908 person sample, 152 were excluded due to ineligibility (125 language barrier; 10 deaths; 6 under 65; 5 illness; 3 each for severe hearing problem and relocation out of area). From an eligible pool of 756, a total of 211 consented to participate (response rate 28 percent) to both the in-person neuropsychological battery of tests and the 30-minute CALLS battery. Each participant signed an Institutional Review Board approved consent form prior to taking the in-person neuropsychological test battery. The breakdown of the final sample selection is described in Table 1.

Table 1 Selection of Sample

There were no significant mean age differences between participants (mean = 73.4 years; SD = 5.8) and non-participants (mean = 72.8 years; SD = 6.4). As shown in Table 2, the sample was evenly divided between men (49 percent) and women (51 percent) in both groups. Hispanics were about twice as likely to be non-participants (36 percent) than participants (19 percent). Asians were slightly more likely to be non-participants (28 percent) than participants (21 percent). Contrariwise, Whites were about two and a half times more likely to be participants (36 percent) than non-participants (14 percent). There were no differences in groups for African-Americans (p < 0.0001).

Table 2 Demographic Characteristics of Sample

As shown in Table 2, the participant group is well represented in terms of age, gender and racial and ethnic groups. The study sample is slightly better educated than the general population in these age ranges, but over one quarter have a high school education or less.

Cognitive measures

Lay interviewers (with at least a bachelor's degree) were trained and supervised by a neuropsychologist to conduct the standardized in-person neuropsychological test battery. Lay interviewers were also trained by a neuropsychologist and supervised by project staff in conducting the standardized CALLS telephone test. Analysis of the reliability of the interviewers' performance across testing sessions revealed correlations in an acceptable range from 0.75 to 0.86.

CALLS telephone test

The CALLS test includes many of the same cognitive items as are used in a neuropsychological battery. It also includes items that measure response time. The CALLS is a computer-assisted test that is standardized with precise scripts and cues for interviewers. The program is designed to not proceed to the next question item until a valid response is entered. Animal Naming, F Words, and Similarities are audio recorded for post-test scoring to ensure that all responses are entered correctly and in the order given.

Test items that are similar to existing tests include: Date – Month, day, date, season and year; President/Vice President – name current; Serial Backward 7 – Subtract 7 from 100 up to 5 times; Digit Span Forward – Digits given from 3 to 7 digits; Digit Span Backward – Digits given from 2 to 6 digits; Animal Naming – 30 seconds to name animals; F Words – 30 seconds to name F words.

Other test items are similar to existing tests but include new word lists, naming and similarities. These include: Naming – 4 questions with brief descriptions are asked to identify number or objects (answers: dozen, umbrella, bed and elephant); Three Trial Wordlist – 12 words across 3 trials with immediate recall (wordlist: brother, steel, day, cousin, month, copper, second, niece, brass, mother, silver, minute);Wordlist Recall -1) recall all 12 words remembered after delay with other tests; 2) recall when cued (prompted with "Tell me any words from list related to: metals, relatives, and units of time"); 3) recall wordlist with intrusion words (12 incorrect words are added to correct list, prompted with "Tell me if the word was on the original list of 12 words); and Similarities – 4 pairs of similarities (prompted with "How are a hammer/saw, skirt and pants, fruits and vegetables, bus/car alike?").

New and unique features and tests include: Volume Configuration – A range of 4 different volume choices are tested and selected by the participant prior to the main CALLS interview; Pitch Discrimination – Participant is given 15 paired tones in a row and must distinguish whether the two tones or pitch are the same or different; Simple Reaction Time – A series of tones are played at random intervals for right and left ears. The participant will be presented with a target tone. Each time the tone is heard, the participant must verbalize "now" as quickly as possible when the target tone is heard; Choice Reaction Time – Participants listen to a target tone and identify that tone from a series of tones with varying pitch. Participants distinguish tones by responding "now" for correct tone within 20 specific tone sequences (5 each high and low, 10 medium). All tones are computer-generated and practice tests are performed prior to actual tests. Response time items are recorded and time stamped to the millisecond to ensure accuracy.

Non-cognitive tests included a 20-question adaptation of the Center for the Epidemiological Study of Depression (CESD) [30] and a brief hearing survey regarding phone use, use of amplifier, and hearing aids. Interviewers completed a feedback questionnaire to evaluate protocol adherence, hearing assessment, and attitude of interviewee.

In-person neuropsychological battery

The in-person tests chosen to compare with the CALLS are well standardized and have acceptable reliability and validity. Verbal memory was assessed with the paragraph prose recall tests from the Wechsler Memory Scale-III: Logical Memory I, Logical Memory II, Logical Memory Recognition (WMS III) [31] and the California Verbal Learning Test (CVLT) [32]. Nonverbal memory was assessed with the Faces I and II test from the WMS III [31]. Attention was assessed with the Digit Span Forward test from the WMS-III [30] and with the Trail-Making Test Part A [33]. Working memory was assessed with Letter-Number Sequencing and Digit Span Backward from the WMS-III [31]. Visuospatial perception was tested with the Judgment of Line Orientation (JLO) [34]. The Trail-Making Test Part B [31] and the Controlled Oral Word Association Test (Phonemic Fluency – F-A-S) [35] evaluated executive functioning. Verbal fluency and naming were tested with the Boston Naming Test (BNT) [36] and Animal Naming (Semantic Fluency) [37]. One hundred and ninety-seven participants were given the Mini-Mental Status Exam [18].

Additional tests conducted in-person but not analyzed here included the Wechsler Test of Adult Reading (WTAR) [38], Symptom Checklist-90 (SCL-90) [39], Geriatric Depression Scale (GDS) [40], and the Lubben Social Network Scale – Revised (LSNS-R) [41]. Interviewers were trained by an audiologist and administered standard audiology tests to assess hearing during the in-person interviews.

Statistical analyses

Descriptive statistics (t-test, chi squares) were generated for demographic characteristics of the sample (Table 2) and mean scores for CALLS (Table 3) and neuropsychological assessment data were calculated.

Table 3 Descriptive Statistics of CALLS Interview

Internal consistency of the CALLS was evaluated by means of item analysis and measured with Cronbach's coefficient alpha. Validity of the CALLS was assessed with respect to the MMSE (concurrent validity). Concurrent validity was measured with Pearson's r, after verifying the linear relationship between the CALLS and MMSE.

To assess construct validity, we conducted a principal component analysis of the CALLS battery, in which the covariance structure of the dependent variables was decomposed into orthogonal components by calculating the eigenvalues and eigenvectors of the data covariance matrix [42]. The eigenvalues were used in decision-making related to the number of orthogonal components used in subsequent analyses. Eigenvectors were used for determining the relationship between the original variables and subsequent components. Principal components were extracted using roots greater than one criterion and submitted to the Varimax procedure with an oblique rotation. The eigenvectors and eigenvalues transformed the initial variable space into a novel variable set of principal components.

Given the aim of concurrent validation, the same principal component analysis strategy was applied to the neuropsychological battery. Next correlations were calculated between the items in the CALLS battery and the component scores of the neuropsychological battery.


The means for each of the individual CALLS tests are displayed in Table 3. Of a possible 180 points, the CALLS total mean score for all participants was 104.4 (S.D. 19.9; range 50–150). Thirty-nine (18%) scored one standard deviation below the mean and 43 (20%) scored one standard deviation above the mean. As shown in Table 3, the CALLS scores are normally distributed. The distribution of CALLS total scores do not present ceiling/floor effects.

Internal consistency

As shown in Table 4, the CALLS showed a high internal consistency, as measured by Cronbach's alpha (0.81). The Cronbach's alpha for the major factors were as follows: verbal learning and memory 0.88, processing speed 0.73, attention and working memory 0.56, verbal fluency and naming 0.46, and concept formation 0.18.

Table 4 Factor Loadings of CALLS Test

Concurrent validity

The CALLS total score correlated moderately with the MMSE total score (Pearson's correlation, r = 0.60). Additional analyses of the relations between the MMSE total score and each of the CALLS domain factors revealed significant correlations: Verbal Learning and Memory (r = 0.41; p < 0.001); Processing Speed (r = 0.24; p < 0.001); Attention and Working Memory (r = 0.23; p < 0.001); Verbal Fluency and Naming (r = 0.38; p < 0.001); and Concept Formation (r = 0.33; p < 0.001).

Construct validity

The principal component analysis resulted in five components with eigenvalues above one. These components accounted for 11 percent of the total matrix variance. The loadings are described in Table 4. The components were labeled as: a) verbal learning and memory (0.883); b) processing speed (0.731); c) attention and working memory (0.555); d) verbal fluency and naming (0.457); and e) concept formation (0.179).

Using the same principal component analysis strategy, the neuropsychological test battery yielded six components similar to the CALLS components. As shown in Table 5, the CALLS total score had largely moderate correlations (all statistically significant) with each of the neuropsychological tests. The strongest correlations were with Verbal Learning and Memory and Verbal Fluency and Naming. Weaker correlations tended to be with visuospatial (e.g. JLO) and non-verbal items (e.g. Facial Recognition). Further, the CALLS total score correlated with four neuropsychological testing components: Verbal Learning and Memory (r = 0.42; p < .0001), Verbal Fluency and Naming (r = 0.44; p < .0001), Episodic Memory for Contextual Information (r = 0.22; p < 0.0016), and Attention and Working Memory (r = 0.29; p < .0001). The CALLS did not correlate with visuospatial processing or non-verbal memory.

Table 5 Correlations between CALLS Total Score and Neuropsychological Components

The CALLS battery also produced expected results for age and education. Older age was significantly correlated with lower CALLS scores (r = -0.35; p < 0.0001). Those with the highest education scored better on the total CALLS test than those with lower education.

Women scored higher than men on the CALLS total score (p = 0.0285), perhaps due to the higher proportion of verbal memory on the test. No CALLS score differences were found based on race or ethnicity.

Moderate intercorrelations were found between Simple Reaction Time and Verbal Fluency and Naming items such as Animal Naming (left ear 0.28, p < 0.0001; right ear 0.33, p < 0.001), F Words (left ear 0.21, p < 0.0027; right ear 0.19, p < 0.0048) and Word List Trial 1 (left ear 0.25, p < 0.003; right ear 0.27, p < 0.0001). The mean score for depression as measured by the CESD was 27.2 (S.D. 6.8) with a range of scores form 20 to 61. There were no significant correlations of the CALLS total or individual scores with depression.


The results of the current validation study suggest that the CALLS instrument is a valid measure for assessing cognitive function in an aging population. The linear correlation between the CALLS and the MMSE (Pearson r = 0.60; p < 0.05) revealed a moderate level of concurrent validity, despite different administration modalities (in-person administered versus telephone-administered). Additional analyses of the relations between the MMSE total score and each of the CALLS individual test items revealed significant correlations.

The CALLS total score was found to be strongly related to verbal learning and memory, verbal fluency and naming, attention and working memory, and episodic memory for contextual information. It was not associated with visuospatial or non-verbal factors from the neuropsychological battery. The majority of findings regarding the effect of age and education on the cognitive outcome were consistent with previous screens, and all results were in expected directions. These findings indeed suggest that the CALLS may be effectively used in place of standard in-person neuropsychological evaluations in situations where the CALLS would be more practical or where the standard in-person evaluations would be impractical to administer.

While further analytical work is required to assess the norms and predictive capacity of the CALLS, the potential clinical utility of the CALLS is reflected in its ability to perform as well as other tests or procedures. For example, the CALLS is well suited for assessing aspects measured by the MMSE, as well as some domains not well assessed by the MMSE. Additionally, the CALLS battery's 12-item word list with immediate and delayed conditions is significantly associated with the neuropsychological battery's verbal learning and memory component. Similarly, a strong association exists between the neuropsychological battery's verbal fluency and naming and the CALLS test component that includes semantic (animal naming) and phonemic (F words) fluency. Additionally, there was a noteworthy association between the CALLS version and the neuropsychological battery's version of digit span tests (forward and backward). The fact that these findings reveal such robust associations gives credence to the assertion that the CALLS battery validly measures these cognitive domains.

The CALLS battery provides unique measures of reaction time and processing speed. As a part of the cognitive progression, speed of processing is well documented to decline with age [43, 44]. Moreover, the enhanced accuracy of timing assessment in the CALLS may make it more suitable for identifying deficits, especially when reduced processing speed and reaction time include delayed onset of responses and increased decision making times (i.e., reduced information processing speed). Further, in non-computerized assessments, there are some cases in which uncontrolled error margins between stimulus onset and actual stimulus display may result in the modeling of "noise" rather than veridical information [45]. Our results are preliminary, and while norms for different age groups will need to be established, the accuracy of our test is promising.

There were also moderate intercorrelations of reaction times with verbal learning and memory and verbal fluency and naming items in the CALLS battery and very small intercorrelations with the verbal fluency and naming items in the neuropsychological tests. These findings suggest the possible relationship of processing speed in retrieval of words from memory. They further suggest that failure to remember words in these tests may be more a function of slow speed in recalling words than of loss of verbal memory. Alternatively, this may be an indication that slow processing speed may impede sufficient verbal encoding for delayed word list recall.

Simple reaction time can also be a measure that distinguishes cognitively healthy from dementia groups [46]. The addition of response time choices found in the CALLS battery enhances the complexity of the response time measures and may increase sensitivity to screen for early dementia [4648]. The addition of the adapted and shortened Center for Epidemiologic Studies Depression scale also provides a screen for depression, which is also known to slow processing speed.

Concept formation including word naming and similarities offers a simple test of concrete thinking and verbal expression. Each of these was correlated with the verbal memory components, and they uniquely address the ability to demonstrate abstract thinking and to identify concepts associated with commonly used words and the ease of retrieval of accurate words.

Although executive functioning did not comprise a unique component, elements of executive functioning are measured in the processing speed component with reaction times and in the individual tests of serial 7s and similarities. Each of these had strong factor loadings in the CALLS test. The failure to identify a specific, valid factor associated with executive functioning highlights the complexity of this construct and the difficulty of using a screening test to uniquely assess this domain [49].

The CALLS battery has a number of limitations. The CALLS battery requires the use of a telephone and there are no visuospatial or non-verbal tasks conducted. While there were modest yet significant correlations between the CALLS and the Trail Making Test (Parts A and B), as well as between the CALLS and Facial Recognition I and II, there was a lack of association with the full components. Given the fact that visuospatial deficits (problems with drawing, constructions, and orientation in their own surrounding) are among the earliest manifestations of Alzheimer's disease [50, 51], the CALLS battery is faced with an important limitation. On the other hand, the lack of a visuospatial component in the CALLS battery may also be helpful in situations where a neuropsychological evaluation or screen needs to be administered to persons with severe visual deficits and specific physical disabilities.

While the sample had fairly equal representation for gender, age, and ethnicity, there were few with less than a high school education. Participants with lower education are generally more difficult to recruit and tend to have lower scores on cognitive tests. The small numbers in this group may have affected the distribution and results of the CALLS scores. The generalizability of results also can be affected by the relatively low response rate.

A further possible limitation of the CALLS battery is that it is not adapted for subjects where English is not their primary language. This resulted in 125, or 14 percent fewer possible subjects. While no CALLS score differences were found in the current study among ethnic and racial groups, it is possible that inclusion of these subjects would have altered that finding. Future studies should include a translated version of the CALLS for use with persons whose language is other than English.

An additional limitation is that the current study's data was insufficient to evaluate the validity of the CALLS battery for application to a sample inclusive of individuals with cognitive impairment ranging from mild (mild cognitive impairment) to severe (dementia). Although we did not specifically exclude anyone in our random sample based on cognitive status, we expect that the majority of our sample was cognitively unimpaired. Future studies should examine inclusion of patients affected by mild cognitive impairment, whether progressing or not to dementia. Hence, the CALLS battery should be applied to the study of prodromic cognitive deficits [52].

Despite these limitations, the CALLS battery has a number of strengths. Studies have shown that telephone testing of participants at home is not only reliable [53, 54] but that screening at home rather than in the clinician's office may actually improve the performance of elderly subjects on these cognitive tests [55]. Further, the CALLS test provides a mechanism for the participant to select a hearing level comfortable to them that ensures appropriate volume for the test. One of the best features of the CALLS is its unique ability to measure simple and choice response times for each participant. Moreover, the thirty minutes required for the CALLS battery is more efficient and time preserving than most standard in-person neuropsychological evaluations. The two to four hour time period needed for face-to-face administration make such tests expensive and logistically unsuitable in most clinical and research settings. This is even more apparent with epidemiological studies. In addition to reduction of fatigue and increased accessibility, the CALLS battery reduces the need for expensive professional staff and locations. The utility of this instrument in large epidemiological studies is also likely increased by the fact that the test is administered via telephone with a computer interface, decreasing the need for training and validating test administrators at multiple sites.


In summary, the CALLS battery was found to be a relatively brief, yet comprehensive standardized cognitive assessment tool with robust correlations to the more time-consuming and costly in-person neuropsychological battery. The test was scrupulously pre-tested and hierarchically staged to ensure that each step followed psychometrically valid procedures. These results show that multiple domains of cognitive functioning can be reliably assessed over the telephone. The CALLS instrument is a valid test with unique telephonic and computerized features that provides a unique potential to efficiently study cognitive function in large populations.


  1. Langa KM, Larson EB, Wallace RB, Fendrick AM, Foster NL, Kabeto MU, Weir DR, Willis RJ, Herzog AR: Out-of-pocket health care expenditures among older Americans with dementia. Alzheimer Dis Assoc Disord. 2004, 18: 90-98. 10.1097/01.wad.0000126620.73791.3e.

    Article  PubMed  Google Scholar 

  2. Fitzpatrick AL, Kuller LH, Ives DG, Lopez OL, Jagust W, Breitner JC, Jones B, Lyketsos C, Dulberg C: Incidence and prevalence of dementia in the Cardiovascular Health Study. J Am Geriatr Soc. 2004, 52: 195-204. 10.1111/j.1532-5415.2004.52058.x.

    Article  PubMed  Google Scholar 

  3. Kukull WA, Higdon R, Bowen JD, McCormick WC, Teri L, Schellenberg GD, van Belle G, Jolley L, Larson EB: Dementia and Alzheimer disease incidence: a prospective cohort study. Arch Neurol. 2002, 59: 1737-1746. 10.1001/archneur.59.11.1737.

    Article  PubMed  Google Scholar 

  4. Doody RS, Stevens JC, Beck C, Dubinsky RM, Kaye JA, Gwyther L, Mohs RC, Thal LJ, Whitehouse PJ, DeKosky ST, Cummings JL: Practice parameter: management of dementia (an evidence-based review). Report of the Quality Standards Subcommittee of the American Academy of Neurology. Neurology. 2001, 56: 1154-1166.

    Article  CAS  PubMed  Google Scholar 

  5. Lezak MD: Neuropsychological Assessment. 1995, New York: Oxford University Press, 3

    Google Scholar 

  6. McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadian EM: Clinical diagnosis of Alzheimer's disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer's Disease. Neurology. 1984, 34: 939-944.

    Article  CAS  PubMed  Google Scholar 

  7. Huppert FA, Brayne C, Gill C, Paykel ES, Beardsall L: CAMCOG- A concise neuropsychological test to assist dementia diagnosis: socio-demographic determinants in an elderly population sample. Br J Clin Psychol. 1995, 34: 529-541.

    Article  PubMed  Google Scholar 

  8. Athey RJ, Walker RW: Demonstration of cognitive decline in Parkinson's disease using the Cambridge Cognitive Assessment (Revised) (CAMCOG-R). Int J Geriatr Psychiatry. 2006, 21: 977-982. 10.1002/gps.1595.

    Article  PubMed  Google Scholar 

  9. Chodosh J, Petitti DB, Elliott M, Hays RB, Crooks VC, Wenger N: Physician recognition of cognitive impairment: evaluating the need for improvement. J Am Geriatr Soc. 2004, 52: 1051-1059. 10.1111/j.1532-5415.2004.52301.x.

    Article  PubMed  Google Scholar 

  10. Valcour VG, Masaki KH, Curb JD, Blanchette PL: The detection of dementia in the primary care setting. Arch Intern Med. 2001, 161: 1238-1239. 10.1001/archinte.161.9.1238.

    Article  Google Scholar 

  11. Crooks VC, Petitti DB, Robins SB, Buckwalter JG: Cognitive domains associated with performance on the telephone interview for cognitive status-modified. Am J Alzheimers Dis Other Demen. 2006, 21: 45-53. 10.1177/153331750602100104.

    Article  PubMed  Google Scholar 

  12. deJager CA, Budge MM, Clarke R: Utility of TICS-M for the assessment of cognitive function in older adults. Int J Geriatr Psychiatry. 2003, 18: 318-24. 10.1002/gps.830.

    Article  Google Scholar 

  13. Herzog AR, Wallace RB: Measures of cognitive functioning in the AHEAD Study. J Gerontol B Psychol Sci Soc Sci. 1997, 52B: 37-48.

    Article  Google Scholar 

  14. Hill J, McVay JM, Walter-Ginzburg A, Mills CS, Lewis J, Lewis BE, Fillit H: Validation of a brief screen for cognitive impairment (BSCI) administered by telephone for use in the Medicare population. Dis Manag. 2005, 8: 223-234. 10.1089/dis.2005.8.223.

    Article  PubMed  Google Scholar 

  15. Kawas C, Karagiozis H, Resau L, Corrada M, Brookmeyer R: Reliability of the Blessed Telephone Information-Memory-Concentration Test. J Geriatr Psychiatry Neurol. 1995, 8: 238-242.

    Article  CAS  PubMed  Google Scholar 

  16. Knopman DS, Knudson D, Yoes ME, Weiss DJ: Development and standardization of a new telephonic cognitive screening test: the Minnesota Cognitive Acuity Screen (MCAS). Neuropsychiatry Neuropsychol Behav Neurol. 2000, 13: 286-296.

    CAS  PubMed  Google Scholar 

  17. Roccaforte WH, Burke WJ, Bayer BL, Wengel SP: Reliability and validity of the Short Portable Mental Status Questionnaire administered by telephone. J Geriatrc Psychiatry Neurol. 1994, 7: 33-38.

    CAS  Google Scholar 

  18. Folstein MF, Folstein SE, McHugh PR: "Mini-Mental State". A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975, 12: 189-198. 10.1016/0022-3956(75)90026-6.

    Article  CAS  PubMed  Google Scholar 

  19. Fountoulakis KN, Tsolaki M, Mohs RC, Kazis A: Epidemiological dementia index: a screening instrument for Alzheimer's disease and other types of dementia suitable for use in populations with low education level. Dement Geriatr Cogn Disord. 1998, 9: 329-338. 10.1159/000017085.

    Article  CAS  PubMed  Google Scholar 

  20. Lindesay J, Jagger C, Mylnik-Szmid A, Sinorwala A, Peet S, Moledina F: The Mini-Mental State Examination (MMSE) in the elderly immigrant Gujarati population in the United Kingdom. Int J Geriatr Psychiatry. 1997, 12: 1155-1167. 10.1002/(SICI)1099-1166(199712)12:12<1155::AID-GPS705>3.0.CO;2-E.

    Article  CAS  PubMed  Google Scholar 

  21. Tombaugh TN, Mcintyre NJ: The Mini-Mental Status Examination: A comprehensive review. J Am Geriatr Soc. 1992, 40: 922-935.

    Article  CAS  PubMed  Google Scholar 

  22. Nelson A, Fogel BS, Faust D: Bedside cognitive screening instruments: a critical assessment. J Nerv Ment Dis. 1986, 174: 73-83. 10.1097/00005053-198602000-00002.

    Article  CAS  PubMed  Google Scholar 

  23. Sabe L, Jason L, Juejati M: Sensitivity and specificity of the Mini-Mental State Exam in the diagnosis of dementia. Behav Neurol. 1993, 6: 207-210.

    Article  CAS  PubMed  Google Scholar 

  24. Swirsky ST, Field HL, Mitchell DR: Longitudinal diagnosis of memory disorders. Age Ageing. 1992, 21: 393-397. 10.1093/ageing/21.6.393.

    Article  Google Scholar 

  25. Brandt J, Spencer M, Folstein M: The telephone interview for cognitive status. Neuropsychiatry Neuropsychol Behav Neurol. 1988, 1: 111-117.

    Google Scholar 

  26. Welsh KA, Breitner JCS, Magruder-Habib KM: Detection of dementia in the elderly using telephone screening of cognitive status. Neuropsychiatry Neuropsychol Behav Neurol. 1993, 6: 103-110.

    Google Scholar 

  27. Plassman BL, Newman TT, Welsh KA, Helms M, Breitner JCS: Properties of the telephone Interview for Cognitive Status: application in epidemiological and longitudinal studies. Neuropsychiatry Neuropsychol Behav Neurol. 1994, 7: 235-241.

    Google Scholar 

  28. Ferrucci L, Del Lungo I, Guralnik JM, Bandinelli S, Benvenuti E, Salani B, Lamponi M, Ubezio C, Benvenuti F, Baroni A: Is the telephone interview for cognitive status a valid alternative in oersons who cannot be evaluated by the Mini Mental State Examination?. Aging. 1988, 10: 332-338.

    Google Scholar 

  29. Milberg W: Issues in the assessment of cognitive function in dementia. Brain Cogn. 1996, 31: 114-132. 10.1006/brcg.1996.0037.

    Article  CAS  PubMed  Google Scholar 

  30. Ensel WM: Measuring depression: The CES-D scale. Social support, life events, and depression. Edited by: Lin N, Dean A, Ensel W. 1986, New York: Academic Press

    Google Scholar 

  31. The Psychological Corporation: WAIS-III:WMS-III: Technical Manual. 1997, San Antonio: Harcourt Brace

    Google Scholar 

  32. Delis D, Kramer J, Kaplan E, Ober B: California Verbal Learning Test: Adult Version Manual. 1987, San Antonio: Harcourt Brace

    Google Scholar 

  33. War Department Adjutant General's Office: Army Individual Test Battery: Manual of Directions and Scoring. Edited by: . 1944, Washington D.C.: War Department Adjutant General's Office.

  34. Benton A, Varney N, Hamsher K: Visuospatial judgment: A clinical test. Arch Neurol. 1978, 35: 364-367.

    Article  CAS  PubMed  Google Scholar 

  35. Spreen O, Benton A: Neurosensory Center Comprehensive Examination for Aphasia (NCCEA). 1977, Victoria, British Columbia: University of Victoria Neuropsychology Laboratory

    Google Scholar 

  36. Kaplan E, Goodglass H, Weintraub S: Boston Naming Test. 1982, Philadelphia: Lea & Febiger

    Google Scholar 

  37. Goodglass H, Kaplan E: The Assessment of Aphasia. 1983, Philadelphia: Lea & Febiger

    Google Scholar 

  38. The Psychological Corporation: Wechsler Test of Adult Reading™ (WTAR™) Third Edition, WMS-III, WTAR Manual. 2001, Toronto, CAN: Harcourt

    Google Scholar 

  39. Derogatis LR: Symptom Checklist-90- Revised. 1996, Minneapolis, Minnesota:Pearson Assessments

    Google Scholar 

  40. Yesavage JA, Brink TL, Lum O, Huang V, Adey M, Leirer VO: Development and validation of a geriatric depression scale: a preliminary report. J Psychiatr Res. 1983, 17: 37-49. 10.1016/0022-3956(82)90033-4.

    Article  CAS  Google Scholar 

  41. Lubben J, Gironda M, Lee A: Refinements to the Lubben Social Network Scale: The LSNS-R. The Behavioral Measurements Letter. 2001, 7: 2-11.

    Google Scholar 

  42. Gorsuch Richard L: Factor Analysis. 1983, Hillsdale, NJ: Erlbaum

    Google Scholar 

  43. Hertzog C, Schear JM: Psychometric considerations in testing the older person. Testing older adults: A reference guide for geropsychological assessments. Edited by: Hunt T, Lindley CJ. 1989, Austin, Texas: Pro-ed.

    Google Scholar 

  44. Salthouse TA, Babcock RL, Shaw RJ: Effects of adult age on structural and operational capacities in working memory. Psychol Aging. 1991, 6: 118-127. 10.1037/0882-7974.6.1.118.

    Article  CAS  PubMed  Google Scholar 

  45. Gur RC, Ragland JD, Moberg PJ, Turner TH, Bilker WB, Kohler C, Siegel SJ, Gur RE: Computerized neurocognitive scanning: I. Methodology and validation in healthy people. Neuropsychopharmacology. 2001, 25: 766-776. 10.1016/S0893-133X(01)00278-0.

    Article  CAS  PubMed  Google Scholar 

  46. Hofman M, Seifritz E, Krauchi K, Hock C, Hampel H, Neugebauer A, Muller-Spahn F: Alzheimer's disease, depression and normal ageing: merit of simple psychomotor and visuospatial tasks. Int J Geriatr Psychiatry. 2000, 15: 31-39. 10.1002/(SICI)1099-1166(200001)15:1<31::AID-GPS72>3.0.CO;2-#.

    Article  CAS  PubMed  Google Scholar 

  47. Ferris S, Crook T, Sathananthan G, Gershon S: Reaction time as a diagnostic measure in senility. J Am Geriatr Soc. 1976, 24: 529-533.

    Article  CAS  PubMed  Google Scholar 

  48. Teng EL, Chui HC, Saperia D: Senile dementia: Performance on a neuropsychological test battery. Recent Advances in Cardiovascular Disease. 1990, 11: 27-34.

    Google Scholar 

  49. Leeds L, Meara RJ, Woods R, Hobson JP: A comparison of the new executive functioning domains of the CAMCOG-R with existing tests of executive function in elderly stroke survivors. Age Ageing. 2001, 30: 251-254. 10.1093/ageing/30.3.251.

    Article  CAS  PubMed  Google Scholar 

  50. Mendez MF, Cummings JL: Dementia- A Clinical Approach. Edited by: 3. 2003, Philadelphia, PA: Butterworth-Heinemann (Elsevier)

  51. Smith MZ, Esiri MM, Barnetson L, King E, Nagy Z: Constructional apraxia in Alzheimer's disease: An association with occipital lobe pathology and accelerated cognitive decline. Dement Geriatr Cogn Disord. 2001, 12: 281-288. 10.1159/000051270.

    Article  CAS  PubMed  Google Scholar 

  52. Lines CR, McCarroll KA, Lipton RB, Block GA: Prevention of Alzheimer's In Society's Elderly Study Group. Telephone screening for amnesic mild cognitive impairment. Neurology. 2003, 60: 261-266.

    Article  PubMed  Google Scholar 

  53. Debling D, Amelang M, Hasselbach P, Sturmer T: Assessment of cognitive status in the elderly using telephone interviews. Z Gerontol Geriatr. 2005, 38: 360-367. 10.1007/s00391-005-0299-5.

    Article  CAS  PubMed  Google Scholar 

  54. Wilson RS, Bennett DA: Assessment of cognitive decline in old age with brief tests amenable to telephone administration. Neuroepidemiology. 2005, 25: 19-25. 10.1159/000085309.

    Article  PubMed  Google Scholar 

  55. Shievitz AL, Tudiver F, Araujo A, Sanghe P, Boyle E: Do elderly people score better on cognitive tests at home?. Can Fam Physician. 1998, 44: 1652-1656.

    CAS  PubMed  PubMed Central  Google Scholar 

Pre-publication history

Download references


The authors acknowledge the invaluable assistance and contributions of Vanessa Kapinus, Aura Barreda & Jesus Gomez.

This study was funded in part by the US Public Health Service, National Institute of Health (NIA AG19218).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Valerie C Crooks.

Additional information

Competing interests

The author(s) declare that they have no competing interests.

Authors' contributions

VCC & JGB contributed to all aspects of design, analyses and implementation and interpretation of study, and drafts, revisions and critical review of paper.

TDP contributed to analyses and interpretation of study, and drafts, revisions and critical review of paper.

All have given final approval of this submission.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Crooks, V.C., Parsons, T.D. & Buckwalter, J.G. Validation of the Cognitive Assessment of Later Life Status (CALLS) instrument: a computerized telephonic measure. BMC Neurol 7, 10 (2007).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: