We used the Nationwide Inpatient Sample of the 1997 Healthcare Cost Utilization Project (HCUP), release 6, produced by the Agency for Healthcare Research and Quality [20]. The Nationwide Inpatient Sample (NIS) is a sample of US community hospitals, including nonfederal, short-term, general and specialty hospitals. It excludes all long-term hospitals, psychiatric hospitals, and alcoholism/chemical dependency treatment facilities. The NIS is based on a 20% stratified probability sample of hospitals with sampling probabilities proportional to the total number of US community hospitals in each stratum. The five stratification variables are geographic region (Northeast, Midwest, South, and West), ownership (government nonfederal, private not-for-profit, private investor-owned), location (urban vs. rural), teaching status (teaching vs. non-teaching), and size based on number of beds (small, medium and large). The database includes information typically available from discharge abstracts for 7,148,420 inpatient stays in 1,012 hospitals across 22 geographically dispersed states (AZ, CA, CO, CT, FL, GA, HI, IL, IA, KS, MD, MA, MO, NJ, NY, OR, PA, SC, TN, UT, WA, WI). Our study was approved by the Committee on Human Research at the University of California San Francisco.
Incidence of hospitalized strokes
HCUP does not distinguish between first and recurrent stroke admissions; therefore, we report the incidence of first and recurrent stroke leading to hospitalization. We identified all patients with a primary diagnosis of stroke (International Classification of Diseases, Ninth Revision, [ICD-9] codes 430–434, 436). Stroke subtypes were identified from previously identified codes [21–23]: acute ischemic stroke (ICD-9 433.01, 433.11, 433.21, 433.31, 433.81, 433.91, 434.01, 434.11, 434.91, 436), SAH (ICD-9 430), and ICH (ICD-9 431). For standardization of stroke incidence, age groups were defined as 0–19 years, 20–29, 30–39, 40–49, 50–59, 60–64, 65–74, 75–84, and 85+. Gender was classified as male and female. Race-ethnicity was coded as non-Hispanic white, Black, Hispanic, API, Native American, and other, based on self-report.
Unadjusted incidence rates were calculated using the number of cases in the database, the appropriate discharge sampling weights specific for each age group, stroke type, and race-ethnicity, and with the 1997 non-Hispanic API or white populations based on Census Bureau data [24] as the denominators. Age-adjusted and gender-specific incidence rates were calculated by the direct method using the entire US population in 2000 as the standard [25]. The number of exposed individuals for each age category was calculated by dividing the 1997 US population in each age category by the discharge sampling weight for the corresponding age category. Subsequently, an event rate for each age category was calculated by dividing the number of cases identified in the database by the number of exposed individuals. Age adjusted rate equals to the sum of age-specific rates multiplied by the age-specific US population proportions. Standard errors for all age categories were computed, as well as an overall standard error for the age-adjusted incidence rate. A 95% confidence interval (CI) was equal to the adjusted rate ± 1.96 times the standard error.
Estimates were obtained for national as well as regional incidences. The four geographical regions were Northeast, Midwest, South, and West as defined by the US Census Bureau in 2000 [26]. Point estimates and 95% CI for incidence rate ratios were calculated using non-Hispanic white as the reference group. Variances for adjusted incidence rates were calculated and regional heterogeneity was tested using ANOVA. Pair wise t-test with Bonferroni correction for multiple comparisons was used to determine statistical differences between pairs of regional incidence rates among APIs and non-Hispanic whites.
Case fatality
Case fatality, or in-hospital death due to all causes, was calculated for each stroke subtype in non-Hispanic whites, and APIs. The Wilcoxon rank-sum test was used to evaluate race-ethnic differences in age, and the chi-squared test was used to test for differences in stroke subtype. Robust logistic regression [27] was used to calculate the independent association between API race and stroke case fatality. The robust method broadens CI to account for clustering of covariates and outcomes by hospital. In multivariable analyses, we adjusted for patient characteristics such as age, sex, co-morbidity score, length of stay and median income for patient's zip code. Co-morbidity scores were developed using a database version of the Charlson comorbidity index and represent a summary of major secondary diagnoses weighted by severity [28]. We also adjusted for hospital characteristics including geographic region, location (urban vs. rural), teaching status (teaching vs. non-teaching), ownership (government non-federal, private for profit, private non-profit), and size (small, medium, large). To determine whether there was heterogeneity among API populations within the US, we tested for interactions between ethnicity and region by using the likelihood-ratio test [29]. The Stata statistical package was used for all analyses (version 7.0; Stata Corporation, College Station, TX).