TMS achieved the highest discrimination performance to identify stroke mimics in our study. The AUC of TMS in our study (0.75) was similar to a previously published validation study using TMS to identify stroke mimics (AUC of 0.72) [14]. The validation study also used a telestroke cohort, similar to our current study. Nevertheless, the similar performance suggests that TMS consistently provide consistent discrimination for stroke mimics across different stroke populations. However, despite TMS having the highest AUC (0.75), the discrimination performance may be insufficient for the score to be clinically meaningful in the setting of emergency decision making for IVT, where a higher AUC may be needed. FABS and simplified FABS did not discriminate stroke mimics well in our study, as compared to the original derivation studies [9, 10]. This may be due to differing study inclusion criteria. Our study only considered patients who were administered IVT, while the derivation studies included all patients who presented with ischemic stroke symptoms. Moreover, as our study population was derived from a primary stroke centre utilising telestroke in the presence of a neurologist, some stroke mimics may have already been clinically excluded, which may potentially account for differing cohort characteristics when compared to a tertiary stroke centre. Hence, we can generalize our results only to the clinical setting of decision making for IVT, in the presence of an emergency physician, a neurologist, and a normal NCCT in a telestroke setting. Nevertheless, all of the scores were adequately rated by the study team, hence the difference of the score performance could not be completely accounted by the differences in inclusion criteria alone.
Our study was the first to externally validate Khan score [12]. Although the score had the lowest AUC in our study, it demonstrated highest specificity (88.5%) and positive predictive value (97.5%) at a cutoff of 2. The above findings suggest that Khan score may be the preferred clinical score when a high level of clinical certainty of stroke mimic is required, before any decision is made to withhold IVT. Of note, it has a low sensitivity for stroke mimics (32.1%) and poor ability to rule out stroke mimics with confidence. Conversely, the TMS has the highest sensitivity of stroke mimics (91.3%) among the four scores. Hence, it may be the score of choice if one would just like to screen the clinical possibility of stroke mimics. Given the above limitations of Khan and TMS, it may be possible for the emergency physician to use both scores for IVT decision making; the TMS as an initial screening tool and followed by the Khan score for confirmation. This will require further evaluation in a larger cohort as our current study population was not powered sufficiently to detect a statistical difference.
A clinical decision tree is an alternative method of prediction modelling compared to traditional prediction scores. Clinical decision tree has been utilised in stroke care [15] as well in other subspecialties [16, 17] to aid physicians in decision making. Similarly, a clinical decision tree may be helpful to aid emergency physicians in identifying stroke mimics by using readily available clinical information without advanced imaging. We specifically only included clinical variables, without radiological variables, in the clinical decision tree to enable it to be applicable to clinicians without advanced imaging. Because univariate analysis revealed advanced age, presence of migraine, hypertension and history of psychiatric illness to be significantly different between stroke mimic and acute ischemic stroke patients, we incorporated these variables in our derivation of the decision tree. Moreover, two of the above clinical variables (age and hypertension) were found to be replicated in the 4 tested prediction scores, suggesting that they were consistent predictors of stroke mimics.
A novel and interesting finding from our clinical decision tree analysis was that the presence of migraine was determined to be the first most important decision point in the evaluation of stroke mimics. In the absence of migraine, the next most important consideration was the age of the patient. It was surprising to find that older patients (more than 45 years) and with the presence of migraine were more likely to be stroke mimics. This is an unexpected observation, considering that all of the previous prediction scores weighed towards younger stroke mimics. The age of the 3 patients who had migraine and were above 45 years old in our database, were 48, 50 and 56 years old respectively. This suggests that although our decision tree cut off was at 45 years old, these stroke mimics with migraine were still relatively young. Our decision tree also suggests that the very young, age 34 or less, were highly likely (66.7%) to be stroke mimics. More importantly, the presence of true stroke was very high (96.9%) in patients with the absence of migraine, older than 45 years and absence of psychiatric history. This suggests that the emergency physician is relatively certain of the presence of true stroke with just 3 simple clinical factors. Overall, the performance of our decision tree had a higher AUC (0.8) than TMS. Prospective validation of the decision tree warrants a larger external cohort.
The stroke mimic rate of our study (6.6%) is considerably lower compared to other studies, that report rates as high as 26–30% [2, 18]. Although the low stroke mimic rate is likely attributable to our study design that included only patients with IVT, even in a randomised controlled trial for thrombolysis, stroke mimic rates can be as high as 16.6% [19]. Nevertheless, our finding is comparable with other studies which only included patients who had IVT and a stroke mimic rate of 3.5–4.1% [5, 20]. The majority of stroke mimics in our study were diagnosed with functional weakness, at a high rate of 41.2%. This is in contrast to seizure or migraine as the most common stroke mimic in previously published series [19, 21,22,23]. However, many studies have similarly found functional mimics to be the most common stroke mimic (14.5–16.7%) [24, 25], with 32% functional stroke mimics reported in a hyperacute unit [26]. This suggests that functional stroke mimics may be indeed very common across different study populations. Our study reported a single hemorrhagic complication from unwarranted IVT in a stroke mimic. Unfortunately, this particular patient required an urgent surgical evacuation of an epidural hematoma. Although hemorrhagic complication rate from IVT in stroke mimics has been reported to be low [5, 20], these complications, when they occur, may be life threatening and require urgent surgical interventions. Therefore, preventing unwarranted IVT in these numerous stroke mimics is essential and remains an unmet clinical need.
Negative neuroimaging for cerebral ischemia is common post thrombolysis. In well characterised thrombolysis cohorts, approximately 20% of post-thrombolysis patients [27, 28] had no evidence of ischemia on follow up neuroimaging. We reported 31 (12.1%) patients with negative neuroimaging in our cohort, which is lower than published literature. Although we used a combination of negative neuro-imaging findings and persistence of symptoms beyond 24 h to characterise stroke mimics in our study, some of the patients with resolution of symptoms within 24 h may be, in fact, stroke mimics rather than TIA. Nevertheless, all 14 patients (5.4%) who had complete resolution of neurological symptoms within 24 h of thrombolysis were all diagnosed to be vascular in origin by an independent neurologist not part of this study. Conversely, there could also be a possibility of the MRI-negative stroke [29] in patients with small lacunar infarctions, which were misclassified as stroke mimics. This was possible as 2 of our stroke mimics with negative imaging but had no definitive psychiatric diagnosis. This diagnostic uncertainty adds to an unmet clinical need to identify stroke patients with greater certainty.
The main limitation of our study was an under-representation of stroke mimics due to the nature of our study design. The study only included patients who were administered IVT and did not include all patients who presented to the emergency department with stroke symptoms. Because the study was performed via a telestroke network, the neurologist could have excluded stroke mimics based on other clinical factors unrecorded in our study, based on clinical experience. Therefore, there may an under-representation of stroke mimics. Additionally, due to stroke mimic patients potentially excluded, our results cannot be generalised to all patients who present with stroke-like symptoms to the emergency department. The study results can only be applicable in the clinical setting of decision making for IVT, in the presence of an emergency physician, a neurologist, a normal NCCT in a telestroke setting. However, our results may still serve as the final clinical checkpoint in IVT decision making. Our study was performed at a primary stroke centre using a telestroke system, which is different from a tertiary referral centre, with an in-house hyperacute stroke team, where stroke mimic rates may be lower. Moreover, in a centre with readily available advanced imaging, the value of the decision tree using only patient derived information alone will be lower. Regardless, in many primary stroke centres or remote hospitals all over the world, neuro imaging beyond NCCT brain may not be available. Hence, we specifically excluded any incorporation of further neuroimaging, such as CT angiogram or MRI, into the clinical decision tree. This is to focus only on clinical assessment, which is still key in clinical stroke diagnosis, and enable our decision tree to help physicians around the world without advanced imaging to ascertain the presence of stroke mimics. We did not have long term functional outcome data on the stroke mimic patients. This would have enabled us to understand and prognosticate the long-term effects of IVT in stroke mimics. We did not validate our results with an external stroke cohort, which would have strengthened our prediction model.