Skip to main content
Fig. 2 | BMC Neurology

Fig. 2

From: Informing the development of an outcome set and banks of items to measure mobility among individuals with acquired brain injury using natural language processing

Fig. 2

The iterative improvement process for preliminary item bank process. The process began with an initial Sentence-BERT model and relied heavily on the ICF ontology to produce a good enough first clustering. At each step, a grid search was collected over a wide range of hyperparameter values and a best clustering was retained according to automatic heuristics and human evaluation. After each clustering, expert annotations were collected to improve the Sentence-BERT model and yield better clusterings. We report the F1 score of each clustering with respect to the first and second expert annotations, respectively named E_1 and E_2. Here, E_2 is the most reliable metric, as it associates items with adequate labels, while E_1 associates item pairs with whether or not they belong together. By nature, E_1 penalizes having a large number of clusters, as can be seen on the third clustering's score. Also note that both E_1 and E_2 are not exact metrics, as, for instance, the third clustering still required heavy finetuning by experts to yield a satisfying Core Outcome Set despite the near-perfect E_2 score.

Back to article page