A total of 30 instruments were included in this review, 18 of which measure capacity, 9 measure perceived performance and 3 measure actual performance.
Even though stroke patients and cerebral palsy (CP) patients share most of their clinical symptoms, almost all of the upper extremity outcome measures are developed for, tested in and used in only one patient population, i.e. either stroke or CP. To gain more insight in the possibilities of common therapies, transcending diagnosis boundaries, agreement about the choice of common instruments is needed.
Regarding stroke or CP rehabilitation, many instruments are available in the categories capacity and perceived performance. However, instruments assessing actual performance are less abundantly available for these patients. This is in sharp contrast to the importance that patients, clinicians and researchers in the field of stroke and CP rehabilitation attribute to high quality arm-hand skilled performance (AHSP) in real life.
For capacity and perceived performance only a few instruments assess quality of use (QOU) of the affected arm and hand, whereas no instruments assess the amount of use (AOU). Many instruments use other descriptions of upper limb use, such as "how much assistance is needed to perform the task?".
In addition, this systematic review revealed that a large diversity in the content of the instruments exists, making it more difficult to compare different outcome measures with each other.
Agreement about the choice and use of instruments
It can be concluded that a wide range of measurement instruments for the categories capacity and perceived performance exists. This systematic review identified 30 instruments currently available to assess AHSP in patients with stroke or CP, which are reported to be both valid and reliable. More than 155 instruments were excluded, mainly because no information was published about the psychometric properties in patients with stroke or CP. Other reasons for exclusion were for example "instrument does not include items to assess the upper extremity" and "instrument is a classification instrument". Nineteen instruments were excluded because next to arm-hand items, they contain also items not related to the upper extremity. Although instruments containing only arm-hand items are used most and are most appropriate to assess arm hand performance, the abovementioned nineteen instruments that were excluded might also be of interest in arm-hand assessment. For the sake of completeness, these instruments are listed in table 8 in the Additional file 1. The responsiveness has not been tested in about 60% of the instruments included in this review.
The chance of consistent use of outcome measures between studies decreases as the range of available instruments increases. The use of different outcome measures makes it more difficult to compare similar studies with each other. It is very important that a future agreement about the choice and the use of common instruments is achieved. This may facilitate comparison between studies, may result in more powerful meta-analyses, and enables the use of published data for group size calculations for new studies .
This systematic review demonstrated that only 3 out of 30 instruments were used in both patients with stroke and patients with CP (i.e. Assessment of Motor and Process Skills , the Goal Attainment Scale  and the Canadian Occupational Performance Measure ). In addition, for 2 instruments separate versions exist for adults and children (i.e. for the MAL/pMAL and the Abilhand/Abilhand-kids). Differences are for example age-dependent item content. Dobkin stated that the mechanisms of motor control, cognitive control and neural adaptation that accompany training and learning are not as much dependent on the underlying disease as on the spared nodes within neural networks . Indeed it is seen that clinical practice paradigms to improve for instance arm-hand function do not tend to differ much between patient populations . To gain more insight in the possibilities of common therapies for different patient populations with similar clinical characteristics, it is important that the same outcome measures are used. Only then a good comparison between studies assessing the same therapies, applied in different patient populations is possible and worthwhile.
It is important to investigate whether the outcome measure which can be used in several patient populations, is valid, reliable and responsive in each of the populations it is used in. One reason is that the course of improvement of AHSP, during and after rehabilitation, may differ between patient populations. Caused by, for example, the fact that stroke patients can rely on learned motor patterns which they have developed during their life, whereas in children with CP these motor patterns may not be present.
Capacity and perceived performance
For the category capacity about 6 times more instruments are available than for the capacity actual performance. For the category perceived performance 3 times as many instruments are available.
Although information about the highest level of functioning (capacity) may be very useful, it does not reveal valid information about the functioning of a patient in daily life (performance). It is known that a large difference may exist between capacity and performance. This difference may be caused, among others, by the learned non-use phenomenon , developmental disregard , changes in the role of the patient at home and in the society  and the fact that capacity measures the highest possible level of functioning during a short period of time (i.e. time of testing) . The latter does not mimic real life situations, where performance is continuous and, for instance, fatigue plays a role.
Patients, clinicians and researchers may have questions on both aspects of AHSP: capacity and performance. Depending on the information needed, the outcome measure should be chosen accordingly.
For the assessment of performance in stroke and CP, most instruments currently available evaluate perceived performance, whereas only 3 instruments assess actual performance. The questionnaires used to assess perceived performance take the perspective of the patient into account, which may be desirable but also has disadvantages. These questionnaires rely on recall and valid reporting of the patient. The cognitive problems stroke patients may have might influence the recall. In addition, the Hawthorn effect may play a role, i.e. the overestimation of arm-hand performance by the patient because his/her desire to improve or to please the examiner . Furthermore, many children with CP are not able to fill in the questionnaire themselves and have to rely on parents and caregivers to fill in the questionnaire, leading to a different perspective, which may render the questionnaire invalid.
Capacity and perceived performance are both relevant for the assessment of AHSP, but actual performance should equally be taken into account, since this reflects the real functioning of a patient in daily life. Actual performance is measured objectively. One example is video observation, in which the performance of a patient is unobtrusively monitored, while performing activities of daily living. A disadvantage of video observation is that the video material has to be assessed by (multiple) experts, which makes this method potentially subjective and very time consuming. Other disadvantages are the possible intrusion on the patient's privacy and the problems of installing the system in a patient's home.
Although the video-based AHA instrument is not applied in the home situation, it was classified as a measure for actual performance, because the spontaneous use of the affected arm during a 15 minutes free play is determined using video observation.
Another method to assess actual performance is accelerometry, measuring the actual AOU of the arm-hand in daily life. Wearing accelerometers is unobtrusive and data can be collected for several consecutive days. Because data collection is done during the whole day, the registered activity will include specific task-related movements, but also non-functional movements and unintentional arm activity. Accelerometry does not provide information about the QOU of the affected arm-hand. The latter is especially of interest for patients, clinicians and researchers. The QOU of the affected arm and hand is associated with the ability to use the affected arm and hand in the home situation, performing activities of daily living.
Currently several promising new instruments to assess actual performance have not yet been tested as to their psychometric properties and were therefore not included in this systematic review. Two examples of such instruments are the Strathclyde Upper Limb Activity Monitor (SULAM)  and the Stroke Upper Limb Activity Monitor (Stroke-ULAM) . The SULAM uses a pressure transducer and electrohydraulic activity sensor to determine the vertical replacement of the wrist compared to the shoulder. The Stroke-ULAM consists of 5 accelerometers and 2 electrogoniometers, measuring the actual upper limb usage of both limbs and the percentage of activity of the affected limb compared to the unaffected limb.
In order to assess actual performance in stroke and CP, it is important that the systems under development will be tested more extensively to determine their utility and psychometric properties in both patient populations. In addition, measurement instruments for the assessment of actual performance have to be (further) developed, assessing also other aspects of AHSP such as QOU or information about the type of activity performed. Considering the importance of instruments transcending diagnosis boundaries, such instruments should be able to be used in different patient populations.
Content of the instruments
A large diversity in the content of the instruments to assess AHSP in patients with stroke or CP exists. About half of the instruments included in the category capacity, solely measure at ICF activity level, whereas the other half of the instruments cover more ICF levels. About 89% of the instruments included in the category perceived performance and 100% of the instruments in the category actual performance solely measure at ICF activity level. If the aim of the study is to measure on ICF activity level, instruments assessing solely on ICF activity level are to be preferred. Whenever more ICF levels are included, the interpretation of the results becomes more difficult, especially when the outcome exists of a total score covering the different ICF levels.
The inclusion of unimanual and/or bimanual items differs among instruments. To determine the capacity of the affected arm-hand, unimanual tasks are useful because these tasks force the use of the affected arm-hand, which can be assessed. However, in daily life, many tasks are bimanual requiring both hands to perform the tasks. Moreover in daily life, the affected arm-hand is rarely used for unimanual tasks . Therefore, if assessment of AHSP in daily life is aimed for, bimanual items should be included.
Although there are some differences in the inclusion of basic and/or extended activities of daily living, the majority of the instruments included both basic and extended activities of daily living.
Some considerations can be made regarding this systematic review. Based on the definitions stated earlier, some instruments which, in other studies, were classified as activity measures, were excluded in this systematic review, for example the nine hole peg test and box and block test . However, the definitions were formulated in order to make a distinction between instruments including tasks which are meaningful in daily life and tasks which are not meaningful in daily life. Instruments containing activities of daily living in the items, but measure on function level (e.g. kinematics) were also excluded.
Instruments used as classification instruments rather than assessment tools for AHSP were excluded. Examples of such instruments are the Manual Ability Classification Instrument (MACS)  and the House classification .
Some instruments, such as the COPM and the GAS can be used to assess individual goals of patients. These instruments can be used to assess AHSP (whenever the individual goals are arm-hand activities) and have been demonstrated to be valuable in the assessment of AHSP . Therefore, these instruments were included in this review, in contrast to other reviews [69, 74]. This gives a more complete overview, of all instruments available. Moreover, individual goal setting instruments are valuable since they reflect the improvement of AHSP on the tasks which are most important for the patient.
This systematic review has some limitations that have to be addressed. One limitation might be the fact that the articles retrieved from the search strategy were divided among four reviewers. However, strict a priori rules were applied in the selection and evaluation of articles and instruments, and in case of even the slightest doubt, the article was reviewed by another reviewer and if needed discussed among all four reviewers.
A second limitation might be that in this systematic review, instruments were included whenever they were reported to be valid and reliable. No criteria were applied to determine the methodological quality of the studies describing the psychometrics. However, the aim of this review was to identify and evaluate instruments available for assessing AHSP in patients with stroke or CP, rather than to give an extended overview of the psychometric properties of these instruments. The latter one is an important next step.