Multiple Errands Test (MET)

Overview

A literature search was conducted to identify publications on the psychometric properties of the Multiple Errands Test (MET) relevant to a population of patients with stroke. Of the 10 studies reviewed, 8 included a mixed population of patients with acquired brain injury including stroke. Studies have reviewed psychometric properties of the original MET, Hospital Version (MET-HV), Simplified Version (MET-SV), Baycrest MET (BMET) and Virtual MET (VMET), as indicated in the summaries below. While research indicates that the MET demonstrates adequate validity and reliability in populations with acquired brain injury including stroke, further research regarding responsiveness of the measure is warranted.

Floor/Ceiling Effects

No studies have reported on floor/ceiling effects of the MET with a stroke population.

Reliability

Internal consistency:
Knight, Alderman & Burgess (2002) calculated internal consistency of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ, using Cronbach’s alpha. Internal consistency was adequate (α=0.77).

Test-retest:
No studies have reported on the test-retest reliability of the MET.

Intra-rater:
No studies have reported on the intra-rater reliability of the MET.

Inter-rater:
Knight, Alderman & Burgess (2002) calculated inter-rater reliability of the MET-HV error categories in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ, using intraclass correlation coefficients. Participants were scored by 2 assessors. Inter-rater reliability was excellent (ICC ranging from 0.81-1.00). The ‘rule breaks’ error category demonstrated the strongest inter-rater reliability (ICC=1.00).

Dawson, Anderson, Burgess, Cooper, Krpan and Stuss (2009) examined inter-rater reliability of the BMET with clients with stroke (n=14) or traumatic brain injury (n=13) and healthy matched controls (n=25), using Intraclass Correlation Coefficients and 2-way random effects models. Participants were scored by two assessors. Inter-rater reliability was adequate to excellent for the five summary measures used: mean number of tasks completed accurately (ICC = 0.80), mean number of rules adhered to (ICC = 0.71), mean number of total errors (ICC = 0.82), mean number of total rules broken (ICC = 0.88) and mean number of requests for help (ICC = 0.71).

Validity

Content:
Shallice & Burgess (1991) evaluated the MET in a sample of 3 patients with traumatic brain injury (TBI) who demonstrated above-average performance on measures of general ability and normal or near-normal performance on frontal lobe tests, and 9 age- and IQ-matched controls. Participants were monitored by two observers and were scored according to number of errors (inefficiencies, rule breaks, interpretation failures, task failures and total score) and qualitative observation. The patients demonstrated qualitatively and quantitatively impaired performance, particularly relating to rule breaks and inefficiencies. The most difficult subtest was the least sensitive part of the procedure and presented difficulties for both patients and control subjects.

Criterion:
Concurrent:
No studies have reported on the concurrent validity of the MET in a stroke population.

Predictive Validity
Maier, Krauss & Katz (2011) examined predictive validity of the MET-HV in relation to community participation with a sample of 30 patients with acquired brain injury including stroke (n=19). Community participation was measured using the Mayo-Portland Adaptability Inventory (MPAI-4) Participation Index (M2PI), completed by the participant and a significant other. The MET-HV was administered 1 week prior to discharge from rehabilitation and the M2PI was administered at 3 months post-discharge. Analyses were performed using Pearson correlation analysis and partial correlation controlling for cognitive status using FIM Cognitive scores. Predictably, higher MET-HV error scores correlated with more restrictions in community participation. There were adequate correlations between participants’ and significant others’ M2PI total score and MET-HV total error score (r = 0.403, 0.510 respectively), inefficiencies (r = 0.353, 0.524 respectively) and rule breaks (r = 0.361, 0.449 respectively). The ability for the MET total error score to predict the M2PI significant other score remained significant but poor following partial correction controlling for cognitive status using FIM Cognitive scores (r = 0.212).

Sensitivity/ Specificity:
Knight, Alderman & Burgess (2002) investigated sensitivity and specificity of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ*. A cut-off score ≥ 7 errors (i.e. 5th percentile of total errors of control subjects) resulted in correct identification of 85% of participants with acquired brain injury (85% sensitivity, 95% specificity).
*Note: IQ was measured using the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ).

Alderman et al. (2003) reported on sensitivity and specificity of the MET-SV with 46 individuals with no history of neurological disease and 50 clients with brain injury including stroke (n=9). Using a cutoff score ≥ 12 errors (i.e. 5th percentile of controls) results in 44% sensitivity (i.e. correct classification of clients with brain injury) and 95.3% specificity (i.e. correct classification of healthy individuals). The authors caution that deriving a single measure based only on number of errors fails to consider between-group qualitative differences in performance. Accordingly, error scores were recalculated to reflect “normality” of the error type, with weighting of errors according to prevalence in the healthy control group (acceptable errors seen in up to 95% of healthy controls = 1; errors demonstrated by ≥ 5% of healthy controls = 2; errors unique to the patient group = 3). Using a cutoff score ≥ 12 errors (5th percentile of controls) resulted in 82% sensitivity and 95.3% specificity. The MET-SV was more sensitive than traditional tests of executive function (Cognitive Estimates, FAS Verbal Fluency, MWCST), and MET-SV error category scores were highly predictive of rating s of executive symptoms of patients who passed traditional executive function tests but failed the MET-SV shopping task.

Construct:
Convergent/Discriminant:
Knight, Alderman & Burgess (2002)* examined convergent validity of the MET-HV by comparison with tests of IQ and cognitive functioning, traditional frontal lobe tests and ecologically sensitive executive function tests, in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3). Tests of IQ and cognitive functioning included the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ), Weschler Adult Intelligence Scale – Revised Full Scale Intelligence Quotient (WAIS-R FSIQ), Adult Memory and Information Processing Battery (AMIPB), Rivermead Behavioural Memory Test (RBMT) and Visual Objects and Space Perception battery (VOSP). Frontal lobe tests included verbal fluency, the Cognitive Estimates Test (CET), Modified Card Sorting Test (MCST), Tower of London Test (TOLT) and versions of the hand manipulation and hand alternation tests. Ecologically sensitive executive function tests included the Behavioural Assessment of the Dysexecutive Syndrome battery (BADS) and the Test of Everyday Attention (TEA) Map Search and Visual Elevator tasks. The Dysexecutive (DEX) questionnaire was also used, although proxy reports were used rather than self-reports due to identified lack of insight of individuals with brain injury. There were excellent correlations between the MCST percentage perseverative errors with MET-HV rule breaks (r=0.66) and MET-HV total errors (r=0.67) following Bonferroni adjustment. There were excellent correlations between the BADS Profile score and the MET-HV task failures (r = -0.58), interpretation failures (r = 0.64) and total errors (r = -0.57). There was an excellent correlation between the DEX intentionality factor and MET-HV task failures (r = 0.70). In addition, the relationship between the MET-HV and DEX was re-evaluated to control for possible confounding effects; controlling variables age, familiarity and memory function with respect to MET-HV task failures resulted in excellent correlations with the DEX total score (r = 0.79) and DEX inhibition (r = 0.69), intentionality (r = 0.76) and executive memory (r = 0.67) factors. There was an adequate correlation between the RBMT Profile Score and the MET-HV number of task failures (r=-0.57). There were no significant correlations between the MET and other tests of IQ and cognitive functioning (MET-HV, NART-R FSIQ, WAIS-R FSIQ, AMIPB, VOSP), and other frontal lobe tests (verbal fluency, CET, TOLT, hand manipulation and hand alternation tests), other ecologically sensitive executive function tests (TEA Map Search and Visual Elevator tasks) or other DEX factors (positive affect, negative affect).
Note: Initial correlations were measured using Pearson correlation coefficient and significance levels were subsequently adjusted by Bonferroni adjustment to account for multiple comparisons; results reported indicate significant correlations following Bonferroni adjustment.

Rand, Rukan, Weiss & Katz (2009a)* examined convergent validity of the MET-HV by comparison with measures of executive function and IADLs with a sample of 9 patients with subacute or chronic stroke, using Spearman correlation coefficients. Executive function was measured using the BADS Zoo Map test and IADLs were measured using the IADL questionnaire. There were excellent negative correlations between the BADS Zoo Map and MET-HV outcome measures of total number of mistakes (r = -0.93), partial mistakes in completing tasks (r = -0.80), non-efficiency mistakes (r = -0.86) and time to complete the MET (r = -0.79). There were excellent correlations between the IADL questionnaire and the MET-HV number of mistakes of rule breaks (r = 0.80) and total number of mistakes (r = -0.76).

Maier, Krauss & Katz (2011)* examined convergent validity of the MET-HV by comparison with the FIM Cognitive score with a sample of 30 patients with acquired brain injury including stroke (n=19), using Pearson correlation analysis. There was an excellent negative correlation between MET-HV total errors score and FIM Cognitive score (r = -0.67).

Alderman, Burgess, Knight and Henman (2003)* examined convergent validity of the MET-SV by comparison with tests of IQ, executive function and everyday executive abilities with 50 clients with brain injury including stroke (n=9). Neuropsychological tests included the WAIS-R FSIQ, BADS, Cognitive Estimates Test, FAS verbal fluency test, a modified version of the Wisconsin Card Sorting Test (MWCST) and the DEX. There were adequate correlations between MET-SV task failure errors and WAIS-R FSIQ (r = -0.32), MWCST perseverative errors (r = 0.39), BADS profile score (r = -0.46) and Zoo-Map (r = -0.46) and Six Element Test (r = -0.41) subtests. There were adequate negative correlations between MET-SV social rule breaks and the Cognitive Estimates (r = -0.33), and between MET-SV task rule breaks, social rule breaks and total rule breaks and the BADS Action Program subtest (r = -0.42, -0.40, -0.43 respectively). There were poor to adequate negative correlations between the DEX and MET-SV rule breaks (r = -0.30), task failures (r = -0.25) and total errors (r = -0.37).

In a subgroup analysis of individuals with brain injury who passed traditional executive function tests but failed the MET-SV (n=17), there were adequate to excellent correlations between MET-SV inefficiencies and DEX factors of intentionality and negative affect (r = 0.59, -0.76); MET-SV interpretation failures and DEX inhibition and total (r = -0.67, -0.57); MET-SV total and actual rule breaks and DEX inhibition (r = -0.70, 0.66), intentionality (r = 0.60, 0.64) and total (r = -0.57, 0.59); MET-SV social rule breaks and DEX positive and negative affect (r = 0.79, -0.59); MET-SV task failures and DEX inhibition and positive affect (r = -0.58, -0.52), and MET-SV total errors and DEX intentionality (r = 0.67).

Dawson et al. (2009)* examined convergent validity of the BMET by comparison with other measures of IADL and everyday function with 14 clients with stroke, using Pearson correlation. Other measures included the DEX (significant other report), Stroke Impact Profile (SIP), Assessment of Motor and Process Skills (AMPS) and Mayo Portland Adaptability Inventory (MPAI) (significant other report). There were excellent correlations between the BMET number of rules broken and the SIP – Physical (r = 0.78) and Affective behavior (r = 0.64) scores and the AMPS motor score (r = -0.75). There was an adequate correlation between the BMET time to completion and SIP physical score (r = 0.54).

Rand et al. (2009a)* examined convergent validity of the VMET by comparison with the BADS Zoo Map test and IADL questionnaire with the same sample of 9 patients with subacute or chronic stroke, using Spearman correlation coefficients. There was an excellent negative correlation between the BADS Zoo Map and VMET outcome measure of non-efficiency mistakes (r = -0.87), and between the IADL and VMET total number of mistakes (r = -0.82).

Rand et al. (2009a) also examined the relationships between the scores of the VMET and those of the MET-HV using Spearman and Pearson correlation coefficients. Among patients with stroke, there were excellent correlations between MET-HV and VMET outcomes for the total number of mistakes (r = 0.70), partial mistakes in completing tasks (r = 0.88) and non-efficiency mistakes (r = 0.73). Analysis of the whole population indicated adequate to excellent correlations between MET-HV and VMET outcomes for the total number of mistakes (r = 0.77), complete mistakes of completing a task (r = 0.63), partial mistakes in completing tasks (r = 0.80), non-efficiency mistakes (r = 0.72) and use of strategies (r = 0.44), but not for rule break mistakes.

Raspelli et al. (2010) examined convergent validity of the VMET by comparison with neuropsychological tests, with 6 clients with stroke and 14 healthy subjects. VMET outcome measures included time, searched item in the correct area, sustained attention, maintained sequence and no perseveration. Neuropsychological tests included the Trail Making Test, Corsi spatial memory supra-span test, Street’s Completion Test, Semantic Fluencies and Tower of London test. There were excellent correlations between the VMET variable ‘time’ and the Semantic Fluencies test (r = -0.87) and the Tower of London test (r = -0.82); between the VMET variable ‘searched item in the correct area’ and the Trail Making Test (r = 0.96); and between the VMET variables ‘sustained attention’, ‘maintained sequence’ and ‘no perseveration’ and Corsi’s supra-span test (r = 0.84) and Street’s Completion Test (r = -0.86).

Raspelli et al. (2012) examined convergent validity of the VMET by comparison with the Test of Attentional Performance (TEA) with 9 clients with stroke. VMET outcome measures included time, errors, inefficiencies, rule breaks, strategies, interpretation failures and partial-task failures. Authors reported excellent correlations between the VMET outcomes time, inefficiencies and total errors and TEA tests (range r = -0.67 to 0.81).
Note: Other neuropsychological tests were administered but correlations are not reported (Mini Mental Status Examination (MMSE), Beck Depression Inventory (BDI), State and Trait Anxiety Index (STAI), Behavioural Inattention Test (BIT) – Star Cancellation Test, Brief Neuropsychological Examination (ENB) – Token Test, Street’s Completion Test, Stroop Colour-Word Test, Iowa Gambling Task, DEX and ADL/IADL Tests).

*Note: The correlations between the MET and other measures of everyday executive functioning and IADLs also provide support for the ecological validity of the MET (as reported by the authors of these articles).

Known Group:
Knight, Alderman & Burgess (2002) examined known-group validity of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects (hospital staff members) matched for gender, age and IQ*. Clients with brain injury made significantly more rule breaks (p=0.002) and total errors (p<0.001), and achieved significantly fewer tasks (p<0.001) than control subjects. Clients with brain injury used significantly more strategies such as looking at a map (p=0.008), reading signs (p=0.006), although use of strategies had little effect on test performance. The test was able to discriminate between individuals with acquired brain injury and healthy controls.
*Note: IQ was measured using the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ).

Rand et al. (2009a) examined known group validity of the MET-HV with 9 patients with subacute or chronic stroke, 20 healthy young adults and 20 healthy older adults, using Kruskal Wallis H. Patients with stroke made more mistakes than older adults on VMET outcomes of total mistakes, mistakes in completing tasks, partial mistakes in completing tasks and non-efficiency mistakes, but not rule break mistakes or use of strategies mistakes. Older adults made more mistakes than younger adults on VMET outcomes of total mistakes, partial mistakes in completing tasks and non-efficiency mistakes, but not mistakes in completing tasks, rule break mistakes or use of strategies mistakes.

Alderman et al. (2003) examined known group validity of the MET-SV with 46 individuals with no history of neurological disease (hospital staff members) and 50 clients with brain injury including stroke (n=9), using a series of t-tests. Clients with brain injury made significantly more rule breaks (t = 4.03), task failures (t = 10.10), total errors (t = 7.18), and social rule breaks (chi square 4.3) than individuals with no history of neurological disease. Results regarding errors were preserved when group comparisons were repeated using age, familiarity and cognitive ability (measured by the NART-R FSIQ) as covariates (F = 11.79, 40.82, 27.92 respectively). There was a significant difference in task failures between groups after covarying for age, IQ (measured by the WAIS-R FSIQ) and familiarity with the shopping centre (F = 11.57). Clients with brain injury made approximately three times more errors as healthy individuals. For both groups, rule breaks and task failures were the most common errors.

Dawson et al. (2009) examined known group validity of the BMET with 14 clients with stroke and 13 healthy matched controls, using a series of t-tests. Clients with stroke performed significantly worse on number of tasks completed accurately (d = 0.84, p<0.05), rule breaks (d = 0.92, p<0.05) and total failures (d = 1.05, r<0.01). The proportion of group members who completed fewer than 40% (< 5) tasks satisfactorily was also significantly different between the two groups (28% of clients with stroke vs. 0% of healthy matched controls, p<0.05).
Note: d is the effect size; effect sizes ≥ 0.7 are considered large.

Rand et al. (2009a) examined known group validity of the VMET with a sample of 9 patients with subacute or chronic stroke, 20 healthy young adults and 20 healthy older adults, using Kruskal Wallis H. Patients with stroke made more mistakes than older adults on all VMET outcomes except for rule break mistakes. Older adults made more mistakes than young adults on all VMET outcomes except for the use of strategies mistakes.

Raspelli et al. (2010) examined known group validity of the VMET with 6 clients with stroke and 14 healthy subjects. There were significant differences between groups in time taken to execute the task (higher for healthy subjects) and in the partial error ‘Maintained task objective to completion’.

Raspelli et al. (2012) examined known group validity of the VMET with 9 clients with stroke, 10 healthy young adults and 10 healthy older adults, using Kruskal-Wallis procedures. Results showed that clients with stroke scored lower in VMET time and errors than older adults, and that older adults scored lower in VMET time and errors than young adults.

Responsiveness

Two studies used the MET (MET-HV, VMET and modified version of the MET-HV & MET-SV) to measure change following intervention.

Novakovic-Agopian et al. (2011) developed a modified version of the MET-HV and MET-SV to be used in local hospital settings. They developed 3 alternate forms that were used in a pilot study examining the effect of goal-oriented attentional self-regulation training with a sample of 16 patients with chronic brain injury including stroke or cerebral hemorrhage (n=3). A pseudo-random crossover design was used. During the first 5 weeks, one group (Group A) completed goal-oriented attentional self-regulation training while the other group (Group B) only received a 2-hour educational instructional session. In the subsequent phase, conditions were switched such that participants in Group B received goals training for 5 weeks while those in Group A received educational instruction. At week 5 the group that received goal training first demonstrated a significant reduction in task failures (p<0.01), whereas the group that received the educational session demonstrated no significant improvement in MET scores. From week 5 to week 10 there were no significant changes in MET scores in either group.

Rand, Weiss and Katz (2009b) used the MET-HV and VMET to detect change in multi-tasking skills of 4 clients with subacute stroke following virtual reality intervention using the VMall virtual supermarket. Clients demonstrated improved performance on both measures following 3 weeks of multi-tasking training using the VMall virtual supermarket.

References
  • Alderman, N., Burgess, P.W., Knight, C., & Henman, C. (2003). Ecological validity of a simplified version of the multiple errands shopping test. Journal of the International Neuropsychological Society, 9, 31-44.
  • Dawson, D.R., Anderson, N.D., Burgess, P., Cooper, E., Krpan, K.M., & Stuss, D.T. (2009). Further development of the Multiple Errands Test: Standardized scoring, reliability, and ecological validity for the Baycrest version. Archives of Physical Medicine and Rehabilitation, 90, S41-51.
  • Knight, C., Alderman, N., & Burgess, P.W. (2002). Development of a simplified version of the Multiple Errands Test for use in hospital settings. Neuropsychological Rehabilitation, 12(3), 231-255.
  • Maier, A., Krauss, S., & Katz, N. (2011). Ecological validity of the Multiple Errands Test (MET) on discharge from neurorehabilitation hospital. Occupational Therapy Journal of Research: Occupation, Participation and Health, 31(1) S38-46.
  • Novakovic-Agopian, T., Chen, A.J.W., Rome, S., Abrams, G., Castelli, H., Rossi, A., McKim, R., Hills, N., & D’Esposito, M. (2011). Rehabilitation of executive functioning with training in attention regulation applied to individually defined goals: A pilot study bridging theory, assessment, and treatment. The Journal of Health Trauma Rehabilitation, 26(5), 325-338.
  • Novakovic-Agopian, T., Chen, A. J., Rome, S., Rossi, A., Abrams, G., Dʼesposito, M., Turner, G., McKim, R., Muir, J., Hills, N., Kennedy, C., Garfinkle, J., Murphy, M., Binder, D., Castelli, H. (2012). Assessment of Subcomponents of Executive Functioning in Ecologically Valid Settings: The Goal Processing Scale. The Journal of Health Trauma Rehabilitation, 2012 Oct 16. [Epub ahead of print]
  • Rand, D., Rukan, S., Weiss, P.L., & Katz, N. (2009a). Validation of the Virtual MET as an assessment tool for executive functions. Neuropsychological Rehabilitation, 19(4), 583-602.
  • Rand, D., Weiss, P., & Katz, N. (2009b). Training multitasking in a virtual supermarket: A novel intervention after stroke. American Journal of Occupational Therapy, 63, 535-542.
  • Raspelli, S., Carelli, L., Morganti, F., Poletti, B., Corra, B., Silani, V., & Riva, G. (2010). Implementation of the Multiple Errands Test in a NeuroVR-supermarket: A possible approach. Studies in Health Technology and Informatics, 154, 115-119.
  • Raspelli, S., Pallavicini, F., Carelli, L., Morganti, F., Pedroli, E., Cipresso, P., Poletti, B., Corra, B., Sangalli, D., Silani, V., & Riva, G. (2012). Validating the Neuro VR-based virtual version of the Multiple Errands Test: Preliminary results. Presence, 21(1), 31-42.
  • Shallice, T. & Burgess, P.W. (1991). Deficits in strategy application following frontal lobe damage in man. Brain, 114, 727-741.