Multiple Errands Test (MET)

Overview

A literature search was conducted to identify publications on the psychometric properties of the Multiple Errands Test (MET) relevant to a population of patients with stroke. Of the 10 studies reviewed, 8 included a mixed population of patients with acquired brain injury including stroke. Studies have reviewed psychometric properties of the original MET, Hospital Version (MET-HV), Simplified Version (MET-SV), Baycrest MET (BMET) and Virtual MET (VMET), as indicated in the summaries below. While research indicates that the MET demonstrates adequate validity and reliability in populations with acquired brain injury including stroke, further research regarding responsiveness of the measure is warranted.

Floor/Ceiling Effects

No studies have reported on floor/ceiling effects of the MET with a stroke population.

Reliability

Internal consistency
Knight, Alderman & Burgess (2002) calculated internal consistency of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ, using Cronbach’s alpha. Internal consistency was adequate (?=0.77).

Inter-rater reliability
Knight, Alderman & Burgess (2002) calculated inter-rater reliability of the MET-HV error categories in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ, using intraclass correlation coefficients. Participants were scored by 2 assessors. Inter-rater reliability was excellent (ICC ranging from 0.81-1.00). The ‘rule breaks’ error category demonstrated the strongest inter-rater reliability (ICC=1.00).

Dawson, Anderson, Burgess, Cooper, Krpan and Stuss (2009) examined inter-rater reliability of the BMET with clients with stroke (n=14) or traumatic brain injury (n=13) and healthy matched controls (n=25), using Intraclass Correlation Coefficients and 2-way random effects models. Participants were scored by 2 assessors. Inter-rater reliability was adequate to excellent for the 5 summary measures used: mean number of tasks completed accurately (ICC = 0.80), mean number of rules adhered to (ICC = 0.71), mean number of total errors (ICC = 0.82), mean number of total rules broken (ICC = 0.88) and mean number of requests for help (ICC = 0.71).

Intra-rater reliability
No studies have reported on the intra-rater reliability of the MET.

Test-retest reliability
No studies have reported on the test-retest reliability of the MET.

Validity

Content

Shallice & Burgess (1991) evaluated the MET in a sample of 3 patients with traumatic brain injury (TBI) who demonstrated above-average performance on measures of general ability and normal or near-normal performance on frontal lobe tests, and 9 age- and IQ-matched controls. Participants were monitored by 2 observers and were scored according to number of errors (inefficiencies, rule breaks, interpretation failures, task failures and total score) and qualitative observation. The patients demonstrated qualitatively and quantitatively impaired performance, particularly relating to rule breaks and inefficiencies. The most difficult subtest was the least sensitive part of the procedure and presented difficulties for both patients and control subjects.

Criterion

Predictive Validity
Maier, Krauss & Katz (2011) examined Mayo-Portland Adaptability Inventory following partial correction controlling for cognitive status using FIM Cognitive scores (r = 0.212).

Sensitivity/ Specificity
Knight, Alderman & Burgess (2002) investigated sensitivity and specificity of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ*. A cut-off score ? 7 errors (i.e. 5th percentile of total errors of control subjects) resulted in correct identification of 85% of participants with acquired brain injury (85% sensitivity, 95% specificity).

* IQ was measured using the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ).

Alderman et al. (2003) reported on sensitivity and specificity of the MET-SV with 46 individuals with no history of neurological disease and 50 clients with brain injury including stroke (n=9). Using a cutoff score ? 12 errors (i.e. 5th percentile of controls) results in 44% sensitivity (i.e. correct classification of clients with brain injury) and 95.3% specificity (i.e. correct classification of healthy individuals). The authors caution that deriving a single measure based only on number of errors fails to consider between-group qualitative differences in performance. Accordingly, error scores were recalculated to reflect “normality” of the error type, with weighting of errors according to prevalence in the healthy control group (acceptable errors seen in up to 95% of healthy controls = 1; errors demonstrated by ? 5% of healthy controls = 2; errors unique to the patient group = 3). Using a cutoff score ? 12 errors (5th percentile of controls) resulted in 82% sensitivity and 95.3% specificity. The MET-SV was more sensitive than traditional tests of executive function (Cognitive Estimates, FAS Verbal Fluency, MWCST), and MET-SV error category scores were highly predictive of ratings of executive symptoms of patients who passed traditional executive function tests but failed the MET-SV shopping task.

Concurrent Validity
No studies have reported on the concurrent validity of the MET in a stroke population.

Construct

Known Group Validity
Knight, Alderman & Burgess (2002) examined known-group validity of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects (hospital staff members) matched for gender, age and IQ*. Clients with brain injury made significantly more rule breaks (p=0.002) and total errors (p<0.001), and achieved significantly fewer tasks (p<0.001) than control subjects. Clients with brain injury used significantly more strategies such as looking at a map (p=0.008), reading signs (p=0.006), although use of strategies had little effect on test performance. The test was able to discriminate between individuals with acquired brain injury and healthy controls.

* IQ was measured using the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ).

Rand et al. (2009a) examined known group validity of the MET-HV with 9 patients with subacute or chronic stroke, 20 healthy young adults and 20 healthy older adults, using Kruskal Wallis H. Patients with stroke made more mistakes than older adults on VMET outcomes of total mistakes, mistakes in completing tasks, partial mistakes in completing tasks and non-efficiency mistakes, but not rule break mistakes or use of strategies mistakes. Older adults made more mistakes than younger adults on VMET outcomes of total mistakes, partial mistakes in completing tasks and non-efficiency mistakes, but not mistakes in completing tasks, rule break mistakes or use of strategies mistakes.

Alderman et al. (2003) examined known group validity of the MET-SV with 46 individuals with no history of neurological disease (hospital staff members) and 50 clients with brain injury including stroke (n=9), using a series of t-tests. Clients with brain injury made significantly more rule breaks (t = 4.03), task failures (t = 10.10), total errors (t = 7.18), and social rule breaks (chi square 4.3) than individuals with no history of neurological disease. Results regarding errors were preserved when group comparisons were repeated using age, familiarity and cognitive ability (measured by the National Adult Reading Test – Revised) as covariates (F = 11.79, 40.82, 27.92 respectively). There was a significant difference in task failures between groups after covarying for age, IQ (measured by the WAIS-R FSIQ) and familiarity with the shopping centre (F = 11.57). Clients with brain injury made approximately three times more errors as healthy individuals. For both groups, rule breaks and task failures were the most common errors.

Dawson et al. (2009) examined known group validity of the BMET with 14 clients with stroke and 13 healthy matched controls, using a series of t-tests. Clients with stroke performed significantly worse on number of tasks completed accurately (d = 0.84, p<0.05), rule breaks (d = 0.92, p<0.05) and total failures (d = 1.05, r<0.01). The proportion of group members who completed fewer than 40% (< 5) tasks satisfactorily was also significantly different between the two groups (28% of clients with stroke vs. 0% of healthy matched controls, p<0.05).

Note: d is the effect size; effect sizes ? 0.7 are considered large.

Rand et al. (2009a) examined known group validity of the VMET with a sample of 9 patients with subacute or chronic stroke, 20 healthy young adults and 20 healthy older adults, using Kruskal Wallis H. Patients with stroke made more mistakes than older adults on all VMET outcomes except for rule break mistakes. Older adults made more mistakes than young adults on all VMET outcomes except for the use of strategies mistakes.

Raspelli et al. (2010) examined known group validity of the VMET with 6 clients with stroke and 14 healthy subjects. There were significant differences between groups in time taken to execute the task (higher for healthy subjects) and in the partial error ‘Maintained task objective to completion’.

Raspelli et al. (2012) examined known group validity of the VMET with 9 clients with stroke, 10 healthy young adults and 10 healthy older adults, using Kruskal-Wallis procedures. Results showed that clients with stroke scored lower in VMET time and errors than older adults, and that older adults scored lower in VMET time and errors than young adults.

Convergent/Discriminant Validity

Knight, Alderman & Burgess (2002)* examined The Dysexecutive (DEX) correlation between the RBMT Profile Score and the MET-HV number of task failures (r=-0.57). There were no significant correlations between the MET and other tests of IQ and cognitive functioning (MET-HV, NART-R FSIQ, WAIS-R FSIQ, AMIPB, VOSP), and other frontal lobe tests (verbal fluency, CET, TOLT, hand manipulation and hand alternation tests), other ecologically sensitive executive function tests (TEA Map Search and Visual Elevator tasks) or other DEX factors (positive affect, negative affect).

Note: Initial correlations were measured using Pearson correlation coefficient and significance levels were subsequently adjusted by Bonferroni adjustment to account for multiple comparisons; results reported indicate significant correlations following Bonferroni adjustment.

Rand, Rukan, Weiss & Katz (2009a)* examined convergent validity of the MET-HV by comparison with measures of executive function and IADLs with a sample of 9 patients with subacute or chronic stroke, using Spearman correlation coefficients. Executive function was measured using the BADS Zoo Map test and IADLs were measured using the IADL questionnaire. There were excellent negative correlations between the BADS Zoo Map and MET-HV outcome measures of total number of mistakes (r = -0.93), partial mistakes in completing tasks (r = -0.80), non-efficiency mistakes (r = -0.86) and time to complete the MET (r = -0.79). There were excellent correlations between the IADL questionnaire and the MET-HV number of mistakes of rule breaks (r = 0.80) and total number of mistakes (r = -0.76).

Maier, Krauss & Katz (2011)* examined convergent validity of the MET-HV by comparison with the FIM Cognitive score with a sample of 30 patients with acquired brain injury including stroke (n=19), using Pearson correlation analysis. There was an excellent negative correlation between MET-HV total errors score and FIM Cognitive score (r = -0.67).

Alderman, Burgess, Knight and Henman (2003)* examined DEX negative correlations between the DEX and MET-SV rule breaks (r = -0.30), task failures (r = -0.25) and total errors (r = -0.37).

In a subgroup analysis of individuals with brain injury who passed traditional executive function tests but failed the MET-SV (n=17), there were adequate to strong correlations between MET-SV inefficiencies and DEX factors of intentionality and negative affect (r = 0.59, -0.76); MET-SV interpretation failures and DEX inhibition and total (r = -0.67, -0.57); MET-SV total and actual rule breaks and DEX inhibition (r = -0.70, 0.66), intentionality (r = 0.60, 0.64) and total (r = -0.57, 0.59); MET-SV social rule breaks and DEX positive and negative affect (r = 0.79, -0.59); MET-SV task failures and DEX inhibition and positive affect (r = -0.58, -0.52), and MET-SV total errors and DEX intentionality (r = 0.67).

Dawson et al. (2009)* examined Mayo Portland Adaptability Inventory correlation between the BMET time to completion and SIP physical score (r = 0.54).

Rand et al. (2009a)* examined IADL questionnaire negative correlation between the BADS Zoo Map and VMET outcome measure of non-efficiency mistakes (r = -0.87), and between the IADL and VMET total number of mistakes (r = -0.82).

Rand et al. (2009a) also examined the relationships between the scores of the VMET and those of the MET-HV using Spearman and Pearson correlation coefficients. Among patients with stroke, there were excellent correlations between MET-HV and VMET outcomes for the total number of mistakes (r = 0.70), partial mistakes in completing tasks (r = 0.88) and non-efficiency mistakes (r = 0.73). Analysis of the whole population indicated adequate to excellent correlations between MET-HV and VMET outcomes for the total number of mistakes (r = 0.77), complete mistakes of completing a task (r = 0.63), partial mistakes in completing tasks (r = 0.80), non-efficiency mistakes (r = 0.72) and use of strategies (r = 0.44), but not for rule break mistakes.

Raspelli et al. (2010) examined Semantic Fluencies and Tower of London test correlations between the VMET variable ‘time’ and the Semantic Fluencies test (r = -0.87) and the Tower of London test (r = -0.82); between the VMET variable ‘searched item in the correct area’ and the Trail Making Test (r = 0.96); and between the VMET variables ‘sustained attention’, ‘maintained sequence’ and ‘no perseveration’ and Corsi’s supra-span test (r = 0.84) and Street’s Completion Test (r = -0.86).

Raspelli et al. (2012) examined Test of Attentional Performance (TEA) correlations between the VMET outcomes time, inefficiencies and total errors and TEA tests (range r = -0.67 to 0.81).

Note: Other neuropsychological tests were administered but correlations are not reported (Mini Mental Status Examination (MMSE), Beck Depression Inventory (BDI), State and Trait Anxiety Index (STAI), Behavioural Inattention Test (BIT)Star Cancellation Test, Brief Neuropsychological Examination (ENB)Token Test, Street’s Completion Test, Stroop Colour-Word Test, Iowa Gambling Task, DEX and ADL/IADL Tests).

*NOTE: The correlations between the MET and other measures of everyday executive functioning and IADLs also provide support for the ecological validity of the MET (as reported by the authors of these articles).

Responsiveness

Two studies used the MET (MET-HV, VMET and modified version of the MET-HV & MET-SV) to measure change following intervention.

Novakovic-Agopian et al. (2011) developed a modified version of the MET-HV and MET-SV to be used in local hospital settings. They developed 3 alternate forms that were used in a pilot study examining the effect of goal-oriented attentional self-regulation training with a sample of 16 patients with chronic brain injury including stroke or cerebral hemorrhage (n=3). A pseudo-random crossover design was used. During the first 5 weeks, one group (Group A) completed goal-oriented attentional self-regulation training while the other group (Group B) only received a 2-hour educational instructional session. In the subsequent phase, conditions were switched such that participants in Group B received goals training for 5 weeks while those in Group A received educational instruction. At week 5 the group that received goal training first demonstrated a significant reduction in task failures (p<0.01), whereas the group that received the educational session demonstrated no significant improvement in MET scores. From week 5 to week 10 there were no significant changes in MET scores in either group.

Rand, Weiss and Katz (2009b) used the MET-HV and VMET to detect change in multi-tasking skills of 4 clients with subacute stroke following virtual reality intervention using the VMall virtual supermarket. Clients demonstrated improved performance on both measures following 3 weeks of multi-tasking training using the VMall virtual supermarket.

References
  • Alderman, N., Burgess, P.W., Knight, C., & Henman, C. (2003). Ecological validity of a simplified version of the multiple errands shopping test. Journal of the International Neuropsychological Society, 9, 31-44.
  • Dawson, D.R., Anderson, N.D., Burgess, P., Cooper, E., Krpan, K.M., & Stuss, D.T. (2009). Further development of the Multiple Errands Test: Standardized scoring, reliability, and ecological validity for the Baycrest version. Archives of Physical Medicine and Rehabilitation, 90, S41-51.
  • Knight, C., Alderman, N., & Burgess, P.W. (2002). Development of a simplified version of the Multiple Errands Test for use in hospital settings. Neuropsychological Rehabilitation, 12(3), 231-255.
  • Maier, A., Krauss, S., & Katz, N. (2011). Ecological validity of the Multiple Errands Test (MET) on discharge from neurorehabilitation hospital. Occupational Therapy Journal of Research: Occupation, Participation and Health, 31(1) S38-46.
  • Novakovic-Agopian, T., Chen, A.J.W., Rome, S., Abrams, G., Castelli, H., Rossi, A., McKim, R., Hills, N., & D’Esposito, M. (2011). Rehabilitation of executive functioning with training in attention regulation applied to individually defined goals: A pilot study bridging theory, assessment, and treatment. The Journal of Health Trauma Rehabilitation, 26(5), 325-338.
  • Novakovic-Agopian, T., Chen, A. J., Rome, S., Rossi, A., Abrams, G., Dʼesposito, M., Turner, G., McKim, R., Muir, J., Hills, N., Kennedy, C., Garfinkle, J., Murphy, M., Binder, D., Castelli, H. (2012). Assessment of Subcomponents of Executive Functioning in Ecologically Valid Settings: The Goal Processing Scale. The Journal of Health Trauma Rehabilitation, 2012 Oct 16. [Epub ahead of print]
  • Rand, D., Rukan, S., Weiss, P.L., & Katz, N. (2009a). Validation of the Virtual MET as an assessment tool for executive functions. Neuropsychological Rehabilitation, 19(4), 583-602.
  • Rand, D., Weiss, P., & Katz, N. (2009b). Training multitasking in a virtual supermarket: A novel intervention after stroke. American Journal of Occupational Therapy, 63, 535-542.
  • Raspelli, S., Carelli, L., Morganti, F., Poletti, B., Corra, B., Silani, V., & Riva, G. (2010). Implementation of the Multiple Errands Test in a NeuroVR-supermarket: A possible approach. Studies in Health Technology and Informatics, 154, 115-119.
  • Raspelli, S., Pallavicini, F., Carelli, L., Morganti, F., Pedroli, E., Cipresso, P., Poletti, B., Corra, B., Sangalli, D., Silani, V., & Riva, G. (2012). Validating the Neuro VR-based virtual version of the Multiple Errands Test: Preliminary results. Presence, 21(1), 31-42.
  • Shallice, T. & Burgess, P.W. (1991). Deficits in strategy application following frontal lobe damage in man. Brain, 114, 727-741.