Timed Up and Go (TUG)

Overview

There is a paucity of literature published on the reliability and validity of the TUG in patients with stroke. For the purposes of this review, we conducted a literature search to identify all relevant publications on the psychometric properties of the TUG.

Floor and Ceiling Effects

Rockwood et al. (2000) found that in frail elderly individuals with cognitive impairment, 35.5% were unable to physically perform the test. This may be indicative of the presence of a large floor effect with the TUG.

Reliability
Internal consistency:
Not reported

Test-retest reliability:
 Podsiadlo and Richardson (1991) reported excellent test-retest reliability of the TUG in frail elderly patients (ICC = 0.99).

Steffen et al. (2002) administered the TUG to 97 community-dwelling older adults. The test-retest reliability of the TUG was found to be excellent in this population (ICC = 0.97).

Thompson and Medley (1995) examined the test-retest reliability of the TUG in elderly individuals without any health problems and found excellent test-retest correlations ranging from 0.81 to 0.99.

Rockwood et al. (2000) examined the test-retest reliability of the TUG as part of the Canadian Study of Health and Aging. Adequate test-retest reliability was reported for all participants (ICC = 0.56), for individuals without cognitive impairment alone (ICC = 0.50), and for those with cognitive impairment alone (ICC = 0.56). The results of this study are substantially lower than the results of previous studies examining the test-retest reliability of the TUG in elderly patients. The authors suggest this may be due to the fact that unlike other similar studies, they did not exclude medically unstable patients in their study, and further, they did not control for certain factors (e.g. the time and setting in which the TUG was readministered).

Morris et al. (2001) examined the test-retest reliability of the TUG in 12 patients with Parkinson’s disease and 12 subjects without Parkinson’s disease. Patients were videotaped and timed by 2 experienced raters. Three experienced clinicians and 3 inexperienced clinicians later rated the videotape. The test-retest reliability of the TUG was found to be excellent (ranging from r = 0.87 to r = 0.99).

Flansbjer, Holmback, Downham, Patten, and Lexell (2005) assessed the test-retest reliability of the TUG in 50 patients with chronic mild to moderate post-stroke hemiparesis. The patients performed the TUG twice, with 7 days between each evaluation. The test-retest reliability of the TUG was found to be excellent (ICC = 0.96).

Ng and Hui-Chan (2005) administered the TUG to 10 healthy elderly subjects and 10 patients with chronic stroke twice, at the same time of day, on different days within one week. The results showed excellent test-retest reliability for both healthy elderly subjects (ICC = 0.97) and patients with stroke (ICC = 0.95). The results of this study and the previous study by Flansbjer et al. (2005) suggest that the TUG is a reliable measure in patients with stroke.

Inter-rater reliability:
 
Podsiadlo and Richardson (1991) compared the inter-rater reliability of the TUG, the TUG Manual and the TUG Cognitive using same day comparisons of three raters. Excellent inter-rater reliabilities were found for the TUG (ICC = 0.98), the TUG Manual (ICC = 0.99), and the TUG Cognitive (ICC = 0.99).

Siggeirsdottir, Jonsson, Jonsson, and Iwarsson (2002) examined the inter-rater reliability of the TUG in 31 elderly individuals in a retirement home. No significant difference was found between the two raters (mean difference = 0.04s). The results of this study suggest that the TUG has high inter-rater reliability.

Norén, Bogren, Bolin, and Stenstrom (2001) examined the inter-rater reliability of the TUG in patients with peripheral arthritis. The inter-rater reliability among three physiotherapists was found to be excellent (ICC = 0.97).

Schoppen et al. (1999) examined the inter-rater reliability of the TUG in elderly patients with a lower-extremity amputation. The test was performed for two different observers at different times of the same day. An excellent Spearman correlation was found between the scores of the two observers (r = 0.96), demonstrating the excellent inter-rater reliability of the TUG.
Note: Caution should be taken in interpreting these findings as the Spearman correlation is not the preferred method of assessing inter-rater reliability and may have produced higher reliability coefficients than a more appropriate analysis.

Morris et al. (2001) examined the inter-rater reliability of the TUG using three experienced raters and three inexperienced raters. Each rater viewed the sequence of performances for 12 patients with Parkinson’s disease and 12 comparison patients from videotape. Raters viewed the videotapes independently at least one week after testing. ICCs were excellent for both experienced and inexperienced raters, ranging from r = 0.87 to r = 0.99. The results of this study demonstrate the excellent inter-rater reliability of the TUG in patients with Parkinson’s disease.

Intra-rater: 
Podsiadlo and Richardson (1991) found that the TUG demonstrated excellent intra-rater reliability in frail elderly individuals (ICC= 0.99).

Schoppen et al. (1999) examined the intra-rater reliability of the TUG in elderly patients with a lower-extremity amputation. Patients performed the TUG for one observer at two different occasions with an interval of two weeks. An excellent Spearman correlation was observed between scores obtained by the same rater on two consecutive visits (r = 0.93).
Note: Caution should be taken in interpreting these findings as the Spearman correlation is not the preferred method of assessing intra-rater reliability and may have produced higher reliability coefficients than a more appropriate analysis

Validity

Content:

Not available.

Criterion:

No gold standard exists.

Construct:

Convergent/Discriminant:
Rockwood et al. (2000) examined the convergent and discriminant validity of the TUG using Phase 2 data from the Canadian Study of Health and Ageing. Both discriminant validity was assessed by comparing the TUG to other functional assessments including: the Older Americans Resources and Services Instrumental Activities of Daily Living Scale (OARS IADL) and OARS Activities of Daily Living (OARS ADL) (Fillenbaum & Smyer, 1981), the Cumulative Illness Rating Scale (CIRS) (Linn, Linn, & Gurel, 1968), and the Frailty Scale (developed for the Canadian Study of Health and Aging-2) using Spearman correlations. The TUG demonstrated excellent correlations with the OARS ADL for all participants, and with participants with cognitive impairment alone (r = -0.69 and r = -0.72 respectively). The OARS IADL also had excellent correlations with the TUG for all participants and with participants with cognitive impairment alone (r = -0.70 and r = -0.70, respectively). The TUG also had an excellent correlation with the Frailty Scale for all participants (r = 0.60). Some correlations are negative because a high score on the TUG indicates abnormal functioning, whereas a high score on some other measures indicates better performance. The TUG correlated poorly with the CIRS (ranging from r = 0.22 to 0.26).

Berg, Maki, Williams, Holliday, and Wood-Dauphinee (1992) compared scores from clinical measures and laboratory tests of balance and mobility in 31 elderly subjects. An adequate correlation between the TUG and the Barthel Index was reported (r = -0.48). Excellent correlations between the TUG and the Berg Balance Scale (r = -0.76) and between the TUG and the Tinetti Balance Scale (Tinetti, 1986) (r = 0.74) were also observed (some correlations are negative because a high score on the TUG abnormal functioning, whereas a high score on other measures indicates better health).

Podsiadlo and Richardson (1989) examined the convergent validity of the TUG in frail elderly individuals and reported an excellent correlation between the TUG and the Berg Balance Scale (r = -0.72), and an adequate correlation between the TUG and gait speed (r = -0.55) and between the TUG and the Barthel Index (r = -0.51).

Brooks, Davis, and Naglie (2006) examined the construct validity of the TUG, and two other measures of physical performance in 52 frail older individuals. Correlations between the TUG and the Functional Independence Measure (Keith, Granger, Hamilton, & Sherwin, 1987) were adequate at both admission (r = -0.59) and at discharge (r = -0.42). Correlations are negative because a high score on the TUG indicates abnormal functioning whereas a high score on the Functional Independence Measure indicates functional independence.

Schoppen et al. (1999) examined the validity of the TUG by comparing it to the Sickness Impact Profile-68 item scale (de Bruin, Diederiks, de Witte, Stevens, & Philipsen, 1994), and the Groningen Activity Restriction Scale (GARS) (Kempen, Doeglas, & Suurmeijer, 1993) in 32 patients over the age of 60 with unilateral transtibial or transfemoral amputation because of peripheral vascular disease. An adequate Spearman correlation was reported between the TUG and the Groningen Activity Restriction Scale (r = 0.39). The TUG also correlated adequately with the total score of the Sickness Impact Profile (r = 0.40) and mobility control and mobility range (r = 0.46 and 0.36). A poor correlation between the TUG and the subscales of the “psychic autonomy and communication” (r = 0.31), “social behavior” (r = 0.19), and “emotional stability” (r = -0.04) of the Sickness Impact Profile was found. The findings confirm that the TUG is not a reflection of mental functioning.

Noren et al. (2001) administered various assessments of balance to 65 patients with peripheral arthritis and found that the Berg Balance Scale (Berg, Wood-Dauphinee, Williams, & Maki, 1989) and the TUG had an excellent correlation (Spearman’s rho = -0.83). The correlation is negative because a high score on the Berg Balance Scale indicates normal balance, whereas a high score on the TUG indicates abnormal functioning.

Ng and Hui-Chan (2005) administered the TUG to 10 healthy elderly participants and 11 patients with chronic stroke. Spearman correlation analyses were conducted to examine the convergent and discriminant validity of the TUG with various measures. No significant associations between the TUG and spasticity of ankle plantarflexors of both affected and unaffected legs were observed. An excellent correlation between the TUG and the peak plantarflexion torque generated by maximum isometric voluntary contraction (MIVC) of the affected plantarflexors was reported (r = -0.86), however the TUG did not correlate with the other MIVC parameters measured. Excellent negative correlations were found between the TUG and gait velocity in both healthy participants and patients with stroke (r = 0.98 and r = 0.99, respectively). For the other gait parameters, the step lengths of both the affected and unaffected legs had excellent correlations with the TUG (ranged from r = -0.67 to r = -0.80). An excellent correlation was found between the distance covered during the 6-Minute Walk Test (6MWT) (Guyatt et al., 1985) and the TUG (r = -0.96). Some correlations are negative because a high score on the TUG indicates abnormal functioning whereas a high score on other measures indicate a high level of performance.

Flansbjer et al. (2005) examined 6 gait performance tests in patients with mild to moderate post-stroke hemiparesis (Comfortable Gait Speed; Fast Gait Speed, Stair Climbing Ascend; Stair Climbing Descend; 6-Minute Walk Test). They found excellent correlations between the TUG and the other gait performance measures examined twice, 7 days apart, ranging from r = -0.84 to r = -0.92 (these correlations are negative because a high score on the TUG indicates abnormal functioning, whereas a high score on other gait measures indicate normal performance). Taken together with the results from the study by Ng and Hui-Chan (2005), the TUG appears to be a valid measure for use in patients with stroke.

Known groups:
Brooks et al. (2006) examined the construct validity of the TUG in 52 frail older individuals. They found that the TUG could distinguish patients using different ambulatory aids. Berg et al. (1992) found that the TUG was able to distinguish between elderly individuals who walked with an aid (cane or walker) versus those who did not use any walking aid (effect size = 1.02).

Rockwood et al. (2000) examined the validity of the TUG using Phase 2 data from the Canadian Study of Health and Ageing. They reported that cognitively unimpaired clients could perform the TUG faster than cognitively impaired clients (12 seconds versus 15 seconds, on average).

Morris et al. (2001) found that the TUG could distinguish between patients with Parkinson’s disease who were on the medication levodopa and those patients who were not on levodopa when compared to individuals without Parkinson’s disease.

Ng and Hui-Chan (2005) found that the TUG was able to distinguish healthy elderly individuals from patients with stroke (mean time to complete the TUG was 9.1 seconds for healthy individuals and 22.6 seconds for patients with stroke).

Predictive:
Nikolaus, Bach, Oster, and Schlierf (1996) examined predictors of death, nursing home placement and hospital admission in 135 patients admitted to a geriatric hospital and discharged home. In a logistic regression analysis, baseline TUG scores were found to be an independent predictor for nursing home placement.

Schwartz et al. (1999) found that in a sample of elderly Mexican-American women, those with the best and worst performance on the TUG were more likely to fall than those with moderate performance.

Whitney, Marchetti, Schade, and Wrisley (2004) found that patients with vestibular disorders and a history of falls who scored > 11.1 seconds on the TUG were five times more likely to have reported a fall in the previous 6 months.

Sensitivity and Specificity

Shumway-Cook, Brauer, and Woollacott (2000) compared the specificity of the TUG in predicting falls in community dwelling older adults. The TUG correctly classified 13/15 fallers (87% sensitivity) and 13/15 nonfallers (87% specificity). These results suggest that the TUG is a sensitive and specific measure for identifying elderly individuals who are prone to falls.

Whitney, Marchetti, Schade, and Wrisley (2004) examined the sensitivity and specificity of the TUG in 103 patient charts of those with vestibular disorders and a history of falls. Sensitivity (80%) and specificity (56%) were calculated for TUG scores of > 11.1 seconds.

Responsiveness

Brooks, Davis, and Naglie (2006) examined the responsiveness of the TUG in 52 frail older individuals. The TUG demonstrated a large responsiveness to an intervention that occurred between admission and discharge with a standardized response mean (SRM) of 1.1.

Flansbjer et al. (2005) examined the responsiveness of the TUG in 50 individuals with stroke. The smallest real difference (SRD), representing the smallest change that indicates a real (clinical) improvement, was small (SRD = 23%). In other words, the TUG can be used to detect clinically relevant small changes.

Salbach and colleagues (2001) examined the most responsive measure of gait speed from a variety of measures in 50 post-stroke patients with gait deficits. The TUG demonstrated significant change from 8 – 38 days post-stroke (SRM = 0.73). However, there were significant difficulties in obtaining scores since not all patients could complete the test at both times. The SRM reported reflects scores for only those subjects who were able to perform the test. The responsiveness of the TUG also varied depending on the group of patients tested. In the moderate group, the TUG was rated the third most responsive tool after the 5-minute Walk Test (5mWT) (maximum pace), and the 5mWT (comfortable pace). In the fast group, the TUG was rated the second most responsive tool after the 5mWT.

References
  • Berg, K. O., Maki, B. E., Williams, J. I., Holliday, P. J., Wood-Dauphinee, S. L. (1992). Clinical and laboratory measures of postural balance in an elderly population. Archives of Physical Medicine and Rehabilitation, 73, 1073-1080.

  • Berg, K., Wood-Dauphinee, S. L., Williams, J. I., Maki, B. E. (1992). Measuring balance in the elderly: Validation of an instrument. Canadian Journal of Public Health, 83(S2), S7-11.

  • Berg, K.O., Wood-Dauphinee, S., Williams, J. L., Maki, B. (1989). Measuring balance in the elderly: Validation of an instrument. Physiotherapy Canada, 41(6), 304-311.

  • Benaim, C., Perennous, D. A., Villy, J., Rousseaux, M., Pelissier, J. Y. (1999). Vaidation of a standardized assessment of postural control in stroke patients. Stroke, 30, 1862-1868.

  • Bourbonnais, D., Bilodeau, S., Lepage, Y., Beaudoin, N., Gravel, D., Forget, R. (2002). Effect of force-feedback treatments in patients with chronic motor deficits after a stroke. American Journal of Physical Medicine and Rehabilitation, 81, 890-897.

  • Brooks, D., Davis, A., Naglie, G. (2006). Validity of 3 physical performance measures in inpatient geriatric rehabilitation. Arch Phys Med Rehabil, 87, 105-110.

  • Cattaneo, D., Regola, A., Meotti, M. (2006). Validity of six balance disorders scales in persons with multiple sclerosis. Disability & Rehabilitation. 28(12), 789-795.

  • de Bruin, A. F., Diederiks, J. P. M., de Witte de, L. P., Stevens, F. C. J., Philipsen, H. (1994). The development of short generic version of the Sickness Impact Profile. J Clin Epidemiol, 47, 407-418.

  • Fillenbaum, G. G., Smyer, M. A. (1981). The development, validity, and reliability of the OARS multidimensional functional assessment questionnaire. J Gerontol, 36, 428-434.

  • Finch, E., Brooks, D., Stratford, P. W., Mayo, N. E. (2002). Physical Rehabilitations Outcome Measures. A Guide to Enhanced Clinical Decision-Making (second ed.), Canadian Physiotherapy Association, Toronto.

  • Flansbjer, U., Holmback, A. M., Downham, D., Patten, C., Lexell, J. (2005). Reliability of gait performance tests in men and women with hemiparesis after stroke. J Rehabil Med, 37, 75-82.

  • Guyatt, G. H., Sullivan, M. J., Thompson, P. J. Fallen, E. L., Pugsley, S. O., Taylor, D. W., Berman, L. B. (1985). The 6-minute walk: a new measure of exercise capacity in patients with chronic heart failure. Can Med Assoc J, 132, 919-923.

  • Hayes, K., Johnson, M. (2003). Measures of Adult General Performance Tests. Arthritis & Rheumatism (Arthritis Care & Research), 49(5S), S28-S42.

  • Keith, R. A., Granger, C. V., Hamilton, B. B., Sherwin, F. S. (1987). The functional independence measure: A new tool for rehabilitation. Adv Clin Rehabil, 1, 6-18.

  • Kempen, M., Doeglas. D. M., Suurmeijer, M. (1993). Het meten van problemen met zelfredzaamheid op verzorgend en huishoudelijk gebied met de Groningen Activiteiten Restrictie Schaal (GARS): een handleiding. Groningen, The Netherlands: Noordeliik Centrum voor Gezondheidsvraagstukken, NCG.

  • Linn, B. S., Linn, M. W., Gurel, L. (1968). Cumulative Illness Rating Scale. J Am Geriatr Soc, 16, 622-626.

  • Lundin-Olsson, L., Nyberg, L., Gustafson, Y. (1998). Attention, frailty, and falls: The effect of a manual task on basic mobility. Journal of the American Geriatric Society, 46, 758-761.

  • Mathias, S., Nayak, U. S., Isaacs, B. (1986). Balance in elderly patients: The “Get-Up and Go” test. Arch Phys Med Rehabil, 67, 387-389.

  • Morris, S., Morris, M. E., Iansek, R. (2001).Reliability of measurements obtained with the Timed “Up & Go” test in people with Parkinson disease. Physical therapy, 81(2), 810-818.

  • Ng, S. S., Hui-Chan, C. W. (2005). The Timed Up & Go test: its reliability and association with lower-limb impairments and locomotor capacities in people with chronic stroke. Arch Phys Med Rehabil. 86(8), 1641-1647.

  • Nikolaus, T., Bach, M., Oster, P., Schlierf, G. (1996). Prospective value of self-report and performance-based tests of functional status for 18-month outcomes in elderly patients. Aging Clin Exp Res, 8, 271-276.

  • Norén, A. M., Bogren, U., Bolin, J., Stenstrom, C. (2001). Balance assessment in patients with peripheral arthritis: Applicability and reliability of some clinical assessments. Physiother Res Int, 6, 193-204.

  • Podsiadlo, D., Richardson, S. (1991). The Timed “Up & Go”: A test of basic functional mobility for frail elderly persons. J Am Geriatr Soc, 39, 142-148.

  • Rockwood, K., Awalt, E., Carver, D., MacKnight, C. (2000). Feasibility and measurement properties of the functional reach and the Timed Up and Go tests in the Canadian study of health and aging. J Gerontol A Biol Med Sci, 55A, M70-73.

  • Salbach, N., Mayo, N., Higgins, J., Ahmed, S., Finch, L., Richards, C. (2001) Responsiveness and predictability of gait speed and other disability measures in acute stroke. Phys Med Rehabil, 82, 1204-1212.

  • Schoppen, T., Boonstra, A., Groothoff, J. W., de Vries, J., Goeken, L. N., Eisma, W. H. (1999). The Timed “Up and Go” test: reliability and validity in persons with unilateral lower limb amputation. Arch Phys Med Rehabil, 80, 825-828.

  • Schoppen,T., Boonstra, A., Groothoff, J. W., de Vries, J., Goeken, L. N., Eisma, W. H. (2003). Physical, mental, and social predictors of functional outcome in unilateral lower-limb amputees. Archives of Physical and Medical Rehabilitation, 84, 803-811.

  • Schwartz, A. V.,Villa, M. L.,Prill, M., Kelsey, J. A., Galinus, J. A., Delay, R. R., Nevitt, M. C., Bloch, D. A., Marcus, R., Kelsey, J. L. (1999). Falls in older Mexican-American women. J Am Geriatr Soc, 47, 1371-1378.

  • Siggeirsdottir, K., Jonsson, B. Y., Jonsson, H., Iwarsson, S. (2002). The “Timed Up & Go” is dependent on chair type. Clinical Rehabilitation, 16(6), 609-616.

  • Shumway-Cook, A., Brauer, S., Woollacott, M. (2000). Predicting the probability for falls in community-dwelling older adults using the Timed Up & Go Test. Phys Ther, 80, 896-903.

  • Shumway-Cook, A., Woollacott, M. H. (2001). Motor Control: Theory and Practical Applications (second Ed). Lippincott Williams & Wilkins. pp272-273.

  • Steffen, T. M., Hacker, T. A., Mollinger, L. (2002). Age-and gender-related test performance in community-dwelling elderly people: Six-Minute Walk Test, Berg Balance Scale, Timed Up & Go Test, and gait speeds. Phys Ther, 82(2), 128-137.

  • Thompson, M., Medley, A. (1995). Performance of Community Dwelling Elderly on the Timed Up and Go test. Physical and occupational therapy in geriatrics, 13, 17-30.

  • Tinetti, M. E. (1986). Performance-oriented assessment of mobility problems in elderly patients, J Am Geriatr Soc, 34, 119-126.

  • Tremblay, L. E., Savard, J., Casimiro, L., Tremblay, M. (2004). Repertoire des Outils d’Evaluation en Francais pour la Readaptation, Regroupement des intervenantes et intervenants francophones en sante et enservices sociaux de l’Ontario, Ottawa.

  • Whitney, S. L., Marchetti, G. F., Schade, A., Wrisley, D. M. (2004). The sensitivity and specificity of the Timed “Up & Go” and the dynamic gait index for self-reported falls in persons with vestibular disorders. Journal of Vestibular Research, 14(5), 397-409.