Assessing general cognitive and adaptive abilities in adults with Down syndrome: a systematic review

Background Measures of general cognitive and adaptive ability in adults with Down syndrome (DS) used by previous studies vary substantially. This review summarises the different ability measures used previously, focusing on tests of intelligence quotient (IQ) and adaptive behaviour (AB), and where possible examines floor effects and differences between DS subpopulations. We aimed to use information regarding existing measures to provide recommendations for individual researchers and the DS research community. Results Nineteen studies reporting IQ test data met inclusion for this review, with 17 different IQ tests used. Twelve of these IQ tests were used in only one study while five were used in two different studies. Eleven studies reporting AB test data met inclusion for this review, with seven different AB tests used. The only AB scales to be used by more than one study were the Vineland Adaptive Behaviour Scale (VABS; used by three studies) and the Vineland Adaptive Behavior Scale 2nd Edition (VABS-II; used by two studies). A variety of additional factors were identified which make comparison of test scores between studies problematic, including different score types provided between studies (e.g. raw scores compared to age-equivalent scores) and different participant inclusion criteria (e.g. whether individuals with cognitive decline were excluded). Floor effects were common for IQ tests (particularly for standardised test scores). Data exists to suggest that floor effects may be minimised by the use of raw test scores rather than standardised test scores. Raw scores may, therefore, be particularly useful in longitudinal studies to track change in cognitive ability over time. Conclusions Studies assessing general ability in adults with DS are likely to benefit from the use of both IQ and AB scales. The DS research community may benefit from the development of reporting standards for IQ and AB data, and from the sharing of raw study data enabling further in-depth investigation of issues highlighted by this review.


Background
Down syndrome (DS) is the most common genetic cause of intellectual disability (ID), with an incidence of around 1 in 650-1000 live births worldwide [1]. DS occurs due to an extra copy of chromosome 21 (trisomy 21), typically in its entirety and in all cells. However, in rarer cases of DS, only some cells have an extra copy of chromosome 21 (mosaicism) or only part of chromosome 21 is triplicated by translocation (partial trisomy). People with DS may have significant cognitive impairments and typically have an intelligence quotient (IQ) ranging from 30 to 70, although IQs both above and below this range occur [2]. Cognitive domains that are particularly impaired in individuals with DS include language (especially expressive language), memory, executive function, and motor coordination. These impairments can vary substantially among individuals with DS and also within individuals due to advanced adult age and/or the development of dementia (for which people with DS are at an ultra-high risk [lifetime prevalence of dementia is estimated to be as high as 90% [3]], although considerable variability is present in terms of age at dementia onset and clinical presentation, as reviewed by Zigman and Lott [4]).
In addition to impairments in general cognitive ability, individuals with DS also have considerable limitations in adaptive behaviour (AB). Adaptive skills are defined as "the effectiveness with which the individual copes with the natural and social demands of his environment" [5]. Although they reflect distinct domains of functioning, adaptive skills/abilities are associated with general cognitive ability measured with IQ [2], suggesting AB scales may be used as an alternative for estimating the severity of ID in individuals when IQ assessment results are unavailable.
Due to the unique cognitive profile found in people with DS (see [6]), it is necessary to understand how useful and applicable different IQ tests and AB scales are for this population as an index of general abilities. Understanding the relationship between IQ and AB scores across the lifespan is also of importance as there is decline in both IQ and AB scores as people with DS age [7,8]. This is thought to be associated with the development of Alzheimer's disease (AD). However, other conditions such as untreated hypothyroidism or emergent neuropsychiatric symptoms, such as the development of depression, may also impact capabilities and performance during assessments. Cohort effects, such as improvements in healthcare and education (including the phasing out of institutions), are also important considerations for cross-sectional studies [9].
IQ tests and AB scales are commonly used in DS studies to describe and compare participant samples, establish the impact of interventions/treatments or comorbidities, and track cognitive change with development and ageing. Such assessments may be particularly important in clinical trials of treatments to improve cognitive outcomes or to track the trajectory of decline due to advanced age or dementia. However, assessment of general ability in individuals with DS is complicated by floor-effects for many neuropsychological tests that were developed for use within the typically developing (TD) population [10][11][12]. In addition, a relative weakness in language domains is often present for people with DS, which may complicate interpretation of performance on verbal tests and those with a large verbal component [13].
The aim of this systematic literature review is to summarise currently available literature on the different IQ and AB tests used previously with adults with DS, with a particular focus on direct comparisons between tests as well as differences in performance between participant groups (younger adults and older adults with and without dementia) in order to make recommendations for future studies assessing general cognitive abilities in adults with DS, and also for the wider DS research community (see Table 3). Full articles were then read in detail to identify which met the following additional inclusion criteria. We included tests of vocabulary, as these are often viewed as tests of general ability due to their strong correlation with IQ. Statistical data (including at least one of the following: mean, median, standard deviation, range, floor effects) from a named IQ or general ability test was provided. Where not all individuals in the study were 16 years or older or not all participants had a diagnosis of DS, papers were only included where separate statistical data (at least one of the following: mean, median, standard deviation, range, floor effects) was provided for participants with DS aged 16 years or older. This brought the total number of eligible studies down to 14. If the same or overlapping participants were used in multiple studies, we selected the main report for inclusion, discarding a further four papers. Additionally, reference lists of identified articles were examined to identify other relevant studies, adding five papers, and a further four papers were included due to knowledge of the research area. This resulted in a total of 19 relevant papers.

Systematic review methods
All available data regarding sample size, age of participants, IQ test used, performance on tests, floor effects (if available), and whether the study reported raw and/or standardised test scores was extracted from included papers.

Search strategy and selection criteria: AB scales
For AB scales in DS, the same database (PubMed) was searched using the search terms ("Down syndrome" [MeSH Major Topic]) AND ("Adaptive Behavior Scales" [All fields] OR "Vineland" [All fields] OR "Adaptive Behavior Assessment System" [All fields] OR "Diagnostic Adaptive Behavior Scale" [All fields] OR "Adaptive Behavior" [All fields] OR "Every day abilities" [All fields] OR "Scales of Independent Behavior" [All fields] OR "Barthel Index" [All fields] OR "Wessex Behaviour Scale" [All fields]) on 23 September 2018, identifying a total of 69 papers. AB papers were included in the review using the same criteria as for the IQ papers detailed above and dropped to 36 after screening the title and abstract. After reading the full article, 28 papers were discarded. Reference lists of identified articles were examined to identify other relevant studies, adding two papers. One additional paper was included due to knowledge of the research area. This left a total of 11 relevant papers.
All available data regarding sample size, age of participants, AB scale used, test performance, floor effects (if available), and whether the study reported raw and/or standardised test scores was extracted from included papers.

Tests
Nineteen studies, comprising 1455 participants (range 26-305 participants), meeting inclusion criteria that reported IQ or general ability test scores are shown in Table 1. A wide range of ages are included in this review, with the oldest participant being 71 years old. A brief description of all tests identified within this review is provided in the Appendix Table 4.
In total, 17 different IQ or general ability tests were used across the 19 identified studies. Twelve of these IQ tests were used in only one study while five were used in two different studies. These five tests were the Kaufman Brief Intelligence Test 2nd edition (KBIT-2) [11,16], the Wechsler Intelligence Scale for Children-Revised (WISC-R) [24,25], Raven's Coloured Progressive Matrices (RCPM) [13,22], the British Picture Vocabulary Scale 2nd edition (BPVS-II) [12,23], and the Peabody Picture Vocabulary Test 4th edition (PPVT-IV) [14,26].
In addition to this, different versions of the same test were used by a number of studies. These included the Peabody Picture Vocabulary Test-Revised (PPVT-R) and the Peabody Picture Vocabulary Test 3rd edition (PPVT-III) [7,13], in addition to the Leiter International Performance Scale-Revised (Leiter-R) [10] and a brief version of this test [12]. The Wechsler Adult Intelligence Scale-III (WAIS-III; Portuguese version) and Wechsler Adult Intelligence Scale Revised were also each used once. Furthermore, de Sola et al. [17] used the Spanish version of the KBIT.
Five tests were used in only one study and also had no alternative versions used. These included the Prudhoe Cognitive Function Test (PCFT), the Woodcock-Johnson Tests of Cognitive Ability-Revised (WJTCA-R), the Matrix Analogies Test-Expanded Form (MAT), the Wechsler Preschool and Primary Scale of Intelligencerevised version (WPPSI-R), and the Stanford Binet 5th edition.

Participant samples
Although some studies have used the same or different versions of the same test, comparison between studies is complicated by differing participant inclusion criteria. For example, some studies grouped participants by dementia status and provide separate test results for each group [11,16], or only include individuals without a diagnosis of dementia or noticeable decline [8,19,20,[23][24][25][26], while in other studies these participants are included in the overall sample [7,13]. Different criteria to define and/or detect dementia were also used between studies.
Furthermore, some studies restricted inclusion to more able participants. For example, "participants were required to have sufficient verbal ability to be interviewed" [20], participants were required to have "verbal oral language skills" [18], inclusion of participants with mild-moderate ID only [24], inclusion criteria of IQ > 30 [25], inclusion criteria of a mental age above 2.5 years in addition to at least minimal verbal communication [14], inclusion criteria of receptive language > 3 years. [26], and the inclusion of individuals not at floor only [13]. All such studies were still included in this review, despite differing individual inclusion criteria. Such differing criteria will substantially skew floor effects between studies and make comparison between studies problematic.

Floor effects
Nine of the 17 studies reported data on floor effects for the IQ or general ability tests they used [10-12, 14, 16, 17, 23-25]. Additionally, floor effects were alluded to by Das et al. [7], who indicated the MAT was "too difficult for most participants". Of the remaining studies, five studies did not report data on floor effects [15,19,21,22,26], two studies only included individuals who were able to provide a Table 1 Summary of studies using intelligence tests in adults with DS. Tests are arranged into those not specifically designed for children and adolescents and those that are. AB tests not shown (see Table 2). Ages and age-equivalents given in years; where given in months in original papers these have been converted. Ages and scores given as mean (SD; range). NR indicates "not reported"  Table 1 Summary of studies using intelligence tests in adults with DS. Tests are arranged into those not specifically designed for children and adolescents and those that are. AB tests not shown (see Table 2). Ages and age-equivalents given in years; where given in months in original papers these have been converted. Ages and scores given as mean (SD; range  Table 1 Summary of studies using intelligence tests in adults with DS. Tests are arranged into those not specifically designed for children and adolescents and those that are. AB tests not shown (see Table 2). Ages and age-equivalents given in years; where given in months in original papers these have been converted. Ages and scores given as mean (SD; range verbal response [18,20], and one study only included individuals above floor levels [13]. Studies using standardised test scores reported particularly large floor effects. These were as high as 61% for the Leiter-R [10]. Glenn and Cunningham [12] also reported large floor effects for the brief Leiter-R (the "majority" of test scores were at floor). For the KBIT (Spanish version), de Sola et al. [17] reported floor effects of 41.9% for standardised IQ scores. When examining KBIT-2 IQ subscales independently, Startin et al. [16] reported floor effects of 66.7% for verbal IQ and 39.4% for non-verbal IQ (adults aged 36+ without a clinical diagnosis of dementia).
For studies reporting IQ test raw scores, using the WISC-R, Kittler et al. [24] reported 40% and 48% of participants scored 0 or 1 on the first administration of the picture arrangement subtest and the similarities subtest, respectively. Devenny et al. [25] also reported high floor effects for these same subscales (52% and 66%, respectively). In contrast to this, when analysing KBIT-2 raw scores, two studies [11,16] found no or limited floor effects for the verbal subscale (based on receptive language rather than expressive language).
The KBIT-2 non-verbal subscale had moderate floor effects across both younger (YA) and older adults (OA), and these increased substantially in participants with dementia (see Fig. 1). Raw scores were also used by Strydom et al. [23] on the BPVS-II, with moderate floor effects (9.4%) reported.

Comparison between IQ test scores
Age-equivalent scores Two IQ tests were identified for which age-equivalent scores were reported by more than one study. Using the BPVS-II (which provides an estimate of receptive language), Glenn and Cunningham [12] reported a mean age-equivalent score of 6.5 years for their sample of younger adults with DS (age range 16-24 years). Strydom et al. [23] reported BPVS-II mean age-equivalent scores separately for participants with mild, moderate, and severe ID (7.8 years, 4.7 years, and 2.0 years, respectively). Interestingly, Glenn and Cunningham [12] also provided non-verbal ageequivalent scores for their participants, using the Brief Leiter-R (mean 5.2 years). Although the higher mean verbal age-equivalent score in this study compared to mean non-verbal (6.5 vs 5.2 years) is not consistent with the cognitive profile associated with DS, the difference is small and SD scores overlap.
Using the PPVT-IV (which also provides a measure of receptive language), Hartley et al. [14] reported a mean age-equivalent score of 8.1 years for their sample of adults with DS aged 30 years or older. Using the same test, Lao et al. [26] reported a mean age-equivalent score of 8.2 years for their sample of adults with DS aged 30 years or older. A lower mean receptive vocabulary ageequivalent score was reported by Iacono et al. using the PPVT-III (5.2 years) [13]. However, 18% of this sample were reported to have diagnosed or suspected dementia, and so comparison between these and the above studies (which did not include people with dementia) is problematic.
It is important to note that the lowest full IQ score obtainable on the Leiter-R is 36, and the Stanford Binet 4th Ed supports calculation of IQ scores lower than 40, whereas the lowest full IQ score for the KBIT and WAIS-II are 40 and 45, respectively. It is therefore possible the results reported here are influenced by differing floor levels between tests. Furthermore, floor effects may substantially influence mean test scores. Apart from the high floor effect in standardised IQ tests, it is also worthwhile noting that standardised scoring may result in inflated estimates of true abilities near floor levels, which may differ between tests [27].

Raw test scores
Raw tests scores are only useful to compare between studies when the same test has been used. The KBIT-2 has been used in more than one study [11,16].
These two papers are published by one group and it should be noted that although there is no overlap in data, there is some overlap between participants (31 individuals from Sinai et al. were later recruited by Startin et al.).
Both studies found a wide range of raw scores for both subscales of the KBIT-2 (see Fig. 2). When examining scores across participant groups (younger adults (YA), older adults without dementia (OA-ND), and older adults with dementia (OA-D)), verbal and non-verbal subscale means and ranges reported by Startin et al. [16] appear relatively similar between YA and OA-ND but were lower in OA-D. Sinai et al. [11] also reported similar reductions in verbal and non-verbal mean scores and ranges between OA-ND and OA-D (YA not included in this study). Overall, these studies demonstrate that raw KBIT-2 scores can be obtained from a range of individuals with DS, including many individuals with dementia.
Raw scores from the WISC-R have been used in two studies [24,25]. Neither study split participants by age and only included individuals with no decline; therefore, raw test scores between groups cannot be compared. However, Kittler et al. [24] used these scores to explore sex differences in DS and reported females performed significantly better than males on the coding subtest (part of the non-verbal IQ subscale).

Tests
Eleven studies using AB scales in DS were identified for inclusion in this review (see Table 2). A total of 848 participants took part in the studies, ranging from 16 to 71 years old. The only AB scales to be used by more than one study were the Vineland Adaptive Behaviour Scale (VABS) and the second edition of this scale (VABS-II).

Floor effects
Two studies reported floor effect data for the AB scale used. Using raw scores, Startin et al. [16] reported total SABS scores had no floor effects in any group of participants investigated (YA, OA-ND, OA-D). However, when split into its 3 subscales, small floor effects were found. In participants aged 36+ without dementia, floor effects were found in the personal self-sufficiency and community self-sufficiency domains (0.9% for both). For participants aged 36+ with dementia, floor effects were also found in the same two domains (2.3% for both). Kishnani et al. [29] reported no participants were at floor on the VABS. Participants in this study were aged 18-38 and did not have dementia.

Comparisons between AB scales
Two studies reporting age-equivalent scores from different versions of the VABS found similar mean ageequivalent scores. This was reported as 8.5 years and 7.3 years for Witts and Elders [30] and Dressler et al. [22], respectively. Minimum age-equivalent scores between these two studies were also similar (3.1 years and 3.7 years); however, maximum scores differed (10.0 years and 18.5 years). Kishnani et al. [29] reported mean Composite Supplemental Norm Score from the VABS. The results of these this study are therefore not comparable to those of Witts and Elders [30] and Dressler et al. [22]. The three studies using the VABS-II all reported mean Adaptive Behavior Composite scores of 51.86 [28], 52.6 [15], and 183.67 [14]. It is likely the latter of these scores is greater because for this study participants were required to have a mental age of above 2.5 in addition to at least minimal verbal communication, whereas the former two studies had no such inclusion criteria.
Two identified studies reported raw scores of different versions of the ABAS. de Sola et al. [17] found an overall mean test score of 636 (91 SD) and a range from 220 to 627 using the ABAS-II. In contrast, Strydom et al. [23] reported a mean raw ABAS score of 377 (140 SD) and a range of 98-589 (for further details see Table 2).

Comparisons between IQ tests and AB scales
de Sola et al. [17] analysed the association between IQ and AB using standardised KBIT IQ scores and raw  ABAS-II scores. A significant difference was found between participants with an IQ above and below 40 for most functional skill areas assessed by the ABAS-II, in addition to ABAS-II total score (mean group difference for total ABAS-II score 63.4; p = 0.001). This suggests that participants with DS with a higher IQ may have greater competence in daily living and also demonstrates a potential relationship between IQ and AB scales in adults with DS. AB scales may also correlate with performance on other tests of IQ. Using raw scores of the PCFT and the ABS, Kay et al. [19] noted a highly significant correlation between these two tests (r = 0.87; p < 0.001). This study provides further evidence that AB and IQ may be related in adults with DS.
In contrast to the findings of these two studies, Dressler et al. [22] found no association between AB and IQ. In this study, IQ tests (either RCPM or the Leiter-R) were used to classify participants by level of ID (mild, moderate, or severe), and VABS scores (Italian version) were compared between groups. No statistically significant differences in VABS scores were observed between groups. However, it is of note that VABS raw scores were not used in this analysis. Instead, VABS raw scores were categorised on an individual basis as above average, average, or below Table 2 Summary of studies using adaptive ability tests in adults with DS. NR indicates "not reported." IQ test results not shown (see Table 1 *Only T1 data used in this review **Only baseline data used in this review average, relative to mean VABS score for each group. It is possible this approach to analysis prevented the detection of a significant difference in AB scores between groups.

Discussion
We aimed to provide a systematic review of the literature regarding tests of IQ and AB used in adults with DS, in order to make recommendations regarding the use of such tests with this population. Overall a wide variety of different IQ tests and AB scales were identified, with a wide range of differing score types provided (including raw scores, age-equivalent scores, full IQ scores, verbal IQ scores, and non-verbal/performance IQ scores). Studies largely differed in criteria for participant inclusion (e.g. only those able to complete tests) and in the reporting of test results by sub-populations. There was also little overlap in the tests used between studies. Together, these factors make the comparison of tests between studies problematic. Where reported, floor effects for IQ tests were particularly high for standardised test scores. Floor effects for raw total BPVS-II scores and raw WISC-R sub-scores were moderate (around 9%) and high (around 50%), respectively. In contrast, floor effects reported for KBIT-2 raw scores were minimal. Verbal raw KBIT-2 scores were particularly low (including for participants with dementia). The number of participants at floor for AB scales was only reported by one study [16]. This study found no floor effects using total raw SABS scores (including for individuals with dementia); however, when subdomains of this test were examined, small floor effects were seen for two out of three subdomains. Further, although Kay et al. [19] did not explicitly report floor effects using the ABS, the authors noted that floor effects were less marked on this scale compared to the IQ test used in the study (the PCFT). Together, these findings indicate that raw KBIT-2 scores and raw AB scores may be particularly suited to tracking longitudinal change in adults with DS, due to minimal floor effects on these measures prior to the onset of cognitive decline.
Although it appears that raw scores may benefit from reduced floor effects compared to standardised scores, it should be noted that the use of raw scores has various limitations. This includes the inability to directly compare level of functioning to that of the TD population (in contrast to the use of standardised or age-equivalent IQ scores, through which this is inherently possible). Additionally, the clinical significance of differences in raw score values both between and within individuals over time has not yet been established.
In some studies, child versions of IQ tests (e.g. WISC instead of WAIS) have been used in adults with DS [21,24,25]. While this might limit floor effects and should therefore be more sensitive to differences in performance, age-adjusted IQ norms are only available for children, and therefore only age-equivalent or raw scores can be used in adults. Age appropriateness could also be an issue. The generalisability of IQ tests and AB scales in general is an important issue that warrants further investigation. Specifically, the tests identified here were developed in Western populations, and most were developed for use in TD individuals.
Many IQ tests identified in this review are dependent on language. Significant relative weaknesses in language are a characteristic feature of the cognitive profile for individuals with DS [38,39] (see review by Silverman [6]). The use of language-based IQ tests in this population is therefore problematic as specific deficits in language may mask the true level of individuals' general ability and skew group test results. Accordingly, some studies identified in this review excluded participants without sufficient verbal skills. Non-verbal/performance subscales on IQ tests are less likely to be substantially influenced by language and so may be more appropriate for use in this population. However, studies utilising these subscales have reported higher floor effects compared to verbal subscales [11,16]. It is also worthwhile noting that IQ tests with language as an integral component require substantial translation and subsequent revalidation for use in different languagespeaking populations. For larger international studies, translation into different languages is a particular barrier, and so non-verbal/performance tests may be preferable to verbal tests. Future research could explore the use of simple non-verbal/performance tests that could be used in people with DS with lower floor effects, though it will need to be established if this would over-estimate IQ.
In this review, "floor" refers to the lowest possible score obtainable on a particular test. However, it may be more appropriate to discuss floor effects in reference to the lowest score below which a decline of significance cannot be detected, for example, two standard errors of the mean (SEM) above the lowest score. Floor effects discussed in this review may therefore be underestimated.
It is likely other IQ tests exist that may be suitable for adults with DS but were not utilised by any studies identified in this review. In particular, d'Ardhuy et al. [10] suggested the Leiter-III may be a more appropriate standardised IQ test for use in people with DS compared to the Leiter-R. This is based on a clinical trial of 180 individuals with DS (Clinical. Trials.gov identifier NCT01920633) which reported a floor effect of only 1% with this IQ test [10]. Furthermore, the potential utility of the Stanford-Binet IQ test in individuals with DS has been highlighted by other studies that did not meet inclusion criteria of this review. For example, Silverman et al. [40] demonstrated a strong linear correlation between IQ score measured on the Stanford-Binet and the WAIS (r = .818) in individuals with an ID (70.3% with DS), confirming that the two scales measured the same underlying construct(s). However, IQ estimates using the WAIS were consistently higher in this study, and more than 85% of individuals with DS had IQ scores that were more than 10 points higher on the WAIS compared to the Stanford-Binet, indicating direct comparison of standardised IQ scores between these two tests requires further validation. Future research should further explore the use of these IQ tests in adults with DS.
With regards to AB measures, three measures were commonly used: the Vineland adaptive behavior scales, the Adaptive Behavior Scale (ABS, including its short form), and the Adaptive Behaviour Assessment System (2nd edition; ABAS-II). These measures did not have significant floor effects and in two identified studies showed a correlation with IQ test scores [17,19]. This suggests that AB measures are a useful addition to research studies of cognitive abilities in individuals with DS alongside IQ testing and may allow for an assessment of general ability in individuals with DS who cannot engage with IQ tests or who are at floor for IQ tests.
AB measures may represent a broader construct compared to IQ and are likely to be influenced by an individual's physical abilities as well as their training and support to maintain independence. Nevertheless, the studies reviewed here demonstrated that AB measures can be useful in tracking change in general abilities over time, and showed significant differences in scores between groups defined by age or dementia status. Further research is required to demonstrate the relationship between different subscales of AB measures such as the VABS and IQ scores, and between different AB scales. The particular strengths and weaknesses of AB domains in DS should be established, and the development of shorter versions of AB measures will be desirable.

Conclusions
Recommendations following this review have been summarised in Table 3. The main recommendations are that the use of raw scores for certain IQ tests such as the K-BIT2 can minimise floor effects and may therefore be particularly useful in longitudinal studies, though it must be acknowledged that the significance of changes in raw scores are currently uncertain. The use of more common IQ tests (e.g. KBIT, BPVS, WISC-R, RCPM) and AB tests (e.g. VABS, ABS, ABAS) should be encouraged more broadly in both research and clinical settings while the use of non-verbal/performance IQ tests may be preferable in multi-site international studies involving populations speaking different languages. Finally, studies may benefit from the use of both IQ and AB scales, particularly if participants include individuals with a broad range of abilities.
It is also apparent from this review that there is likely a wealth of raw IQ and AB test data that has not been included in the studies identified here. Furthermore, it is apparent that a potential limitation of the current research field is that many studies do not exclude (or analyse separately) individuals with cognitive decline or dementia, or individuals with a non-trisomy 21 form of DS. The research community may therefore benefit from an effort to share such data in order to make full and valid comparisons between scales and between different subpopulations of individuals with DS. Such information is likely to be of benefit to both clinicians and researchers. Recommendations for individual studies of adults with DS 1. The use of raw scores for certain IQ tests, particularly the K-BIT2, can minimise floor effects and may therefore be particularly useful in longitudinal studies to track change in cognitive ability over time.
2. Non-verbal/performance IQ tests may be useful in multi-site international studies involving populations speaking different languages.
3. The use of more common IQ tests (e.g. KBIT, BPVS, WISC-R, RCPM) and AB tests (e.g. VABS, VABS-II, ABS, ABAS) should be encouraged more broadly in both research and clinical settings. Practical implications of this are extremely valuable for detecting changes in ability.
4. Studies may benefit from the use of both IQ and AB scales, particularly if participants include individuals with a broad range of abilities.
Recommendations for the DS research community 1. The development of reporting standards would increase the ability of different study findings to be compared, for example reporting both raw and standardised scores, full floor effects, and separately reported results for individual DS subpopulations.
2. Sharing of data from published studies would allow comprehensive comparison between different IQ tests and between different AB tests, in addition to correlations between these two measures for different DS subpopulations.