Phenotypic characteristics and variability in CHARGE syndrome: a PRISMA compliant systematic review and meta-analysis

CHARGE syndrome (OMIM #214800) is a phenotypically complex genetic condition characterised by multi-system, multi-sensory impairments. Behavioural, psychological, cognitive and sleep difficulties are not well delineated and are likely associated with biopsychosocial factors. This meta-analysis investigated the prevalence of clinical features, physical characteristics and conditions, behavioural, psychological, cognitive and sleep characteristics in CHARGE syndrome, and statistically evaluated directional associations between these characteristics. Pooled prevalence estimates were calculated using reliable, prespecified quality weighting criteria, and meta-regression was conducted to identify associations between characteristics. Of the 42 eligible studies, data could be extracted for 1675 participants. Prevalence estimates were highest for developmental delay (84%), intellectual disability (64%), aggressive behaviour (48%), self-injurious behaviour (44%) and sleep difficulties (45%). Meta-regression indicated significant associations between intellectual disability and choanal atresia, intellectual disability and inner ear anomalies, sleep difficulties and growth deficiency, and sleep difficulties and gross motor difficulties. Our comprehensive review of clinical features, behavioural, psychological, cognitive and physical characteristics, conditions and comorbidities in CHARGE syndrome provides an empirically based foundation to further research and practice.


Background
CHARGE syndrome (CS) is a highly variable multisystemic condition, with an estimated prevalence of 1 in 8500 live births [1]. The acronym refers to the prominent congenital malformations first used to delineate the syndrome: Coloboma, Heart defects, Atresia choanae, Retardation of growth and development, Genital abnormalities and Ear anomalies [2] (see Table 1).
Heterozygous variants in the chromodomain helicase DNA binding protein 7 (CHD7) cause CS [6]. Mechanistically, CHD7 is essential for the differentiation of gene expression at thousands of sites in the human genome [7]. The prevailing hypothesis is that the dynamic role of CHD7 during gene expression and neural crest development may account for the pleiotropic signs and symptoms of CS [7]. Prospective investigation of Open Access *Correspondence: att644@student.bham.ac.uk Page 2 of 20 Thomas et al. Journal of Neurodevelopmental Disorders (2022) 14:49 genotype-phenotype correlations has been performed [8][9][10] with an association between truncating CHD7 variants and more severe heart defects being identified [10]. However, given the rarity of CS and the spectrum of clinical findings, better delineation of genotype-phenotype associations requires pooling of data across data sets [11]. CS is associated with many disparate physical conditions requiring health monitoring throughout life [12]. Trider et al. [12] developed a comprehensive checklist for proactive monitoring of common or critical physical conditions and characteristics. These conditions will likely have a deleterious impact on emotional and psychological wellbeing. Identifying and understanding these impacts can help build resilience and early support strategies utilising multidisciplinary practices.
While physical health in CS has been extensively documented [12] research on development and behaviour is sparce. Developmental delay (DD) and intellectual disability (ID) have received the most attention and feature in all diagnostic algorithms (Table 1). Direct cognitive assessments are rarely appropriate as performance requires adequate communication and minimal sensory impairment [13]. Consequently, ID is often based on informant measures of adaptive behaviour that might not correlate well with cognitive performance [14][15][16].
Moreover, sleep problems, anxiety, emotional dysregulation, aggression, self-injurious behaviour and tactile defensiveness are reported in adolescents and adults with CS [17]. Psychiatric diagnoses in children and adults include anxiety, obsessive-compulsive disorder, attention deficit disorder, and autism [17][18][19]. Data reporting cognitive, behavioural, and psychiatric features in CS warrant synthesis to definitively describe the behavioural phenotype in the condition.
Diagnostic criteria have been revised several times to accommodate new insights (e.g. [3][4][5] see Table 1). Before the identification of the molecular etiology of CS in 2004, individuals were diagnosed solely based on clinical characteristics. Around 90% of individuals that meet clinical criteria for CS have an identifiable CHD7 variant [7]. However, there remains substantial heterogeneity in phenotypic presentation and CHD7 variants. A meta-analytic strategy would be informative to generate pooled prevalence estimates for Table 1 Diagnostic criteria for CHARGE syndrome a "Mental retardation" is an archaism superseded by DSM-5 intellectual disability/intellectual developmental disorder or ICD-11 disorders of intellectual development Rhombencephalic dysfunction Hypothalamo-hypophyseal dysfunction Abnormal external or internal ear Malformation of mediastinal organs Mental retardation a Cranial nerve dysfunction (including hearing loss) Dysphagia or feeding difficulties Structural brain anomalies Developmental delay, intellectual disability, or autism Hypothalamo-hypophyseal dysfunction, genital anomalies Heart or oesophageal malformation Renal anomalies skeletal or limb anomalies

Occasional findings
Renal anomalies Spinal anomalies Hand anomalies Neck/shoulder anomalies

Inclusion rule
Four criteria, including one major criteria Four major criteria or three major and three minor criteria Typical CHARGE: Three major or two major and two minor criteria Partial CHARGE: Two major and one minor criteria Atypical CHARGE: Two major but no minor or one major and two minor criteria Two major and any minor criteria Page 3 of 20 Thomas et al. Journal of Neurodevelopmental Disorders (2022) 14:49 clinical features based on a diagnosis of CHARGE syndrome (see Table 1). This would enable quantification of phenotypic characteristics and variability between individuals and further evaluation of moderating and co-occurring characteristics to assist in the exploration of potential subgroups within the clinically diagnosed CHARGE syndrome phenotype.
In this study, we apply meta-analytic techniques to synthesise prevalence estimates across published studies. Given the challenges of assessing behavioural, cognitive and sleep characteristics in CS and potential for varying methodological quality, studies are quality weighted prior to meta-analysis. Pooled prevalence estimates facilitate subgroup meta-regression analyses to elucidate and quantify interrelated characteristics. The aims of this study are the following:

Comprehensive search strategy
The reporting of this systematic review was guided by the standards of the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) [20] (see Appendix 1 of Supplementary materials 1; S1 [21]. The databases PubMed, Ovid MEDLINE, PsycINFO and Embase were searched from inception until January 12, 2021, using search terms for CS generated from OMIM. Search terms included MeSH terms and "All Fields" advanced searches for: CHARGE syndrome; CHARGE association; coloboma, heart anomaly, choanal atresia, genital anomalies and ear anomalies; Hall Hittner syndrome; CHD7; and SEMA3A. Details of search syntax are available in Appendix 2 (S1 [21]). Manual searches of reference lists from recent review articles [12,22,23], gene review knowledge bases (GeneReviews ® , UniProtKB) and contents pages of key journals (American Journal of Medical Genetics Part A (1979-2021), B (2003-2021) and C (2003-2021)) were also conducted to facilitate a comprehensive investigation. Details of manual searches are available in Appendix 3 (S1 [21]).

Selection criteria
Study selection was completed by the first author. Inclusion criteria permitted any peer-reviewed study reporting on the prevalence of behavioural, psychological, cognitive or sleep characteristics in a sample of participants with a clinical diagnosis of CS. Studies with less than five participants and caseseries reports were excluded (details are available in Appendix 4 (S1 [21]).

Data extraction
The first author independently extracted all data. Participant-level data were extracted for year of publication, recruitment of sample and sample size, participant age and gender, clinical features, CHD7 status and classification of CHD7 variant, enduring or recurrent physical characteristics and conditions, and behavioural, psychological, cognitive and sleep characteristics.

Quality appraisal
The quality framework used (see Table 2) was adapted from Richards et al. [24] and Surtees et al. [25] to control for the risk of methodological bias between individual studies included in the meta-analysis. Good inter-rater reliability was obtained for the quality framework, using a 25% random sample of the eligible studies. Details are available in Appendix 5 (S1 [21]). In summary, scores ranging from 1 (poor) to 4 (excellent) were awarded based on sample identification, confirmation of syndrome and assessment of behaviour, cognition or sleep.

Data synthesis
The effect size index for meta-analysis was derived from raw proportions and corresponding standard errors. The raw proportion (PR) is given by where Exp.case is the number of individuals experiencing the characteristic of interest, and Exp.sample is the number of individuals sampled. The standard error (SE) of the raw proportion is given by: Given the anticipated small sample size indicative of rare syndrome research, a pragmatic decision was made to exclude studies with less than five participants or an effect size of zero under the assumption that the sample size would not afford accurate estimation of the true event rate.
Where multiple measures of the same construct were reported across multiple subgroups, data were combined into one quantitative outcome. When computationally appropriate (i.e. where five or more study effects could be synthesised) characteristics were subdivided.

Meta-analysis
Analyses were conducted in R version 3.62. R code is available in Supplementary material 2 [21]. Pooled prevalence estimates and 95% confidence intervals (CIs) were calculated using the inverse variance method [32] assuming a random-effects model (REM). The assumption is that the synthesised studies vary randomly under a common distribution and the REM estimates the mean of the assumed distribution [32,33]. Visualisation of Quantile Quantile (QQ) plots were used to estimate the distribution of study effects for each REM. Where study effect sizes followed an approximate normal distribution, the DerSimonian-Laird estimate (DL) [33] was used to calculate between studies variance (tau). Where QQ plots suggested a non-Gaussian distribution, the restricted maximum-likelihood (ReML) estimator was used. ReML avoids over-fitting, providing an efficient estimator of tau when effects are not normally distributed [34].
To address methodological differences quality weightings were used to extend the random-effects model (QEM; quality weighted random effects model). REMs were limited to opportunities where five or more study effects could be synthesised. This threshold is the minimum k studies to allow implementation of exact permutation testing to reach statistical significance (i.e., p ≤ .05). The test permutes the effect size outcome and calculates a (two-sided) p value which is equal to the proportion of times that the absolute value of the test statistic under the permuted data is as extreme or more extreme than the observed data [35]. Where REMs were not statistically significant (p < .05) following permutation testing, pooled prevalence estimates were reported using a fixed effect model (FEM). To prevent bias, no studies were included more than once in a single meta-analysis. Potential sources of heterogeneity were investigated using the I squared (I 2 ) [36] statistic. Values of the I 2 index of 25%, 50%, and 75% were considered respectively as low, medium and high degrees of heterogeneity. Sensitivity was evaluated using the funnel plot, Baujat plot [37] fail-safe N [38] and leave one out procedures, and the impact of varying methodological quality was further investigated through a series of subgroup analyses. Details are provided in Appendix 6 (S1 [21]).
Rainforest plots were used to visualise statistically amalgamated studies. The rainforest plot is a variation of the traditional forest plot proposed by Schild and Voracek [39]. This alternative plot visually emphasises larger studies with short confidence intervals (CIs) and small studies with wider CIs are less visually dominant. Therefore, the rainforest plot enhances the interpretability of the traditional forest plot.

Meta-regression
Associations were appraised systematically within and between behavioural, psychological, cognitive and sleep characteristics, and between these features and each clinical feature and physical characteristic and condition included in the meta-analysis. Meta-regressions were also conducted to explore genotype-phenotype correlations (associations between CHD7 positive status and characteristics included in the meta-analysis, and truncating CHD7 mutations and characteristics included in the meta-analysis). Meta-regressions were conducted when ≥ 5 study effects could be analysed. Due to the high number of regressions, the Benjamini-Hochberg adjustment for multiple comparisons was used in the first instance [40] followed by permutation testing for studies with a p value of 0.05 or above.

Comprehensive literature search
A PRISMA flowchart summarising study selection is presented in Fig. 1. The search yielded 7761 citations and a further 29 studies were identified through manual searches. A total of 42 studies were eligible for meta-analysis.  Table 3 presents the descriptive data and clinical features for CS extracted from eligible studies. In the total sample, 1556 participants were reported to have typical or atypical CS, and 362 diagnoses were confirmed genetically. Studies were published between 1979 and 2020. The mean age of participants was 9.5 years (range < 1 to 53 years) and 51% of participants were male.
The following studies had overlapping datasets: Smith et al. [67] and Issekutz et al. [1], Johansson et al. [58] and Strömland et al. [70], Hartshorne et al. [17] and Salem-Hartshorne and Jacob [64], and Wincent et al. [74] and Strömland et al. [70]. In this scenario, earlier data sets were given precedence for meta-synthesis, with later studies contributing only original previously unpublished data. Six participants were omitted from Davenport et al. [47] that were previously reported as a familial case series. Four participants from Hale et al. [5], one participant from Jongmans et al. [9] and five participants from Wessels et al. [73] were excluded because they did not meet clinical criteria for CS. Figure 2 presents the pooled prevalence estimates and 95% CIs for clinical features of CS drawn from the eligible studies. For comparison, these estimates are presented alongside the largest and most recent review of individuals with a clinical diagnosis by Hale et al. [5]. The prevalence of coloboma were higher and the prevalence of ear anomalies, anosmia, genital hypoplasia, facial clefts and tracheoesophageal fistula were lower in the present study than in Hale et al. [5]. Sufficient data were also available to calculate subcategories of coloboma, choanal atresia, heart defects and structural brain anomalies (see Appendix 7, S1 [21]). ) and a 32% (CI = 13-51%) prevalence of laryngeal anomalies. Sufficient data were available to explore types of skeletal anomalies: spinal anomalies (including scoliosis, 28% [CI = 19-36%]) and hand anomalies (15% [11-20%]) (see Appendix 8 and 10, S1 [21]. Table 4 presents data on the study-level prevalence of behavioural, psychological, cognitive and sleep characteristics including: DD, ID, autism, aggression, selfinjurious behaviour, obsessive or compulsive behaviour, tactile defensiveness and sleep problems. Two studies used a whole population sample [1,14], with 25 studies (60%) using single restricted or non-random samples, and 15 studies (36%) using multiple restricted or nonrandom samples. Seven studies (17%) reported details of clinical diagnosis and genetic testing, with 3 studies (7%) confirming these findings at the time of data collection. Of the remaining 32 articles, 16 (38%) reported which clinical diagnosis participants were assessed against, and 16 (38%) did not. Assessment methods were typically 'poor' (64%) with 22% rated as 'adequate' , 11% as 'good' and 4% were rated 'excellent' .

Behavioural, psychological, cognitive and sleep characteristics
We conducted quality weighted random effects metaanalyses for cognitive, behavioural, psychological and sleep characteristics. Results are summarised in Fig. 4 and detailed in Appendix 10 and 11 (S1 [21] identified for studies rated 'good' or 'excellent' and studies rated 'poor' or 'adequate' for confirmation of syndrome in the meta-analysis of ID. Full details are included in Appendix 10 (S1 [21]).

Meta-regression
Statistically significant associations included more ID and less choanal atresia (p = 0.014), more ID and less inner ear anomalies (p = 0.014), more sleep problems and more growth deficiency (p = 0.001) and more sleep difficulties and more gross motor difficulties (p = 0.033). Details are provided in Appendix 12 (S1 [21]). A summary of results is presented in Fig. 5.
A series of meta-regressions were calculated to evaluate evidence for genotype-phenotype associations. There were       no statistically significant associations in this series of metaregressions (details provided in Appendix 13, S1 [21]). Finally, to supplement our findings, we ran a metaregression for each characteristic using year of publication as a moderator variable. Significantly more inner ear anomalies (p = 0.018) and atrial septal defects were reported over the years (details provided in Appendix 14).

Discussion
In this comprehensive systematic review and meta-analysis, results indicate that cognitive, behavioural, psychological and sleep difficulties are prevalent in CS. There is substantial variability in the quality of studies and significant differences between study estimates. These prevalence estimates have enabled investigation of relationships between characteristics, facilitating a comprehensive method for describing CS.

Clinical features
Pooled prevalence estimates of clinical features were largely consistent with previous reports by Hale et al. [5], with the greatest discrepancy being lower estimates for inner ear anomalies. Hale et al. [5] estimates were drawn from studies published between 2005 and 2016, whereas our meta-analysis incorporated research from 1979 to 2020. As inner ear anomalies were not recognised as a clinical feature of CS until 2001 [75], it is possible that the prevalence estimate is conservative. However, there was no significant difference between prevalence rates reported before or after 2001 in our analysis (p = 0.111). Therefore, in line with the first aim of the study, we present our results as an up-to-date prevalence estimate of clinical features for future clinical and research practice.

Physical characteristics and conditions
Of the physical conditions identified in accordance with the second aim of the study, the highest prevalence estimates were for otitis media (74%) and gastroesophageal reflux (58%). There is an established causal link between otitis media and gastroesophageal reflux in typical development [76], but there were too few reports in the current review to assess this association. Gastroesophageal reflux is associated with failure to thrive and with significant mortality in young children with CS [1,44,77]. In Cornelia de Lange Syndrome, behavioural indicators of gastroesophageal reflux include night-time agitation, hyperactivity and self-injurious behaviour [78]. Research is required to prospectively identify specific behavioural markers of reflux in CS.
Micrognathia was the only feature of the characteristic CHARGE face described by Blake et al. [3] that was frequently reported as an independent observation. Similarly, almost one third of individuals with CS were estimated to have laryngeal anomalies. These presentations are worthy of consideration in clinical contexts as laryngeal anomalies and micrognathia are known to increase the burden of respiratory and therefore sleep disordered breathing for example in 22q11.2 deletion [79], Treacher Collins and Nager syndromes [80].

Behavioural, psychological, cognitive and sleep characteristics
In addressing the third aim of the study, quality adjusted prevalence estimates for behavioural, psychological, cognitive and sleep characteristics. The estimated prevalence rates for aggression and self-injurious behaviour are concerning, given the likely impact on parenting stress [16] and quality of life [17]. Once present, self-injury and aggression often persist [81,82]. Comparable incidence rates of self-injury and aggression have been reported in fragile X (51% and 52%) and Prader-Willi (52% and 43%) [83], and gaps between service need and service provision are reported, despite the availability of evidencebased treatment [84]. This is an area that requires careful monitoring in the CS community. Future research should aim to determine the intensity, frequency and duration of aggression and self-injurious behaviour, through comparison with different genetic syndromes that have shared characteristics. Such research provides the groundwork for tailored interventions based on the specific strengths and difficulties of the individual with CS.
Obsessive-compulsive behaviour was detailed in one of eight studies in which it was reported, despite these behaviours being described as a pervasive manifestation in CS [85]. Given the salience of obsessive-compulsive behaviour, a pragmatic decision was made to include both a clinical diagnosis of obsessive-compulsive disorder [18,52] and observations reported as obsessive-compulsive behaviour [1,17,42,48]. While estimates did not significantly differ between obsessive-compulsive behaviour and obsessive-compulsive disorder (p = 0.414), the quality of assessments was poor for all but one study and estimates ranged from 3 to 72%.
Study estimates for a clinical diagnosis of autism were variable, ranging from 6 to 50%, and this may reflect the range of assessment strategies. Four studies included in the meta-analysis assessed autism, each with a different measure or combination of measures. Autism could not be reliably assessed in 8-12% of participants in two of these studies due to severe sensory impairment and ID [67,70]. This is concerning, given the increased likelihood of autistic behaviour in this sub-group [58]. The evidence indicates a need for the development of assessments and interventions for autism in CS that are sensitive to the spectrum of reported autistic behaviours.
Given the detrimental effect of poor sleep on learning, behaviour regulation, physical, psychological, and social wellbeing [86], the estimated 45% prevalence of sleep difficulties in CS should not be overlooked. A more nuanced understanding of the characteristics and aetiology of sleep difficulties is required to develop proactive assessment and timely interventions.
The quality weighted pooled prevalence estimate for DD was 84%, with a 64% pooled estimate for ID with an estimated 28% of people with CS experiencing severe or

Sleep Problems
Hsu et al. [56] 100% (20/20) profound ID. While prevalence estimates were characterised by wide CIs, they do suggest greater potential for cognitive development than has been described in previous reviews [8,22].

Exploration of co-occurring characteristics
A series of exploratory meta-regression analysis were conducted to explore co-occurring characteristics in accordance with the fourth aim of the study. Metaregression analysis revealed associations between sleep problems and gross motor difficulties, and sleep problems and growth deficiency. The association between sleep problems and growth deficiency in CS is likely to be multifaceted. For example, growth can be limited by feeding difficulties and chronic illness that may cause pain or necessitate overnight monitoring, compromising sleep [12]. There is also an association between obstructive sleep apnoea and growth failure [87]. Where this condition is due to enlarged tonsils and adenoids, improvement in growth has been reported following adenotonsillectomy [87]. Growth hormone deficiency is also associated with CS [88] and monitoring is recommended as part of multidisciplinary care [12]. Disordered growth hormone secretion can be a consequence of disordered sleep because most growth hormone secretion is triggered by the onset of slow-wave sleep [89,90]. As such, pain and discomfort, obstructive sleep apnoea and a sleep-disorder-related growth hormone deficit are worthy of consideration in the workup and management of growth deficiency in CS. With reference to the associations between sleep problems and gross motor difficulties, it is notable that sleep disordered breathing has been shown to have a negative impact on motor development in Down syndrome [91]. Furthermore, children with more gross motor difficulties are likely to walk at a later age. A later age of walking in CS is associated with more 'challenging behaviour' [52], 'autistic behaviour' [53] and adaptive functioning limitations [64]. This evidence suggests that gross motor development could be a key intervention target for multidisciplinary assessment, including otolaryngology, occupational therapy and developmental paediatrics. In summary, the bidirectional association between sleep and gross motor difficulties, potentially predicted by a later age of walking, warrants further investigation.
The relationship between ID and less choanal atresia seemed unlikely given links between early psychomotor delay and severe respiratory distress [61]. However, as reported in Tellier et al. [71], 48% of infants with bilateral choanal atresia died in the first year of life, before ID could be assessed. Therefore, the association between ID and choanal atresia may simply be an artefact of the data.

Exploration of genotype-phenotype associations
Consistent with previous reports [7], an estimated 84% of study participants that received genetic testing had an identifiable CHD7 variant. To address aim five, a series of meta-analysis and meta-regression were run with these CHD7-positive participants to evaluate evidence for genotype-phenotype associations.
There is some evidence to suggest that truncating mutations are associated with a more severe CS phenotype [8,10]. We identified no such association. Based on our findings, and the available literature, we can make no inference to genetically mediated sub-groups within the clinically diagnosed CHARGE syndrome population. However, given the pleotropic nature of CHD7, it is conceivable that the available data was not detailed enough to capture genotype phenotype interactions. Further exploration of CHD7 function through gene expression studies may advance our understanding of Genotype-Phenotype Associations and the pathogenesis of CHARGE syndrome.

Limitations
This study has some limitations. First, we excluded participants with CHD7 disorder that did not fulfil the clinical criteria for CS. We may therefore have excluded participants with milder CS phenotypes. Conversely, including CS participants for whom clinical features were not reported may have led to the inclusion of non-CS participants. As such, our findings should be treated as preliminary. However, there was no statistical difference between studies that did or did not detail the CS diagnosis. Second, synthesis of the CS literature was hampered by the large number of idiosyncratic descriptions used. For example, the 60% prevalence of 'increased levels of stress and anxiety' [49] and 35% incidence of 'often seemed anxious' [69] could not be reliably pooled with the 37% and 45% prevalence of anxiety diagnosis reported by Blake et al. [18] and Hartshorne et al. [17] respectively. Anxiety is a multifaceted construct that requires a fine-grained appraisal to facilitate metasynthesis. Edwards et al. (unpublished results) have used such an approach and report a 37% (95% CI 10-64%; k = 2) quality weighted pooled prevalence estimate for anxiety in CHARGE syndrome. A statement on the use of specific, explicit, and appropriate definitions for behaviour in CS should be developed through multidisciplinary collaboration to enable data sharing and pooling. Availability of such data, particularly longitudinal data, would allow researchers to go beyond co-occurring characteristics to understand varying developmental trajectories. Lastly, meta-analytic estimates were limited by the paucity of available data and the wide CIs for the pooled prevalence estimates were not fully explained by meta-regression or subgroup analysis. As such our Page 18 of 20 Thomas et al. Journal of Neurodevelopmental Disorders (2022) 14:49 findings and recommendations should be considered as preliminary. Similarly, the use of univariate analysis to understand causal pathways in a heterogeneous syndrome such as CHARGE is less than adequate. However, multivariate analysis was precluded by the paucity of data. It is feasible that co-occurring characteristics may arise independently, and we emphasise that our findings and recommendations should be interpreted with caution.

Conclusion
Cognitive, behavioural, psychological and sleep difficulties are highly prevalent in CHARGE syndrome. Future research should address the conceptualisation and description of behaviour in CHARGE syndrome, the development of valid and reliable instruments for psychological diagnosis, and an observational checklist for behavioural signs of gastrointestinal reflux. Future research should use cross-syndrome comparison to understand the clinical presentation of CS. The data from this systematic review and meta-analysis support the ongoing efforts of family support groups, researchers, and practitioners to strengthen understanding and develop appropriate interventions and supports for individuals with CHARGE syndrome, their families and professionals involved in their care.
Additional file 2: R code used for data analysis.