Open Access

Novel method for combined linkage and genome-wide association analysis finds evidence of distinct genetic architecture for two subtypes of autism

  • Veronica J. Vieland1Email author,
  • Joachim Hallmayer2,
  • Yungui Huang1,
  • Alistair T. Pagnamenta3,
  • Dalila Pinto4,
  • Hameed Khan4,
  • Anthony P. Monaco3,
  • Andrew D. Paterson4,
  • Stephen W. Scherer4,
  • James S. Sutcliffe5,
  • Peter Szatmari6 and
  • The Autism Genome Project (AGP)
Journal of Neurodevelopmental Disorders20113:9072

https://doi.org/10.1007/s11689-011-9072-9

Received: 20 October 2010

Accepted: 4 January 2011

Published: 19 January 2011

Abstract

The Autism Genome Project has assembled two large datasets originally designed for linkage analysis and genome-wide association analysis, respectively: 1,069 multiplex families genotyped on the Affymetrix 10 K platform, and 1,129 autism trios genotyped on the Illumina 1 M platform. We set out to exploit this unique pair of resources by analyzing the combined data with a novel statistical method, based on the PPL statistical framework, simultaneously searching for linkage and association to loci involved in autism spectrum disorders (ASD). Our analysis also allowed for potential differences in genetic architecture for ASD in the presence or absence of lower IQ, an important clinical indicator of ASD subtypes. We found strong evidence of multiple linked loci; however, association evidence implicating specific genes was low even under the linkage peaks. Distinct loci were found in the lower IQ families, and these families showed stronger and more numerous linkage peaks, while the normal IQ group yielded the strongest association evidence. It appears that presence/absence of lower IQ (LIQ) demarcates more genetically homogeneous subgroups of ASD patients, with not just different sets of loci acting in the two groups, but possibly distinct genetic architecture between them, such that the LIQ group involves more major gene effects (amenable to linkage mapping), while the normal IQ group potentially involves more common alleles with lower penetrances. The possibility of distinct genetic architecture across subtypes of ASD has implications for further research and perhaps for research approaches to other complex disorders as well.

Keywords

Autism Linkage analysis Genome-wide association PPL PPLD IQ

Introduction

Autism spectrum disorders (ASDs) are heritable, genetically complex neurodevelopmental conditions. In this paper, we search for ASD genes through combined analysis of two datasets collected by the Autism Genome Project (AGP). The AGP previously reported linkage to 11p12–p13 along with notable copy number variations in the largest collection of multiplex ASD families analyzed to date (Szatmari et al. 2007; see also Liu et al. 2008), followed by copy-number variation (CNV) and association analysis in a large cohort of trios (Anney et al. 2010; Pinto et al. 2010). Here, we use a distinct statistical method, based on the PPL framework (Vieland 1998, 2006; Yang et al. 2005; Vieland et al. 2008; Wratten et al. 2009; Huang and Vieland 2010), to reconsider the combined multiplex and trio data sets.

The PPL statistical framework has three principle advantages in this context. (1) It handles genetic heterogeneity via “sequential updating” across data subsets (Vieland et al. 2001; Huang and Vieland 2001; Govil and Vieland 2008). The posterior evidence from previously analyzed data is carried forward as prior evidence as new data subsets are analyzed, with underlying genetic parameters (allele frequencies, penetrances, and levels of heterogeneity) allowed to vary between subsets. This can be a powerful method for discovering genetic signals arising from even very small subsets of the data, provided only that we have classification variables allowing the division of families (or cases) into relatively more homogeneous groups (Govil and Vieland 2008). (2) The PPL accumulates evidence against genetic hypotheses as well as in favor of them. Thus inspection of subset-specific contributions to the omnibus signal can distinguish among subsets that are supporting the hypothesis and subsets that are actually contributing evidence against it. (3) The PPL permits analysis of multiplex families and trios in a unified manner. The multiplex families provide linkage information, while the trios provide information on allelic associations. Here, we introduce a novel method for genome-wide analysis based on simultaneous use of linkage and association information from two different sets of data.

It is widely accepted that ASD is genetically heterogeneous, but less clear whether clinical features can be used to demarcate more homogeneous subclasses. Familial concordances for specific ASD symptoms are not strong, and there is generally high intrafamilial variability. However, familiality for nonverbal IQ has been reported in several studies (Le Couteur et al. 1996; Silverman et al. 2002; Szatmari et al. 2008; MacLean et al. 1999). This is in line with more recent family and twin studies suggesting that IQ is the most heritable component of the ASD phenotype (Szatmari et al. 2008). Furthermore, subgrouping ASD patients on the basis of IQ has provided the most consistent method for distinguishing patients on a number of dimensions. At the lower end of the IQ range, there is considerable overlap between autistic features and chromosomal syndromes (Xu et al. 2004); epilepsy is more prevalent, and the ratio of females to males approaches unity in contrast to the preponderance of males among higher IQ cases (Amiet et al. 2008). Moreover, there is compelling emerging evidence of considerable etiologic overlap between the clinical classification of intellectual disability, various mental retardation syndromes, and ASD in terms of rare de novo and inherited CNVs (Guilmatre et al. 2009; Bijlsma et al. 2009; Marshall et al. 2008). Indeed, the distinction of “high” and “low” functioning autism is often based on IQ, and indicates groups that differ with respect to associated brain dysfunction, outcome, and response to treatment (Lotspeich et al. 2004; Allen et al. 2001; Stevens et al. 2000).

Here, we accumulate the total, or “omnibus,” evidence across subsets of families characterized by the presence or absence of lower IQ autistic individuals while allowing for the fact that the subsets may differ substantially from one another. We find compelling evidence that, indeed, the lower IQ group appears to be genetically distinct.

Methods

Participants

Multiplex families (N = 1,069), each containing at least 2 individuals diagnosed with autism by the Autism Diagnostic Interview (ADI; Le Couteur et al. 1989) and clinical best estimate, were contributed by 10 sites. (See Szatmari et al. (2007) for additional details. Note that some “sites”, or research groups, covered multiple data collection locations; however, sample sizes precluded further subdivision of the data.) IQ was recorded as a dichotomous trait. Families were classified as lower IQ (LIQ, N = 255) if at least one ASD individual had performance IQ ≤ 50 or was coded as “missing due to low functioning”; as normal IQ (NIQ, N = 580) if all ASD individuals had IQ > 50; and as missing IQ (MIQ, N = 234) if there were no lower IQ individuals and at least one affected individual missing IQ information.

Trios (N = 1,129) were contributed by 8 sites. (See Anney et al. (2010) for additional details. Again, some sites covered multiple data collection locations). Children met criteria for either autism or ASD based on ADI and ADOS criteria. A trio was classified as LIQ (N = 285) if the child had performance IQ ≤ 70 or was coded as “missing due to low functioning”, as NIQ (N = 394) if the child had IQ > 70, and as MIQ (N = 450) if the IQ information was missing. Changes in IQ classification by the AGP over time have led to the slight difference in IQ classification compared to the multiplex families. However, IQ is used only to subdivide the sample into relatively more homogeneous subsets, not as an outcome variable in its own right, and this is therefore unlikely to appreciably affect the results. Note too that the proportion of LIQ families is similar (25% vs. 24%) in the trios and multiplex families, respectively, suggesting that the change in cutoff might actually appropriately compensate for differences between the two datasets. Of the trios, 283 overlapped with the multiplex families, but only a single case from each overlapping family was used in the LD analyses; thus there is no overlap in the information extracted from the overlapping samples. All trios were of European ancestry (Anney et al. 2010). All sites had Institutional Review Board approval for this study, and the research was conducted in accordance with the World Medical Association Declaration of Helsinki (2000). Written informed consent was obtained from all subjects after the study had been fully explained.

Genotyping and data cleaning

Details of genotyping methods are given in Szatmari et al. (2007) and Anney et al. (2010). In preparation for linkage analysis, marker data were cleaned for family structure problems and Mendelian inconsistencies. Merlin (Abecasis et al. 2002) was run to detect and remove unlikely double recombinants, and to cluster any SNPs in LD groups. (Most parents were genotyped and LD in the marker map proved not to affect the results.) In preparation for LD analyses, marker data were additionally cleaned for marker missingness (>5%), sample missingness (>5%), and excess Mendel errors both by SNP and individual. Markers with minor allele frequency <1% were dropped, as were SNPs with a Hardy–Weinberg (HW) p value < 1 × 10−10 in at least one data subset or HW p value < 0.05 in at least three subsets. After cleaning, 749,933 SNPs remained in the analyses.

Statistical methods

All analyses were conducted using the software package Kelvin (Huang et al. 2006), which implements the PPL class of models for measuring the strength of genetic evidence (Vieland 1998, 2006). The two specific statistics employed were the PPL itself (posterior probability of linkage) and the posterior probability of trait-marker linkage disequilibrium (PPLD). Linkage analyses utilized LOD scores computed in Merlin (Abecasis et al. 2002; Lander and Green 1987) as input to PPL calculations (Vieland 1998). The genetic map is based on http://compgen.rutgers.edu/mapopmat (Matise et al. 2007; release 10/09/06).

The PPL as applied here is parameterized as a dichotomous trait model with parameters α (the admixture parameter of Smith (1963), representing the proportion of ‘linked’ pedigrees), p (the disease allele frequency), and the penetrance vector f i , representing the probability that an individual with genotype i develops disease, for i − 1..3. All trait parameters are integrated out of the final statistic, using uniform prior distributions, implicitly allowing for dominant, recessive, and additive models along with intra-subset heterogeneity. This provides a robust approximation for mapping complex traits in terms of the marginal model at each locus, and because the parameters are integrated out, no specific assumptions regarding their values are required. The likelihood also contains two location parameters: the recombination fraction θ and the standardized LD parameter D′, representing trait–marker association due to physical proximity.

The PPL framework accumulates evidence across data subsets by integrating the trait parameters out of the likelihood separately for each subset, using Bayesian sequential updating to combine the marginal information regarding θ and D′ across subsets. This procedure allows for genetic differences among data subsets, and is far more robust in retaining true signals originating from individual subsamples than analyses that simply combine subsets for a single analysis (Vieland et al. 2001; Huang and Vieland 2001; Govil and Vieland 2008). Here, we have subdivided the data and sequentially updated across IQ groups. Because the AGP families have been contributed by multiple research groups, we also sequentially update over “site.” Sites can vary with respect to the populations from which they recruit, ascertainment strategies and criteria, and subtle differences in clinical evaluations; simple sampling variability can also lead to inter-site differences. While not usually considered as a separate source of variation in genetic studies, the importance of allowing for site effects has been long appreciated in other settings, such as clinical trials. After dividing by IQ and site, subset sizes ranged from N = 20–148 (mean = 62) for the multiplex families and N = 20–169 (mean = 71) for the trios.

The PPL is on the probability scale, and its interpretation is therefore straightforward: e.g., PPL = 40%, means that there is a 40% probability of a trait gene at the given location based on these data. Based on earlier calculations (Elston and Lange 1975), the prior probability at each location is set to 2%, so that PPLs > 2% indicate (some degree) of evidence in favor of a trait gene at that locus, while PPLs < 2% represent evidence against the location. The prior probability of LD given linkage (L) is also set to 0.02, so that in the absence of prior linkage information P(L&LD) = 0.0004 (see also Welcome Trust Case Control Consortium 2007 for justification of a comparable figure).

Novel here is a mathematically rigorous method for using linkage information from the multiplex families to inform the association analyses, based on the fact that PPLD = PP(LD|L) × PPL (see Huang and Vieland 2010 for additional details). We interpolated the PPL results onto the physical map, and inserted the measured PPL into this equation. Thus, PPLs < 2% will depress PPLDs, and PPLs > 2% will increase PPLDs, by increasing the prior probability of LD under a linkage peak, up to a maximum of 2% prior probability of LD when PPL = 1 (see Roeder et al. (2007) for a related approach). This assumes that at least some ASD genes are etiologically relevant to both the multiplex and trio sets.

The PPL and PPLD are measures of statistical evidence, not decision-making procedures; therefore, there are no “significance levels” associated with them and they are not interpreted in terms of associated error probabilities (Royall 1997; Vieland and Hodge 1998). By the same token, no multiple testing corrections are applied to the PPL or PPLD, just as one would not “correct” a measure of the temperature made in one location for readings taken at different locations (Vieland 2006). Nevertheless, it may assist readers to have some sense of scale relative to more familiar frequentist test statistics. In simulations of 10,000 replicates of sets of 1,000 affected sib-pairs under the null hypothesis (no linkage), PPLs of 5%, 25%, and 80% were associated type 1 error probabilities of 0.00128, 0.00002, and <0.00001, respectively. In 10,000 null (no linkage, no LD) replicates of sets of 1,000 trios, no PPLDs > 1% were observed, while PPLD > 0.1% occurred in just 0.04% of replicates. At a locus with PPL < 2%, this represents a PPLD < 0.1%; while at a locus with PPL = 80% this would still only correspond to a PPLD of 3.9%.

It is also of interest to consider “power” in the trio sample in particular. For relative risk (RR) of 1.3–1.7, our ability to detect association in regions lacking evidence of linkage is low; e.g., for RR = 1.7, PPLDs > 10% occur just 10% of the time. However, LD under linkage peaks is expected to be considerably stronger. For RR = 2.0, with PPL = 80%, 91% of PPLDs are > 29% and 59% of PPLDs are >82%; with RR = 2.5 99.6% of PPLDs are >82%. (Here we generated data with disease and marker minor allele frequencies of 0.1, varying D′ and the penetrances to achieve different RRs; actual power can obviously deviate from these results.) Thus, we are unable to draw definitive conclusions regarding absence of LD in unlinked regions of the genome based on the current sample size. However, the sample size appears adequate for detection of moderate allelic effects under linkage peaks.

Results

Omnibus linkage analysis

Figure 1a shows genome-wide PPL results for the omnibus (all groups) analysis. 92.6% of the genome showed evidence against linkage, 97.4% of the genome had PPLs < 5%, and 98.7% of PPLs were <10% (99.6% ignoring chromosome 11, which shows several broad peaks). Against this backdrop, several peaks stand out. Two peaks on chromosome 11 coincide with locations reported in the two previous AGP analyses of this dataset (PPL = 60%@11p13; PPL = 93%@11p15.2). Also noteworthy is the very high PPL = 87% on 16q21, as well as several other peaks including: 2p25 (PPL = 12%), 4q31 (PPL = 33%), 6q14 (PPL = 11%), 18q22 (PPL = 18%), and possibly two additional peaks on 11p15 and 11q14, which are more moderate in size although still salient relative to the background. We note that the detection of multiple loci in this dataset is attributable largely to the PPL’s use of sequential updating. For instance, if we simply “pool” all sites and IQ groups together for a single analysis, on 16q21, the PPL at the peak is just 4%, compared to 87% based on sequential updating.
Fig. 1

Genome-wide linkage analyses in a omnibus, b LIQ, c MIQ, and d NIQ groups. The PPL (posterior probability of linkage) represents the probability of an ASD gene at each position. The x-axis represents chromosomes 1–23 (X) on the Kosambi cM scale; the y-axis is on the probability scale. The horizontal line at PPL = 0.02 corresponds to the prior probability of linkage. Values below this line represent evidence against linkage, while values above the line represent evidence for linkage, at the given position

Linkage analysis by IQ group

Plotting the IQ groups separately (Fig. 1b–d), we see that the linkage plots suggest substantially different genetic profiles, with peaks occurring at different positions and more peaks in the LIQ group than the NIQ group. Notably, in several cases in which one IQ group gives evidence in favor of linkage, the other IQ group is actually giving evidence against linkage across the region. For example, the NIQ group gives PPL < 2% across the entire region surrounding the peak on 16q21 in which the LIQ group is giving evidence for linkage.

In this context, the MIQ group serves as a kind of control. Combining data from two genetically distinct groups in a single analysis tends to attenuate linkage signals (Govil and Vieland 2008). Thus, if the LIQ and MIQ groups differ in their underlying genetic etiology, then the MIQ group, presumably comprised of a mixture of LIQ and NIQ families, should produce smaller linkage signals overall. On the other hand, if the appearance of two distinct genomic patterns comparing the LIQ and NIQ groups were the result of random variations rather than true genetic differences, the larger MIQ group would be expected to yield larger linkage peaks, perhaps in separate locations. The observed pattern in the MIQ group corroborates the interpretation of these graphs as indicating that IQ is indeed demarcating genetically different subsets of the data.

The linkage signals on 1q31.3, 13q22.1, 14q24.2, and 16q21 are clearly driven by one IQ group in particular (with the other giving evidence against linkage), and in three of the four cases it is the LIQ group that is driving the signal. The peaks on 11p13, 11p15.2 are more difficult to parse: on the one hand, the omnibus PPLs are higher than the PPLs from either the LIQ or the MIQ subset; on the other hand, Fig. 1 strongly suggests that there are multiple loci on this chromosome, and possibly distinct genes operating in the two IQ groups (see below), which is consistent with the absence of appreciable signals from the MIQ group. Note too the small but visible omnibus signal on 15q11.2 (PPL = 4%), which rises to PPL = 14% in the NIQ group. This signal is directly over the known Prader–Willi ASD locus (van der Zwaag et al. 2010; Vorstman et al. 2006).

Omnibus combined linkage and association analysis

Figure 2a shows omnibus PPLD results. Against a very clean background, two modest peaks stand out. These occur at rs11603469 (11p15.2, PPLD = 26%) and rs10221112 (16q21, PPLD = 15%). In both cases, surrounding SNPs are also giving PPLDs elevated above the baseline (prior) probability of LD. On 11p15.2, rs11603469 is one of a small cluster of SNPs showing some LD evidence and overlapping the gene FAR1 (rs11603469 itself is 10 kb from the FAR1 start site); on 16q21 the SNP falls 351 kb from the nearest annotated gene (GOT2). A third, smaller, LD signal stands out on 4q31.23 (PPLD = 6% at rs7668351, which falls in BC031092). In each case, these SNPs fall directly under corresponding linkage peaks (Fig. 3). It is noteworthy that in each case, multiple data subsets (sites) support LD, but also, multiple sites give evidence against LD, and some are merely neutral. In situations where allelic effects may vary across strata, pooling data across strata will tend to wash out these types of signals.
Fig. 2

Genome-wide combined linkage and association results from a omnibus, b LIQ, and c NIQ analyses. The PPLD (posterior probability of LD) represents the probability of allelic association with ASD due to LD for each SNP in turn, and utilizes both linkage information from the multiplex families and association information from the trios. The x-axis represents the physical map for chromosomes 1–23 (X); the y-axis is on the probability scale. An additional 151 markers from the pseudoautosomal region of X are not shown on the graph; none had PPLD exceeding the prior probability of LD

Fig. 3

Omnibus PPL and PPLD for chromosomes a 4, b 11, c 16. Units on the x-axis are in cM

Combined linkage and association analysis by IQ group

Because the linkage results strongly suggest distinct etiology in the LIQ and NIQ groups, it is also of interest to consider the two groups separately in the association analyses. As expected, different SNPs are salient in the two groups (Fig. 2b–c). In general, the LIQ plot is slightly noisier than the NIQ plot, with smaller maximum peak and more “chatter” at the bottom of the plot. In part, this is consistent with smaller sample size. However, “power” is not merely a function of sample size, but also of the underlying genetic model. The LIQ multiplex family dataset is also smaller than the multiplex NIQ dataset, yet the linkage signals are more numerous and higher in the LIQ group. The different pattern observed for the PPLD analyses may therefore be revealing real differences in the underlying genetic architecture, and not just reflecting relative sample sizes. We return to this point below.

Table 1 shows all PPLDs ≥ 10% from the separate LIQ and NIQ analyses. Compared with the omnibus results, on 11p15.2, the omnibus signal in FAR1 is driven by the NIQ group (maximum PPLD = 32%). On 16q21, the omnibus signal is driven by the LIQ group, which on its own gives a PPLD = 7%, bolstered by a small signal from the MIQ group (not shown); none of these SNPs falls in an annotated gene. Some additional signals also appear in the subgroup analyses that were not salient in the omnibus results (see Fig. 4; this figure also shows the distinct genetic linkage patterns on chromosome 11). On 8q21.12 (LIQ, not in an annotated gene), a pair of SNPs is showing evidence of LD in a region not showing evidence of linkage (the second SNP, rs7007634, has PPLD = 8%). Additional association signals from the separate analyses are found on 3p12.1 (NIQ) and Xq13.1 (NIQ, with no clear difference between males and females) and 16p13.2 (LIQ).
Table 1

All PPLDs ≥ 10% from the LIQ, NIQ analyses

IQ group

Chromosome

SNP name

cM position

Physical positiona

PPLD

Gene

NIQ

11p15.2

rs11603469

27.41

13636952

0.32

8 kb from FAR1

NIQ

11p15.2

rs10500796

27.41

13655409

0.27

FAR1

LIQ

8q21.12

rs3885022

92.28

79890601

0.19

17.6 kb from IL7

NIQ

Xq13.1

rs12556351

59.34

71509736

0.12

HDAC8

LIQ

16p13.2

rs17722735

17.10

6660421

0.11

A2BP1

NIQ

3p12.1

rs9812103

117.58

86069410

0.1

CADM2

a Physical positions are based on NCBI build 36.1.

Fig. 4

PPL and PPLD for LIQ, NIQ groups respectively, for chromosomes a 3, b 8, c 11, d 16, e X containing SNPs shown in Table 1. Units on the x-axis are in cM

Discussion

These analyses represent an examination of the AGP data from a unique statistical perspective. In contrast to the original analyses of the multiplex families (Szatmari et al. 2007), we have found multiple strong linkage signals. Disappointingly, however, PPLD analysis failed to find strong evidence of allelic effects under the linkage peaks, which would point us to the individual genes driving the linkage results. (We note, however, that follow up molecular work focused on one of the linkage peaks has established a strong prima facie case for involvement of the gene CDH8 (Pagnamenta et al. 2011).) The apparent absence of allelic effects could reflect a genuine absence of LD under the peaks, or limitations in 1 M coverage of the peaks for LD mapping purposes. Another possibility is that there is sufficient heterogeneity that the trios are simply too dissimilar to the multiplex families to be informative at the same genes. The absence of dense SNP array data in the majority of the multiplex families makes direct evaluation of this possibility difficult.

It is also important to keep in mind that the trio sample is still relatively small, and in particular, the LIQ and NIQ groups individually may be too small to provide strong evidence on their own. The AGP is currently completing a second phase of trio data collection and genotyping, which will effectively double the sample size, and sequentially updating with the new dataset will provide better differentiation between SNPs truly supporting LD and SNPs with evidence against LD.

However, the overall pattern of results might reflect heterogeneity between the IQ groups rather than sample size. The linkage analysis finds more loci in the LIQ analyses than the NIQ analyses, despite the fact that the multiplex NIQ sample is 2.3 times the size of the multiplex LIQ sample; while in the LD analyses, where the sample sizes are better matched (the NIQ trio sample is just 1.4 times as large as the LIQ trio sample), the strongest signals are found in the NIQ group. Linkage analysis is powerful for identifying relatively major effects, that is, those in which mutations at a single locus greatly increase disease risk, even if only in a small subset of cases or against specific genetic and environmental backgrounds. Association analysis is particularly powerful for detecting alleles that individually confer small effects on disease risk, but do so in a relatively homogeneous manner across the study population. Thus, the two sets of results can be interpreted as telling a complementary story. The LIQ families may represent more strongly “genetic” forms of disease, in which a single gene or a small number of genes cause the disorder in any given individual, with sufficient overlap in causal genes across families to permit linkage mapping. The NIQ families, on the other hand, could involve more of a spectrum of conditions, possibly more highly influenced by the accumulation of variants in multiple genes each of smaller effect, or perhaps simply involving even higher levels of heterogeneity and/or many private mutations.

Of course until more data are available, this remains highly speculative. Further work to fully characterize the distinction between the LIQ and NIQ groups, combined with additional genetic analyses, will be needed to refine and test this hypothesis. But the results obtained thus far require us to at least consider the possibility that subtypes of autism have distinct genetic architectures. This means that no single study design or experimental approach is likely to be optimal for all subtypes, and that we must be prepared for disparate results across different types of studies, or across data sets comprising different mixtures of subtypes. This point almost certainly applies to other complex disorders as well.

Finally, it is interesting to note that the association signal on 16p13.2, which does not fall under a PPL linkage peak, does fall within a linkage interval previously reported in a subset of AGP families, using a very different approach to untangling clinical heterogeneity based on latent class modeling (Bureau et al. 2008). Thus, the PPLD may be indicating a true association, but at a locus that our linkage analysis lacked power to detect, given the particular phenotypic classifications used here.

Declarations

Acknowledgments

This work was funded in part by NIH grant MH086117 (VJV) and NS042165 to the Autism Genetics Collaborative. The authors gratefully acknowledge the families participating in the study and the main funders of the AGP: Autism Speaks (USA), the Health Research Board (HRB, Ireland), The Medical Research Council (MRC, UK), Genome Canada/Ontario Genomics Institute, and the Hilibrand Foundation (USA). Additional support for individual groups was provided by the US National Institutes of Health (NIH grants: HD055751, HD055782, HD055784, HD35465, MH52708, MH55284, MH061009, MH06359, MH066673, MH080647, MH081754, MH66766, NS026630, NS049261), the Canadian Institutes for Health Research (CIHR), AP-HP Autism Speaks UK, Canada Foundation for Innovation/Ontario Innovation Trust, Deutsche Forschungsgemeinschaft (grant: Po 255/17-4) (Germany), EC Sixth FP AUTISM MOLGEN, Fundação Calouste Gulbenkian (Portugal), Fondation de France, Fondation FondaMental (France), Fondation Orange (France), Fondation pour la Recherche Médicale (France), Fundação para a Ciência e Tecnologia (Portugal), GlaxoSmithKline-CIHR Pathfinder Chair (Canada), the Hospital for Sick Children Foundation and University of Toronto (Canada), INSERM (France), Institut Pasteur (France), the Italian Ministry of Health convention 181 of 19.10.2001, the John P Hussman Foundation (USA), McLaughlin Centre (Canada), Netherlands Organization for Scientific Research (Rubicon 825.06.031), Ontario Ministry of Research and Innovation (Canada), Royal Netherlands Academy of Arts and Sciences (TMF/DA/5801), the Seaver Foundation (USA), the Swedish Science Council, The Centre for Applied Genomics (Canada) and the Utah Autism Foundation (USA).

Conflict of interest

The authors have no conflicts of interest to report.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Authors’ Affiliations

(1)
Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children’s Hospital and The Ohio State University
(2)
Department of Psychiatry, Division of Child and Adolescent Psychiatry and Child Development, Stanford University School of Medicine
(3)
Wellcome Trust Centre for Human Genetics, University of Oxford
(4)
The Centre for Applied Genomics and Program in Genetics and Genomic Biology, The Hospital for Sick Children and Department of Molecular Genetics, University of Toronto
(5)
Department of Molecular Physiology and Biophysics, Vanderbilt Kennedy Center, and Centers for Human Genetics Research and Molecular Neuroscience, Vanderbilt University
(6)
Department of Psychiatry and Behavioural Neurosciences, McMaster University

References

  1. Abecasis GR, Cherny SS, Cookson WO, Cardon LR. Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002;30(1):97–101.View ArticlePubMedGoogle Scholar
  2. Allen DA, Steinberg M, Dunn M, Fein D, Feinstein C, Waterhouse L, et al. Autistic disorder versus other pervasive developmental disorders in young children: same or different? Eur Child Adolesc Psychiatry. 2001;10(1):67–78.View ArticlePubMedGoogle Scholar
  3. Amiet C, Gourfinkel-An I, Bouzamondo A, Tordjman S, Baulac M, Lechat P, et al. Epilepsy in autism is associated with intellectual disability and gender: evidence from a meta-analysis. Biol Psychiatry. 2008;64(7):577–82.View ArticlePubMedGoogle Scholar
  4. Anney R, Klei L, Pinto D, Regan R, Conroy J, Magalhaes TR, et al. A genome-wide scan for common alleles affecting risk for autism. Hum Mol Genet. 2010;19(20):4072–82.View ArticlePubMed CentralPubMedGoogle Scholar
  5. Bijlsma EK, Gijsbers AC, Schuurs-Hoeijmakers JH, van Haeringen A, van de Putte DE Fransen, Anderlid BM, et al. Extending the phenotype of recurrent rearrangements of 16p11.2: deletions in mentally retarded patients without autism and in normal individuals. Eur J Med Genet. 2009;52(2–3):77–87.View ArticlePubMedGoogle Scholar
  6. Bureau A, Labbe A, Croteau J, Merette C. Using disease symptoms to improve detection of linkage under genetic heterogeneity. Genet Epidemiol. 2008;32(5):476–86.View ArticlePubMedGoogle Scholar
  7. Elston RC, Lange K. The prior probability of autosomal linkage. Ann Hum Genet. 1975;38(3):341–50.View ArticlePubMedGoogle Scholar
  8. Govil M, Vieland VJ. Practical considerations for dividing data into subsets prior to PPL analysis. Hum Hered. 2008;66:223–37. PMID: 18612207.View ArticlePubMed CentralPubMedGoogle Scholar
  9. Guilmatre A, Dubourg C, Mosca AL, Legallic S, Goldenberg A, Drouin-Garraud V, et al. Recurrent rearrangements in synaptic and neurodevelopmental genes and shared biologic pathways in schizophrenia, autism, and mental retardation. Arch Gen Psychiatry. 2009;66(9):947–56.View ArticlePubMed CentralPubMedGoogle Scholar
  10. Huang J, Vieland VJ. Comparison of ‘model-free’ and ‘model-based’ linkage statistics in the presence of locus heterogeneity: single data set and multiple data set applications. Hum Hered. 2001;51(4):217–25. PMID: 11287743.View ArticlePubMedGoogle Scholar
  11. Huang Y, Vieland VJ. Association Statistics under the PPL Framework. Genetic Epidem. 2010;34:835–45.Google Scholar
  12. Huang Y, Segre A, O’Connell J, Wang H, Vieland VJ. KELVIN: a 2nd generation distributed multiprocessor linkage and linkage disequilibrium analysis program. American Society of Human Genetics; 2006; ASHG 56th annual meeting.Google Scholar
  13. Lander ES, Green P. Construction of multilocus genetic linkage maps in humans. Proc Natl Acad Sci USA. 1987;84(8):2363–7.View ArticlePubMed CentralPubMedGoogle Scholar
  14. Le Couteur A, Rutter M, Lord C. Autism diagnostic interview: a semi-structure interview for parents and caregivers of autistic persons. J Autism Dev Disord. 1989;19:363–87.View ArticlePubMedGoogle Scholar
  15. Le Couteur A, Bailey A, Goode S, Pickles A, Robertson S, Gottesman I, et al. A broader phenotype of autism: the clinical spectrum in twins. J Child Psychol Psychiatry Allied Discipl. 1996;37(7):785–801.View ArticleGoogle Scholar
  16. Liu XQ, Paterson AD, Szatmari P, et al. Genome-wide linkage analyses of quantitative and categorical autism subphenotypes. Biol Psychiatry. 2008;64(7):561–70.View ArticlePubMed CentralPubMedGoogle Scholar
  17. Lotspeich LJ, Kwon H, Schumann CM, Fryer SL, Goodlin-Jones BL, Buonocore MH, et al. Investigation of neuroanatomical differences between autism and Asperger syndrome. Arch Gen Psychiatry. 2004;61(3):291–8.View ArticlePubMedGoogle Scholar
  18. MacLean JE, Szatmari P, Jones MB, Bryson SE, Mahoney WJ, Bartolucci G, et al. Familial factors influence level of functioning in pervasive developmental disorder. J Am Acad Child Adolesc Psychiatry. 1999;38(6):746–53.View ArticlePubMedGoogle Scholar
  19. Marshall CR, Noor A, Vincent JB, Lionel AC, Feuk L, Skaug J, et al. Structural variation of chromosomes in autism spectrum disorder. J Hum Genet. 2008;82:477–88.View ArticleGoogle Scholar
  20. Matise TC, Chen F, Chen W, De La Vega FM, Hansen M, He C, et al. A second-generation combined linkage physical map of the human genome. Genome Res. 2007;17(12):1783–6.View ArticlePubMed CentralPubMedGoogle Scholar
  21. Pagnamenta A, Khan H, Walker S, Gerrelli D, Wing K, Bonaglia MC, et al. Rare familial 16q21 microdeletions under a linkage peak implicate cadherin 8 (CDH8) in susceptibility to autism and learning disability. J Med Genet. 2011;48:48–54.View ArticlePubMed CentralPubMedGoogle Scholar
  22. Pinto D, Pagnamenta AT, Klei L, Anney R, Merico D, Regan R, et al. Functional impact of global rare copy number variation in autism spectrum disorders. Nature. 2010;466(7304):368–72.View ArticlePubMed CentralPubMedGoogle Scholar
  23. Roeder K, Devlin B, Wasserman L. Improving power in genome-wide association studies: weights tip the scale. Genet Epidemiol. 2007;31(7):741–7.View ArticlePubMedGoogle Scholar
  24. Royall R. Statistical evidence: a likelihood paradigm. London: Chapman & Hall; 1997.Google Scholar
  25. Silverman JM, Smith CJ, Schmeidler J, Hollander E, Lawlor BA, Fitzgerald M, et al. Symptom domains in autism and related conditions: evidence for familiality. Am J Med Genet. 2002;114(1):64–73.View ArticlePubMedGoogle Scholar
  26. Smith CAB. Testing for heterogeneity of recombination fraction values in human genetics. Ann Hum Genet. 1963;27:175–82.View ArticlePubMedGoogle Scholar
  27. Stevens MC, Fein DA, Dunn M, Allen D, Waterhouse LH, Feinstein C, et al. Subgroups of children with autism by cluster analysis: a longitudinal examination. J Am Acad Child Adolesc Psychiatry. 2000;39(3):346–52.View ArticlePubMedGoogle Scholar
  28. Szatmari P, Paterson AD, Zwaigenbaum L, Roberts W, Brian J, Liu XQ, et al. Mapping autism risk loci using genetic linkage and chromosomal rearrangements. Nat Genet. 2007;39(3):319–28.View ArticlePubMedGoogle Scholar
  29. Szatmari P, Merette C, Emond C, Zwaigenbaum L, Jones MB, Maziade M, et al. Decomposing the autism phenotype into familial dimensions. Am J Med Genet B Neuropsychiatr Genet. 2008;147B(1):3–9.View ArticlePubMedGoogle Scholar
  30. van der Zwaag B, Staal WG, Hochstenbach R, Poot M, Spierenburg HA, de Jonge MV, et al. A co-segregating microduplication of chromosome 15q11.2 pinpoints two risk genes for autism spectrum disorder. Am J Med Genet B Neuropsychiatr Genet. 2010;153B(4):960–6.PubMed CentralPubMedGoogle Scholar
  31. Vieland VJ. Bayesian linkage analysis, or: how I learned to stop worrying and love the posterior probability of linkage. Am J Hum Genet. 1998;63(4):947–54. PMID: 9758634.View ArticlePubMed CentralPubMedGoogle Scholar
  32. Vieland VJ. Thermometers: something for statistical geneticists to think about. Hum Hered. 2006;61(3):144–56. PMID: 16770079.View ArticlePubMedGoogle Scholar
  33. Vieland VJ, Hodge SE. Review of statistical evidence: a likelihood paradigm. Am J Hum Genet. 1998;63:283–9.Google Scholar
  34. Vieland VJ, Wang K, Huang J. Power to detect linkage based on multiple sets of data in the presence of locus heterogeneity: comparative evaluation of model-based linkage methods for affected sib pair data. Hum Hered. 2001;51(4):199–208. PMID: 11287741.View ArticlePubMedGoogle Scholar
  35. Vieland VJ, Huang Y, Bartlett C, Davies TF, Tomer Y. A multilocus model of the genetic architecture of autoimmune thyroid disorder, with clinical implications. Am J Hum Genet. 2008;82(6):1349–56. PMID: 18485327.View ArticlePubMed CentralPubMedGoogle Scholar
  36. Vorstman JA, Staal WG, van Daalen E, van Engeland H, Hochstenbach PF, Franke L. Identification of novel autism candidate regions through analysis of reported cytogenetic abnormalities associated with autism. Mol Psychiatry. 2006;11(1):18–28.View ArticleGoogle Scholar
  37. Welcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447(7145):661–78. PMID: 17554300.View ArticleGoogle Scholar
  38. Wratten NS, Memoli H, Huang Y, Dulencin AM, Matteson PG, Cornacchia MA, et al. Identification of a schizophrenia-associated functional noncoding variant in NOS1AP. Am J Psychiatry. 2009;166(4):434–41.View ArticlePubMed CentralPubMedGoogle Scholar
  39. Xu J, Zwaigenbaum L, Szatmari P, Scherer SW. Molecular cytogenetics of autism. Curr Genomics. 2004;5(4):347–64.View ArticleGoogle Scholar
  40. Yang X, Huang J, Logue MW, Vieland VJ. The posterior probability of linkage allowing for linkage disequilibrium and a new estimate of disequilibrium between a trait and a marker. Hum Hered. 2005;59:210–9. PMID: 16015031.View ArticlePubMedGoogle Scholar

Copyright

© The Author(s) 2011

This article is published under license to BioMed Central Ltd. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Advertisement