Novel copy number variants in children with autism and additional developmental anomalies

Autism is a neurodevelopmental disorder characterized by three core symptom domains: ritualistic-repetitive behaviors, impaired social interaction, and impaired communication and language development. Recent studies have highlighted etiologically relevant recurrent copy number changes in autism, such as 16p11.2 deletions and duplications, as well as a significant role for unique, novel variants. We used Affymetrix 250K GeneChip Microarray technology (either NspI or StyI) to detect microdeletions and duplications in a subset of children from the Autism Genetic Resource Exchange (AGRE). In order to enrich our sample for potentially pathogenic CNVs we selected children with autism who had additional features suggestive of chromosomal loss associated with developmental disturbance (positive criteria filter) but who had normal cytogenetic testing (negative criteria filter). We identified families with the following features: at least one child with autism who also had facial dysmorphology, limb or digit abnormalities, or ocular abnormalities. To detect changes in copy number we used a publicly available program, Copy Number Analyser for GeneChip® (CNAG) Ver. 2.0. We identified novel deletions and duplications on chromosomes 1q24.2, 3p26.2, 4q34.2, and 6q24.3. Several of these deletions and duplications include new and interesting candidate genes for autism such as syntaxin binding protein 5 (STXBP5 also known as tomosyn) and leucine rich repeat neuronal 1 (LRRN1 also known as NLRR1). Lastly, our data suggest that rare and potentially pathogenic microdeletions and duplications may have a substantially higher prevalence in children with autism and additional developmental anomalies than in children with autism alone. Electronic supplementary material The online version of this article (doi:10.1007/s11689-009-9013-z) contains supplementary material, which is available to authorized users.

has since been found to be both common and relevant to human diseases, including autism [14,28,34,41,42].
Recent studies have also suggested a significant role for CNVs, both de novo and inherited, in the etiology of autism [3,28,30,44,50]. De novo copy number changes appear to be particularly enriched in sporadic (simplex) autism, though some recurrent CNVs that are primarily de novo, such as 16p11.2 deletions and duplications, are also occasionally inherited from an unaffected parent, suggesting incomplete penetrance for some autism-associated CNVs [3,28,50]. Several of the recently highlighted mutations in autism, such as SHANK3 [7], NRXN1 [8,19] and contactin 4 [38] have also been identified in parents, some of whom exhibit features of a broader autism phenotype.
When interrogating a sample for CNVs, it is important to note that enrichment for de novo mutations in singleton families is not specific to CNVs and that these families are likely to be enriched for other types of sporadic mutations as well. Chromosomal abnormalities (whether de novo or inherited) have historically been associated with syndromic developmental delay such as 15q11-q13 syndrome, or the 22q11 group of deletion disorders. Indeed, one of the first studies to identify copy number variants in mental retardation selected for mental retardation with syndromic presentation [15].
In order to enrich our sample for potentially pathogenic chromosomal imbalances we applied a positive and negative criteria filter to the study sample. We hypothesized that children with autism who had additional phenotypic features or diagnoses suggestive of a developmental disturbance (positive filter) but who had normal cytogenetic testing (negative filter) would be more likely to harbor novel CNVs. We used this model for two reasons: 1) children with chromosomal abnormalities often have multiple or syndromic developmental anomalies and 2) we recently reported a pathogenic microdeletion in a child with autism and eye abnormalities, stimulating our interest in investigating other individuals with autism and additional developmental disturbances [5]. While the AGRE sample has been heavily studied, it is important to note that many of the families included in this study have been flagged as "possible non-idiopathic autism" by AGRE due to their associated phenotypes, and therefore have been generally excluded from previous linkage, association, and CNV studies. In addition, we screened a sample of children with autism and no dysmorphology from the AGRE cohort, as well as a sample of unselected controls.

Patient ascertainment and sample
Families included in this study came from the Autism Genetic Research Exchange. AGRE houses DNA and phenotypic data on hundreds of simplex and multiplex families of children with autism and makes these materials available to researchers investigating autism. The diagnoses were confirmed in the affected individuals through the Autism Diagnostic Interview (ADI) and in some cases the Autism Diagnostic Observation Schedule-General [6,25,26]. AGRE makes available to researchers data from family medical and psychological history, physical exams, cytogenetic data, and findings from neurological exams.
From the AGRE repository of 814 multiplex and simplex families, we identified a total of 17 children from 15 families with the following features: at least one child with autism who also had a less than common cranio-facial dysmorphology, limb or digit abnormality, or ocular abnormality that were suggestive of a possible chromosomal imbalance. These children were considered the "syndromic autism" sample and are referred to as such throughout this paper. Included among these syndromic features were microopthalmia, iris coloboma, trigonocephaly, Sotos syndrome, cleft lip and palate, alopecia areata, fused baby teeth, fused ribs, metatarsal full syndactyly of toes 2,3 and 4 and adducted thumbs (Table 1). In families AU0590 and AU0720 both affected siblings also showed the same syndromic features. In the majority of cases, only one affected sibling was included in the initial syndromic autism group. In these cases, the affected sibling with syndromic features was analyzed first and additional family members (affected and unaffected) were studied as a follow-up to compelling results. It should be noted that the AGRE physical exam records indicate a number of common and mild birth defects or abnormalities (i.e. Strabismus, simple cutaneous syndactyly). These more common features were not included in our study. We also selected at random 19 subjects with autism and no associated syndromic features from the AGRE cohort as a comparison for the primary study sample.
In addition, we screened an unselected control population of 716 individuals ascertained as part of a study of agerelated eye disease at the University of Iowa under the direction of Dr. Edwin Stone. These individuals were analyzed with either the 250K NspI and 250K StyI or the 5.0 Affymetrix SNP array platforms. The 250K array data was also analyzed with CNAG while the 5.0 array data was analyzed with dChip [24].
Affymetrix GeneChip® human mapping 250K microarray DNA from each individual was analyzed with one of the Affymetrix 250K GeneChip microarrays (either NspI or StyI). It is well documented that SNP genotyping arrays can be used successfully for copy number detection [11,31,48]. The DNA was hybridized to the array according to the manufacturer's instructions. Briefly, the assay uses 250ng of genomic DNA digested with NspI or StyI restriction enzyme (New England Biolabs, Boston, MA), ligated to an adaptor using T4 DNA ligase (New England Biolabs), and amplified by PCR using Titanium Taq (Clonetech). PCR products were then purified from excess primer and salts by a DNA amplification cleanup kit (Clonetech) and a 90µg aliquot was fragmented using DNaseI. An aliquot of the fragmented DNA was separated and visualized in a 3% agarose gel in 1× TBE buffer to ensure that the bulk of the product had been properly fragmented. The fragmented samples were end-labeled with biotin using terminal deoxynucleotidyl transferase before each sample was hybridized to the NspI or StyI arrays for 16 h at 49°C. After hybridization, the arrays were washed and stained using an Affymetrix Fluidics Station 450. The most stringent wash was 0.6× SSPE, 0.01% Tween-20 at 45°C, and the samples were stained with R-phycoerythrin (Molecular Probes). Imaging of the microarrays was performed using a GCS3000 (Affymetrix) high-resolution scanner.

Detection of copy number variants
To detect changes in copy number we used a publicly available program, Copy Number Analyser for GeneChip® (CNAG) Ver. 2.0, developed at The University of Tokyo [33]. CNAG was designed specifically for work with highdensity oligonucleotide arrays and uses a Hidden Markov Model (HMM) to identify statistically significant deviations in signal intensity between SNPs represented on the array. Instead of using a standard reference panel, CNAG uses a panel of "best fit" references. The reference arrays were drawn from a pool of over 500 arrays of the same type using the above methodology and run on individuals with autism and their family members. Within this pool of reference arrays the signal intensity standard deviation values are ranked and the arrays with the best possible standard deviation values are used for the analysis of each new test array. Each array was referenced to at least 6 other arrays of unrelated individuals. To determine CNV size we relied primarily on boundaries defined by CNAG, though with some deletions (Supplementary Table 1) we used loss of heterozygosity (LOH) as determined by genotype data to both validate the CNV and to better define the boundaries. However, it should be noted that LOH is limited by the presence of naturally occurring homozygous SNPs.

Identification and confirmation of CNVs of interest
All CNVs identified by CNAG in children with syndromic autism as well as autism alone are described in supplementary materials (control data not included). To identify a CNV of interest for follow-up validation and study, we first ruled out common CNVs by comparison with the Database of Genomic Variants CNV track on the May 2004 UCSC Genome Browser build [4,13,14,29,35,42,43,46]. In addition, we used as a local comparison group the 716 eye disorder controls. CNVs with more than one record in DGV or any occurance in our own control sample were not considered for follow-up. The decision to set the follow-up threshold at one CNV occurrence in DGV control populations (instead of zero) was based on previous reports suggesting that autism susceptibility CNVs may be present at a very low frequency in a control population [50]. Among the CNVs that met these criteria, we identified and validated by qPCR deletions and duplications that influenced brain expressed genes of potential relevance to autism. We then tested for segregation of the CNV with disease in all available members of the proband's family using microarrays or qPCR. For quantitative real-time PCR (qPCR), we selected an amplicon within the center of the putative CNV. We used an assay targeted for G6PD on the X-chromosome as an internal control for gene dosage and an assay targeted for GAPDH to normalize signal between DNA samples. As the possibility for copy number variation exists for any given region of the genome, we relied on information obtained from our arrays as well as our gender prediction within the qPCR experiment to support the use of GAPDH as a normalization control for validation of copy number variants. The reactions were performed in mixtures containing 12.5µl of 2x QuantiTect SYBR Green PCR Master Mix (QIAGEN), 12µl genomic DNA (1ng/µl), 0.25µl of each primer (10 pmol/µl) in a total volume of 25 µl. The PCR amplification and detections were carried out in an ABI 7700, each with an initial activation step for 15 min at 95ºC followed by 15s at 94ºC, 30s at 55ºC, and 30s at 72ºC for 42 cycles. Each experiment was performed two times with three replicates in each experiment. To exclude the presence of non-specific products, a melting curve analysis was performed. The threshold cycle value was calculated using the comparative C T method. C T was determined using the thermocycler software and an average of the three replicates was calculated. The fold change from normal samples was set at 1 and the ratio of the normalized fold change in autism compared to that of control samples was calculated.

Direct sequencing
Direct sequencing was used for identification of possible compound heterozygous mutations in one family in which a CNV affecting a single gene (STXBP5) was identified. Additionally, direct sequencing was used to rule out a PAX6 mutation in a child with coloboma of the iris and autism. All STXBP5 and PAX6 exons were forward and reverse sequenced in families AU0677 and AU1376, respectively. The sequence data were analyzed using the Sequencher gene analysis computer program (Gene Codes, Ann Arbor, MI).

Results
In our sample of 17 children with syndromic autism, we detected an average of 3.1 CNVs per child including 31 deletions and 21 duplications (Table 2), a rate similar to that from previous studies of autism using similar methodologies [28]. Nine CNVs were novel with five being of particular interest ( Table 3). The pedigrees of the families highlighted in the results are shown in Fig. 1. In the sample of 19 children with non-syndromic autism we detected an average of 3.4 CNVs per child including 33 deletions and 32 duplications (Table 2). Two CNVs were novel and one was of high interest ( Table 3). The two groups did not differ in their average number of CNVs per individual. A fisher's exact test of the number of novel CNVs out of the total number of CNVs in each group yields a one-tailed p-value of .04, suggesting that there are significantly more novel CNVs in the sample of children with syndromic autism. However, the number of individual carriers of novel CNVs did not differ significantly between the groups as there were two individuals in the syndromic autism group who each carried two novel CNVs (

CNVs of interest identified in children with syndromic autism
Pedigrees for families with CNVs of interest are provided in Fig. 1 and CNV diagrams are provided in Fig. 2. We identified a 260 kb deletion on chromosome 6q24 in a family with two affected boys and one unaffected girl. The oldest affected boy, who carried the deletion, was diagnosed with autism, mental retardation and seizures. The deletion was also detected in the mother who has severe bipolar disorder with episodes of psychosis and suicidal ideation. This deletion contained coding sequence from only one gene on chromosome 6q24, syntaxin binding protein 5 (STXBP5). This deletion was not present in the mildly affected brother or unaffected sister. STXPB5 was also sequenced in this family to rule out the possibility of a compound heterozygous mutation, with no new SNPs or mutations being identified. We also detected a 317 kb deletion on chromosome 1q24.2 in two siblings with autism spectrum disorders, one of whom had an additional phenotype of microphthalmia. The deletion was transmitted from the father and was not present in an unaffected paternal half-brother. The father transmitting the deletion has been diagnosed with depression, anxiety disorder, and ADHD. This deletion is novel and contains three genes, dermatopontin (DPT), chemokine (C motif) ligand 1, and chemokine (C motif) ligand 2. A 166 kb deletion on chromosome 4q34.2 was identified in two of three trizygotic triplets. The two children carrying the 4q34.2 deletion have autism and malformations of the limbs. One twin has finger clinodactyly on both hands while the other has metatarsal syndactyly of toes 2, 3 and 4 on both feet. This deletion was paternally inherited and not present in the third unaffected triplet. The father reported learning disabilities as a child and his sister has mild mental retardation. This deletion includes three known genes, WD repeat domain 17 (WDR17), spermatogenesis associated 4 (SPATA4), and ankyrin repeat and SOX box containing protein 5 (ABS5). Deletions in this chromosome region have previously been associated a number of features including mild mental retardation, velo-cardio-facial (VCF) syndrome-like features, and finger clinodactyly [16,45,47]. While there are no clear features of VCF in this family, the affected children do display the 4q34.2 characteristic digit abnormalities. In family AU0053 we identified two duplications in close proximity to each other on chromosome 3p26.2-p26.1 in a family with two affected brothers. This family has been recently published in Christian et al. [3] and here we add to their findings and further discuss the candidate interval. The father, who transmitted the duplications, endorsed symptoms of obsessive-compulsive disorder (OCD) and attention deficit-hyperactivity disorder (ADHD) and is a selfdescribed loner. The duplication that brought this family to our attention is located on 3p26.1, is 137 kb in length, contains no genes, and is entirely novel. The second duplication on this chromosome lies on 3p26.2 and is approximately 2 Mb telomeric to the first duplication. CNVs overlapping with this duplication on 3p26.2 have been identified in three unselected controls [18,34]. The duplication on 3p26.2 is 336 kb in length and contains a portion of only one validated gene, leucine rich repeat neuronal 1 (LRRN1). It is worth noting that according to the microarray analysis, the breakpoints for this CNV, which are novel to DGV, fall within the LRRN1 gene itself suggesting possible disruption of the transcriptional unit. Finally, a 177 kb deletion on chromosome 7q35 was detected in one male child with autism and no related dysmorphology, but was not identified in his younger affected brother. This deletion encompasses a portion of the first intron of CNTNAP2 and was previously identified in [1]. The deletion was paternally inherited and while the father did not report any learning disorders he did report a lack of empathy and considers himself "eccentric and a loner".

Discussion
This study was designed to identify CNVs that may play a role in the etiology of autism. We applied positive and  [17] in individuals with syndromic autism (http:// genome.ucsc.edu). a Deletion identified in two affected siblings (one with microopthalmia), AU1334303 and AU1334302. The deleted region on chromosome 1q24.2 includes the genes XCL2 and DPT. b Deletion identified in one affected proband (with microopthalmia) AU027505. As no genes lie within the deletion region, this CNV did not meet our criteria for continued study and additional family members were not screened. The deleted region on chromosome 2p22.1 is non-genic, however, it is notable that region appears to contain conserved elements and is within close proximity of the gene SLC8A1. c Duplications were identified in two affected siblings (one with alopecia), AU005303 and AU005304. The duplicated regions on chromosome 3p26.2 and 3p26.1 overlap with LRRN1 and lie close to CNTN4 and GRM7. d Duplication identified affected siblings (one with syndactyly, the other with clinodactyly) AU010903 and AU010904. The duplicated region on chromosome 4q34.2 encompasses two genes, WDR17 and ABS5. e Deletion identified in one affected sibling (with adducted thumbs), AU067703. The deleted region on chromosome 6q24.3 includes STXBP5. f Second duplication identified in individual AU1334302. The duplicated region on chromosome 22q11.21 includes a number of genes and overlapping regions of common variation, however, the breakpoints identified in AU1334302 are unique to the Database of Genomic Variants and include genes such as SNAP29 that do not show evidence of common variation negative filter criteria to AGRE families in order to enrich our sample for such CNVs. We theorized that children with autism and additional developmental anomalies would be more likely than children with idiopathic autism to harbor pathogenic microdeletions and duplications. We did not find significant enrichment of de novo CNVs, however, we did identify significantly more rare inherited CNVs in our sample of syndromic autism compared to a sample of non-syndromic autism. It is worth mentioning that almost all of the rare, inherited, CNVs were transmitted from parents reporting a range of learning disabilities and psychiatric disturbance. We highlight novel deletions and duplications on chromosomes 1q24.2, 3p26.2, 4q34.2, 6q24.3, and 7q35. It is important to note that with increased array density, it is possible that some of the highlighted novel CNVs will be identified in control populations. The "novel" status of CNVs is rather a moving target and even during the preparation of this manuscript, CNV status had to be changed. While this data is preliminary and should be interpreted with caution, several of the deletions and duplications reported here include new and interesting biological candidate genes for autism such as syntaxin binding protein 5 (STXBP5 also known as tomosyn), and leucine rich repeat neuronal 1 (LRRN1 also known as NLRR1).
Perhaps the most compelling of these is STXBP5, which plays a key role in neuronal guidance and in regulation of synaptic transmission at the presynaptic cleft [51]. The protein forms a stable t-SNARE complex with SNAP-25 and a ROCK/Rho phosphorylated form of synatxin-1 (isoforms A and B) at the presynaptic nerve terminal. Its interaction with SNAP-25 and syntaxin-1 ultimately blocks synaptobrevin (part of the v-SNARE machinery) from joining the SNARE complex, resulting in inhibition of vesicle exocytosis [10,12,51]. This synaptic vesicle inhibition is particularly important in early development during extension and retraction of neurites. Phoshorylated syntaxin-1 and STXBP5 have been shown to co-localize to the palm of the growing neurite, thereby inhibiting synaptic formation along the palm and encouraging synaptic vesicle release toward the growth cone of the emerging dendrite or axon [40,51]. Ours is the first mutation reported of STXBP5, present in a mother with severe bipolar disorder and her son with autism, mental retardation and seizures. A recent paper has suggested syntaxin-1A as a possible autism candidate based on SNP genotype association with autism in a set of AGRE trios as well as mRNA expression data that suggests syntaxin-1A is expressed at a higher level in children with high functioning autism than in age and gender matched controls [32]. It is also important to note that one child with autism spectrum disorder in this family did not carry the STXBP5 deletion. This phenom-enon has been noted in previous studies of autism susceptibility genes [28,50]. It is possible that the STXBP5 deletion is not causative but influences autism severity, that there is a secondary cause for autism in the sibling, or that STXBP5 has an additional maternal mRNA contribution during fetal development that resulted in a more severe form of autism in the sibling who was also carrying the deletion.
Duplications flanking the neuronal cell adhesion molecule LRRN1 were detected in a second family. One of the duplications has unique breakpoints disrupting the only intron of LRRN1. It is unclear whether these duplications are functionally connected or arose independently, but the possibility remains that they may act in concert in this family. Leucine rich repeat neuronal genes are heavily expressed during embryonic development of the cortex and are thought to be involved in neuronal outgrowth. However, their role in synaptic cell adhesion is still unclear [20,21]. Mutations in LRRN1 have not been associated with any other disorder, though mutations in the LRRN gene family have been associated with other psychiatric disorders such as Parkinson's disease and schizophrenia [9,27]. Additionally, larger deletions and duplications of chromosome 3p26 have been found in children with Prader-Willi syndrome, mental retardation, and social cognition deficits [2,37]. Recently, deletions in another cell adhesion molecule, contactin 4, were identified in three children with autism. [3].
One limitation of our study is the possibility that our findings may be associated with the additional phenotype and not autism in these individuals. We believe this to be unlikely, however, as the chromosomal abnormalities that we identified disrupt compelling biological candidates that are heavily expressed in the brain and known to function at the neuronal synapse. We believe it is more likely that these CNVs have a pleiotropic effect based on gene dosage or position of the chromosomal disruption. In addition we were unable to entirely rule out cell line effects; however, as the CNVs highlighted in the results were all inherited, they are unlikely to represent such artifacts.
Lastly, our data suggest that rare microdeletions and duplications may have a statistically higher occurrence in children with autism and additional developmental anomalies and that these children ought not be excluded from studies of copy number variants in autism. Here we present evidence from such a sample, implicating new genes such as STXBP5 and LRRN-1 that may play a role in the development of autism. While the results presented here require further investigation in larger samples, it may also be prudent to consider clinical high density microarray testing for children who present with autism and related developmental disturbances.