Participants
Participants were males and females with a DNA-confirmed FMR1 full mutation aged 12 to 50 years (adolescent/adult study), and aged 5–11 (child study). The age ranges are those preferred by FDA for developmental disorders, with the 5–11 age group representing predominantly prepubertal children with FXS and the 12–50-year-old group representing adolescents and adults with FXS for whom behavioral issues are overall very similar throughout the age range. Up to three concomitant psychoactive medications (including antiepileptic drugs), which were FDA-approved for the condition or symptom being treated, were permitted, but use of vigabatrin, tiagabine, riluzole, racemic baclofen was prohibited because of their GABAergic mechanisms. Also, participants could not be taking medications with anxiolytic properties (including serotonin reuptake inhibitors (SSRIs), tricyclic antidepressants, venlafaxine, buspirone, benzodiazepines that were administered on a regular daily schedule, and propranolol). It was not considered feasible in FXS, a rare condition with severe behavioral dysfunction, to enroll a cohort in the specified age ranges with no psychoactive medication treatment that would be sufficient size for appropriate power to detect drug effect. Further, allowance of standard-of-care background therapy would allow identification of effects contributed by arbaclofen which supersede those obtained from standard care. Pharmacological treatment regimens were required to be stable for 4 weeks and educational, behavioral, and other treatments stable for 2 months, prior to screening and for the duration of the study. Subjects with any previous seizure were required to be on anticonvulsant medication and seizure-free for 6 months or seizure-free for 3 years off of anticonvulsants. A score of 8 or greater on the parent-rated ABC-C Lethargy/Social Withdrawal subscale was required at the screening visit and visit 1 at the beginning of the treatment period. This cutoff was used because it was the median value observed in the prior phase 2 arbaclofen trial in FXS, and also defined in the trial post hoc analyses the group that demonstrated significant improvement on numerous measures including the ABC-CFX Social Avoidance subscale. Caregivers watched a training video explaining how to rate the ABC-C before performing the rating at screening. Female subjects of childbearing potential were tested and excluded if they were pregnant. Female patients were required to follow an acceptable method of birth control throughout the study. The major exclusion criteria included, but were not limited to, impairment of renal function, evidence or history of malignancy, or any significant hematological, endocrine, cardiovascular, respiratory, hepatic, or gastrointestinal disease, and illicit drug use or alcohol abuse. Informed consent was obtained from the participant or a legal guardian or legally acceptable representative in all cases, and participants were enrolled if they met all inclusion criteria. The studies (clinicaltrials.gov identifiers NCT01282268 for adolescent/adult study, NCT01325220 for child study) were approved by the Institutional Review Boards governing each site.
Study design
The studies were phase 3 randomized, double-blind, placebo-controlled, multisite, parallel group trials in adolescent/adults (209FX301, NCT01282268) and children (209FX302, NCT01325220) with FXS, conducted at 23 sites between May 2011 and December 2012 (adolescent/adult study) and 25 sites between June 2011 and June 2013 (child study) in the USA (Fig. 1 shows design of both studies). Study design complied with FDA GCP requirements and followed the standard elements in the CONSORT checklist guidelines. In the adolescent/adult study, drug was flexibly titrated every 7 days, starting at 5 mg BID, and then 10 mg BID, 10 mg TID, and 15 mg TID, until the maximal tolerated dose was established. In the child study, participants were assigned in a ratio of 1:1:1:1 to one of the following four fixed dose treatment arms: arbaclofen 5 mg BID, 10 mg BID, 10 mg TID, or placebo. Dosing was chosen based on FDA’s requirement that three parallel dose groups be enrolled in a placebo-controlled trial (fourth dose group receiving placebo). In addition, the dose that demonstrated efficacy in the phase 2 trial post hoc analyses was chosen as the middle dose and doses 50% lower and 50% higher were selected for the other two dose groups. Participants allocated to an arbaclofen arm-initiated therapy with 5 mg daily, and the dose was up-titrated every 7 days in steps (5 mg BID, 10 mg BID, then 10 mg TID) until the target dose was reached. Down-titration for dose adjustment was not allowed due to FDA’s preference for the most stringent assessment of tolerability; patients unable to tolerate their assigned dose were discontinued. Randomization was stratified based upon the use of antipsychotic medication. The total length of the double-blind treatment period was 8 weeks for both studies, including up-titration and then stable dosing for at least the final 4 weeks at the MTD (adolescent/adult study) or assigned fixed dose (child study).
Subjects returned for evaluations 2, 4, and 8 weeks after initiating double-blind treatment. After the 8-week treatment period, participants entered a withdrawal period, during which study drug was tapered weekly until off, according to the reverse of the up-titration schedules noted above, over 0–3 weeks (adolescent/adult study) or 1–3 weeks (child study). Phone calls occurred every 3–4 days during the first 29 days after randomization, when the drug was being titrated upwards; then every 2 weeks, during the stable dosing period; and then every 4–6 days, during down-titration at the end of the placebo-controlled dosing period. Participants returned for a close-out visit within 3 days of the last dose of study medication (adult/adolescent study) or at 11 weeks when off study medication (child study). Subjects who completed the 8-week double-blind treatment period or who discontinued due to intolerability to their assigned dose were then eligible for enrollment in an open-label extension study in which subjects could be titrated to and treated with arbaclofen at the best tolerated dose ranging from 5 mg BID to 15 mg TID (209FX303, NCT01555333). Results from the long-term open-label study will be reported separately.
Efficacy assessments were performed at baseline and treatment weeks 2, 4, and 8, as well as on phone calls to the primary caregiver at treatment weeks 1 and 3. Efficacy assessments on phone calls included only the CGI-I and CGI-S, which were obtained via an interview with the caregivers done by the site investigators. Safety and tolerability assessments were performed at baseline, treatment weeks 2, 4, 8 and the follow-up visit (which occurred 3 weeks after the end of treatment at week 8), as well as on all phone calls, and at 4 weeks after the follow-up visit for participants not entering the open-label extension. Families were queried about concomitant treatments and any changes in medications at every visit and phone call, in order to identify any emergent medical issues and ensure psychoactive medications were not being changed.
Study drug and matching placebo were provided as 5 and/or 10 mg orally disintegrating tablets in color-coded blister packs. The orally disintegrating formulation, which showed pharmacokinetics similar to racemic baclofen prior to use in 209FX301 and 209FX302, was developed for the studies to accommodate patients who could not swallow pills. Blinding was maintained in the setting of different doses by requiring subjects to all take the same number of tablets three times a day which could be either drug or placebo tablets. Subjects were assigned to treatment linked to a set of blister packs according to a centrally generated randomization list. Treatment compliance was monitored with a dosing form, which guardians completed on a daily basis.
Assessments
Efficacy assessments
All efficacy outcomes were assessed as change from baseline after 8 weeks of treatment. The primary efficacy outcome for both studies was the parent or caregiver-rated Aberrant Behavior Checklist-Community Edition (ABC-C) refactored for FXS (ABC-CFX) Social Avoidance score. The key secondary outcome was the Clinical Global Impression-Improvement (CGI-I). The ABC-CFX Social Avoidance score was chosen as primary endpoint because this measure showed improvement in the full intent-to-treat (ITT) cohort in the phase 2 study and because the FDA required a primary outcome in one behavioral domain. Because the drug reversed molecular, electrophysiological and synaptic phenotypes in the animal model [11, 17], it was postulated that it should help all aspects of FXS. Although the company requested, therefore, to nominate a measure of global function as the primary endpoint, this was not allowed by FDA. The FDA rather recommended that a global outcome measure be implemented as a key secondary rather than as primary endpoint; hence, the CGI-I was chosen as such key secondary measure. Furthermore, there was regulatory precedent for using the CGI-I (secondary) and the ABC-C Irritability subscale (primary), in addition to a responder analysis, for approval of atypical antipsychotics for irritable behavior in ASD [18]. Other secondary outcomes were the Clinical Global Impression-Severity (CGI-S), visual analog scale (VAS) for disruptive and anxiety behaviors, and Vineland Adaptive Behavior Scales, Second Edition (Vineland-II) - Socialization domain raw and standard scores (Survey Interview Form with parent/caregiver/Legally Authorized Representative (LAR)). Exploratory outcomes were a responder analysis (CGI-I score of 1 or 2, and 10, 20, 25, 30, 40, 50, and 60% improvement on the ABC-CFX Social Avoidance subscale), the other ABC-CFX Subscales (including Irritability, Hyperactivity, Stereotypic Behavior, Lethargy, and Abnormal Speech), Parenting Stress Index (PSI) – Short Form, Vineland-II Maladaptive Behavior Index and other domain raw scores, Vineland-II - Communication domain raw and standard scores, and Total Score and Daytime sleepiness subscale of the Children’s Sleep Habits Questionnaire (CSHQ). The ABC-CFX was performed at baseline and weeks 4 and 8 of the treatment period. The CGI-I was performed at baseline and weeks 1, 2, 3, 4, and 8. The CGI-S and VAS were performed at baseline and weeks 2, 4, and 8. The Vineland-II, PSI, and CSHQ were performed at baseline and at 8 weeks.
Safety assessments
Safety and tolerability of STX209 was determined by adverse events (AEs; all visits and calls), physical examination, vital signs and weight (all visits), laboratory tests including complete blood count (CBC), chemistry panel, and urinalysis (UA) (baseline and weeks 4 and 8), electrocardiogram (ECG; baseline and week 8), and a suicidality assessment (three question interview with subject and parent/caregiver/LAR, all visits, as required by FDA guidelines). When the patient could not provide meaningful answers due to inadequate language or cognitive function, the family was asked if there was any sign of suicidality.
Pharmacokinetics
Four blood samples for analysis of plasma STX209 were obtained from each subject at four defined post-dose time points. Samples were to be used for population-based pharmacokinetic (PK) analyses, to confirm accurate randomization, and compliance with use of study medication. The goal was to perform an integrated population PK analyses with data from the phase 2 and 3 and open-label extension trials; however, these analyses were not completed prior to the wind-down of Seaside Therapeutics.
Description of assessments
DSM-IV-TR
Diagnostic and Statistical Manual of Mental Disorders IV – Text Revision (DSM-IV-TR) (APA 2000) criteria for Pervasive Developmental Disorders (PDD), including severe and pervasive impairment in several areas of development: reciprocal social interaction skills, communication skills, and the presence of stereotyped behavior, interests, and activities, were used by an Investigator on the study to determine at the screening visit if the subject had a PDD in addition to FXS.
ABC-C
The ABC-C is a 58-item parent-rated global behavior checklist implemented for the measurement of drug and other treatment effects in individuals with intellectual disability, and utilized in registration studies for drug efficacy in autism spectrum disorder [18]. In its original validation, five empirically derived dimensions were identified: Irritability, Lethargy/Social Withdrawal, Inappropriate Speech, Hyperactivity, and Stereotypic Behavior. The Lethargy/Social Withdrawal scale includes questions about social indifference, social avoidance, and physical lethargy [19]. A recent factor analysis of the ABC specifically in FXS (ABC-CFX) [16] generated a six-factor structure modifying items mapping to most subscales and identified a “Social Avoidance” factor that is related to the original “Lethargy/Social Withdrawal” scale, but which does not include the items assessing social indifference or physical lethargy. These items are now in a new subscale labeled Socially Unresponsive/Lethargic.
CGI-S
The CGI-S is a clinician-rated measure used to assess the impairment of neurobehavioral function in study subjects. The clinician should consider all aspects of that function, including but not limited to, internalizing and externalizing problems, and social engagement. The clinician’s score utilized the following 7-point Likert scale: normal (not at all impaired), borderline, mild, moderate, marked, severe, or extreme.
CGI-I
The CGI-I is a well-validated clinician-rated measure commonly used in drug studies [20] because it allows the clinician to integrate all sources of information, including the parent/caregiver history, observations in the clinic, and reports from other sources, into a single rating of improvement during treatment. For these studies, the investigator considered all aspects of the subject’s neurobehavioral function, including but not limited to internalizing problems, externalizing problems, and social engagement, and rated the scale employing a 7-point Likert scale: very much improved, much improved, minimally improved, no change, minimally worse, much worse, very much worse. In these studies, a single clinician-investigator at the site rated the CGI for each subject. This individual was not blinded to side effects or results of other measures, as it was not thought a priori that these would be unblinding based on the side effect profile from the phase 2 trial. Investigators were trained on CGI-I and CGI-S rating to standardize the rating of participants, including rating practice cases prior to performing the measure in the trials.
Vineland-II
The Vineland-II [21] is designed to assess the personal and social functioning of handicapped and non-handicapped persons. It is a gold standard for the assessment of adaptive functioning, and, with IQ testing, comprises one of two pillars for the assessment and diagnosis of intellectual disability. The “Survey Interview Form” of the Vineland-II was administered by a qualified psychologist or experienced rater to a parent or caregiver using a semi-structured interview format. Only the Communication domain, Socialization domain, and Maladaptive Behavior Index were completed (not the Daily Living Skills and Motor Skills domains).
VAS-Anxiety and Disruptive Behaviors
This methodology has been utilized in autism [18, 22]. The parent/caregiver/LAR is asked about the severity of anxiety and disruptive troublesome behaviors and is given examples of several of each of these types of target behaviors and then rates changes in severity of the anxiety and disruptive behavior target symptom on separate visual analog scales (VASs). The VAS is a 10-cm line, with troublesome behaviors anchored on one end with the description “worst ever” and on the other end with “no problem at all”. This scale showed good reliability in a prior study of subjects with FXS [23].
PSI – Short Form
The PSI [24] provides a measure of parental stress and is widely used in the assessment of family function for families with children who have special needs. The PSI was normed on over 2500 parents, and the 36-item short form provides a well-validated estimate of the overall stress faced by parents.
CSHQ
The CSHQ is a 35-item questionnaire designed for children aged 4 through 12 years, to screen for the most common sleep problems in that age group. Reliability and validity data has been collected on a sample of 495 elementary school children and on a clinical sample from a pediatric sleep clinic [25].
Suicidality assessment
This semi-structured interview of the parent/caregiver/LAR and subject was completed by a physician or clinical psychologist. A targeted set of questions were asked to assess potential suicidality. At a minimum, the clinician asked the subject the following questions: Do you ever wish you were dead? Have you done anything to hurt yourself? Then the clinician asked the parent/caregiver/LAR the following question: Has (subject’s name) done anything to hurt himself/herself (other than stereotyped self-injurious behaviors)?
Statistical analysis
For both studies, the ABC-CFX Social Avoidance score was designated the primary endpoint based on the phase 2 study results [15]. All data collected in this study was documented using summary tables, figures, and subject data listings. All efficacy analyses were based on an intent-to-treat (ITT) population defined as all randomized subjects who were assigned to study medication, received at least one dose of double-blind study medication, and had post-baseline efficacy data available. For the primary efficacy variable of the ABC-CFX Social Avoidance score, a per protocol (PP) population was also defined as those ITT subjects who fulfilled the entrance criteria and substantially adhered to the protocol for the duration of the study. Differences in efficacy variables from baseline to the end of 8 weeks of double-blind treatment in the arbaclofen treatment group and placebo treatment group were assessed using Restricted Maximum Likelihood (REML)-based Analysis of Covariance (ANCOVA) techniques for continuous variables and chi-square techniques for categorical variables as appropriate. Baseline scores and age were co-variates in the analyses. For all comparisons, a nominal p value of 0.05 or less was required to declare significance, and no adjustments for multiplicity were made. For the adolescent/adult study, there was one primary comparison of active versus placebo. For the child study, the level of significance for the primary efficacy comparison was protected using a closed testing procedure (that allows simultaneous testing of several hypotheses). In addition, only one key secondary efficacy variable was declared for each study.
The Safety Populations were comprised of all subjects who took at least one dose of study medication. Clinical safety was addressed by calculating the incidence of AEs in the two treatment groups and by descriptively summarizing laboratory assessments, physical examinations, ECG assessments, and vital signs.
Sample size
Study group sizes were based upon formal power analyses that identified the minimum number of subjects required to detect a medium effect size. Specifically, it did not appear smaller trials would be worth running as smaller group sizes would not be sufficiently powered to assess efficacy. For the primary outcome, the ABC-CFX Social Avoidance subscale, the adolescent/adult study was designed to have at least 80% power to detect a treatment effect of size 0.55, with a p level of 0.05 in a sample size of (n = 60) subjects per group. The child study was designed to have at least 80% power to also detect a medium treatment effect of size 0.57, with a p level of 0.05 in a sample size of (n = 50) participants per group in each of the four dose arms.