Skip to main content
  • Research article
  • Open access
  • Published:

Autism, spectrum or clusters? An EEG coherence study



Autism prevalence continues to grow, yet a universally agreed upon etiology is lacking despite manifold evidence of abnormalities especially in terms of genetics and epigenetics. The authors postulate that the broad definition of an omnibus ‘spectrum disorder’ may inhibit delineation of meaningful clinical correlations. This paper presents evidence that an objectively defined, EEG based brain measure may be helpful in illuminating the autism spectrum versus subgroups (clusters) question.


Forty objectively defined EEG coherence factors created in prior studies demonstrated reliable separation of neuro-typical controls from subjects with autism, and reliable separation of subjects with Asperger’s syndrome from all other subjects within the autism spectrum and from neurotypical controls. In the current study, these forty previously defined EEG coherence factors were used prospectively within a large (N = 430) population of subjects with autism in order to determine quantitatively the potential existence of separate clusters within this population.


By use of a recently published software package, NbClust, the current investigation determined that the 40 EEG coherence factors reliably identified two distinct clusters within the larger population of subjects with autism. These two clusters demonstrated highly significant differences. Of interest, many more subjects with Asperger’s syndrome fell into one rather than the other cluster.


EEG coherence factors provide evidence of two highly significant separate clusters within the subject population with autism. The establishment of a unitary “Autism Spectrum Disorder” does a disservice to patients and clinicians, hinders much needed scientific exploration, and likely leads to less than optimal educational and/or interventional efforts.

Peer Review reports


The DSM-5 [1] summarizes that individuals on the autism spectrum exhibit problems involving interaction and communication with other individuals, show repetitive behaviors and restricted interests, and manifest behavior issues interfering with school, work, and/or multiple other life endeavors. The move from DSM-3 [2] to DSM-4 [3] and most recently DSM-5 [1] as diagnostic standard reflects a gradual condensation of a number of autism-related clinical entities under the rubric of Autism Spectrum Disorder (ASD). These include infantile autism, atypical autism, pervasive developmental disorder not otherwise specified or PDD-nos, and most recently Asperger’s syndrome [4]. This diagnostic “simplification” was welcomed by some yet quite concerning to others, as previously reviewed [5]. As suggested by Kienle et al. [6], “…the issue of an empirically reproducible and clinically feasible differentiation into subgroups must still be raised.” Indeed, in 2016, Pruett and Povinelli [7] published a wistful ‘Commentary’ in which they hypothesized that the usual rapid and automatic recognition of individuals on the autism spectrum resulted from our human “evolved sensitivity for species-typical ranges of social relating”. The authors further postulated that “social spacing”, “quality of eye contact”, and “timing of communicative exchange” constitute three primary variables and that they may form a recognizable set of “clusters” within the realm of human behavior.

A review of the literature from 1994 to 2018 reveals nine publications using cluster analysis to demonstrate quantitatively defined groupings within subjects diagnosed with autism spectrum disorder (ASD) [6, 8,9,10,11,12,13,14,15]. Eight papers used structured neurobehavioral assessments of various types [2, 3, 16,17,18,19,20,21,22,23,24] and one relied upon MRI data. Four studies reported solutions involving two clusters, one reported both two and three cluster solutions, one study reported a three cluster solution, two reported a four cluster solution, and one reported a five cluster solution.

It is notable, that all studies summarized above succeeded in identifying clusters. However, the number of underlying clusters identified varied, although the two cluster solution was noted most often. As an ensemble, the studies serve to suggest that ASD may well comprise a varying number of discrete sub-populations rather than exist on a continuum.

The varying number of clusters reported in these different studies may reflect the unique characteristics of the population under study, differing choices of variables selected to represent subjects and/or differing cluster methodologies utilized. The two most commonly used cluster methods, hierarchical and K-means algorithms rely upon apparent ‘satisfactory’ cluster separation by means of the clinical/neuropsychological difference between or among subjects within differing clusters. K-means clustering, hierarchical clustering, and combinations of these two fundamental methods, fail to determine quantitatively the optimal cluster number. For instance, the K-means approach to clustering “…requires users to specify the number of clusters to be generated. One fundamental question is: How to choose the right number of clusters” [25], p 39. Similarly, “…one of the problems with hierarchical clustering is that is does not tell us how many clusters there are…” [25], p 74.

The current study employed one of the first comprehensive software approaches to objectively establish cluster number, namely NbClust [25]. It is specifically designed to provide an objective means, i.e. independent of investigator choice, to identify the ‘optimal’ cluster number within a population.

Thus, the main goal of the current study was to determine the feasibility of delineating objective, EEG-based, clusters among children diagnosed with ASD. The underlying hope is that successful clustering might ultimately lead to better clinical, cost effective, specific diagnoses of subtypes of Autism and the creation of more specific interventions as well as a method to test the effectiveness of such interventions, be they pharmacological, behavioral or other. The multiplicity of potential approaches to clustering and their complexities have been succinctly reviewed by Jain et al. [26] and more recently by Charrad et al. [25]. Key issues that arise when employing cluster analysis are as follows: (1) What clustering technique to use (typical choices are K-means and hierarchical); (2) How many clusters to form (typically required prior to analysis initiation); and (3) How to determine the relative “significance” of resulting cluster configurations (internally by statistics and/or externally by association with one or more symptom complexes). The software package NbClust [25] addresses these three issues and was used in this study to elucidate the cluster structure within a large group of ASD subjects each represented by 40 previously derived [27] EEG-based coherence factors.

The main hypothesis of the study thus was that there are definable subgroups (clusters) of children within autism. It was hypothesized that children in one cluster will be different from children in other groups (clusters), and more similar among each other within a cluster, than across clusters. A secondary hypothesis states that EEG coherence is a productive means for the establishment of such stable clusters of children within the autism population.



  1. (1)

    Utilize for clustering 430 subjects with ASD, who had been studied previously for a different purpose [27].

  2. (2)

    Utilize as variables for clustering all 40 EEG coherence factors objectively generated previously in the differentiation of subjects with ASD and neurotypical control (CON) group subjects [27].

  3. (3)

    Determine the ASD cluster number within the 430 previously studied subjects with ASD by use of the recently developed NbClust software package [25, 28].

  4. (4)

    Compare NbClust results with independently used hierarchical and K-means clustering techniques.

  5. (5)

    For both hierarchical and K-means clustering algorithms use initial default parameters, and then initiate by use of NbClust all thirty available methods in order to determine the optimal cluster number, assessing up to 15 possible cluster outcomes by an objective ‘voting process’, which is part of NbClust [25].

  6. (6)

    Utilize the ‘voted’ as best outcome cluster configuration to identify each ASD subject’s cluster association/identity.

  7. (7)

    Evaluate the internal validity of the resulting clusters within multivariate factor space by using univariate and multivariate statistics (discriminant function analysis - DFA with jackknifing and split-half replication).

  8. (8)

    Explore by use of multigroup DFA the relationship among derived clusters and a neurotypical control (CON) population (not used for clustering). And.

  9. (9)

    In order to explore external cluster meaning, evaluate the relationship, i.e., multivariate positioning, of previously studied ASP subjects [5], who were not part of the ASD population used to form EEG-based factors and were not included in the clustering process.

Subjects previously studied

The EEG coherence factor data used for this study were derived from a population of 984 previously studied 2 to 12 year old subjects [27]. Of these subjects, 430 were representatives of the Autism Spectrum Disorder group (ASD) and 554 constituted the neurotypical control group (CON).

As previously detailed [27], the ASD population had EEGs to rule out epilepsy, seen in up to 30% of certain ASD patients [12, 29]. ASD referrals for the prior study came from pediatric psychologists, psychiatrists, or neurologists at Boston Children’s Hospital (BCH) or from another Harvard associated teaching hospital. Diagnosis of ASD relied upon the DSM [1, 30] and/or ADOS [31, 32] criteria confirmed by clinical histories and evaluation. ASD exclusion criteria included: (1) Coexistent neurologic syndromes with autistic-like features, (2) Seizure disorders or epileptic encephalopathy (infrequent and/or isolated spikes did not cause exclusion); (3) Primary diagnoses of global developmental delay or dysphasia; (4) Clinical uncertainty as to the diagnosis of ASD; (5) Medication being taken at the time of study; (6) Any processes that might alter EEG change such as hydrocephalus, hemiparesis, or other syndromes often associated with abnormal brain development.

As also previously outlined [27], the CON population was selected from an extensive study pool archived by the BCH Developmental Neurophysiology Laboratory (DNL). CON subjects had been utilized as controls for numerous research projects over many years. CON subjects constituted a comparison group of children selected to be normally functioning yet avoiding creation of a ‘super-normal’ population. All CON group subjects were living at home with, considered normal by parents and identified as functioning within the normal range on standardized assessments from respective research studies. Previously delineated CON exclusion criteria included: (1) Diagnosed with or suspected of psychiatric or neurologic illness; (2) Abnormal neurological examination; (3) Seizure disorder (Rare EEG spikes were permitted); (4) Noted at study time to manifest autistic features; (5) Receiving medications.

Summary of the EEG data collection protocol, the analytic methods previously utilized, and the prior study results

Data for all subjects were digitally recorded at BCH, in the resting awake state following placement of 24 gold cup electrodes (Fig. 1) with EEG filtered from 1 to 100 Hz at 256 Hz sampling rate. More recent data were recorded at a higher spatial density (128 channels) and temporal sampling rate (512 Hz). These data were software down-sampled to conform to the earlier recorded data as previously detailed [27]. From 8 to 20 min of artifact-free waking data were collected. As EEGs had been primarily collected to rule out epilepsy, these records usually contained additional time for the appearance of drowsiness and/or sleep as epileptiform discharges are often more frequent during these periods [33]. No subjects were included if EEG records were deemed diagnostic of or consistent with an underlying seizure disorder. Coherence analyses were restricted to waking epochs. Segments of EEG containing obvious artifacts were eliminated by visual inspection. Remaining eye blink and eye movement artifacts, often prominent even during the eye closed state, were removed by means of a source component technique [34, 35] implemented by the BESA™ software package. EEG data were analyzed in Laplacian montage [36,37,38] with coherence calculated [36] between all pairs of 24 electrodes (Fig. 1) in 16, two Hz spectral bands from 1 to 30 Hz resulting in 4416 unique spectral values per subject. Impact of any remaining eye blink and muscle artifact upon these coherence measures were removed by multiple regression using frontal slow delta and high frequency frontal-temporal EEG as indicators (used as independent variables in multiple regression) of residual eye and muscle artifact respectively [27, 39, 40].

Fig. 1
figure 1

Standard EEG Electrode Names and Positions. Legend: Head in vertex view, nose above, left ear to left. EEG electrodes: Z: Midline: FZ: Midline Frontal; CZ: Midline Central; PZ: Midline Parietal; OZ: Midline Occipital. Even numbers, right hemisphere locations; odd numbers, left hemisphere locations: Fp: Frontopolar; F: Frontal; C: Central; T: Temporal; P: Parietal; O: Occipital. The standard 19, 10–20 electrodes are shown as black circles. An additional subset of five, 10–10 electrodes are shown as open circles. This figure was first published in a 2012 autism manuscript by the current authors [27] and is shown with permission of these authors and publisher, BMC Medicine

Reduction of the 4416 coherence variables was managed by Principal Components Analysis (PCA). The first 40 factors accounted for 50.87% of the total variance. Age effects were removed from the 40 coherence variables generated on the total sample by regression of age at study. Factors remained statistically uncorrelated after age regression. These 40 factors were used to separate the CON from ASD groups by discriminant function analysis (DFA); results were highly significant (p < 0.0001) [27]. More importantly, when DFA was used in 10 randomly formed split half replications, the average ASD group classification success was 86.0% and for the CON group, 88.5%. For each split half replication, classification success was also highly significant (p < 0.0001). It was concluded that “…consistent differences exist between the CON and ASD groups” [27].

Cluster analysis, current study

Clustering is a technique of “unsupervised learning” that partitions subjects/objects into groupings or “clusters” such that the subjects/objects within a cluster are more similar to others within the cluster than to subjects in other clusters. The NbClust cluster analysis program [25], within the extensive “R” analytic and display software packages [28, 41], was selected for the purpose of objective, unbiased estimation of the optimal number of clusters within a data set, a primary issue when performing cluster analysis. NbClust, a recent addition to the R programming and analyses software packages, provides 30 indices to determine the “best” number of clusters in a data set by objective, data-driven, “majority vote”. NbClust also provides [28] both hierarchical and K-means clustering as options. The 40 EEG coherence factors developed and described in a prior study [27] and the data from the prior ASD group were utilized as variables for cluster analysis in the current study.

Internal validation of clusters, once delineated, was assessed by three criteria: First, that a majority of the 40 factor values differed between clusters by T-test; Second, that the clusters differed among/between themselves by two-group discriminant function analysis (DFA - see below); and Third, when the CON group was added to the newly defined ASD clusters that the CON group and the created ASD clusters respectively remained separate by multi-group DFA. External cluster validation, in the absence of available, consistent neuropsychological and/or other domain-derived variables for the ASD subjects, was limited in the current study to the passive localization of previously evaluated [5] EEG coherence factor data from 26 ASP subjects within the multivariate space of the CON and ASD subject clusters.

Discriminant function analysis (DFA) and other statistical procedures

All statistical analyses, aside from PCA and cluster analyses, utilized the BMDP2007™ software package [42]. Program 7 M (P7M) was used for the two and three group stepwise discriminant function analyses (DFA). P7M creates new canonical variables for maximal subject group separation. For a two group analysis one discriminant function is produced and for a three group analysis two discriminant functions are produced. DFA defines the significance of a group separation, summarizes the classification of each participant, and provides an approach for the prospective classification of individuals not involved in creation of the discriminant rule [43, 44]. In order to estimate prospective classification success, the jackknifing technique, also referred to as the leaving-one-out process, was utilized. By this method, a discriminant function is formed on all individuals but one. The left-out individual is subsequently classified. This initial left out individual is then folded back into the group (hence ‘jackknifing’), and a different individual is left out, a process which is repeated until all individuals have been left out and classified by a classification rule created on the non-left out subjects. The measure of classification success is then based upon a tally of the correct classifications of the left-out individuals. An alternative technique to estimate prospective classification success was also utilized, namely split-half replication. By means of a random number generator, internal to P7M, the entire population was randomly divided into a training-set and a test-set. Classification rules were generated on the training-set and evaluated in terms of classification on the corresponding test-set. Such split-half replication was repeated five times.


Cluster creation

NbClust was performed on 430 ASD subjects represented by 40 factor variables [27] using the hierarchical clustering method and asking for the development of up to 15 clusters. Results are shown in histogram form, see Fig. 2. Note that 17 of the 30 assessments identified a 2-cluster solution and by the majority rule of the hierarchical approach this was determined the optimal cluster configuration. NbClust was repeated now using K-means clustering with results shown in Fig. 3. NbClust once more indicated as optimal the 2-cluster-configuration; 10 of the 30 assessments ‘voted’ for this outcome. Since both clustering methods chose the 2 cluster solution as optimal, this configuration was taken as most representative of the full ASD population. Moreover, as hierarchical clustering produced the more definitive 17 of 30 ‘vote’, the ‘best’ two cluster solution as formed from hierarchical clustering, was accepted. The first cluster, referred to as Cluster 1 or C1, comprised 169 subjects and the second cluster, referred to as Cluster 2 or C2, contained 261 subjects.

Fig. 2
figure 2

Optimal Cluster Number by Hierarchical Clustering and Program NbClust. Legend: NbClust produced histogram of up to 15 possible cluster groupings formed by Hierarchical clustering. Atop each vertical bar is the total number of the 30 indices used to estimate the optimal cluster grouping. Note that 17 of the 30 indices indicate the two cluster configuration as “optimal”. Cluster configurations never selected are omitted from the X axis as their frequency would be zero

Fig. 3
figure 3

Optimal Cluster Number by K-means Clustering and Program NbClust. Legend: An NbClust produced histogram of up to 15 possible cluster groupings formed by K-means clustering. Atop each vertical bar is the total number of the 30 indices used to estimate the optimal cluster grouping. Note that 10 of the 30 indices indicate the two cluster configuration as “optimal”. Cluster configurations never selected are omitted from the X axis as their frequency would be zero

Separation between clusters; factors and demographic variables

Table 1 shows a two group t-test for each of the 40 factor variables between clusters C1 and C2. Of the 40 factors, 13 achieved highly significant cluster differences of p < 0.0001, and 11 achieved significant differences with p values ranging from p ≤ 0.0262 to ≤0.0002. Sixteen tests showed insignificant p values. Thus, 60% of the factors manifested significant between-clusters differences, with 32.5% being highly significant. As shown in Table 2, there were no statistically significant differences between the two clusters on the basis of gender, as tested by Chi-square, handedness, also tested by Chi-square, or age in years at time of study, tested by t-test.

Table 1 T-test Between Clusters1 (C1) and 2 (C)
Table 2 Demographic Differences Between Clusters 1 and 2

Separation of two ASD clusters by two-group DFA

Stepwise DFA (P7M) was performed between ASD clusters C1 and C2 on the initial basis of all 40 Factors as potential discriminating variables. Nineteen factors were selected (Table 3). Coherence loadings on the 19 DFA selected factors are illustrated in Fig. 4. In order to establish the sign, plus for positive, minus for negative, of the differential coherence loading for coherences associated within a given factor (red = positive or blue = negative) three analytic steps were considered for each factor: (1) The sign of loading of coherence upon a given factor at time of factor creation; (2) The sign of the factor loading upon the generated discriminant function variable; and (3) The sign of the C1 and C2 group outcome positions along the discriminant function axis. Note that the C1 - C2 difference involved factors showing both increased (12 red) and decreased (7 blue) coherences. No factor showed a combination of both increased and decreased coherence. Two group classification by Wilk’s lambda was highly successful (0.342; F = 41.4; DF 19,410; p < 0.00001). Overall subject classification success was 95.8% directly and 94.7% by jackknifing. Graphic separation between the two cluster groups by the resulting discriminant function is shown in Fig. 5.

Table 3 Separation of ASD Clusters 1 and 2 by Discriminant Function Analysis (DFA)
Fig. 4
figure 4

Graphic Representation of 19 Coherence Factor Loadings Used in Separating Clusters 1 and 2. Legend: EEG coherence factor loadings. Heads in top view, scalp left to image left, nose above; Factor number is above heads to left and peak frequency for factor in Hz is above to right. Lines indicate top approximate 15% coherence loadings per factor: Red Lines = increased coherence in Cluster 1; Blue Lines = decreased coherence in Cluster 1. Involved electrodes are shown as white circles. Uninvolved electrodes are not shown; they are blackened-out within the superior scalp area and greened-out for scalp electrodes. Factors are shown in numerical order. See text for factor selection order in discriminant analysis

Fig. 5
figure 5

C1 and C2 Cluster Groups Along 2-Group DFA by Discriminant ScoreLegend: C1 and C2 histograms (red = C1, blue/green = C2) with X-axis the 2 group discriminant score. Note minimal overlap. Separation by Wilk’s Lambda is significant (p < 0.00001) and overall individual subject classification is approximately 95% correct by jackknifing (see text)

Five split half replications were performed by DFA between clusters C1 and C2. The population split into test set and training set was performed by means of a random number generator. Results are shown in Table 4. Note that average correct classification of the left out ‘Test Set’ C1 group was 86.36%, and the left out ‘Test Set’ C2 group was 91.79%. Thus, by both jackknifing and by five split half replications there was strong evidence for successful prospective C1/C2 group classification.

Table 4 C1 vs. C2, Five Split-Half Replications

Separation among the two ASD clusters and the CON Group by three-group DFA

Stepwise DFA (P7M) was performed among ASD clusters C1, C2 and control group CON on the initial basis of all 40 Factors as potential discriminating variables (Table 5). Thirty-one factors were selected for use by DFA. Overall subject classification success among the three groups was 87.4% directly and 85.3% by jackknifing. Overall classification and the three separate pair comparisons were all statistically highly significant at p < 0.00001 for each of the four analyses. Graphic separation among the three groups (CON, C1, C2) that resulted from this three group DFA is illustrated in Fig. 6. The two discriminant functions served as X and Y axes.

Table 5 Separation of ASD Clusters 1 and 2 and the Control Group (CON) by 3 Group DFA
Fig. 6
figure 6

C1, C2 and CON Groups Along 3-Group DFA by Discriminant Score. Legend: C1, C2, and CON group population distributions (red circle = C1, green triangle = C2, blue + = CON) with X and Y axes the two 3-group discriminant function scores. Note minimal populations overlap. Overall and among group separations are significant. There is very significant three group subject classification (see text). Note, hierarchical clustering results, upon which this figure is based, tend to illustrate linear group boundaries (whereas K-means clustering tend to produce more circular or ovoid boundaries [32])

Passive classification of subjects with Asperger’s syndrome (ASP)

Stepwise DFA was repeated among the three groups C1, C2, and CON on the basis of all 40 Factors. To this three-group population a fourth group of 26 previously studied subjects with Asperger’s syndrome (ASP) [5] was added as input data and set to be passively classified by the resulting C1-C2-CON based discriminant functions. The ASP subjects did not participate in the creation of the two discriminant variables. Results demonstrated that 19 of the 26 ASP subjects were passively classified as belonging within the C2 cluster and six within the C1 cluster; of these six subjects two fell into the C1-C2 cluster border zone. One fell within the CON group. These results are illustrated in Fig. 7.

Fig. 7
figure 7

C1, C2, CON, and ASP Groups Along 3-Group DFA by Discriminant Score. Legend: C1, C2, and CON group population distributions as for Fig. 6. Now with passive classification of 26 Asperger (ASP) subjects. Note ASP population mostly overlaps with C2 ASD group and nearby regions of C1 group. (red circle = C1, green square = C2, blue x = CON, black square = ASP)


Results show that 430 subjects diagnosed as being on the autism “spectrum” and represented by 40 EEG coherence factors [27], fell into two distinct clusters. These two ‘autism clusters’ statistically differed from one another and, in turn, statistically differed from 554 subject neuro-typical control group subjects, not involved in the clustering process. Notably the 40 utilized EEG coherence factor variables had been objectively derived [27] and a completely objective data-driven variable selection was applied. Furthermore, choice of the optimal cluster number was also objectively determined by use of a relatively recent software package, NbClust [25]. This program was instructed to form up to 15 clusters and to establish the optimal cluster configuration on the basis of the 30 methods [25] included in the program.

Finally, NbClust was run twice, first utilizing the hierarchical clustering technique and second utilizing the other commonly used K-means technique. Both techniques ‘voted’ the two cluster configuration as optimal; the choice was more definitive when hierarchical clustering was used. Thus, the optimal two cluster solution was selected on the basis of objectively derived EEG measures of brain connectivity.

In order to explore the potential clinical significance of the two autism clusters, advantage was taken of a prior study [5] that contrasted the control, ASD, and ASP populations and that had shown that ASP subjects were closer to the ASD population than the neurotypical control population, and also that ASP subjects were statistically separable and distinctly different from the ASD population. For the current study, these previously studied 26 ASP subjects were represented by the 40 EEG coherence factor variables and were utilized to determine whether these ASP subjects would passively fall within one or the other of the newly formed clusters. Notably these ASP subjects had not been utilized in the original cluster formation. As Fig. 7 shows, the majority of ASP subjects were within or close to the Cluster 2 domain, which suggests that C2 may be primarily associated with those subjects manifesting Asperger-like behavioral characteristics. It is of note that despite the multiple variable types and differing methods for clustering in the literature, a prominent two cluster distribution of autistic characteristics has been observed repeatedly by others [6, 8, 11, 14, 15].

A significant limitation of the study is the lack of external validation by similarly extensive subject data from other relevant domains, such as neuro-psychological evaluation as well as autistic specific evaluations such as the ADOS [32] or ADI-R [45], MRI [9, 46,47,48,49], genetic/epigenetic testing [50,51,52,53,54,55,56,57], and prolonged sleep EEG recordings for detection of epileptiform activity [29, 58,59,60,61].

The future establishment of correlations among such additional brain-based data with the EEG coherence-based findings described in this manuscript should be very helpful in further validating the EEG-based subgroups, and should facilitate interpretation of the coherence data by clinicians and scientists alike. The absence of correlative data in the current study does not invalidate the future use of the current data and findings for such correlative studies. For example it was possible in the current study to insert 26 Asperger’s patient data within the three group discriminant analysis described here. As also previously reported, autistic patients could be additionally classified as also having attention issues using a discriminant, developed on a different population with attention disorders. Thus it was possible to explore attention disability within autism [40] by means of EEG data. Future studies utilizing the current study’s results would only require that subjects have waking EEG data. Such studies are anticipated.

It is also important to clarify that the current demonstration of two neurophysiological clusters within the autism spectrum does not preclude the possibility of further, relevant autism subdivisions. For example in the future, as the neurodevelopmental characteristics of Cluster 1 and 2 are explored, there may prove to be additional clinically relevant sub-populations within or even across these two clusters. .


Objective brain derived EEG coherence factor data strongly support the proposition that the autism disorder should not be seen as a continuous spectrum [1] but likely is formed from at least two distinct subpopulations. This is important since the ‘spectrum versus clusters’ issue goes beyond academic taxonomy and has a number of real world consequences: (1) For example, moving Asperger’s syndrome subjects into the autism spectrum disorder allows some US schools to develop and offer a single autism educational program that is oriented towards management and teaching of the ‘typical’ autistic child of limited verbal ability, who may present with behavioral issues. This may leave out or otherwise disadvantage children with Asperger’s Syndrome, who present with specific and often different behavioral and educational issues altogether and typically profit from more individually-tailored education. (2) On the other hand, in clinical autism research (e.g., neuro-behavioral evaluations, MRI based studies) it is often much easier to recruit and successfully study subjects with Asperger’s Syndrome or others who are high-functioning. However, results of such studies may be inappropriately generalized as findings characteristic of the entire autism spectrum. (3) The multiple different findings resultant from genetic and epigenetic studies of autism [54, 62,63,64,65] also contradict a unitary perspective on autism. Important correlational findings may be lost when all autism is treated as a single entity. It prevents identification of distinct subgroups based upon clinical insights, and/or neurobehavioral parameters, and/or direct brain parameters (as from MRI and EEG).

In addition EEG, a classic and relatively inexpensive non-invasive test, is found to be reasonably well tolerated by children with various forms of ASD and should be considered for inclusion in future studies of autism and of other neurobehavioral disorders [5, 27, 40, 66, 67].

As previously discussed [27] the authors believe that in clinical practice diagnosis of ASD should follow the DSM-5 criteria, and should be made by clinicians with special training in this area (e.g., neurologists, psychiatrists, psychologists) by use of readily available assessments such as the ADI-R [45]. EEG coherence study data may best serve as adjunctive, confirmatory, and/or exploratory information. At this point they are especially useful regarding the discovery of clinically relevant autism sub-populations.



Attention deficit disorder


Attention deficit hyperactivity disorder


Autism diagnostic interview - revised


Analysis of variance


Autism spectrum disorder checklist


Autism Spectrum Disorder


Asperger’s syndrome


Boston Children’s Hospital, an HMS affiliated teaching hospital


Cluster one


Cluster two


Neurotypical control group


Discriminant function analysis


Developmental neurophysiology laboratory at BCH


Diagnostic and statistical manual (of Mental Disorders)


Electroencephalogram, electroencephalography


Fast fourier transform, used for spectral analysis


Harvard Medical School (Boston, Massachusetts, USA)


Institutional review board of BCH


Principal Components Analysis


Pervasive developmental disorders – not otherwise specified


  1. Association AP. Diagnostic and statistical manual of mental disorders Fith edition DSM-5. Washington, D.C.: American Psychiatric Publishing, Incorporated; 2013.

    Book  Google Scholar 

  2. Association AP. Diagnostic and statistical manual of mental disorders (3rd edition). Washington: American Paychiatric Association; 1980.

  3. Association AP. Diagnostic and statistical manual of mental disorder, DSM-IV. 4th ed. Washington, DC: American Psychiatric Association; 1994.

    Google Scholar 

  4. Mattila ML, Kielinen M, Linna SL, Jussila K, Ebeling H, Bloigu R, Joseph RM, Moilanen I. Autism spectrum disorders according to DSM-IV-TR and comparison with DSM-5 draft criteria: an epidemiological study. J Am Acad Child Adolesc Psychiatry. 2011;50(6):583–92 e511.

    Article  Google Scholar 

  5. Duffy F, Shankardass A, McAnulty G, Als H. The relationship of Asperger's syndrome to autism: a preliminary EEG coherence study. BMC Med. 2013;11:175.

    Article  Google Scholar 

  6. Kienle X, Freiberger V, Greulich H, Blank R. Autism Spectrum disorder and DSM-5: Spectrum or cluster? Prax Kinderpsychol Kinderpsychiatr. 2015;64(6):412–28.

    Article  Google Scholar 

  7. Pruett JR, Povinelli DJ. Commentary - autism Spectrum disorder: Spectrum or cluster? Autism Res. 2016;9(12):1237–40.

    Article  Google Scholar 

  8. Eaves LC, Ho HH, Eaves DM. Subtypes of autism by cluster analysis. J Autism Dev Disord. 1994;24(1):3–22.

    Article  CAS  Google Scholar 

  9. Hrdlicka M, Dudova I, Beranova I, Lisy J, Belsan T, Neuwirth J, Komarek V, Faladova L, Havlovicova M, Sedlacek Z, et al. Subtypes of autism by cluster analysis based on structural MRI data. Eur Child Adolesc Psychiatry. 2005;14(3):138–44.

    Article  Google Scholar 

  10. Bitsika V, Sharpley CF, Orapeleng S. An exploratory analysis of the use of cognitive, adaptive and behavioural indices for cluster analysis of ASD subgroups. J Intellect Disabil Res. 2008;52(11):973–85.

    Article  CAS  Google Scholar 

  11. Ji NY, Capone GT, Kaufmann WE. Autism spectrum disorder in Down syndrome: cluster analysis of aberrant behaviour checklist data supports diagnosis. J Intellect Disabil Res. 2011;55(11):1064–77.

    Article  CAS  Google Scholar 

  12. Cuccaro ML, Tuchman RF, Hamilton KL, Wright HH, Abramson RK, Haines JL, Gilbert JR, Pericak-Vance M. Exploring the relationship between autism spectrum disorder and epilepsy using latent class cluster analysis. J Autism Dev Disord. 2012;42(8):1630–41.

    Article  Google Scholar 

  13. Palmer CJ, Paton B, Enticott PG, Hohwy J. ‘Subtypes’ in the presentation of autistic traits in the general adult population. J Autism Dev Disord. 2015;45(5):1291–301.

    Article  Google Scholar 

  14. Kitazoe N, Fujita N, Izumoto Y, Terada SI, Hatakenaka Y. Whether the autism Spectrum quotient consists of two different subgroups? Cluster analysis of the autism Spectrum quotient in general population. Autism. 2017;21(3):323–32.

    Article  Google Scholar 

  15. Tanaka S, Oi M, Fujino H, Kikuchi M, Yoshimura Y, Miura Y, Tsujii M, Ohoka H. Characteristics of communication among Japanese children with autism spectrum disorder: a cluster analysis using the Children's communication Checklist-2. Clin Linguist Phon. 2017;31(3):234–49.

    Article  Google Scholar 

  16. Rutter M, Schopler E. Autism: a reappraisal of concepts and treatment. New York: Plenum Press; 1978.

    Book  Google Scholar 

  17. Sparrow S, Cicchetti D, Balla DA. Vineland-II, Vineland adaptive behavior scales, second edition. Minneaspolis, MN: Pearson Assessments; 2005.

    Google Scholar 

  18. Aman MG, Burrow WH, Wolford PL. The aberrant behavior checklist-community: factor validity and effect of subject variables for adults in group homes. Am J Ment Retard. 1995;100(3):283–92.

    CAS  PubMed  Google Scholar 

  19. Schopler E, Reuchler RJ, Renner BR. The childhood autism rating scale manual, 8th printing edn. Irvington, NY: Western Psychological Services; 1999.

    Google Scholar 

  20. Myles BS, Bock SJ, Simpson RL. Asperger syndrome diagnostic scale. Austin, TX: PRO-ED; 2001.

    Google Scholar 

  21. Bolte S, Poustka F. Fragebogen zur Sozialen Kommunikation (FSK) [Questionnaire Regarding Social Communication]. Bern: Huber; 2006.

    Google Scholar 

  22. Baron-Cohen S, Wheelwright S, Skinner R, Martin J, Clubley E. The autism-spectrum quotient (AQ): evidence from Asperger syndrome/high-functioning autism, males and females, scientists and mathematicians. J Autism Dev Disord. 2001;31(1):5–17.

    Article  CAS  Google Scholar 

  23. Volden J, Phillips L. Measuring pragmatic language in speakers with autism spectrum disorders: comparing the Children’s communication checklist—2 and the test of pragmatic language. Am J Speech-Lang Pathol. 2010;19:204–12.

    Article  Google Scholar 

  24. Bishop DVM. Childrens's communication checklist (2nd ed., US ed.). San Antonio TX: Psychological Corporation; 2006.

    Google Scholar 

  25. Charrad M, Ghazzali N, Boiteau V, Niknafs A. NbClust: an R package for determining the relevant number of clusters in a data set. J Stat Softw. 2014;61(6):1–36.

    Article  Google Scholar 

  26. Jain AK, Murty MN, Flynn JP. Data clustering: a review. ACM Comput Surv. 1999;31(3):264–323.

    Article  Google Scholar 

  27. Duffy FH, Als H. A stable pattern of EEG spectral coherence distinguishes children with autism from neuro-typical controls - a large case control study. BMC Med. 2012;10(1):64.

    Article  Google Scholar 

  28. Kassambara A. Practical guide to cluster analysis in R - unsupervised machine learning. San Bernadino, CA: STHDA; 2017.

    Google Scholar 

  29. Spence SJ, Schneider MT. The role of epilepsy and epileptiform EEGs in autism spectrum disorders. Pediatr Res. 2009;65(6):599–606.

    Article  Google Scholar 

  30. Association AP. Diagnostic and statistical manual of mental disorders fourth edition text revision (DSM-IV-TR). Washington, DC: American Psychiatric Publishing, Inc.; 2000.

    Book  Google Scholar 

  31. Lord C, Risi S, Lambrecht L, Cook EH, Leventhal BL, DiLavore PC, Pickles A, Rutter M. The autism diagnostic observation schedule-generic: a standard measure of social and communication deficits associated with the spectrum of autism. J Autism Dev Disord. 2000;30:205–23.

    Article  CAS  Google Scholar 

  32. Lord C, Rutter M, PC DL, Risi S, Gotham K. autism diagnostic observation schedule - second edition (ADOS-2). Torrence, CA: Western Psychological Services; 2012.

    Google Scholar 

  33. Hughes JR. EEG in clinical practice. Boston: Butterworth; 1982.

    Google Scholar 

  34. Berg P, Scherg M. Dipole modeling of eye activity and its application to the removal of eye artifacts from EEG and MEG. Clin Phys Physiol Meas. 1991;12(Suppl A):49–54.

    Article  Google Scholar 

  35. Berg P, Scherg M. A multiple source approach to the correction of eye artifacts. Electroencephalogr Clin Neurophysiol. 1994;90:229–41.

    Article  CAS  Google Scholar 

  36. van Drongelen W. Signal processing for neuroscientists : an introduction to the analysis of physiological signals vol. 5. Oxford: Elsevier; 2011.

    Google Scholar 

  37. Nunez PL, Srinivasan R. Electric field of the Braim. The Neurophysics of EEG. Second. New York: Oxford University Press; 2006.

    Book  Google Scholar 

  38. Nunez PL, Srinivasan R, Westdorp AF, Wijesinghe RS, Tucker DM, Silberstein RB, Cadusch PJ. EEG coherency, 1: statistics, reference electrode, volume conduction, Laplacians, cortical imaging, and interpretation at multiple scales. Electroencephalogr Clin Neurophysiol. 1997;103(5):499–515.

    Article  CAS  Google Scholar 

  39. Semlitsch HV, Anderer P, Schuster P, Presslich O. A solution for reliable and valid reduction of ocular artifacts, applied to the P300 ERP. Psychophysiology. 1986;23(6):695–703.

    Article  CAS  Google Scholar 

  40. Duffy FH, Shankardass A, McAnulty GB, Als H. A unique pattern of cortical connectivity characterizes patients with attention deficit disorders: a large electroencephalographic coherence study. BMC Med. 2017;15(1):51.

    Article  Google Scholar 

  41. Wickham H, Grolemund G. R for data science. Sebastopol, CA: Oreilly; 2016.

    Google Scholar 

  42. Dixon WJ. BMDP statistical software (revised edition). Berkeley: University of California Press; 1985.

    Google Scholar 

  43. Lachenbruch P, Mickey RM. Estimation of error rates in discriminant analysis. Technometrics. 1968;10:1–11.

    Article  Google Scholar 

  44. Lachenbruch PA. Discriminant analysis. New York: Hafner Press; 1975.

    Google Scholar 

  45. Kim SH, Thurm A, Shumway S, Lord C. Multisite study of new autism diagnostic interview-revised (ADI-R) algorithms for toddlers and young preschoolers. J Autism Dev Disord. 2013;43(7):1527–38.

    Article  Google Scholar 

  46. Chen R, Jiao Y, Herskovits EH. Structural MRI in autism spectrum disorder. Pediatr Res. 2011;69:63R–8R.

    Article  Google Scholar 

  47. He Q, Karsch K, Duan Y. Abnormalities in MRI traits of corpus callosum in autism subtype. Conf Proc. 2008;1:3900–3.

    Google Scholar 

  48. Just M, Cherkassky V, Keller T, Kana R, Minshew N. Functional and anatomical cortical underconnectivity in autism: evidence from an fMRI study of an executive function task and corpus callosum morphometry. Cereb Cortex. 2007;17:951–61.

    Article  Google Scholar 

  49. Yu KK, Cheung C, Chua SE, McAlonan GM. Can Asperger syndrome be distinguished from autism? An anatomic likelihood meta-analysis of MRI studies. J Psychiatry Neurosci. 2011;36(6):412–21.

    Article  Google Scholar 

  50. Bailey A, Le Couteur A, Gottesman I, Bolton P, Simonoff E, Yuzada E, Rutter M. Autism as a strongly genetic disorder: evidence from a British twin study. Psychol Med. 1995;52(1):63–77.

    Article  Google Scholar 

  51. Benvenuto A, Moavero R, Alessandrelli R, Manzi B, Curatolo P. Syndromic autism: causes and pathogenetic pathways. World J Pediatr. 2009;5(3):169–76.

    Article  Google Scholar 

  52. Constantino JN, Zhang Y, Frazier T, Abbaccchi AM, Law P. Sibling recurrence and the genetic epidemiology of autism. Am J Psychiatr. 2010;167(11):1349–56.

    Article  Google Scholar 

  53. Folstein S, Rutter M. Infantile autism: a genetic study of 21 twin pairs. J Child Psychol Psychiatry. 1977;18(4):297–321.

    Article  CAS  Google Scholar 

  54. Forsberg SL, Ilieva M, Maria Michel T. Epigenetics and cerebral organoids: promising directions in autism spectrum disorders. Transl Psychiatry. 2018;8(1):14.

    Article  Google Scholar 

  55. Hallmayer J, Cleveland S, Torres A, Phillips J, Cohen B, Torigue T, Miller J, Fedele A, Collins J, Smith K, et al. Genetic heritability and shared environmental factors among twin pairs with autism. Arch Gen Psychiatry. 2011;68(11):1095–1102.

  56. Happé F, Roland A. The ‘fractional autism triad’: a review of evidence from behavioral, genetic, cognitive, and neural research. Neuropsychol Rev. 2008;18:287–304.

    Article  Google Scholar 

  57. Muhle R, Trentacoste SV, Rapin I. The genetics of autism. Pediatrics. 2004;113(5):e472–86.

    Article  Google Scholar 

  58. Chez MG, Chang M, Krasne V, Coughlan C, Kominsky M, Schwartz A. Frequency of epileptiform EEG abnormalities in a sequential screening of autistic patients with no known clinical epilepsy from 1996 to 2005. Epilepsy Behav. 2006;8(1):267–71.

    Article  Google Scholar 

  59. Mulligan CK, Trauner DA. Incidence and behavioral correlates of epileptiform abnormalities in autism spectrum disorders. J Autism Dev Disord. 2014;44(2):452–8.

    Article  Google Scholar 

  60. El Achkar CM, Spence SJ. Clinical characteristics of children and young adults with co-occurring autism spectrum disorder and epilepsy. Epilepsy Behav. 2015;47:183–90.

    Article  Google Scholar 

  61. Valvo G, Baldini S, Retico A, Rossi G, Tancredi R, Ferrari AR, Calderoni S, Apicella F, Muratori F, Santorelli FM, et al. Temporal lobe connects regression and macrocephaly to autism spectrum disorders. Eur Child Adolesc Psychiatry. 2016;25(4):421–9.

    Article  Google Scholar 

  62. Miller DT, Adam MP, Aradhya T, Biesecker LG, Brothman AR, Carter NP, Church DM, Crolla JA, Eichler EE, Epstein CJ, et al. Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. Am J Human Gen. 2010;86(5):749–64.

    Article  CAS  Google Scholar 

  63. Sanders SJ, He X, Willsey AJ, Ercan-Sencicek AG, Samocha KE, Cicek AE, Murtha MT, Bal VH, Bishop SL, Dong S, et al. Insights into autism Spectrum disorder genomic architecture and biology from 71 risk loci. Neuron. 2015;87(6):1215–33.

    Article  CAS  Google Scholar 

  64. Green Snyder L, D'Angelo D, Chen Q, Bernier R, Goin-Kochel RP, Wallace AS, Gerdts J, Kanne S, Berry L, Blaskey L, et al. Autism Spectrum disorder, developmental and psychiatric features in 16p11.2 duplication. J Autism Dev Disord. 2016;46(8):273427–48.

    Article  Google Scholar 

  65. Sandin S, Lichtenstein P, Kuja-Halkola R, Hultman C, Larsson H, Reichenberg A. The heritability of autism Spectrum disorder. J Am Med Assoc. 2017;318(12):1182–4.

    Article  Google Scholar 

  66. Duffy FH, D'Angelo E, Rotenberg A, Gonzalez-Heydrich J. Neurophysiological differences between patients clinically at high risk for schizophrenia and neurotypical controls--first steps in development of a biomarker. BMC Med. 2015;13:276.

    Article  Google Scholar 

  67. Duffy FH, McAnulty GM, McCreary MC, Cuchural GJ, Komaroff AL. EEG spectral coherence data distinguish chronic fatigue syndrome patients from healthy controls and depressed patients - a case control study. BMC Neurol. 2011;11:82.

    Article  Google Scholar 

Download references


The authors thank the children and their families, who participated in subject data acquisition. The authors furthermore thank registered EEG technologists Herman Edwards, Jack Connolly, and Sheryl Manganaro for the quality of their work and their consistent efforts over the years. Author FHD expresses gratitude to Caterina Stamoulis, PhD for her assistance with installation and instruction with use of the R software package and program NbClust. FHD expresses special gratitude to the late Peter Bartels, PhD of the University of Arizona for his past mentoring in the complex art and science of cluster analysis. FHD also thanks April Kim for preliminary cluster analyses utilizing program Pindex. FHD additionally thanks Aditi Shankardass, PhD for her advice and assistance in project planning.


This work was supported in part by grants from the John Leopold Weil and Geraldine Rickard Weil Memorial Charitable Foundation, Newton, MA, USA and the Irving Harris Foundation, Chicago, IL, USA as well as the Buehler Family, Mill Valley, CA, USA (to author HA) and the Intellectual and Developmental Disabilities Research Center grant HD018655 (to S. Pomeroy). All funding sources are aware of and not involved in the experimental design, execution, and results of the current analyses. No funding agency requires pre-publication review. The BCH Departments of Neurology and Psychiatry provide general support for the facilities utilized by this project.

Availability of data and materials

Laboratory policy restricts availability of the raw data and materials to the investigators due to the inherent complexity and volume of raw EEG data and the unique, complex file structures for such data storage and associated data analyses.

Author information

Authors and Affiliations



FHD and HA equally contributed to the study’s concepts and design, selection of patients and subjects, and interpretation of results. Additionally, HA translated the Kienle et al. paper [6] published in German. FHD contributed to acquisition and preparation of neurophysiologic data and statistical analyses. FHD had full access to all the data in the study and takes responsibility for all aspects of the study, including integrity of the data accuracy and the data analyses. Both authors collaborated in writing and editing the paper and approved the final manuscript.

Corresponding author

Correspondence to Frank H. Duffy.

Ethics declarations

Authors’ information

FHD is a physician, child neurologist, clinical electroencephalographer and research neurophysiologist with degrees in electrical engineering and mathematics. Current research interests are in neuro-developmental disorders and epilepsy, including the development and utilization of specialized analytic techniques to support related investigations. As a clinician FHD has evaluated and managed many patients on the ‘Autism Spectrum’ and has evaluated and officially reported very many clinical EEGs for the BCH Division of Epilepsy and Clinical Neurophysiology. HA is a research and a licensed clinical psychologist with research interests and considerable clinical expertise in newborn, infant and child neuro-development, including generation of early predictors of later outcome from behavioral, MRI, and neurophysiological data.

Ethics approval and consent to participate

All control subjects, as appropriate, and/or their families or guardians gave written informed consent in accordance with protocols approved by the Institutional Review Board (IRB) of Boston Children’s Hospital, Office of Clinical Investigation, which is in keeping with the Declaration of Helsinki, a statement of ethical principles for medical research involving human subjects. Consent was provided by the parents or legally appointed representatives of all minors included in the research presented in this manuscript. The approved protocol is in full compliance with the Declaration of Helsinki. All previous clinical EEG studies of subjects with autism were separately approved for research analysis and subsequent publications by the above IRB with the condition that all data be de-identified. This protocol is also in full compliance with the Declaration of Helsinki. All data for this project were de-identified prior to analysis.

Consent for publication

Not applicable (see paragraph above).

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Duffy, F.H., Als, H. Autism, spectrum or clusters? An EEG coherence study. BMC Neurol 19, 27 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: