Vol. 25. Núm. 1. - 2019. Páginas 13-22

Virtual Reality-based Assessment and Rating Scales in ADHD Diagnosis

[La evaluación mediante realidad virtual y las escalas de valoración en el diagnóstico del TDAH]

Aitziber Zuluetaa, Unai Díaz-Oruetab, Nerea Crespo-Eguilazc, and Fermín Torranoa

aUniversidad Internacional de La Rioja, Logroño, Spain; bMaynooth University, Maynooth, Ireland; cNavarra University Clinic, Pamplona, Spain

Received 24 January 2018, Accepted 8 June 2018


The diagnosis of Attention Deficit Hyperactivity Disorder (ADHD) is mainly based on structured scales, such as Conners’ ADHD Rating Scale (EDAH in the Spanish version) and DSM interviews. The use of technologies in neuropsychological assessment, such as the AULA virtual reality based test leads to more accurate diagnosis. The current study presents findings from analyzing the external validity of AULA and its contribution to the diagnosis of ADHD. Four hundred and seven children (272 girls and 135 boys) from 6 to 16 years old (213 with ADHD diagnosis, 105 inattentive children, 108 combined-type, and 194 controls) were evaluated. First, a factor analysis of AULA variables was conducted in order to reduce data to factor and 5 factors or components that account for 82.37% of the total variance were obtained from 407 subjects, namely, sustained attention, impulsivity control, processing speed, response variability, and control of motor activity. Second, a discriminant analysis was then performed on data obtained by participants from whom the five factors were obtained, showing that AULA presents moderate levels of both specificity and sensitivity. Finally, in order to study whether AULA adds relevant information in the diagnosis of ADHD, a cluster analysis was carried out, showing 4 clusters in the analysis of conglomerates with the control group and 6 groups of clusters in the ADHD group. In summary, AULA test shows adequate external validity, allows correct classification of children with and without attentional problems, and confirms and provides additional ADHD diagnostic information that it is essential for the design of interventions.


El diagnóstico del trastorno por déficit de atención y/o hiperactividad (TDAH) se basa principalmente en escalas estructuradas, como la escala de Conners (validada en España con el nombre de EDAH), y entrevistas basadas en los criterios recogidos en el DSM. El uso de las nuevas tecnologías en el proceso de evaluación neuropsicológica, como el Test de Atención de Realidad Virtual AULA, proporciona información rigurosa que conduce a diagnósticos más precisos. En este contexto, el objetivo del presente estudio es proporcionar evidencias acerca de la validez externa del test AULA, un test que mejora la precisión diagnóstica del TDAH. La muestra estuvo formada por 407 niños (272 niños y 135 niñas) de entre 6 y 16 años del norte de España (213 niños tenían un diagnóstico de TDAH, 105 desatentos, 108 de tipo  combinado y 194 fueron niños controles). En primer lugar se realizó un análisis factorial exploratorio, que puso de manifiesto una estructura formada por cinco factores (atención sostenida, control de la impulsividad, velocidad de procesamiento, variabilidad de respuesta y control de la actividad motora), que son capaces de explicar el 82.37% de la varianza. En segundo lugar, se realizó un análisis discriminante, de acuerdo con el rendimiento obtenido por los participantes en los cinco factores obtenidos en el test AULA, que mostró niveles moderados de especificidad y sensibilidad. Finalmente, se llevó a cabo un análisis de conglomerados con el grupo control y seis grupos de clúster dentro del grupo de niños con TDAH. Los resultados muestran una adecuada validez externa del instrumento AULA, que permite realizar clasificaciones correctas de sujetos sanos y patológicos, confirmándose que proporciona información adicional para el diagnóstico de TDAH, esencial para el posterior diseño de intervenciones clínicas eficaces.

Palabras clave

TDAH, Escala de Valoración del TDAH de Conners, Escala de TDAH para Profesores, Test de Atención de Realidad Virtual AULA, Escalas de valoración, Sensibilidad, Especificidad, Diagnóstico


ADHD, Conners’ ADHD Parent Rating Scale, EDAH Teacher Rating Scale, AULA Attention Virtual Reality, Test Rating scales, Sensitivity, Specificity, Diagnosis

Cite this article as: Zulueta, A., Díaz-Orueta, U., Crespo-Eguilaz, N., & Torrano, F. (2018). Virtual reality-based assessment and rating scales in ADHD diagnosis. Revista Psicología Educativa, 25, 13-22. 

Correspondence: (A. Zulueta).


In recent years, research on the Attention Deficit Hyperactivity Disorder (ADHD) diagnosis is certainly ‘in vogue’. Many popular books have been published, several companies are marketing new drugs for ADHD, and it constitutes a commonly used term among parents and teachers. Specialized services, including extra time for standardized assessment and psychoeducational intervention, are afforded to children and adolescents diagnosed with ADHD, a fact that improves the likelihood of a more accurate ADHD diagnosis (Gualtieri & Johnson, 2005).

Attention Deficit Hyperactivity Disorder (ADHD) is defined by the DSM-5 as a persistent pattern of inattention and/or hyperactivity-impulsivity that interferes with functioning or development, presents symptoms in two or more settings (e.g., at home, school, or work; with friends or relatives; in other activities), and has a negative direct impact on social, academic, or occupational functioning. Several symptoms must be present before the age of 12. Depending on diagnostic criteria and procedures that are used, the percentage of children affected by ADHD worldwide differs critically. According to the meta-analysis performed by Willcutt (2012), the overall prevalence for ADHD ranged from 4% to 13.3% depending on the specific procedures used to integrate information from multiple raters and to measure functional impairment. These results clearly illustrate how prevalence estimates are sensitive to methodological differences. Moreover, the lack of clear guidelines or specific directives when it comes to perform an ADHD diagnosis may also explain these differences in prevalence, not only between countries but also between individual clinicians.

Diagnostic criteria are properly defined and specified both in the latest versions of the Diagnostic and Statistical Manual for Mental Disorders (DSM-IV, IV-TR, and 5) (APA, 1996, 2000, 2013) and in the International Classification of Diseases (ICD-10) (World Health Organization, 1992). In general terms, the diagnosis of ADHD is mainly based on clinical interviews with the patient and third parties’ observations by means of structured scales such as the ADHD Rating Scale-IV (DuPaul, Power, Anastopoulos, & Reid, 1998) or EDAH for teachers (Farré & Narbona, 2010) as well as DSM-based interviews with parents. Rating scales are systematic and quantitative but, unfortunately, relying on reported symptoms or a third party observation is by definition not fully objective. Thus, while rating scales are seen as a necessary component of the diagnostic process, they are not sufficient for an accurate diagnosis. Faced with an ambiguous clinical picture with mixed ratings from parents and teachers, the practitioner is often tempted to refer children to specialists, usually clinical child psychologists that are well trained to diagnose ADHD and provide a full cognitive assessment (Gualtieri & Johnson, 2005).

Rating Scales

Clinicians may not perform daily life behavioral observations such as those collected by means of validated questionnaires. Hence, that information is not directly available for them and has to be provided by third parties who have this kind of interaction with the child. Thus, parents and teachers are the main information sources in relation to the child’s behavioral disorders and their reports are necessary to establish an accurate ADHD diagnosis. Rating scales and clinical interviews are complementary instruments for ADHD diagnosis and their joint utilization allows assessing frequency and intensity of symptoms.

However, studies about the usefulness of the information provided by these informants about subjects with and without ADHD are not conclusive. Variability of these results seems to be related to specific diagnostic criteria, measures, and even to the geographical area in which the study was carried out (Polanczyk, De Lima, Horta, Biederman, & Rohde, 2007). Other factors associated to these differences are the type of sample (community-dwelling vs. clinical sample), sociocultural features (Blázquez-Almería et al. 2005), and the information source (parents, teachers, or children).

With regards to the type of sample and based on the ADHD subtype or clinical presentation, the frequency for the inattentive presentation or subtype has been reportedly higher in community samples, while combined presentation or subtype is more frequent in clinical samples. According to Capdevila-Brophy, Navarro-Pastor, Artigas-Pallarés, and Obiols-Llandrich (2007), boys are overrepresented in both subtypes, while girls, though being a minority, have more representation in the inattentive group. Additionally, children in the combined group are identified and treated earlier than those in the inattentive group. In this sense, a misdiagnosis bias has been reported based on gender (Bruchmüller, Margraf, & Schneider, 2012). While both over- and underdiagnosis of ADHD can result in harmful consequences for the patient, it is important to strive for an accurate diagnosis, which is based on accepted diagnostic criteria and not biased by other factors. In relation to this, there is an ongoing scientific and public debate in relation to the potential overdiagnosis of ADHD in children. Only a few empirical studies have addressed this important issue and, broadly, they have found a trend for potential overdiagnosis. Thus, it appears that therapists do not adhere strictly to diagnostic criteria, and that not all therapists seem to follow DSM or ICD requirements to base their diagnosis on a thorough evaluation of the relevant diagnostic criteria (Bruchmüller, 2012). The great problem with over- or underdiagnosis extends to the fact that mental disorders are often associated with negative perceptions, and these perceptions may have pernicious consequences (Eisenberg & Schneider, 2007; Narad et. al. 2015). Consequences of stigma may include lower self-esteem, increased feelings of hopelessness and isolation, and a reduced likelihood of seeking and obtaining care. Negative perceptions also represent a known concern in the specific context of ADHD in the childhood. Children’s self-esteem and self-confidence have been shown to be highly sensitive to perceptions maintained regarding the disorder by family, teachers, and peers (Roe, 1998; Wheeler & Carlson, 1994). Furthermore, parents’ and teachers’ perceptions and expectations affect their interactions with other children, which can in turn affect children’s behavior and academic success (Chi & Hinshaw, 2002). Several studies reveal teachers’ and parents’ expectations may create a self-fulfilling prophecy and even affect the level of post-secondary education a child would attain (Madon, Guyll, & Spoth, 2004).

In addition, discrepancies between information sources may add more confusion to the achievement of a proper ADHD diagnosis. For example, as Eisenberg and Schneider (2007) suggest, there is a variety of non-mutually exclusive reasons that explain why parents’ perceptions of children diagnosed with ADHD are more negative, on average, compared to judgments performed by teachers. Parents may be less familiar with ADHD, and that can be one reason to truly hold more negative views about the disorder. Alternatively, children with a diagnosis of ADHD may behave more disruptively and perform assignments more poorly at home than at school, especially when medication effects are restricted to the time they spend at the educational environment. Finally, teachers are probably less likely to be aware of ADHD diagnoses (Cáceres & Herrero, 2011; Wolraich et al. Worley, 2004) although they tend to identify the disorder more often than parents (Sandberg, 2002).

However, each reporter provides unique and valid clinical information related to ADHD symptoms presentation that should not be ignored (Narad et al. 2015). For that reason, clinical utility of behavioral rating scales, in contrast to structured interviews, relies on a dimensional methodology that provides an assessment of behavior and emotional difficulties along a continuum from normal to abnormal, without a clear delimitation between the presence or absence of the disorder (Achenbach & McConaughy, 1996).

Neuropsychological Testing

Neuropsychological evaluation has progressively become a part of the protocol for an efficient approach to the understanding of ADHD (Holmes et al., 2010). Conventional “psychological” tests can indeed capture the elements of inattention and impulsivity that characterize patients with ADHD. This kind of testing also identifies present cognitive disabilities that make the management of the disorder more complex. However, there are two major limitations with neuropsychological testing as a routine approach to ADHD diagnosis: one is related to expenses and the other is related to the fact that ordinary “psychoeducational” batteries reflect only an indirect measure of ADHD. Attention deficits, locomotor hyperactivity, and cognitive impulsivity may or may not be evidenced when a child is in a small room, face to face with a single adult (Gualtieri & Johnson, 2005). As a consequence, this type of evaluation by means of conventional neuropsychological batteries has suffered considerable criticism due to a low predictive value that leads to numerous false negatives (i.e., patients with ADHD wrongly discarded as not suffering the disorder, thus preventing them from receiving appropriate treatment), a lack of correlation between ADHD symptoms and neuropsychological deficits (Bolea-Almagnac et al. 2014; Epstein et al., 2003), a limited clinical use confined to individual descriptions (Lange et al. 2014), and the lack of ecological validity (Bolea-Almagnac et al. 2014; Gualtieri & Johnson, 2005; Parsons, 2016).

These criticisms have opened the door to new neuropsychological evaluation methods, such as computerized tests of attention. According to several authors (Bioulac et al., 2012; Díaz-Orueta et al., 2014; Etchepareborda, Paiva-Barón, & Abad, 2009), computerized tests increase a patient’s interest and motivation for the task, making the assessment more entertaining and enjoyable. They allow a more precise control of time (both regarding the stimuli presentation and response) and the “experimenter effect” is reduced to its minimum. Moreover, they save time, significantly improve accuracy and speed of scoring, and allow registering a series of variables and scores that exceed human capacities (Conners, Epstein, Angold, & Klaric, 2003). All these features also tend to increase the opportunities for research (Etchepareborda et al. 2009).

In this sense, as many researchers remark (Epstein et al. 2011; Negut, Jurma, & David, 2017; Nolin et al., 2016; Parsey & Schmitter-Edgecombe, 2013) clinical application of Virtual Reality (VR) certainly provides new opportunities for assessment, such as customization for target populations, specific cognitive domains, and unique settings. VR scenarios allow for measurement of simulated everyday tasks in a safe and controlled environment and offer the opportunity to assess the influence of environmental stimuli (e.g., distractions, interruptions) on cognitive performance, which may provide a more ecologically valid assessment of everyday skills (Díaz-Orueta, 2017; Parsey & Schmitter-Edgecombe, 2013). With regard to ecological validity, there is a need to move beyond the limited generalizability of results found in many developed neuropsychology batteries to measures that more closely approach real-world functioning (Parsons, Carlew, Magtoto, & Stonecipher, 2017; Parsons, 2016). As these authors suggest, a more ecological approach to neuropsychological assessment implies moving from construct-driven assessments to tests that are really “representative” of real-world “functions” and provide with results that are “generalizable” for prediction of functional performance across a range of situations (function-led approach), namely, to capture the complexity of the response required in many multistep tasks found in everyday activities.

When it comes to VR applied to the Continuous Performance Test (CPT) (considered the most accurate tool to measure attentional processes) (Epstein et al., 2011; Negut et al., 2017; Negut, Matu, Sava, & David, 2017; Nolin et. al. 2016; Tarantino, Cutini, Mogentale, & Bisiacchi, 2013) it can be said that VR platforms promote the collection of additional cognitive and behavioral information about neuropsychological testing performance beyond data obtained through traditional assessments. CPTs conducted through VR seem to be better at detecting problems experienced by users because they emphasize the ability to be representative of people’s functioning in everyday situations (Nolin et al. 2016). However, a recent meta-analysis conducted by Negut et al. (2017) pointed out that VR measures are more complex and difficult because they replicate conditions similar to everyday life and, as a consequence, performance obtained on these tests is usually poorer than on classical measures. Thus, tasks embedded in VR may have an increased level of complexity and difficulty that requires additional cognitive resources. This could be one of the reasons to explain resistance to adopt additional measures that incorporate technology components to existing tests, rating scales, and questionnaires (Parsey & Schmitter-Edgecombe, 2013). The field of neuropsychology has made lower progress than what would be expected in comparison to other fields in terms of adopting technology and, hence, the potential for an effective use of technology has not yet been realized. A higher awareness with this regards would imply recognizing an adequate adoption of new technologies, which could result in a more comprehensive assessment of cognitive dysfunction and, in turn, the achievement of better informed diagnoses and treatments (Diaz-Orueta, 2017).


The goal of the current study is double: 1) to test the external validity of the AULA test by means of identifying the core factors of the test and whether it provides valid measures of inattention, impulsivity, and hyperactivity and 2) to analyze the accuracy of AULA test to support diagnosis of ADHD in addition to subjective reports provided by third parties (i.e., parents and teachers). First, it is hypothesized that AULA will show clear factors associated to measures of inattention, impulsivity, processing speed, sustained attention or vigilance, and hyperactivity. Second, it is expected that the information provided by AULA will show to be more accurate in correctly identifying ADHD cases and discarding non-ADHD individuals than subjective reports provided by third parties (i.e., parents and teachers).


Sample and Diagnostic Procedures

The final study sample comprised 407 children (272 boys and 135 girls) between 6 and 16 years old. Among them, 213 had an ADHD diagnosis: 105 were inattentive (72 boys and 32 girls) and 108 were combined-type (83 boys and 25 girls), while 194 were typically developing controls (116 boys and 78 girls). They were recruited from two private primary and secondary schools, a neuropsychology clinic, and a pediatric neurology clinic in northern Spain (see Table 1).

Table 1

Descriptive Statistics and Significant Differences between Subtypes of ADHD

Note. ADHD-C (combined subtype); ADHD-I (inattentive subtype); ns = no significant differences. 1In items 1, 3 y 5; 2In items 6, 15 y 16.

The clinical sample comprises 213 patients with an ADHD diagnosis that fulfill the following inclusion criteria: a) to attend outpatient services in a neuropediatric unit or at the psychopedagogical services, where a neuropsychological assessment is performed as part of the diagnostic process; b) to show an IQ within the normal limits (IQ > 80); and c) to show consent to participate in this study. Ethics approval was obtained for the development of the study and parents provided written informed consent on behalf of their children to let them participate.

ADHD diagnosis was made by a clinical diagnostic team who considered both data from parents’ (DSM-IV-TR criteria) and teachers’ ratings (EDAH for teachers) (Farré & Narbona, 2010), as well as clinical interviews with children and their parents. No significance was obtained between the information gathered from parents and and the information from teachers (McNemar = .125, p = .05). Children were considered to meet the criteria for a symptom domain or clinical presentation (inattention and/or hyperactivity/impulsivity) if the parent and teacher rating scale reported at least 6 non-overlapping symptoms in a particular domain. Children who met this criterion for both domains were inserted in the combined-type group, while children who met symptoms only for inattention were enrolled in the inattentive group. Those children with ADHD taking stimulant medications were included in the study but were required to be off medication for 48 hours prior to testing.

There were no statistically significant differences according to age in relation to the symptomatology observed by parents (χ2 = 16.546, p = .085) and teachers (χ2 = 15.437, p = .117). Moreover, there were no differences according to gender in the symptomatology objectified either by parents (χ2 = 1.452, p = .228) or teachers (χ2 = 499, p = .480). Hence, no analysis was performed using age or gender as covariates.

The gender ratio for the ADHD group of boys to girls is almost 3:1 (more specifically, 2.7 boys per girl). Epidemiological studies held in schools using gender- and age-validated questionnaires have shown a similar incidence (Cardó, Servera-Barceló, & Llobera, 2007).

Controls showed minimal symptoms of ADHD reported on the same parent and teacher rating scales and did not meet criteria for any other behavioral disorder. Participants with a full-scale score below IQ = 70 on the Wechsler Intelligence Scale for Children (WISC-IV) were excluded from the current analyses (Table 2).

Table 2

Descriptive Statistics of IQ and Subscales (Group with ADHD)

To carry out the second objective of the study only those subjects in the sample whose EDAH results were registered item by item were considered. Thus, 227 children were recruited, 72 with ADHD diagnoses (37 inattentive and 35 combined-type) and 155 controls.


All children were administered the AULA based neuropsychological test, designed for a comprehensive evaluation of attention processes; parents received the ADHD Rating Scale-IV (DuPaul et al. 1998) and EDAH (Farré & Narbona, 2010) rating scale was filled out by teachers. In those clinical subjects a standard assessment procedure consisting of an open clinical interview with parents and a more extensive and thorough cognitive evaluation was also conducted.

ADHD Rating Scale. ADHD Rating Scale-IV was administered with diagnostic purposes to all parents of children in the sample. This measure includes 18 symptom criteria for a diagnosis of ADHD according to the DSM-IV-TR criteria (DuPaul et al., 1998).

Parents were asked about the frequency for each child symptom on a 0 to 3 Likert’s rating scale: 0 = never, 1 = sometimes, 2 = often, and 3 = very often. To fulfill the criteria for a symptomatic ADHD diagnosis, six of nine symptoms occurring often or very often in one or both the subscales must be present.

EDAH Rating Scale. EDAH is a revised Spanish version of the Conners’ Teacher Rating Scale-Revised. It is typified in primary school children (6-12 years) and it has proven to be very helpful in the assessment of ADHD. This rating scale shows excellent indicators for reliability and validity. EDAH allows collecting information on the typical child behavior at school. It provides a structured observation method for the teacher, comprising 20 items (derived from DSM criteria) easy to understand and requires a minimum investment of time for its completion. Each item is scored from 0 to 3, with higher scores indicative of the presence of symptoms.

EDAH comprises three subscales: inattention, hyperactivity/impulsivity, and behavioral problems. However, since AULA only measures attention and hyperactivity and tests that can potentially quantify EDAH items related to behavior problems could not be found, EDAH scale items related to behavior problems were not considered in the present study. Items that were finally analyzed and subareas to which they belong are listed below:

Hyperactivity (H): item 1, “excessive motor activity”, item 3, “frequently annoys other children”, item 5, “demands immediate satisfaction” , item 13, “moves constantly, uneasy”, and item 17, “impulsive and irritable”.

Inattention (I): item 2, “has academic learning problems”, item 4, “distracts easily, show poor attention”, item 7, “has head up in the clouds”, item 8, “does not end what he/she starts”, and item 19, “frustrates his efforts easily, inconstant”.

Teachers needed to answer to each item according to a Likert scale depending on the degree to which the child shows the behavior that is described: 1 = none, 2 = little, 3 = quite, and 4 = a lot. In order to make appropriate statistical analyses, scores were grouped in the following categories: 0 = shows no symptoms (1 and 2) and 1 = shows symptoms (3 and 4).

Table 3

Pearson Correlations between the 5 Factors of AULA and Attention Deficit, Hyperactivity, and Behavioral Problems Variables with EDAH in the Study Sample (N = 407)

AULA Virtual Reality Test. AULA (classroom in Spanish) is a VR based neuropsychological test, designed to comprehensively assess attention processes (Climent & Banterla, 2011) and support an ADHD diagnosis in children between 6 and 16 years old (Díaz-Orueta et al., 2014; Zulueta, Iriarte, Díaz-Orueta, & Climent 2013), with high test-rest reliability (Fernandez-Fernandez, Morillo-Rojas, & Alonso-Romero, 2012; Iriarte et al., 2016), sensitivity, and specificity (Rufo-Campos, Cueto, Iriarte, & Rufo-Muñoz, 2012).

It is a CPT paradigm based test with different tasks and distracting conditions, presented in the virtual scenario of a school classroom (Climent & Banterla, 2011). AULA is composed of two main exercises: a non-X paradigm based exercise (“press the button when you DO NOT see or hear apple”) and an X paradigm based exercise (“press the button whenever you DO see or hear seven”). Stimuli are presented both on a visual basis (on the classroom’s blackboard) and on an auditory basis (the patient listens to them with the earphones). At the same time, visual, auditory, or combined distractors of ecological nature (i.e., equal to those that may appear in a real classroom environment, such as an ambulance going by, a child coughing, or the teacher walking through the classroom) are presented.

The main variables measured by AULA are as follows:

  • Omission errors (inattention): patients do NOT press the button when they should.
  • Commission errors (impulsivity): patients press the button when they should NOT.
  • Reaction time (processing speed): measured for correct answers and commission errors.
  • Variability (standard deviation) in reaction time (sustained attention): changes in reaction time patterns during the test.
  • Motor activity (hyperactivity): head movement, tracked with a movement sensor placed in the 3D glasses.

The following scores include total scores, scores comparing non-X versus X task, scores comparing visual vs. auditory performance (divided attention), scores with versus without distractors (interference of external distractors). There is also a final score on the quality of attention focus that compares number of errors performed when the blackboard is in the viewing angle versus when it is not in the viewing angle.

Data Analyses

Statistical analyses were performed with SPSS version 21.0 and correspond to the two mayor objectives of this study: to test the external validity of AULA and to estimate whether AULA test contributes to the diagnosis of ADHD providing additional information.

In order to carry out the first study goal, the external validity in AULA, a factorial analysis of cognitive variables contained in AULA was performed and, subsequently, factors obtained were correlated with behavioral variables in EDAH (using Pearson correlation). Moreover, a study on AULA sensitivity and specificity was provided.

In order to achieve the second goal (i.e., to provide valuable and complementary information to clinical diagnosis by means of AULA), comparisons (Mann Whitney’s U) were held between performance obtained in 5 AULA factors and items of EDAH Inattention and Hyperactivity scales. Finally, followed by a cluster or k-means conglomerate analysis, different subgroups in which the study sample may be classified were provided, according to their performance in AULA.


External Validity

First, a factor analysis of AULA variables was conducted in order to reduce data to factors. The measure of sampling adequacy, Kaiser-Meyer-Olkin (KMO = .903), and the measure of adequacy of the sample for each variable (MSA) are greater than .50 indicating a good fit of the data to a factorial analysis model. Thus, 5 factors or components that account for 82.37% of the total variance were obtained from 407 subjects and are called: sustained attention, impulsivity control, processing speed, response variability, and control of motor activity.

Then, the correlation between variables of inattention and hyperactivity of EDAH with the five factors obtained with AULA was calculated by obtaining Pearson correlation coefficients and, as expected, a significant correlation between all analyzed variables was found (see Table 3).

A discriminant analysis was then performed on results obtained by participants from whom the five factors were obtained in AULA. In this way, it could be seen how AULA classifies correctly the study participants according to their membership to either the group with or without ADHD. AULA presents a moderate degree of specificity, identifying 75.3% of healthy participants, and a sensitivity of 68.1% of individuals with ADHD correctly identified (Table 4).

Table 4

Discriminant Analysis of AULA: Classification of all Participants (N = 407) according to Performance in the Factors in their Group Membership (Group with ADHD and Control Group)

Note. 1Specificity; 2sensitivity.

The current results provide support for the external validity of AULA, that classifies correctly study participants according to their membership to either the group with or without ADHD. That 24.7% of false positives detected, a percentage which AULA does not seem to properly discard as non ADHD, may be due to the possibility that AULA detects attention difficulties that are also linked to other disorders different from ADHD. In other words, AULA’s ability to detect attention difficulties may not be only specific for ADHD but also for other learning or neuro-developmental disorders involving attention deficits. In terms of the 31.9% of false negatives (subjects who perform well in AULA despite their clinical diagnosis of ADHD), they may belong to a subgroup of ADHD patients who, in controlled or structured environments, may perform correctly.

Table 5a

Mean Differences using Mann-Whitney’s U on Performance with AULA Based on the Presence/Absence of Each Item of EDAH Inattention Subscale (N = 227)

Table 5b

Mean Differences using Mann-Whitney’s U on Performance with AULA Based on the Presence/Absence of Each Item of EDAH Hyperactivity Subscale (N = 227)

Contribution of AULA to the Diagnosis of ADHD

The second objective of the study aims to investigate about what AULA provides for the diagnosis of ADHD in comparison to subjective reports provided by third parties. This analysis was performed with EDAH scores obtained item by item so only a part of the sample that had these data (227 subjects, 72 with ADHD and 155 controls) was used.

First, pairs of groups were created based on the presence or absence of the symptom of each EDAH individual item taken from both attention deficit and hyperactivity subscales from the previously presented factor analysis (inattention, impulsivity, processing speed, variability in reaction time, and motor activity) (Tables 5a and 5b). Due to lack of normality of studied variables, a Mann Whitney’s U test was applied to analyze these differences (m = mean ranks).

In relation to the presence or absence of the symptom of each EDAH individual item from attention deficit and hyperactivity subscales and performance on tasks measuring similar variables (AULA), the findings were the following: results for the items comprised in EDAH inattention subscale showed significant differences in all variables except in processing speed (Table 5a). With regards to items comprised in EDAH hyperactivity subscale in Table 5b those items in which differences were observed are indicated. Thus, item 1, “excessive motor activity”, showed significant differences in all variables except in processing speed; or item 13, “moves constantly, uneasy”, showed significant differences in impulsivity, variability in reaction time, and motor activity.

To study whether AULA adds relevant information in the diagnosis of ADHD, a cluster analysis was performed. For this analysis the total sample of subjects (N = 407, 194 controls and 213 with ADHD) was again used. Thus, according to the performance obtained in AULA in the five variables, a cluster analysis with both groups was carried out. No statistically significant differences were found for performance in those 5 factors according to gender in any of the groups.

According to the performance in AULA for the five studied factors, we obtained 4 clusters in the analysis of conglomerates with the control group (Table 6a). As an example, healthy children with good performance form cluster 2 or subjects showing a performance below the mean form cluster 4. According to the performance in AULA, subjects with ADHD are distributed in 6 groups of clusters (Table 6b) (for example, cluster 3 comprises children with normal performance; cluster 6 is an extreme subgroup that presents deficits throughout all variables; cluster 1 is formed by children with impulsivity and hyperactivity; cluster 2 and 5 show a low performance in impulsivity). Tables 6c and 6d specify results obtained for ADHD subgroups of combined and inattentive subtypes, respectively.

Table 6a

Results from the Analysis of Conglomerates. Grouping of Control Group Subjects (n = 194) in 4 Clusters according to Performance in AULA Factors (T scores, M = 50, SD = 10)

Note. Cluster 1 = mild inattention and quick responses.

Cluster 2 = average performance.

Cluster 3 = mild impulsiveness.

Cluster 4 = mild inattention and impulsiveness; quick responses; moderate hyperactivity.

Table 6b.

Results from the Analysis of Conglomerates. Grouping of ADHD Subjects (n = 213) in 6 Clusters according to Performance in AULA Factors (T scores, M = 50, SD = 10)

Note. Cluster 1 = inattention and moderate hyperactivity; quick responses.

Cluster 2 = mild impulsiveness and slowness.

Cluster 3 = average performance.

Cluster 4 = mild inattention and quick responses.

Cluster 5 = inattention, impulsiveness and mild hyperactivity.

Cluster 6 = mild inattention; slowness; impulsiveness and moderate hyperactivity.

Table 6c

Results from the Analysis of Conglomerates. Grouping of ADHD Subjects, Combined Subtype (n = 108) in 4 Clusters according to Performance in AULA Factors (T scores, M = 50, SD = 10)

Note. Cluster 1 = moderate inattention and hyperactivity; mild impulsiveness; quick responses.

Cluster 2 = average performance.

Cluster 3 = mild inattention and impulsiveness; moderate hyperactivity.

Cluster 4 = mild impulsiveness and slowness.

Table 6d

Results from the Analysis of Conglomerates. Grouping of ADHD Subjects, Inattentive Subtype (n = 105) in 4 Clusters according to Performance in AULA Factors (T scores, M = 50, SD = 10)

Note. Cluster 1 = mild inattention and impulsiveness; moderate hyperactivity; quick responses.

Cluster 2 = average performance.

Cluster 3 = mild inattention and impulsiveness; slowness; moderate hyperactivity.

Cluster 4 = mild inattention.


Accurate ADHD diagnosis is crucial to appropriate treatment. In this section, we first discuss the implications of the current results for the external validity and the complementary diagnostic information that AULA provides. Then we examine the broader clinical and theoretical implications of these findings, we consider possible limitations, and conclude by highlighting several key directions for future research on ADHD.

First, in relation of external validity of AULA, it could be seen that AULA generally does well in classifying subjects participating in the study according to whether they belonged to the group of ADHD or to the non-ADHD group, showing a moderate degree of specificity and sensitivity.

Despite the false negatives identified (those clinically diagnosed of ADHD who perform well in AULA), this could suggest that they may belong to a subgroup of ADHD patients who, in controlled or structured environments, may perform correctly. This fact does not invalidate the diagnosis, as they constitute a group of children that potentially self-regulate better despite their ADHD. Hence, AULA can contribute to the diagnosis of ADHD but can also identify the presence of other type of attention difficulties that may be linked to different developmental or learning conditions. As these results contradict those obtained by Rufo et al. (2012), in which AULA features of sensitivity and specificity showed to be significantly higher, other factors or the effect of potential confounding variables, such as the influence IQ (especially for those children above 130), and comorbidities (other coexisting disorders which were absent in the study of Rufo et al., 2012, but which were not conclusively ruled out for this study) should be considered as reasons for this discrepancy, provided that, as stated above, behavioral subscales and items of teachers’ EDAH and Conners’ parent scales do not track cognition as AULA, and AULA only tracks hyperactivity in terms of movement (not in terms of disruption of home or class environment).

Second, it can be established that there is a correlation between the presentation of behavioral symptoms (in a moderate intensity) with low or medium performance on cognitive tasks that measure similar variables. Thus, the presence or absence of a symptom of inattention or impulsivity/hyperactivity significantly correlates with a low to moderate performance in cognitive tasks. This seems to hold even for other executive variables different from those affected in ADHD (as it is the case with all the inattention subscale items, or, for example, item 5 (i.e., “demands immediate satisfaction”) correlates with impulsivity control and control of motor activity, thus, they are measuring the same thing. Therefore, in agreement with previous researches (Areces, Rodríguez, García, Cueli, & González-Castro, in press) it can be established that AULA confirms the diagnosis of ADHD and even anticipates difficulties of different etiology that is not specific of individuals with ADHD. These findings are consistent with other studies that highlight the importance of relying on rating scales for the assessment of ADHD (Charach, Chen, Hogg-Johnson, & Schachar, 2009; Parker & Corkum, 2016).

In relation to the complementary information provided by AULA for the diagnosis of ADHD and oriented to a cognitive behavioral intervention, AULA enhances the relevance of dimensions related about how to behave, to cope with tasks, or even how to face and solve problems in a way that is not provided by the behavioral diagnosis obtained with DSM scales and EDAH. However, as it has been observed in obtained clusters, groups are differentiated on the basis of the presence of hyperactivity and/or impulsivity. Thus, AULA will differentiate subgroups in an efficient way, that is, it will specify executive profiles not previously detected by mean of traditional diagnostic tools or procedures. In summary, AULA provides information about performance in different domains and discriminates between impulsivity and hyperactivity.

In this study, children below the threshold of symptoms for a clinical diagnosis of ADHD but showing inattention and hyperactivity problems demanding an intervention have been detected. In the control group, 15 children without an ADHD diagnosis showed a low performance in all variables, hence, they would require some kind of treatment. With regards to the group with a diagnosis of ADHD, data obtained with the application of AULA allows dividing subjects in one group that present impulsivity and another group showing hyperactivity, for which the target intervention will be different. In this sense, these findings will allow the improvement and the specification of intervention proposals or designs. Moreover, the finding of a subgroup of ADHD diagnosed children who were able to show a good performance despite their diagnosis has to be reported. For this subgroup, formed by children who may be able to control themselves in highly structured environments and tasks, the prognosis seems positive.

In summary, AULA can reasonably confirm the diagnosis of ADHD and complement the information obtained by means of observation scales with regards to different ADHD subtypes or clinical presentations. Finally, the current study, in agreement with numerous previous researches (Areces et al., in press; Bioulac et al., 2012; Epstein et al., 2011; Etchepareborda et al., 2009; Negut et al., 2017; Nolin et al., 2016; Parsey & Schmitter-Edgecombe, 2013), provides also support for the value and ecological validity of technological-based measures in neuropsychological assessment.

The interpretation of these results must be seen in the context of the current study’s limitations. There are limitations related to the sample group. First, the ADHD group included only two of the three ADHD subtypes. The ADHD predominantly hyperactive-impulsive subtype was not included separately but merged with the group of mixed or combined subtype. Nevertheless, excluding this subtype allows drawing conclusions across all three subtypes.

Other limitations refer to the control group. In this study, we only considered typically-developing children. We did not include a group with psychopathology. As a result, we are not able to assert if the intergroup differences are only due to the ADHD condition or to other learning or potential neurodevelopment conditions not previously identified, at least partially, for the individuals comprised in the control group.

To carry out the second objective of the study, only those subjects in the sample whose EDAH results were registered item by item were recruited. This resulted in a significant decrease in the number of subjects in this analysis and consequently, a decrease in the general representativeness.

Limitations in the extent to which virtual reality simulations reflect real experiences of individuals as well as limitations related to psychological concerns (influence of VR on cognitive performance, internal and ecological validity of tools, utility for all populations, lack of normative data properties for different clinical populations) must be also taken into account.

Finally, although the AULA may be a useful clinical tool for ADHD diagnosis and the identification of subtypes of attentional disorders in children, its usage does not seem to be vastly superior to other instruments. The sensitivity and specificity of this instrument in relation to other measures of attention, such as seen in the Continuous Performance Test, are similar (Negut et al., 2017). In this sense, the AULA VR becomes another effective possibility for evaluating attentional processes.

Technology development, including VR, is continuing to grow and new applications are definitely emerging for the field of neuropsychology. VR technology offers a great opportunity to work in real-world simulation scenarios by means of safe, attractive, and controlled environments (Nolin et. al. 2016; Tarantino et al. 2013). Neuropsychologists should continue to move beyond in research for new strategies to assess cognitive abilities that allow them to offer patients effective and high quality diagnosis services.

Future research may try to obtain different neuropsychological profiles for ADHD and other developmental disorders in order to help increase accuracy in differential diagnosis, not only between different ADHD subtypes, or between ADHD and healthy controls, but also between ADHD and other learning, behavioral or neurodevelopmental disorders that sometimes may coexist with ADHD and mask actual results by increasing or decreasing performance. More specifically, the role and impact of high and very high IQs, other learning disorders (such as dyslexia), and conditions related to processing speed that may be inappropriately diagnosed as ADHD will have to be included in future studies.

Cite this article as: Zulueta, A., Díaz-Orueta, U., Crespo-Eguilaz, N., & Torrano, F. (2018). Virtual reality-based assessment and rating scales in ADHD diagnosis. Psicología Educativa, 25, 13-22.


Cite this article as: Zulueta, A., Díaz-Orueta, U., Crespo-Eguilaz, N., & Torrano, F. (2018). Virtual reality-based assessment and rating scales in ADHD diagnosis. Revista Psicología Educativa, 25, 13-22. 

Correspondence: (A. Zulueta).

Copyright © 2024. Colegio Oficial de la Psicología de Madrid

© Copyright 2024. Colegio Oficial de la Psicología de Madrid ContactoPolítica de privacidadPolítica de cookies

Utilizamos cookies propias y de terceros para mejorar nuestros servicios y conocer sus preferencias mediante el análisis de sus hábitos de navegación. Si continua navegando, consideramos que acepta su uso. Puede acceder a política de cookies para obtener más información.