Spectrum Disorder (ASD) is a neurodevelopmental condition defined by
impairments across the areas of reciprocal social interaction and verbal and
non-verbal communication, alongside repetitive and stereotyped behaviors”
(Bölte, 2010). ASD children also suffer from a lack of eye contact (Wimpory,
2000) and this poor eye contact directly affects their social interactions. Research
suggest that even before spoken responses have begun to mature, eye contact do
serve an important social function for young kids (Mirenda, 1983; stern, 1985).
Impairment in the use of eye contact for non-verbal communication has been
argued as a major characteristic of autism (American Psychiatric Association,
1994). It has been supported that poor eye contact, due to its direct
relationship with the ability to perceive and carry out teacher and
instructional requests will negatively influence previous educational gains of
autistic children (Greer, 2007).
treatment is currently available for ASD but There are both a big amount of
research and assuring practices based on technology which target the core
problems of autistic children in communication and socialization. Individuals
with ASD have strength in visual processing which facilitates the use of
technology for their rehabilitation programs and they also possess other
learning characteristics such as desire for sameness and interest in inanimate
objects which are all well suited to technology (Cafiero, 2001). One of these
technologies with a promising effect is Virtual Reality.
the Cai research team designed a Virtual Dolphinarium which aimed to help
Autistic children to learn communication through hand gestures to interact with
virtual dolphins (Cai,2013). Also, a game named “Astrojumper”, designed by the
University of North Carolina, assists autistic children in developing dexterity
and motor skills by allowing them to maintain their physical balance to avoid
virtual space objects flying towards them (Finkelstein, 2010).
different attempts for rehabilitating ASD children by the mean of VR, less
research has been done on the eye contact ability of ASD children. In a project
by researchers from Vanderbilt University, they tried to condition eye contact
in Autistic children by creating a virtual storyteller to guide their focus on
the communicator (Lahari, 2011). Previous projects all have used external
signals, cues and prompts to attract the attention of ASD child to the face of
the communicator. Some have used fixed visual prompts while other more advanced
techniques have focused on fading prompt approaches. While all these studies
are important for the progress of our knowledge and improving our understanding
of autism and its probable future therapies, however they all have aimed to
condition eye contact by presenting an external prompt to grab their attention
and maintain it for a while on a fixed position. one would argue that
conditioning eye gaze with an external prompt without any manipulation in ASD
individual’s implicit cognitive representations of themselves and others might
seem insufficient and narrow. Presenting prompts to grab the attention of
autistic children for a long time would seem nearly impossible due to the costs
and difficulties of carrying around specialized glasses and related equipment.
So the therapy should target more fundamental mechanisms underlying ASD
children’s social communication difficulties rather than just conditioning them
with prompts that will not be presented to them in real life situations
proposal, its suggested that eye contact conditioning techniques which are
performed in a synchronous mode, might provide us a new way for activating eye
contact in ASD children more profoundly and thus long-lasting than techniques
currently in use.
synchrony means a simultaneous action or occurrence. In psychology,
interactional synchrony relates to the timing and pattern of the interaction. “Researches
have suggested that synchrony between real hand movements and virtual-hand
movements creates or increases the illusion that the virtual hand is a part of
the person’s own body, so called “virtual hand illusion” (Ma & Hommel,
2013). Other researches indicate that watching the face of a person while that
face and one’s own face are touched synchronously induces the illusion of
”owning” the other face, named, enfacement illusion. (Tsakiris, 2013;
Tsakiris, 2008). ”Enfacement Illusion” is defined by the illusory sensation
of being in control of the facial movements of the person in the video and
perceive a similarity between the virtual face and their own face, if they both
are in a synchronous position. Ownership illusions are even more strong when
multiple informational sources are mixed together (Ma & Hommel, 2015).
justifying these illusionary sensations to happen when multiple informational
sources are combined in a synchronous condition is the Theory of Event Coding (TEC).
TEC assumes that perceived and produced events (i.e., perceptions and actions)
are cognitively represented in a common way as integrated networks of
sensorimotor feature codes or event files. TEC does not distinguish between
social and nonsocial events, which implies that people represent themselves and
others – be them other individuals or objects – in basically the same way.
based on this
theory, Ma, Sellaro, Lippelt and Hommel, (2016) investigated to see if
perceiving ownership for another face is followed by adopting the emotions that
this other face is expressing. Their results showed that even a short time of
being in synchrony with a virtual face that smiles at you can make you feel
happier. However, such effect didn’t happen when the faces were not in
synchrony. So the question that comes to the mind is that if mood migration can
happen because of synchrony, can attention migration also happen in a
goal is to investigate whether that eye contact ability will pass to the ASD
child from virtual face presented in a virtual reality environment, provided
that both face’s movements will be in synchrony while the eye contact training
is going on. In this experiment it’s meant to use synchrony between face
movements of two agents to help them perform eye contact by perceiving it in
their virtual twin face which theoretically is assumed to be under the
illusionary ownership of themselves.
hypothesized in this proposal that eye contact which is theoretically defined
as the state in which two people are aware of looking directly into one
another’s eyes will be enhanced between two agents if they are in head movement
synchrony than in situation which there is no synchrony between two agents. the
Synchrony is defined as The relation that exists when things occur at the same
time and in this research it will be created by the mean of Kinect which is a
device that can provide full-body 3D motion capture. The eye contact techniques
will be designed based on the combination of previous researches on eye contact
training and virtual agent’s eye gaze on target objects will be used as a cue
to guide attention of the child to desired intentioned targets.
aims of this research would be
the effects of synchrony of movement between virtual agent and participant on
the eye contact ASD participant creates with virtual agent
eye contact training efficacy in situation in which synchrony is going on in
comparison with non-synchronous situations
hypothesis is that synchrony in movement between agents has no effect on eye
contact quality of autistic children.
experimental design of current project will be classic pretest-posttest
experiment. Participants of this experiment will be children clinically
diagnosed as autistic whom will be selected randomly based on their account
number from Tehran University-related mental health clinics and then will be
randomly assigned to one of either control or experimental groups. Experimental
group’s participants will go through a 3 level interaction with a virtual agent
in which they will practice eye contact with a head and face movement
synchronous virtual agent while the control group will just practice the same
eye contact practices without movement synchrony. They must have been
previously received a diagnosis from an independent clinician according to
standard criteria of ASD. Groups must be
matched based on age and IQ. For IQ assessment, Raven’s Standard Progressive
Matrices can be used. This is an IQ test, which contains multiple choice
questions assessing the abstract reasoning. The Raven Progressive Matrices test
is a widely used intelligence test in many research settings.
of eye contact, participants complete the Brief Observation of Social
Communication Change (BOSC-C), a 12-minute interaction between clinician and
child. The BOSCC is divided into three parts. First, there is a five-minute
play interaction with a series of standardized toys. The second segment is a
two-minute conversation about any topic. The final segment is a second
five-minute play interaction with a second set of standardized toys. The BOSCC
will be recorded with a camcorder handheld camera as well as pivot-head
glasses. Pivot head glasses have an embedded camera, embedded between the eyes.
Because the camera is positioned between the eyes, it is possible to tell when
the child is making eye contact with a clinician based on if they are looking
directly at the camera or not.
participant’s facial movements will be monitored by means of a Kinect system.
The Kinect sensor is a horizontal bar connected to a small base with a
motorized pivot and is designed to be positioned lengthwise above or below the
video display. The device can provide full-body 3D motion capture, facial
recognition and voice recognition capabilities (Tatilo, 2010). the Kinect’s
various sensors output video at a frame rate of ?9 Hz to 30 Hz depending on
faces can be constructed and controlled by mean of a virtual reality
environment software. Hommel and Ma (2016) have used an integration of Kinect,
intersense orientation tracker, FAAST and wizard (virtual reality environment
software) to allow participants to freely move or rotate their own face to
control the movement or rotation of the virtual face. This method with some moderations
base on the available equipment can be used in this experiment as well.
techniques will be used to track the eye movements of participants on screen
and this will be used as a source for rewarding participants as their eye gaze
get closer to the virtual face’s eyes. For this aim the screen will be
partitioned and fixation on any given point on screen will result in a specific
level of smile on the virtual agent’s face.
of training would be divided into 3 main trials in each of them a new option
will be added to the virtual environment of interaction. This will be done
because a novel stimulus is significantly more likely to be recognized than
non-novel stimulus and oddity enhances the attention of participant into the
task so every trial the procedure gets more interactive and involves
participant more. In addition, this division of trials helps the experimenters
evaluate the previous trial to see if it has been done successfully.
At the first
trial the synchrony will be applied between head movements of the virtual face
and that of the participant by the mean of Kinect. Evidence shows that four minutes
of synchronous head movement interaction between participant and virtual face
is enough to result in a better mood in participant if the virtual face has a
smile on her/his face (Ma, Sellaro, lippelt, Hommel, 2016). This synchrony as
evidences support, can lead to an ownership illusion that would contribute to
the feature migration and mimicry tendency which will be used as a medium to
implicitly convey the basic eye
contact principles to the
autistic child through
the second trial.
figure1 – the virtual faces used in Hommel and Ma’s mood migration
At the second
trial eye tracker will be added to first trial’s setup to follow the eye
movements of participant on the screen. The screen will be divided into
circular supposed divisions centered on the virtual face. As participant’s gaze
get closer to the center of the virtual face, the virtual face’s smile will get
bigger and as the eyes of participant get away from the center of circle, the
face will get sadder. Thus a proper eye contact will be associated with a big
smile and hence will act as an enforcement. researches have shown that enfacing
a smile in a synchronous interaction will lead to better divergent thinking
ability (Ma, Sellaro, lippelt, Hommel, 2016) and the linkage between smile and
divergent thinking through the medium of good mood is supported to lay in the
phasic increase of dopamine in the brain (Ashby et al., 1999), which is an
intrinsic enforcement source that can add even more to the learning rate of
At the third
trial, some objects will be added to the virtual environment, appearing between
virtual face and participant’s face and while the synchrony between their head
movements is still going on, the eyes of virtual face at some points of time
will be fixed on one of those objects and when the participant’s gaze is fixed
on the same object, reinforcement will be given to participant for detecting
the intention of virtual face.
Regarding the fact
that the mean scores of pretest and posttest of participants will be compared
based on their performance in the Brief Observation of Social Communication
Change (BOSC-C) scores before and after the intervention, paired Sample T-Test analysis
can best suit this experimental design because paired Sample T-Test is a
statistical procedure used to determine whether the mean difference between two
sets of observations is zero. In a paired sample t-test, each subject or entity
is measured twice, resulting in pairs of observations. The null hypothesis
assumes that the true mean difference between the paired samples is zero. Under
this model, all observable differences are explained by random variation.
Conversely, the alternative hypothesis assumes that the true mean difference
between the paired samples is not equal to zero. We expect that the difference
between two groups will not be zero and the group which has gone through
synchronous eye contact training interaction will show higher performance on
mean scores of BOSC-C in comparison with the group which has just practiced eye
contact without synchrony.
American Psychiatric Association (1994).
Diagnostic and statistical manual of mental disorders (4th ed.).Washington, DC.
Ashby, F. G., Isen, A. M., & Turken,
A. U. (1999). A neuropsychological theory of positive affect and its influence
on cognition. Psychological Review, 106, 529–550.
Bölte S, Hallmayer J: Autism Spectrum
Conditions: FAQs on Autism, Asperger Syndrome, and Atypical Autism Answered by
International Experts. 2010, Cambridge, MA: Hogrefe Publishing.
Cafiero, J. (2001). The effect of an
augmentative communication intervention on the communication, behavior, and
academic program of an adolescent with autism. Focus on Autism and Other
Developmental Disabilities, 16(3), 179-189.
Cai, Y., Chia, N. K. H., Thalmann, D.,
Kee, N. K. N., Zheng, J., Thalmann, N. M.:Design and Development of a Virtual
Dolphinarium for Children With Autism. Neu-ral Systems and Rehabilitation
Engineering, IEEE Transactions on, 21(2), 208217. (2013)
Finkelstein, S. L., Nickel, A., Barnes,
T., Suma, E. A. : Astrojumper: Designing a virtual reality exergame to motivate
children with autism to exercise. Presented at the Virtual Reality Conference
(VR), 2010 IEEE.pp. 267268 (2010).
Greer, D. R., Ross, D. E. : Verbal
behavior analysis. New York, NY: Pearson Education (2007)
Ma, K., Sellaro, R., Lippelt, D., &
Hommel, B. (2016). Mood migration: How enfacing a smile makes you happier.
Cognition, 151, 52-62. http://dx.doi.org/10.1016/j.cognition.2016.02.018
Hommel, B., Müsseler, J., Aschersleben,
G., & Prinz, W. (2001). The theory of event coding (TEC): A framework for
perception and action planning. Behavioral & Brain Sciences, 24, 849–878
Hommel, B. (2004). Event files: Feature
binding in and across perception and action. Trends in Cognitive Sciences, 8,
494–500. Memelink, J., & Hommel, B. (2013). Intentional weighting: A basic
principle in cognitive control. Psychological Research, 77, 249–259.
Keysers C, Gazzola V. Integrating
simulation and theory of mind: from self to social cognition. Trends Cogn Sci.
2007; 11:194–196. PubMed: 17344090
Kim, D., & Hommel, B. (2015). An
event-based account of conformity. Psychological Science, 26, 484–489
Lahiri, U., Warren, Z., Sarkar, N. :
Design of a Gaze-Sensitive Virtual Social Interactive System for Children with
Autism. Neural Systems and Rehabilitation Engineering, IEEE Transactions on,
19(4), pp. 443452. (2011)
Ma, K., & Hommel, B. (2015b).
Body-ownership for actively operated non-corporeal objects. Consciousness and
Cognition, 36, 75–86.
Mirenda P., Donnellan, A, Yoder, D: Gaze
Behavior: A New Look at an Old Problem. Journal of Autism and Developmental
Disorders. 13, 397–409 (1983)
Totilo, Stephen (January 7, 2010).
“Natal Recognizes 31 Body Parts, Uses Tenth of Xbox 360 “Computing
Resources””. Kotaku, Gawker Media. Retrieved November 25,
2010.Tsakiris, M. (2008). Looking for myself: Current multisensory input alters
self-face recognition. PLoS One, 3, e4040. Schroeder CE, et al. Dynamics of
active sensing and
Wimpory, D. C., Hobson, R. P., Williams,
M. G., Nash, S.: Are infants with autism socially engaged? A study of recent
retrospective parental reports. Journal of Autism and Developmental Disorders.
30, 525-536 (2000).