Context of the acquisitions

This corpus was recorded by the LIG laboratory (Laboratoire d’Informatique de Grenoble, UMR 5217 CNRS/UGA) thanks to the  VocADom project founded by the French National Research Agency (Agence Nationale de la Recherche/ANR-16-CE33-0006). The authors would like to thank the participants who accepted to perform the experiments.

This corpus is composed of audio and home automation data acquired in a real smart home with French speakers. This campaign was conducted within the VocADom project aiming at designing a new smart home system based on audio technology. The developed system provides assistance via natural man-machine interaction (voice and tactile command) and security reassurance by detecting distress situations so that the person can manage, from anywhere in the house, her environment at any time in the most natural way possible.

Home automation Smart Home of the LIG Laboratory

Amiqual4Home smart apartment is part of the experimentation platform of the LIG laboratory and is dedicated for research projects.  According to the different research projects, experimentations are conducted with users performing scenarios of daily housework and leisure. Multimodal corpus are produced, synchronized and analyzed in order to evaluate and validate the concerned concept or system.


The Amiqual4Home apartment used to collect this dataset has the following layout with (a) the ground floor and (b) the first floor:

a)                       b) 


Amiqual4Home is fully functional and equipped with sensors, such as energy and water consumption, level of hygrometry, temperature, and actuators able to control lighting, shutters, multimedia diffusion, distributed in the kitchen, the bedroom, the office and the bathroom. An observation instrumentation, with cameras, microphones and activity tracking systems, allows to control and supervise the experiment from a control room connected to Amiqual4Home. The flat also is equipped with 16 microphones (4 arrays of 4 microphones each) set into the ceiling that can be recorded in real-time thanks to a dedicated software able to record simultaneously the audio channels.

Short description of sensors: place and data type
Room Binary Integer Real Number Categorical Microphone area
Entrance 3 1 2 3 0
Kitchen 13 21 18 0 1
Living room 16 6 8 7 1
Staircase 3 0 0 0 0
Walkway 9 0 1 0 0
Bathroom 9 6 8 3 1
Bedroom 17 4 6 7 1
ALL 70 40 43 20 4 (16 channels)

*: not used room



  1. Phase 1 – Graphical based instruction to elicit spontaneous voice commands (interaction with the home)
  2. Phase 2 – Two-inhabitant scenario enacting a visit by a friend (interaction with the home)
  3. Phase 3 – Voice commands in noisy domestic environment (reading of voice commands in the home)

Grammar of the voice command

set/check an actuator: key initiateCommand object
(e. g., KEYWORD turn off the light)
(e. g., KEYWORD is the light on?)
emergency call: key emergencyCommand
(e. g., KEYWORD help)


Data type Format
Audio Transcriber
Localization and activities Elan
NLU Home made


Participants’ recordings (duration in the format hour:minute:second)

Participant Age group (years) Gender Duration Chosen keyword
S00 20-23 M 01:03:54 vocadom
S01 20-23 M 08:48:53 vocadom
S02 20-23 M 01:12:26 hé cirrus
S03 20-23 M 01:11:52 ulysse
S04 23-25 F 01:04:46 téraphim
S05 <20 F 01:22:59 allo cirrus
S06 23-25 M 00:55:54 ulysse
S07 25-28 M 01:03:54 ichefix
S08 23-25 M 01:13:01 ulysse
S09 23-25 F 01:20:06 minouche
S10 23-25 F 01:11:03 hestia
All (mean) 23-25 4F/7M 12:28:45 8 keywords

Data structure

Under the record/ directory each record is organized as is:

S<??>/ openhab_log/ (log of the home automation network) publicly available
activity/ (annotation of the participants’ location and activity)
mic_array/ (microphone array recordings) available upon request
mic_headset/ (headset microphone recordings)
speech_transcript/ (transcription of the participants’ speech)
NLU/ (semantic annotation of the voice commands)
video/ (video records for annotation purpose) restricted
openhab logs
  1. S??_change.csv.log: the timestamped log of sensors values changes (e.g., door moving from CLOSED to ON). The log covers more that the experiment duration to provide a view on the initial value of the sensors.
  2. S??_wizard.csv.log: the timestamped log of commands sent through the openHAB network by the Wizards.

The full list of used sensors can be found here while the list of commands can be found here.


  1. Home Automation only corpus: only composed of S00 until the publication of the paper related to the corpus
  2. Full corpus: only upon request to Francois.Portet@imag.fr