LISTEN

“Hands-­free Voice-enabled  Interface to Web Applications for  Smart  Home  Environments”, 2015-2020.

Funded by: European Commission, H2020-MSCA-RISE-2014.

Partners: FORTH-ICS (Coordinator,  Ath. Mouchtaris), RWTH Aachen University, European Media Laboratory GmbH, Cedat 85 Srl.

Funding:  €414.000 (FORTH-ICS: €180.000)

SPL Contact Person: Prof. Athanasios Mouchtaris.

Summary: The central objective of LISTEN is to design and implement a complete system, including both the software and hardware components, enabling robust hands-free large-vocabulary voice-based access to Internet applications in smart homes. This would allow the users to have natural control (i.e., using their voice) of the smart-home web-enabled functionalities (e.g., turning on/off web-enabled “smart” appliances), but also to access specific Internet applications (e.g., web search, email dictation, access to social networks). A truly hands-free system operation of the voice interface is equally important: users will not have to turn towards a microphone or other device, or wear a headset.

Therefore, LISTEN will develop (a) a robust hands-free speech capture system operating as a wireless acoustic sensor network (WASN), specifically designed for the smart home, and (b) a large-vocabulary automatic speech recognition system optimised for accessing web applications and controlling web-enabled smart home automation functionalities. LISTEN pushes the boundaries of current state-of-the-art by bridging the gap between the acoustic front-end and automatic speech recognition research communities, with the common goal of developing a smart-home-specific natural voice interface to web services.

The role of SPL: 

  • Develop innovative algorithms for robust speech acquisition using microphone arrays, for distant speech recognition in the smart home.
  • Develop the software and hardware components of a Wireless Acousic Sensor Network for the smart home, as the front-end for a hands-free large-vocabulary speech recognition system.

Related Publications: 

Caetano, Marcello; Kafentzis, George; Mouchtaris, Athanasios, Adapive Modeling of Synthetic Nonstationary Sinusoids, to be presented at the 18th International Conference on Digital Audio Effects (DAFx-15), Trodheim, Norway, November 30 – December 3, 2015.

Morfi, Veronica; Degottex, Gilles; Mouchtaris, Athanasios, Speech Analysis and Synthesis with a Computationally Efficient Adaptive Harmonic Model, Audio, Speech, and Language Processing, IEEE/ACM Transactions on, 23 (11), pp. 1950-1962, 2015, ISSN: 2329-9290.

Alexandridis, Anastasios; Mouchtaris Athanasios, “Multiple Sound Source Location Estimation and Counting in a Wireless Acoustic Sensor Network”, in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), October 18-21, 2015.

Achievements: 

  • Already a first version of the large-vocabulary speech recognition system in four languages (English, German, Italian, Greek) is available.
  • Already, a first version of the acoustic sensor network system is available.
  • Already, a first version of the robust speech processing front-end is available.