Lip-Reading using deep learning methods

Workshop website

Paper submission details

To be announced

Workshop Information

The aim of the workshop is to gather researchers on Visual Speech Recognition (also known as Lip-Reading) to disseminate their work and to exchange views on the new potentials of the field created by the advent of Deep Learning methods. Lip-Reading is a field of major importance for a wide range of applications, such as silent dictation, speech recognition in noisy environments, improved hearing aids and biometrics. Moreover, it lies at the intersection of Computer Vision and Speech Recognition, which are the two fields that pioneered Deep Learning methods.

Topics of interest include, but are not limited to:

  • Deep Learning methods for Lip-Reading
  • Audio-Visual Speech Recognition and fusion methods
  • Combinations of probabilistic and Deep Learning methods for Lip-Reading
  • Visual units for Lip-Reading
  • Tracking methods for Lip-Reading
  • Multi-View Lip-Reading
  • Lip-Reading and Audio-Visual Databases

Important dates

Workshop Paper Submission deadline:                              July 10th 2017  

(extended) 13th July 2017

Paper acceptance Notification:                                              July 18th 2017

Camera Ready Submission:                                                 July 25th 2017


List of presentations:

  [1] M Kubokawa and T Saitoh, “Intensity Correction Effect for Lip Reading” 

  [2] H L Bear, “Visual gesture variability between talkers in continuous visual speech”

  [3] K Thangthai, H L Bear and R Harvey, “Comparing phonemes and visemes with DNN-based lipreading”

  [4] H L Bear and S Taylor, “Visual speech recognition: aligning terminologies for improved understanding”

Extended Abstracts

  [5] J-C Hou, S-S Wang, Y Tsao and H-M Wang, “Audio-Visual Speech Enhancement based on Multimodal Deep Convolutional Neural Network”

  [6] L Liu, G Feng and D Beautemps, “Inner lips features extraction based on CLNF with hybrid dynamic template for cued speech”

  [7] C Wright and D Stewart, “Real-World DataSets for Lip-Based Research”

  [8] G Sterpu and N Harte “Lipreading Sentences with Sequence to Sequence Models: Preliminary Results”

  [9] T Stafylakis and G Tzimiropoulos, “Visual word recognition using Residual Networks and LSTMs”

  [10] S Petridis, Y W Z Li and M Pantic, “End-to-end multi-view lipreading”