Welcome to the AV Digits Database!


Silent speech interfaces have been recently proposed as a way to enable communication when the acoustic signal is not available. This introduces the need to build visual speech recognition systems for silent and whispered speech. However, almost all the recently proposed systems have been trained on vocalised data only. This is in contrast with evidence in the literature which suggests that lip movements change depending on the speech mode.


We introduce a new audiovisual database which contains normal, whispered and silent speech. We recorded 53 participants from 3 different views (frontal, 45 and profile) pronouncing digits and phrases in three speech modes. To the best of our knowledge, this is the first audiovisual database which is publicly available and contains all three speech modes. 


The database consists of two parts: digits and short phrases. In the first part, participants were asked to read 10 digits, from 0 to 9, in English in random order five times. In case of non-native English speakers this part was also repeated in the participant’s native language. In total, 53 participants (41 males and 12 females) from 16 nationalities, were recorded with a mean age and standard deviation of 26.7 and 4.3 years, respectively.

 

In the second part, participants were asked to read 10 short phrases. The phrases are the same as the ones used in the OuluVS2 database: “Excuse me”, “Goodbye”, “Hello”, “How are you”, “Nice to meet you”, “See you”, “I am sorry”,   “Thank you”, “Have a good time”, “You are welcome”. Again, each phrase was repeated five times in 3 different modes, neutral, whisper and silent speech. Thirty nine participants (32 males and 7 females) were recorded for this part with a mean age and standard deviation of 26.3 and 3.8 years, respectively.


If you use the database please cite the following paper:

S. Petridis, J. Shen, D. Cetin, M. Pantic, Visual-Only Recognition of Normal, Whispered And Silent Speech, submitted to IEEE ICASSP 2018

 More details about the training / validation / test partition can be found on the About page.