Automatic phonetic transcription - Lëtzebuergesch


The task

Transcribe any Luxembourgish word into its canonical phonetic form according to the IPA. For testing, enter: Den Duerchschnëttsalter läit bei 46 Joer (do not hit the enter button). Transcriptions are generated using two algorithms: Sequitur g2p [broken] and the recent Gramophone, to also evaluate the differences between these two methods.

How it works

Enter Luxembourgish text - and wait a bit. The application takes words or sentences as input, which then are transcribed using a trained model by using some 7000 words of training data. The phonetic transcription is regarded as a canonical one, regional or style-related pronunciation variants are not considered. Due to its automatic-statistical nature the programme will also generate incorrect transcriptions, especially when it comes to names, numbers, abreviations and especially to loans from French, German or English. The system is still experimental. If you receive erroneous transcriptions, please let me know and I will try to re-train the model accordingly. The transcription conventions are oriented on Gilles, Peter and Jürgen Trouvain. 2013. [Illustration of the IPA] Luxembourgish. Journal of the International Phonetic Association 43.67–74. doi:10.1017/S0025100312000278.

© Feb. 2016 Peter Gilles, using Sequitur g2p, Gramophone and RPC functions from gr0w.