Florian Schiel (München):
Automatic analysis of speech with "MAUS"

Donnerstag, 12.00 Uhr

This paper describes a method to automatically label and segment broad phonetic segments in large speech corpora within the framework of the 'MAUS' project ([1]). 'MAUS' stands for 'Munich Automatic Segmentation System' and is a general purpose tool to automatically label and segment read and spontaneous German speech in phonetic/phonologic segments. The output of MAUS can be used to build probabilistic models of pronunciation of fluent German reflected by the analyzed corpus. This models can be the basis for phonetic/phonologic investigations or can be incorporated into classic speech recognition algorithms.

The paper is organized as follows: The first section gives a very short introduction into the main processing principle of MAUS and gives some examples of the output of MAUS applied to utterances from the Verbmobil corpus.

Section 2 deals very briefly with the problem of how to evaluate such an output. A method is given that first compares the performance of three human transcribers with each other and then the performance of MAUS with each of them.

As an example section 3 describes our method to derive probabilistic pronunciation dictionaries from the MAUS output and gives some interesting examples from the Verbmobil domain.

The 4th and last section gives some new approaches of how to incorporate these models into a new automatic speech recognition (ASR) approach that combines phonetically 'sharper' acoustic models with the probabilistic modeling of pronunciation.

Literatur:

zum Programm der AG 9
zur alphabetischen Übersicht der Abstracts
zur zeitlichen Übersicht