Using Part-Of-Speech Tags for Predicting Phrase Breaks

Read, I. and Cox, S. J. (2004) Using Part-Of-Speech Tags for Predicting Phrase Breaks. In: 8th International Conference on Spoken Language Processing (Interspeech 2004), 2004-10-04 - 2004-10-08.

Full text not available from this repository. (Request a copy)

Abstract

Predicting the location of phrase breaks within an utterance is an important task in text-to-speech synthesis, and can be done with reasonable accuracy using part-of-speech (POS) tags as features. However, it seems unlikely that the 40 or more different tags used by most taggers all contribute to this task, and in fact many may contribute noise. In this paper, we present an algorithm for reducing the standard Penn Treebank POS tag set for use in predicting phrase breaks. Using the best first search approach, the algorithm considers possible groupings of tags, searching the groupings that yield the highest overall performance. The reduced tag sets were evaluated by an n-gram model trained on POS sequences along with their associated juncture (break/non-break), the reduced tag set raised the model's performance on junctures correct from 90.38% to 92.43%, and reduced insertions from 2.89% to 1.83%.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
Related URLs:
Depositing User: Vishal Gautam
Date Deposited: 21 Jul 2011 14:02
Last Modified: 24 Jul 2019 12:27
URI: https://ueaeprints.uea.ac.uk/id/eprint/21646
DOI:

Actions (login required)

View Item View Item