A Rule-Based Phrase Parser for Real-Time Text-To-Speech Synthesis

Joan Bachenko, Eileen Fitzpatrick

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Text-to-speech systems are currently designed to work on complete sentences and paragraphs, thereby allowing front end processors access to large amounts of linguistic context. Problems with this design arise when applications require text to be synthesized in near real time, as it is being typed. How does the system decide which incoming words should be collected and synthesized as a group when prior and subsequent word groups are unknown? We describe a rule-based parser that uses a three cell buffer and phrasing rules to identify break points for incoming text. Words up to the break point are synthesized as new text is moved into the buffer; no hierarchical structure is built beyond the lexical level. The parser was developed for use in a system that synthesizes written telecommunications by Deaf and hard of hearing people. These are texts written entirely in upper case, with little or no punctuation, and using a nonstandard variety of English (e.g. WHEN DO I WILL CALL BACK YOU). The parser performed well in a three month field trial utilizing tens of thousands of texts. Laboratory tests indicate that the parser exhibited a low error rate when compared with a human reader.

Original languageEnglish
Pages (from-to)191-212
Number of pages22
JournalNatural Language Engineering
Volume1
Issue number2
DOIs
StatePublished - 1 Jan 1995

Fingerprint

Speech synthesis
Audition
Linguistics
Telecommunication
telecommunication
time
Speech Synthesis
Group
linguistics

Cite this

@article{cea582c67d4b42b997337c292003da72,
title = "A Rule-Based Phrase Parser for Real-Time Text-To-Speech Synthesis",
abstract = "Text-to-speech systems are currently designed to work on complete sentences and paragraphs, thereby allowing front end processors access to large amounts of linguistic context. Problems with this design arise when applications require text to be synthesized in near real time, as it is being typed. How does the system decide which incoming words should be collected and synthesized as a group when prior and subsequent word groups are unknown? We describe a rule-based parser that uses a three cell buffer and phrasing rules to identify break points for incoming text. Words up to the break point are synthesized as new text is moved into the buffer; no hierarchical structure is built beyond the lexical level. The parser was developed for use in a system that synthesizes written telecommunications by Deaf and hard of hearing people. These are texts written entirely in upper case, with little or no punctuation, and using a nonstandard variety of English (e.g. WHEN DO I WILL CALL BACK YOU). The parser performed well in a three month field trial utilizing tens of thousands of texts. Laboratory tests indicate that the parser exhibited a low error rate when compared with a human reader.",
author = "Joan Bachenko and Eileen Fitzpatrick",
year = "1995",
month = "1",
day = "1",
doi = "10.1017/S1351324900000140",
language = "English",
volume = "1",
pages = "191--212",
journal = "Natural Language Engineering",
issn = "1351-3249",
publisher = "Cambridge University Press",
number = "2",

}

A Rule-Based Phrase Parser for Real-Time Text-To-Speech Synthesis. / Bachenko, Joan; Fitzpatrick, Eileen.

In: Natural Language Engineering, Vol. 1, No. 2, 01.01.1995, p. 191-212.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A Rule-Based Phrase Parser for Real-Time Text-To-Speech Synthesis

AU - Bachenko, Joan

AU - Fitzpatrick, Eileen

PY - 1995/1/1

Y1 - 1995/1/1

N2 - Text-to-speech systems are currently designed to work on complete sentences and paragraphs, thereby allowing front end processors access to large amounts of linguistic context. Problems with this design arise when applications require text to be synthesized in near real time, as it is being typed. How does the system decide which incoming words should be collected and synthesized as a group when prior and subsequent word groups are unknown? We describe a rule-based parser that uses a three cell buffer and phrasing rules to identify break points for incoming text. Words up to the break point are synthesized as new text is moved into the buffer; no hierarchical structure is built beyond the lexical level. The parser was developed for use in a system that synthesizes written telecommunications by Deaf and hard of hearing people. These are texts written entirely in upper case, with little or no punctuation, and using a nonstandard variety of English (e.g. WHEN DO I WILL CALL BACK YOU). The parser performed well in a three month field trial utilizing tens of thousands of texts. Laboratory tests indicate that the parser exhibited a low error rate when compared with a human reader.

AB - Text-to-speech systems are currently designed to work on complete sentences and paragraphs, thereby allowing front end processors access to large amounts of linguistic context. Problems with this design arise when applications require text to be synthesized in near real time, as it is being typed. How does the system decide which incoming words should be collected and synthesized as a group when prior and subsequent word groups are unknown? We describe a rule-based parser that uses a three cell buffer and phrasing rules to identify break points for incoming text. Words up to the break point are synthesized as new text is moved into the buffer; no hierarchical structure is built beyond the lexical level. The parser was developed for use in a system that synthesizes written telecommunications by Deaf and hard of hearing people. These are texts written entirely in upper case, with little or no punctuation, and using a nonstandard variety of English (e.g. WHEN DO I WILL CALL BACK YOU). The parser performed well in a three month field trial utilizing tens of thousands of texts. Laboratory tests indicate that the parser exhibited a low error rate when compared with a human reader.

UR - http://www.scopus.com/inward/record.url?scp=84974251983&partnerID=8YFLogxK

U2 - 10.1017/S1351324900000140

DO - 10.1017/S1351324900000140

M3 - Article

AN - SCOPUS:84974251983

VL - 1

SP - 191

EP - 212

JO - Natural Language Engineering

JF - Natural Language Engineering

SN - 1351-3249

IS - 2

ER -