TY - JOUR
T1 - Reproducible Speech Research With the Artificial Intelligence–Ready PERCEPT Corpora
AU - Benway, Nina R.
AU - Preston, Jonathan L.
AU - Hitchcock, Elaine
AU - Rose, Yvan
AU - Salekin, Asif
AU - Liang, Wendy
AU - McAllister, Tara
N1 - Funding Information:
Funding for corpus compilation was provided by the National Institute on Deafness and Other Communication Disorders under Grant R01DC017476-S2 (principal investigator: Tara McAllister). This research was supported in part through computational resources provided by Syracuse University under National Science Foundation Grants ACI-1341006 and ACI-1541396. The authors wish to thank the participants and their families, as well as the many research speech-language pathologists (most notably, Megan Leece) whose dedication, time, and ingenuity have generated the data for this corpus. The authors are also appreciative of the research assistants involved in corpus curation, including Kelly Garcia, Allison Corsetti, and Michela Eivers (annotation, verification), as well as Felicia Pace (Google Speech Data). Much gratitude is also given to Elizabeth Roepke, who provided thought-provoking insight related to the use of Perceptual Error Rating for the Clinical Evaluation of Phonetic Targets in an academic context.
Funding Information:
Funding for corpus compilation was provided by the National Institute on Deafness and Other Communication Disorders under Grant R01DC017476-S2 (principal investigator: Tara McAllister). This research was supported in part through computational resources provided by Syra-cuse University under National Science Foundation Grants ACI-1341006 and ACI-1541396. The authors wish to thank the participants and their families, as well as the many research speech-language pathologists (most notably, Megan Leece) whose dedication, time, and ingenuity have generated the data for this corpus. The authors are also appreciative of the research assistants involved in corpus curation, including Kelly Garcia, Allison Corsetti, and Michela Eivers (annotation, verification), as well as Felicia Pace (Google Speech Data). Much gratitude is also given to Elizabeth Roepke, who provided thought-provoking insight related to the use of Perceptual Error Rating for the Clinical Evaluation of Phonetic Targets in an academic context.
Funding Information:
1 The early development of Phon was funded by grants from the Social Sciences and Humanities Research Council of Canada and the Canada Foundation for Innovation as well as by a Petro-Canada Young Innovator Award. Since 2006, the development of Phon and PhonBank has been funded primarily through grants from the National Institutes of Health (HD051698, R01 HD051698-06A1, R01 HD051698-11, and R01 HD051698-16).
Publisher Copyright:
© 2023, American Speech-Language-Hearing Association. All rights reserved.
PY - 2023/6
Y1 - 2023/6
N2 - Background: Publicly available speech corpora facilitate reproducible research by providing open-access data for participants who have consented/assented to data sharing among different research teams. Such corpora can also support clinical education, including perceptual training and training in the use of speech analysis tools. Purpose: In this research note, we introduce the PERCEPT (Perceptual Error Rating for the Clinical Evaluation of Phonetic Targets) corpora, PERCEPT-R (Rhotics) and PERCEPT-GFTA (Goldman-Fristoe Test of Articulation), which together contain over 36 hr of speech audio (> 125,000 syllable, word, and phrase utterances) from children, adolescents, and young adults aged 6– 24 years with speech sound disorder (primarily residual speech sound disor-ders impacting/ɹ/) and age-matched peers. We highlight PhonBank as the repository for the corpora and demonstrate use of the associated speech analysis software, Phon, to query PERCEPT-R. A worked example of research with PERCEPT-R, suitable for clinical education and research training, is included as an appendix. Support for end users and information/descriptive statistics for future releases of the PERCEPT corpora can be found in a dedi-cated Slack channel. Finally, we discuss the potential for PERCEPT corpora to support the training of artificial intelligence clinical speech technology appropriate for use with children with speech sound disorders, the develop-ment of which has historically been constrained by the limited representation of either children or individuals with speech impairments in publicly available training corpora. Conclusions: We demonstrate the use of PERCEPT corpora, PhonBank, and Phon for clinical training and research questions appropriate to child citation speech. Increased use of these tools has the potential to enhance reproducibility in the study of speech development and disorders.
AB - Background: Publicly available speech corpora facilitate reproducible research by providing open-access data for participants who have consented/assented to data sharing among different research teams. Such corpora can also support clinical education, including perceptual training and training in the use of speech analysis tools. Purpose: In this research note, we introduce the PERCEPT (Perceptual Error Rating for the Clinical Evaluation of Phonetic Targets) corpora, PERCEPT-R (Rhotics) and PERCEPT-GFTA (Goldman-Fristoe Test of Articulation), which together contain over 36 hr of speech audio (> 125,000 syllable, word, and phrase utterances) from children, adolescents, and young adults aged 6– 24 years with speech sound disorder (primarily residual speech sound disor-ders impacting/ɹ/) and age-matched peers. We highlight PhonBank as the repository for the corpora and demonstrate use of the associated speech analysis software, Phon, to query PERCEPT-R. A worked example of research with PERCEPT-R, suitable for clinical education and research training, is included as an appendix. Support for end users and information/descriptive statistics for future releases of the PERCEPT corpora can be found in a dedi-cated Slack channel. Finally, we discuss the potential for PERCEPT corpora to support the training of artificial intelligence clinical speech technology appropriate for use with children with speech sound disorders, the develop-ment of which has historically been constrained by the limited representation of either children or individuals with speech impairments in publicly available training corpora. Conclusions: We demonstrate the use of PERCEPT corpora, PhonBank, and Phon for clinical training and research questions appropriate to child citation speech. Increased use of these tools has the potential to enhance reproducibility in the study of speech development and disorders.
UR - http://www.scopus.com/inward/record.url?scp=85163922989&partnerID=8YFLogxK
U2 - 10.1044/2023_JSLHR-22-00343
DO - 10.1044/2023_JSLHR-22-00343
M3 - Article
C2 - 37319018
AN - SCOPUS:85163922989
SN - 1092-4388
VL - 66
SP - 1986
EP - 2009
JO - Journal of Speech, Language, and Hearing Research
JF - Journal of Speech, Language, and Hearing Research
IS - 6
ER -