Validating and optimizing a crowdsourced method for gradient measures of child speech

Tara Mc Allister Byun, Elaine Hitchcock, Daphna Harel

Research output: Contribution to journalConference article

1 Citation (Scopus)

Abstract

There is broad consensus that speech sound development is a gradual process, with acoustic measures frequently revealing covert contrast between sounds perceived as identical. Well-constructed perceptual tasks using Visual Analog Scaling (VAS) can draw out these gradient differences. However, this method has not seen widespread uptake in speech acquisition research, possibly due to the time-intensive character of VAS data collection. This project tested the validity of streamlined VAS data collection via crowdsourcing. It also addressed a methodological question that would be challenging to answer through conventional data collection: when collecting ratings of speech samples elicited from multiple individuals, should those samples be presented in fully random order, or grouped by speaker? 100 naïve listeners recruited through Amazon Mechanical Turk provided VAS ratings for 120 /r/ words produced by 4 children before, during, and after intervention. 50 listeners rated the stimuli in fully randomized order and 50 in grouped-by-speaker order. Mean click location was compared against an acoustic standard, and standard error of click location was used to index variability. In both conditions, mean click location was highly correlated with the acoustic measure, supporting the validity of speech ratings obtained via crowdsourcing. Lower variability was observed in the grouped presentation condition.

Original languageEnglish
Pages (from-to)2834-2838
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2015-January
StatePublished - 1 Jan 2015
Event16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015 - Dresden, Germany
Duration: 6 Sep 201510 Sep 2015

Fingerprint

Scaling
Gradient
Analogue
Acoustics
Acoustic waves
Standard error
Speech
Children
Vision
Rating
Data Collection
Sound
Listeners
Presentation
Character
Standards
Acquisition
Nave
Amazon
Conventional

Keywords

  • Acquisition and disorders
  • Covert contrast
  • Crowdsourcing
  • Perceptual rating

Cite this

@article{49227dc59e98453c862da0ad08c2833c,
title = "Validating and optimizing a crowdsourced method for gradient measures of child speech",
abstract = "There is broad consensus that speech sound development is a gradual process, with acoustic measures frequently revealing covert contrast between sounds perceived as identical. Well-constructed perceptual tasks using Visual Analog Scaling (VAS) can draw out these gradient differences. However, this method has not seen widespread uptake in speech acquisition research, possibly due to the time-intensive character of VAS data collection. This project tested the validity of streamlined VAS data collection via crowdsourcing. It also addressed a methodological question that would be challenging to answer through conventional data collection: when collecting ratings of speech samples elicited from multiple individuals, should those samples be presented in fully random order, or grouped by speaker? 100 na{\"i}ve listeners recruited through Amazon Mechanical Turk provided VAS ratings for 120 /r/ words produced by 4 children before, during, and after intervention. 50 listeners rated the stimuli in fully randomized order and 50 in grouped-by-speaker order. Mean click location was compared against an acoustic standard, and standard error of click location was used to index variability. In both conditions, mean click location was highly correlated with the acoustic measure, supporting the validity of speech ratings obtained via crowdsourcing. Lower variability was observed in the grouped presentation condition.",
keywords = "Acquisition and disorders, Covert contrast, Crowdsourcing, Perceptual rating",
author = "Byun, {Tara Mc Allister} and Elaine Hitchcock and Daphna Harel",
year = "2015",
month = "1",
day = "1",
language = "English",
volume = "2015-January",
pages = "2834--2838",
journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
issn = "2308-457X",

}

Validating and optimizing a crowdsourced method for gradient measures of child speech. / Byun, Tara Mc Allister; Hitchcock, Elaine; Harel, Daphna.

In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2015-January, 01.01.2015, p. 2834-2838.

Research output: Contribution to journalConference article

TY - JOUR

T1 - Validating and optimizing a crowdsourced method for gradient measures of child speech

AU - Byun, Tara Mc Allister

AU - Hitchcock, Elaine

AU - Harel, Daphna

PY - 2015/1/1

Y1 - 2015/1/1

N2 - There is broad consensus that speech sound development is a gradual process, with acoustic measures frequently revealing covert contrast between sounds perceived as identical. Well-constructed perceptual tasks using Visual Analog Scaling (VAS) can draw out these gradient differences. However, this method has not seen widespread uptake in speech acquisition research, possibly due to the time-intensive character of VAS data collection. This project tested the validity of streamlined VAS data collection via crowdsourcing. It also addressed a methodological question that would be challenging to answer through conventional data collection: when collecting ratings of speech samples elicited from multiple individuals, should those samples be presented in fully random order, or grouped by speaker? 100 naïve listeners recruited through Amazon Mechanical Turk provided VAS ratings for 120 /r/ words produced by 4 children before, during, and after intervention. 50 listeners rated the stimuli in fully randomized order and 50 in grouped-by-speaker order. Mean click location was compared against an acoustic standard, and standard error of click location was used to index variability. In both conditions, mean click location was highly correlated with the acoustic measure, supporting the validity of speech ratings obtained via crowdsourcing. Lower variability was observed in the grouped presentation condition.

AB - There is broad consensus that speech sound development is a gradual process, with acoustic measures frequently revealing covert contrast between sounds perceived as identical. Well-constructed perceptual tasks using Visual Analog Scaling (VAS) can draw out these gradient differences. However, this method has not seen widespread uptake in speech acquisition research, possibly due to the time-intensive character of VAS data collection. This project tested the validity of streamlined VAS data collection via crowdsourcing. It also addressed a methodological question that would be challenging to answer through conventional data collection: when collecting ratings of speech samples elicited from multiple individuals, should those samples be presented in fully random order, or grouped by speaker? 100 naïve listeners recruited through Amazon Mechanical Turk provided VAS ratings for 120 /r/ words produced by 4 children before, during, and after intervention. 50 listeners rated the stimuli in fully randomized order and 50 in grouped-by-speaker order. Mean click location was compared against an acoustic standard, and standard error of click location was used to index variability. In both conditions, mean click location was highly correlated with the acoustic measure, supporting the validity of speech ratings obtained via crowdsourcing. Lower variability was observed in the grouped presentation condition.

KW - Acquisition and disorders

KW - Covert contrast

KW - Crowdsourcing

KW - Perceptual rating

UR - http://www.scopus.com/inward/record.url?scp=84959113429&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:84959113429

VL - 2015-January

SP - 2834

EP - 2838

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

ER -