Quantifying the speaking voice: Generating a speaker code as a means of speaker identification using a simple code-matching technique

Peter Popolo, Richard W. Sanders, Ingo R. Titze

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This paper looks at a methodology of quantifying the speaking voice, by which temporal and spectral features of the voice are extracted and processed to create a numeric code that identifies speakers, so those speakers can be searched in a database much like fingerprints. The parameters studied include: (1) average fundamental frequency (F0) of the speech signal over time, (2) standard deviation of the F0, (3) the slope and (4) sign of the FO contour, (5) the average energy, (6) the standard deviation of the energy, (7) the spectral energy contained from 50 Hz to 1,000 Hz, (8) the spectral energy from 1,000 Hz to 5,000 Hz, (9) the Alpha Ratio, (10) the average speaking rate, and (11) the total duration of the spoken sentence.

Original languageEnglish
Title of host publicationAudio Engineering Society - 123rd Audio Engineering Society Convention 2007
Pages456-468
Number of pages13
Volume1
StatePublished - 1 Dec 2007
Event123rd Audio Engineering Society Convention 2007 - New York, NY, United States
Duration: 5 Oct 20078 Oct 2007

Other

Other123rd Audio Engineering Society Convention 2007
CountryUnited States
CityNew York, NY
Period5/10/078/10/07

Fingerprint

Speaker Identification
Energy
Standard deviation
standard deviation
sentences
Fundamental Frequency
energy
Speech Signal
Fingerprint
Numerics
Slope
methodology
slopes
Voice
Methodology

Cite this

Popolo, P., Sanders, R. W., & Titze, I. R. (2007). Quantifying the speaking voice: Generating a speaker code as a means of speaker identification using a simple code-matching technique. In Audio Engineering Society - 123rd Audio Engineering Society Convention 2007 (Vol. 1, pp. 456-468)
Popolo, Peter ; Sanders, Richard W. ; Titze, Ingo R. / Quantifying the speaking voice : Generating a speaker code as a means of speaker identification using a simple code-matching technique. Audio Engineering Society - 123rd Audio Engineering Society Convention 2007. Vol. 1 2007. pp. 456-468
@inproceedings{87bed64431de4077b0a0b8b5e546fc02,
title = "Quantifying the speaking voice: Generating a speaker code as a means of speaker identification using a simple code-matching technique",
abstract = "This paper looks at a methodology of quantifying the speaking voice, by which temporal and spectral features of the voice are extracted and processed to create a numeric code that identifies speakers, so those speakers can be searched in a database much like fingerprints. The parameters studied include: (1) average fundamental frequency (F0) of the speech signal over time, (2) standard deviation of the F0, (3) the slope and (4) sign of the FO contour, (5) the average energy, (6) the standard deviation of the energy, (7) the spectral energy contained from 50 Hz to 1,000 Hz, (8) the spectral energy from 1,000 Hz to 5,000 Hz, (9) the Alpha Ratio, (10) the average speaking rate, and (11) the total duration of the spoken sentence.",
author = "Peter Popolo and Sanders, {Richard W.} and Titze, {Ingo R.}",
year = "2007",
month = "12",
day = "1",
language = "English",
isbn = "9781604239027",
volume = "1",
pages = "456--468",
booktitle = "Audio Engineering Society - 123rd Audio Engineering Society Convention 2007",

}

Popolo, P, Sanders, RW & Titze, IR 2007, Quantifying the speaking voice: Generating a speaker code as a means of speaker identification using a simple code-matching technique. in Audio Engineering Society - 123rd Audio Engineering Society Convention 2007. vol. 1, pp. 456-468, 123rd Audio Engineering Society Convention 2007, New York, NY, United States, 5/10/07.

Quantifying the speaking voice : Generating a speaker code as a means of speaker identification using a simple code-matching technique. / Popolo, Peter; Sanders, Richard W.; Titze, Ingo R.

Audio Engineering Society - 123rd Audio Engineering Society Convention 2007. Vol. 1 2007. p. 456-468.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Quantifying the speaking voice

T2 - Generating a speaker code as a means of speaker identification using a simple code-matching technique

AU - Popolo, Peter

AU - Sanders, Richard W.

AU - Titze, Ingo R.

PY - 2007/12/1

Y1 - 2007/12/1

N2 - This paper looks at a methodology of quantifying the speaking voice, by which temporal and spectral features of the voice are extracted and processed to create a numeric code that identifies speakers, so those speakers can be searched in a database much like fingerprints. The parameters studied include: (1) average fundamental frequency (F0) of the speech signal over time, (2) standard deviation of the F0, (3) the slope and (4) sign of the FO contour, (5) the average energy, (6) the standard deviation of the energy, (7) the spectral energy contained from 50 Hz to 1,000 Hz, (8) the spectral energy from 1,000 Hz to 5,000 Hz, (9) the Alpha Ratio, (10) the average speaking rate, and (11) the total duration of the spoken sentence.

AB - This paper looks at a methodology of quantifying the speaking voice, by which temporal and spectral features of the voice are extracted and processed to create a numeric code that identifies speakers, so those speakers can be searched in a database much like fingerprints. The parameters studied include: (1) average fundamental frequency (F0) of the speech signal over time, (2) standard deviation of the F0, (3) the slope and (4) sign of the FO contour, (5) the average energy, (6) the standard deviation of the energy, (7) the spectral energy contained from 50 Hz to 1,000 Hz, (8) the spectral energy from 1,000 Hz to 5,000 Hz, (9) the Alpha Ratio, (10) the average speaking rate, and (11) the total duration of the spoken sentence.

UR - http://www.scopus.com/inward/record.url?scp=84866495115&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84866495115

SN - 9781604239027

VL - 1

SP - 456

EP - 468

BT - Audio Engineering Society - 123rd Audio Engineering Society Convention 2007

ER -

Popolo P, Sanders RW, Titze IR. Quantifying the speaking voice: Generating a speaker code as a means of speaker identification using a simple code-matching technique. In Audio Engineering Society - 123rd Audio Engineering Society Convention 2007. Vol. 1. 2007. p. 456-468