FluencyBank English IISRP Corpus
|
Nicoline Ambrose
Hearing and Speech Sciences
University of Illinois
nambrose@illinois.edu
website
|
|
Ehud Yairi
Hearing and Speech Sciences
University of Illinois
-
website
|
Participants: | 88 CWS, 40 controls |
Type of Study: | clinical |
Location: | USA |
Media type: | audio |
DOI: | doi:10.21415/T5KX2D |
This corpus exists in three forms:
- The IISRP-new corpus provides
fully fluency coded, checked, and linked versions of the recordings from the first wave of sample at Time 1.
- The version of the corpus available from the above links on the current page was created by ASR on the basis of
the audio extracted from the videos provided by Yairi and Ambrose. Improvement of this
version to include fluency coding and correction of ASR errors is currently underway.
- We also provide a link to the original transcriptions in the IISRP-orig.zip file that is
inside the downloadable version of the ASR corpus. Those transcriptions were produced by conversion from SALT.
They did not include utterances produced by the examiners, and they were not time linked.
Citation information
Users of this corpus should the following publication:
Yairi, E. & Ambrose, N. (2005). Early childhood stuttering. Austin: Pro Ed.
For a full bibliography, please click here.
Project Description
This corpus includes:
- 440 samples from 88 CWS, 5 samples each, 6 months apart
- 67 additional samples from the same children, an additional year later
- 37 of the same children, samples taken 3 months after the initial visit
(between visits 1 & 2) from children seen close to onset
- 160 samples (approx..) from CWNS, 4 samples each, one year apart
The general objective of the Illinois International Stuttering Research Project is to study
the onset and subsequent developmental course of stuttering in children under age six.
The specific objectives are to:
- determine variations in onset and their possible effect on the future course of the disorder,
- describe and quantify changes that occur in stuttering over time,
- obtain data on the timing and magnitude of natural recovery,
- identify factors that can influence the recovery and those that lead to chronic stuttering, and
- identify means for making early prediction of the likelihood of chronic vs. transient stuttering.
The general investigation involves a wide range of testing of young
children as close as possible to the onset of stuttering. This includes
tests of speech, language, hearing, motor skills, intellectual
functioning, and emotional reactions, as well as audio and video
recordings, thorough case histories, and familial pedigrees. After
initial testing we proceed with follow-ups every six months for a period
of several years. Through this close monitoring, we document what
happens to children who begin stuttering and generate criteria for
risks. The National Institutes of Health, National Institute on Deafness
and Other Communication Disorders funded this project for more than 12
years.
For a full list of researchers involved in this project, click
here
Codes
Stuttering-like disfluencies are coded as:
- PW part-word repetition of a sound or syllable, the number refers to how many extra times
- WW single syllable whole word repetition
- DP disrhythmic phonation, blocks or prolongations (reliably identified by two experienced
listeners, as are all disfluencies but these are the hardest to be reliable on)
"Normal"-like disfluencies are coded as:
- M multiple syllable word repetition
- P phrase repetition
- I interjection, um, er, uh etc., NOT "real" words such as "well" or "like"
- R revision
Here are two examples:
(block)I wa-wa-waaaaant w-want some um um cookies cookies
[DP]I [DP][PW3][WW1]want some [I2] [M1]cookies.
Ca-ca-can I play play play so-so-some some of those toys?
[PW2]CAN I [WW2]PLAY [PW2][WW1]SOME OF THOSE TOY/S?
Acknowledgements
Andrew Yankes reformatted this corpus into accord with current versions of CHAT.