On this page you will find the experimental details for the two word naming corpora that we have collected. The word naming data is freely available for anyone to down load. The goal is to make available a large dataset to address a variety of questions regarding word naming performance. In doing so, we have only two requests:
1. Any use of this data should be acknowedged via the appropriate citations (of course). For the naming data from young adults, the appropriate citation is:
Spieler D. H., & Balota, D. A. (1997). Bringing computational models of word naming down to the item level. Psychological Science, 6, 411-416.
For the older adult data, the appropriate citation is:
Balota, D. A. & Spieler, D. H. (1998).
The utility of item level analyses in model evaluation: A reply to
Seidenberg &
Plaut (1998). Psychological Science
2.
Distribution of the naming data should be exclusively from this
site. In other words, do not distribute data second hand, rather
direct requests these data to this site. This is primarily to ensure
accurate citations and to ensure that the corpora are distributed in their
entirety rather than in parts.
Please direct any questions or comments to either of us:
David Balota
Department of Psychology
Washington University
St. Louis, MO 63130
dbalota@artsci.wustl.edu
Daniel Spieler
Department of Psychology
Jordan Hall, Bldg. 420
Stanford University
Stanford, CA 94305-2130
spieler@psych.stanford.edu
Below we report
the Methods for the data collection. Similar Methods for young subjects
only are reported in the Spieler and Balota (1997) paper.
Methods
Thirty one younger adults were recruited from the undergraduate student population at Washington University. Twenty nine older adults were recruited from the Aging and Development Subject Pool in the Department of Psychology at Washington University. All individuals were paid $20 for their participation. The young participants had a mean age of 22.6 years (SD = 5.0), 14.8 years of education (SD = 2.0) and scored 35.1 (SD = 2.7) on the Shipley vocabulary subtest (Western Psychological Services, 1967).
Apparatus
An IBM compatible Compudyne 486 computer was used to control the display of stimuli and to collect response latencies to the nearest ms. The stimuli were displayed on a NEC4G 14 inch color VGA monitor in 40 column mode in white on a blue background. The naming latency for each word was measured using a Gerbrands Model G1341T voice operated relay interfaced with the computer.
Materials
The words consisted of 2870 single syllable words appearing in the training corpora of the PMSP model and SM89 models. These words ranged in frequency from 68246 to 0 counts per million according to Francis & Kucera (1982). The words ranged from two to seven letters in length.
Notice: The word-frequency norms that are listed are based on a restricted data base version of the Francis & Kucera (1982) norms. Although the listed estimates are highly correlated with the estimates from the full data base (.97 based on log frequency values), there are some deviations. The full Kucera & Francis (1967) listing should be available in the data base by 9/15/98. Please contact spieler@psych.stanford.edu, if you have any questions.
Procedure
Each individual participated in two separate experimental sessions.
In each session, participants named 1435 words. Words were presented
in a different random order for each participant. At the beginning
of each of the two experimental sessions, individuals were seated in front
of the computer and given the instructions for the experiment. Participants
were told that they would be shown single words at the center of the computer
screen and that their task was to name the words aloud as quickly and as
accurately as possible. They were told to avoid making any extraneous
noises which might trigger the voicekey and they were also told not to
precede any of their responses with vocalized pauses such as “um” or “err”.
Participants were told that some of the words were very common while others
were quite rare. Each trial consisted of the following sequence of
events: a) a fixation consisting of three plus signs (“+ + +”) appeared
in the center of the computer screen for 400 ms, b) the screen went
blank for 200 ms, c) the word appeared at the position of the fixation
and remained on the screen until 200 ms after the initial triggering of
the voicekey. After each naming response, participants pressed a
button on a mouse to go on to the next word. If there was an error
or an extraneous sound triggered the voicekey, participants were told to
press the right button on the mouse. If everything appeared to have
worked properly on that trial, subjects were told to press the left button
on the mouse. Pressing the mouse button initiated a 1200 ms intertrial
interval.
Participants were
given breaks after every 150 trials. Two buffer trials consisting
of filler words not appearing in the training corpora were inserted at
the beginning of each block of trials. In addition, at the beginning
of each session, subjects were given 20 practice trials to familiarize
them with the task. Each experimental session lasted for approximately
60 minutes.
Results
Response latencies for trials
that participants marked as errors and response latencies faster than 200
ms and slower than 1500 ms were excluded from all analyses. Also,
items that fell more than 2.5 standard deviations beyond each subject’s
mean response latency were also dropped from these analyses. These
criteria eliminated 4.8% of the observations in younger adults.
An identical screening method was also applied for the older adults with
the exception that an upper limit on naming latency was increased to 2000
ms. These criteria eliminated 4.9% of the naming responses in the
older adults. In addition, 10 words were eliminated because their
colloquial connotations were such to make it unlikely that their naming
latencies are meaningful. Mean latencies were then computed for each
item across subjects separately for each group.
Updated: July 22, 1998.