Relation to Long Term Goals
Relation to Present Knowledge
Objectives. Comparing languages
systematically is the most accurate way to firm up our
understanding of language diversity. It tells us about earlier
relationships of peoples. In a century during which many endangered
languages and cultures are likely to disappear forever, that's
To make accurate comparisons on a
broad scale before the data pass out of reach, we must bring
linguistics and information technology together. Then we can go
beyond what individual scholars are now able to accomplish in a
lifetime of using traditional data management practices. Wordcorr
is one such partnership between linguistics and technology.
In addition it increases the
possibilities for collaboration among linguists. The
"advances in language-information technology, such as
documentation and comparison of linguistic diversity" that the
National Science Foundation's Information Technology initiative
calls for have an impact on the way linguists organize and conduct
At the same time Wordcorr broadens
the possibilities for education in linguistic science. It's
a good teaching tool.
Significance. Grimes has been in
touch with enough comparative endeavors to respect the enormous
amount of work done by dedicated practitioners. But he also knows
something about the gaps and guesswork that still need to be filled
in by solid research. He explored quantification of comparative
results in Grimes
and Agard 1959, and was Consulting Editor for the Ethnologue
from 1974 to 2000 (Grimes
1995b), compiler of the Ethnologue Language Family Index
for 1993, 1996, and 2000, one of the Language Identification
Editors for the 1992 and 2002 editions of the Oxford International
Encyclopedia of Linguistics, a contributor to the Comparative
Austronesian Dictionary (1995c), and a member of
the Linguistic Society of America's
Committee on Endangered Languages and their Preservation from 1997
Aware of the snail's pace at which
good comparative research often proceeds, he got the idea of
sorting out differences between
- the analytical judgments that
linguists usually make quite rapidly, and
- the meticulous bookkeeping they
have to do to keep track of the implications of those
From that he put together
relational data structures suitable for an information technology
application -- Wordcorr. It has enough capacity that teams of
scholars anywhere in the world can work with it over the Internet
to tackle language families of any size. Computing time appears to
be linear in the number of speech varieties being compared, rather
than running into a combinatorial explosion.
Now that we have such an infrastructure
for research, we can:
- replace conjectures about
language relationships with demonstrations backed by detailed
- up the rate at which teams of
linguists can document language relationships accurately,
including the relationships of endangered languages with
- test conflicting hypotheses
about how language families may have developed, following out
each hypothesis simultaneously without confusion.
- circulate research results to
scholars and to the public at large as soon as the
investigators reach closure on their analysis.
- stimulate dialogue by allowing
collaborating groups of comparative linguists to share
information and discuss it collegially via the Internet.
- help teachers of linguistics to
teach the principles and practices of comparative linguistics
by giving their students hands on experience with real data.
- attract smart high school and
college students into linguistics by letting them discover for
themselves how interesting language is.
- lead informed citizens to
discover for themselves the intricacy and beauty of languages
they have been taught to regard as "inferior."
- contribute to the shared data
archives of the worldwide linguistics community, including
archives of endangered and recently extinct languages.
Relation to long term goals.
Grimes's interest in language comparison was launched in 1954. He
published some of the regular correspondences between Huichol and
Cora, neighboring Uto-Aztecan languages in Mexico, from field data.
Through extended personal contact with comparative linguists such
as Morris Swadesh, Robert E. Longacre, Charles F. Hockett,
Frederick B. Agard, and later the Austronesian Circle at the
University of Hawai`i, he saw the field develop through the latter
half of the twentieth century. He also branched out into
investigating inherent intelligibility among speech varieties (1974), even though his
major scholarly interests were focused on discourse and the
He was one of the first linguists
to use computers in connection with field work. Beginning in
1960, he soon became aware of their potential for managing
comparative linguistic data. In the early 1970s he initiated a
project that rounded up 668 word lists that linguists had collected
but never gotten around to processing, having the greatest success
in Africa but getting some from Asia and the Americas. This
eventually became the Cornell-SIL-Hawai`i archive (CSH), which
Maria Faehndrich of the University of Hawaii has now transformed
into a set of Wordcorr collections.
But collecting data is only one
step towards doing science. The reason why some linguists
contributed word lists to the CSH archive was that they realized
they themselves had little hope in their lifetime of exploiting the
data they had sweated to collect. It took too long to tabulate
everything before they could begin to put together solid
Grimes couldn't help them with
tabulation at that stage either. But later he was able to work out
data structures that could be used to automate the frustrating
parts of the process and allow linguists to focus on the
comparisons, rather than on finding mislaid file slips or
recreating long forgotten hypotheses.
The long term goal is to
demonstrate what the linguistic relationships are within all
the world's language families. Whether all languages can be
integrated into a single family, as some think, or whether the
evidence fuzzes out well this side of total coverage, as others
believe, depends upon a lot of scholars doing a lot of very
detailed work, preferably in a much shorter time than the three
centuries since the Dutch started pointing out regular differences
among Malay varieties in the East Indies. Now that Wordcorr is
available internationally as a vehicle for team-based research, the
unthinkable just might become doable.
Relation to present knowledge.
Comparative and historical linguists look at much more than
comparative phonology -- they also examine evidence for
morphological, syntactic, and semantic change. Nevertheless,
comparative phonology is where most scholars begin, and some spend
most of their time on it.
That is because the greatest
precision in techniques of analysis is there, training begins
there, its results are most clearly explained to nonspecialists,
and arguments based on detailed handling of masses of phonological
data are easier to assess than some of the arguments from the other
areas. The Wordcorr Project is concerned directly with data and
analysis management for comparative phonology.
In that context, Wordcorr
facilitates the best practices of comparative linguistics:
extensive, detailed tabulation of sets of correspondences among
phonological segments and the relationship of each set to partially
similar sets. Without a fairly sophisticated tool to manage that
kind of complexity, it is easy to lose sight of part of the data.
This project helps practicing linguists to concentrate on the
patterns that help explain how the data got that way.
Continue on to Implications.