Languages of the World

Introduction

Languages may be classified either genetically or typologically. A genetic classification assumes that certain languages are related in that they have evolved from a common ancestral language. This form of classification employs ancient records (such as those for Latin) as well as hypothetical reconstructions of the earlier forms of languages, called protolanguages. Because information on the genetic affiliations of languages is sufficiently extensive, world surveys of languages are necessarily oriented in that way--sometimes exclusively so and sometimes in conjunction with typological classifications. Typological classification is based on similarities in language structure. Individual frames of reference in language typology are not known well enough to permit a worldwide typological classification.

Before the conclusive demonstration that unwritten languages could be classified genetically, they were often relegated to a typological classification, which at one time was denigrated by scholars. Since 1917, however, the prestige of some kinds of typology has risen--in particular, that of grammatical typology. The best-known typological frame of reference represents the grammar of a language, either as a whole or as a subsystem. Once a genetic classification has been established, typological classification may be superimposed on it in order to show change of language type--as from a predominantly inflectional language (such as Proto-Germanic) to a predominantly isolating one (such as modern English)--or to show features that are shared by languages in neighbouring branches in the same family (e.g., Celtic and Germanic in Indo-European). The ultimate grammatical typology is that which treats subsystems that are, in some sense, universal to all human languages.

Lexical typologies, based on similarities in vocabulary structure, have been used in cognitive anthropology and psycholinguistics (e.g., perception of colours and use of colour terms). The sociolinguistic frame of reference in typology provides classifications for varieties of language in terms of their functions and their ways of identifying social groups and cultural spaces; in addition, it brings order and integration to problems concerning national standards that are faced by new nations that have many nonstandard and unwritten languages as well as languages that make use of writing.

A few points of terminology should be explained before further discussion of the world's languages is afforded. Language family is the label often used for a conservative genetic classification, one that can be attested only when an abundance of cognates (related words) is available. Phylum is the label for a liberal genetic classification that is attested with fewer cognates; it encompasses language families. Although a given phylum will have greater extension than any of the families included in it, only fragments of phonology will be reconstructible in the protolanguage. In actual linguistic usage, however, the term family is often employed to refer to a group that is technically a phylum--e.g., the Afro-Asiatic (Hamito-Semitic) family, the Sino-Tibetan family.

The label language isolate is used for a language that is the only representative of a language family, as Basque or the extinct Sumerian language; the presumptive but unknown sister languages of isolates are dead and unrecorded. A language isolate may be classified, along with normal language families, under the rubric of an extensive phylum (e.g., Korean is sometimes classified as a member of a hypothetical Ural-Altaic phylum) or left wholly unclassified (e.g., the Ainu language of Japan). The label pidgin-creole is used for a language that has had so much vocabulary change that cognates for reconstructing the protolanguage from which it descended cannot be found. A pidgin is a contact language used for communication between groups having different native languages. When a pidgin becomes the native language of a community it is customarily called a creole.

This article begins with a survey of world languages based on geographic regions of unequal size: huge and sprawling areas for the peripheral regions of Africa, Oceania, and the Americas but relatively compact areas for the focal regions within Eurasia. Nine regions--six in Eurasia, in all of which writing and standard languages are widespread--constitute a convenient basis for comparison and contrast. The larger part of the article consists of more detailed examinations of the languages of the world arranged by genetic affinities.

Facets of the subject of language and human communication are treated in a variety of articles. For a full account of the theory and methods of linguistic science, see the article LINGUISTICS. For information on such subjects as the characteristics of language, language variants (slang, jargon), speech production, and the acquisition of language, see the article LANGUAGE. For a full account of phonetics and the pathology of speech, see the article SPEECH. For information on written languages and writing systems, see the article WRITING.

For coverage of related topics, see SPECTRUM, section 514, and the Index.


Languages of the World: Table of Contents

Table of Contents