The advent of generative grammar with the publication of Chomsky’s monograph Syntactic Structures in 1957 was a scientific revolution. From the beginning, this revolution was directed almost exclusively against the taxonomic conception of syntax. Since the generative revolution was essentially syntactic, our starting point must be syntax. So, we need to have a brief examination of taxonomic syntax, indicating also some of the problems to the solution of which it is inadequate.
The Constituent Structure of Sentences
We will take the word as a basic unit of syntactic structure. By doing this we are assuming that there is a set of procedures which, when applied to sequences of morphemes, will define a class of constituents consisting of one or more morphemes which may suitably be named ‘words’. In other words, the word is not an intuitively given unit, but a procedurally defined classificational construct.
We will make the further assumption that there is a set of procedures through which words can be classified into such well-known classes of ‘nouns’, ‘verbs’, ‘adjectives’, ‘adverbs’, ‘pronouns’, ‘conjunctions’, etc.
At this point, let us introduce some technical terms from the generative theory of syntax. The terms are not used in taxonomic grammars, but the notions underlying them are implicit in the analysis of the constituent structure. Consider the following abstract tree-diagram.
The dots in the diagram are called nodes. The capital letters are node labels. The lower-case letters represent words. The relationship between the higher and lower nodes in the tree is one of domination. One node immediately dominates another if there is no intervening node between them. In the tree, A immediately dominates B and C; it dominates all other nodes. C immediately dominates F and G; it dominates z and q. There is no relation of domination between C and B, between K and G, etc. Conversely, if we move up the tree, the relationship is one of “is a”. Thus, m “is a” K, K + L “is a” B, etc. Besides these relations, the tree specifies the relation of precedence. For example, B precedes C, F precedes G, etc.
Immediate Constituent Analysis
To study the structure of a sentence, the structural linguists thought of dividing a sentence into its immediate constituents (or ICs). The principle involved was that of cutting a sentence into two, further cutting these two parts into another two, and continuing the segmentation till the smallest unit, the morpheme was arrived at. The concept of constituent structure is based on the observation that units which occur next to each other tend to belong together in the sense that they are structurally intimately related.
Consider the following sentence-
(1) The nice scouts who were camping in the wood have gone home
This sentence consists of 12 words. These form the ultimate constituents of the sentence. As the first step in our analysis of the constituent structure of (1), we attempt to group the words in pairs. Likely candidates are ‘nice’ and ‘scouts’, ‘were’ and ‘camping’, ‘the’ and ‘wood’, and ‘have’ and ‘gone’. Once we have grouped these words, they are to be considered as functional units or constituents. An operational test for the correctness of the analysis is substitution: if the groups are indeed constituents, it should be possible to substitute single words for them without affecting the basic syntactic pattern of the sentence.
(2) The women (nice scouts) who worked (were camping) in there (the wood) went (have gone) home.
We proceed like this through the sentence until all words have been paired off with a constituent. Let us assume that the final result of the analysis can be represented in terms of the following bracketed string:
(3) (((The ( nice scouts)) ( who (( were camping) ( in ( the wood )))) (( have gone) home)))
Observe that we might proceed the other way. The first step, then, would be to segment or cut the sentence into two parts, the next to segment each of these into two parts; and so on, until the rank of the word. Again, the analysis would have to be controlled by environmentally determined substitution tests, studies of the distributional range of the units established by segmentation, etc.
We can now illustrate the procedure in the following way, that is 0 indicating where the first cut was made, 1 where the second cuts were made, etc.
Let us now proceed to convert (3) and (4) into a tree-diagram.
There are 11 nodes in the tree. Each of them immediately dominates two constituents, and these two constituents are immediate constituents or ICs of construction represented by the immediately dominating node. The tree shows a hierarchical layering of structures -the principle of togetherness-by-ranks. In other words, the syntactic structure is not solely a matter of linearity, but also a matter of depth.
We said that substitution was one of the checks on the analysis. The two encircled nodes represent the last possibilities of substitution. At those nodes we can substitute, say
(6) Jack left
The sentence, then, can be viewed as successive expansions of this basic structure.
We may now try to omit some of the constituents to see what happens in terms of grammaticality (ungrammatical sentences are marked by *)
(7) The scouts who were camping in the wood have gone home
(8) * The nice who were camping in the wood have gone home
(9) * The nice scouts were camping in the wood have gone home
(10) * The nice scouts have gone home
(11) The nice scouts have gone home
(12) * The nice scouts who were camping the wood have gone home
(13) * The nice scouts who were camping in have gone home
(14) * The nice scouts who were camping in the wood home
(15) The nice scouts who were camping in the wood have gone
The omission of ‘nice’ in (7) does not make the sentence ungrammatical. In (8) ‘scouts’ has been omitted and this results in ungrammaticality. Consequently, ‘scouts’ must be syntactically more important in the sentence than ‘nice’. (9), (10) and (11) show that ‘who’ and ‘were camping in the wood’ must either both be present in the sentence or both be omitted. (12) and (13) show that ‘in’ and ‘the wood’ are interdependent in the same way. Finally. (14) and (15) reveal that ‘have gone’ is more important in the structure of the sentence than ‘home’.
The pattern of (un)grammaticality revealed by (7) through (15) suggests that we have to do with two fundamentally different types of construction. In one type, one of the ICs is the head or centre of the construction. In the other type, neither of the ICs constitutes the head: both are equally important.
Constructions which have a head are referred to as endocentric, whereas those which have no such head are termed exocentric. It will be apparent that the defining criterion of endocentricity and exocentricity is distribution: a construction is endocentric only if one of the ICs has the same (or roughly the same) distribution as the whole construction, whereas, in an exocentric construction, neither of the ICs has the same distribution as the whole construction.
Each node is related to the two symbols on the branches immediately below. If one arrow points towards the node, the other away from the node, the structure is endocentric. If both arrows point away from the node, the structure is exocentric.
Each integer from 13 to 23 specifies a construction. The two lines branching from each node represent the function of the ICs of the construction. Each of the constructions of which the sentence is made up, and -ultimately -each word, is a member of a form class. If we adopt the following notational convention: F1 = function of left-branching constituent from a given node, F2 = function of right-branching constituent, and C = class / construction, we can assign the following structural description to (16):
A tree-diagram with labelled nodes is called a phrase-marker, abbreviated P-marker.