(Modal) (Have) (Be)which means that each part is optional, but modals always precede Have and Be, and Have always precedes Be. Here are some examples, using see as the main verb:
To enforce agreement in form among the auxiliaries and their main verb, the COMPFORM attribute is added to their entries in the lexicon, and then enforced in their entries in the grammar. Here is an example lexicon entry for the auxiliary verb could:
(could (aux (modal +) (root COULD1) (vform (? v pres past)) (agr ?a) (COMPFORM bare)))This says that could is a modal, can be used with a verb in the present or past tense (e.g., could see or could have seen), requires person and number agreement, and has as complement the base (bare) form of the main verb when used alone with it. A grammar rule that incorporates auxiliaries is illustrated below:
VP --> (AUX COMPFORM ?v) (VP VFORM ?v)which is encoded as follows for the Allen interpreter:
((vp) -2> (head (aux (compform ?v))) (vp (vform ?v)))This allows a verb phrase to be formed out of an auxiliary verb (like could) and a main verb, provided that the COMPFORM attribute of the auxiliary is the same as the VFORM of the main verb (like bare, in the verb phrase could be).
More examples of auxiliary verb coding and use in the lexicon and grammar are illustrated in Allen, pp 123-127. Here is a listing of the lexicon for auxiliary verbs, given as the file chapt5 in the jallen/Parser1.1 directory.
(setq *lexicon5-2*
'((can (aux (modal +) (root CAN1) (vform pres) (agr ?a)
(COMPFORM bare)) can1)
(could (aux (modal +) (root COULD1) (vform (?
v pres past)) (agr ?a)
(COMPFORM
bare)))
(do (aux (modal +) (root DO1) (vform pres) (agr
(? a 1s 2s 1p 2p 3p))
(COMPFORM bare)))
(does (aux (modal +) (root DO1) (vform pres)
(agr 3s) (COMPFORM bare)))
(did (aux (modal +) (root DO1) (vform past)
(agr ?a) (COMPFORM bare)))
(have (aux (vform bare) (root HAVE-AUX) (COMPFORM
pastprt)))
(have (aux (vform pres) (root HAVE-AUX) (agr
(?a 1s 2s 1p 2p 3p))
(COMPFORM
pastprt)))
(has (aux (vform pres) (root HAVE-AUX) (agr
3s) (COMPFORM pastprt)))
(had (aux (vform past) (root HAVE-AUX) (agr
?a) (COMPFORM pastprt)))
(having (aux (vform ing) (root HAVE-AUX) (COMPFORM
pastprt)))
(be (aux (root BE-AUX) (VFORM bare) (COMPFORM
-)))
(is (aux (root BE-AUX) (VFORM pres) (COMPFORM
-) (AGR 3s)))
(am (aux (root BE-AUX) (VFORM pres) (COMPFORM
-) (AGR 1s)))
(are (aux (root BE-AUX) (VFORM pres) (COMPFORM
-) (AGR (?a 2s 1p 2p 3p))))
(was (aux (root BE-AUX) (VFORM past) (AGR (?
a 1s 3s)) (COMPFORM -)))
(were (aux (root BE-AUX) (VFORM past) (AGR (?
a 2s 1p 2p 3p))
(COMPFORM
-)))
(been (aux (root BE-AUX) (VFORM pastprt) (COMPFORM
-)))
(being (aux (root BE-AUX) (VFORM ing) (COMPFORM
-)))))
In this discussion, we identify the idea of a finite verb as one which is a complete verb phrase, like "halts," "halted," "writes a program," "is halting," or "has been halting." A nonfinite verb refers to a verb's base form, such as "halt." We also distinguish infinitival forms, like "to halt," present participles, like "halting," and past participles, like "halted." Now the lexicon for verbs can be encoded in the following way:
iv(Form) --> [IV], {iv(IV, Form)}.
iv(halts, finite).
iv(halt, nonfinite).
iv(halting, present_participle).
iv(halted, past_participle).
aux(Form) --> [Aux], {aux(Aux, Form)}.
aux(could, finite/nonfinite).
aux(have, nonfinite/past_participle).
aux(has, finite/past_participle).
aux(been, past_participle/present_participle).
aux(be, nonfinite/present_participle).
Now verb phrases can be formed with or without auxiliaries, using the following rules:
vp(Form) --> iv(Form).
vp(Form) --> tv(Form), np.
vp(Form) --> aux(Form/Require), vp(Require).
The above rules cover various kinds of verb phrases. For instance, the auxiliary verb been is a past participle and combines with a present participle, and have is nonfinite and takes a past participle when used as an auxiliary.
Thus, "have been halting" is a verb phrase with the auxiliary Form = nonfinite (have) and Require = past_participle (been halting). Taking the next step, the verb phrase "been halting" responds to the same grammar rule, with the auxiliary Form = past_participle (been) and Require = present_participle (halting). Finally, the verb phrase "halting" is an intransitive verb (using the first rule) with Form = present_participle.
Jack can see the dog.has a passive form in sentences like the following:
The dog was seen.To account for passives forms, grammars utilize the notion of a "passive gap", which is just a placeholder for an object that would normally complement a transitive verb. To explain passive verb phrases like the one above, the grammar is augmented with rules like the following:
VP[+pass] --> AUX[be] VP[pastprt, main, +passgap]
VP{+passgap, +main] --> V[_np]
That is, a verb phrase that is passive can be formed using the auxiliary be, followed by a verb phrase whose main verb is a past participle and then a passive gap. A verb phrase that is a main verb and has a passive gap can be any transitive verb (i.e., any verb that has the feature _np). These two rules are encoded as follows:
((vp (PASS +)) -5>
(head (aux (root
BE-AUX))) (vp (vform pastprt) (MAIN +) (PASSGAP +)))
((vp (PASSGAP +) (MAIN +)) -8>
(head (v (subcat
_np))))
Here is an encoding of the complete grammar shown in Figure 5.3 (page 127) of Allen, including rules like the ones discussed above. This grammar defines several different verb phrase structures, each allowing different combinations of auxiliary verbs and passive forms.
(setq *grammar5-3*
'((headfeatures
(s vform agr)
(vp vform agr)
(np agr))
((s (inv -))
-1>
(np (agr ?a))
(head (vp (vform (? v pres past)) (agr ?a))))
((vp)
-2>
(head (aux (compform
?v))) (vp (vform ?v)))
((vp)
-3>
(head (aux (root
BE-AUX))) (vp (vform ing) (MAIN +)))
((vp)
-4>
(head (aux (root
BE-AUX))) (vp (vform ing) (PASS +)))
((vp (PASS +))
-5>
(head (aux (root
BE-AUX))) (vp (vform pastprt) (MAIN +) (PASSGAP +)))
((vp (PASSGAP -) (MAIN
+))
-6>
(head (v (subcat
_none))))
((vp (PASSGAP -)
(MAIN +))
-7>
(head (v (subcat
_np))) (np))
((vp (PASSGAP +)
(MAIN +))
-8>
(head (v (subcat
_np))))
((np)
-9>
(art (agr
?a)) (head (n (agr ?a))))
((np)
-10>
(head (name)))
((np)
-11>
(head (pro)))))
Below is a parse of the sentence "The dog was seen" which shows the roles of the various grammar rules for handling passive gaps. A simplified tree diagram of this parse is shown in Figure 5.4 of Allen (page 128).
REL181:<REL ((GAP -) (1 S180))> from 0 to 3 from rule -R5>
S180:<S ((GAP <NP ((SEM ?SEM177) (AGR ?AGR176))>)
(WH -) (INV -)
(VFORM
PAST) (AGR 3S) (1 NP175)
(2
VP179))> from 0 to 3 from rule -5-8-1>
NP175:<NP ((GAP -) (WH -) (AGR 3S) (1 DET173)
(2 CNP174))> from 0 to 2 from rule -5-7-2>
DET173:<DET ((GAP -) (AGR 3S)
(1 ART167))> from 0 to 1 from rule -5-7-5>
ART167:<ART ((LEX
THE) (ROOT THE1)
(AGR (? A5 3P 3S)))> from 0 to 1 from rule NIL
CNP174:<CNP ((GAP -) (AGR 3S)
(1 N168))> from 1 to 2 from rule -5-7-3>
N168:<N ((LEX DOG)
(ROOT DOG1)
(AGR 3S))> from 1 to 2 from rule NIL
VP179:<VP ((GAP <NP ((SEM ?SEM177) (AGR
?AGR176))>) (VFORM PAST)
(AGR 3S) (1 V170)
(2 GAP178))> from 2 to 3 from rule -5-8-7>
V170:<V ((LEX WAS) (ROOT BE1)
(VFORM PAST) (AGR (? A7 3S 1S))
(SUBCAT _NP))> from 2 to 3 from rule NIL
GAP178:<NP ((EMPTY +) (GAP <NP
((SEM ?SEM177) (AGR ?AGR176))>)
(SEM ?SEM177)
(AGR ?AGR176))> from 3 to 3 from rule NP-GAP-INTRO
V191:<V ((VFORM PASTPRT) (ROOT SEE1) (SUBCAT _NP) (1 V171)
(2 +EN172))> from
3 to 5 from rule -LEX5>
V171:<V ((LEX SEE) (ROOT SEE1) (VFORM BARE) (SUBCAT _NP)
(IRREG-PAST
+) (EN-PASTPRT +))> from 3 to 4 from rule NIL
+EN172:<+EN ((LEX +EN))> from 4 to 5 from rule NIL
REL196:<REL ((GAP -) (1 VP195))> from 3 to 5 from rule -R6>
VP195:<VP ((GAP <NP ((SEM ?SEM193) (AGR ?AGR192))>)
(VFORM PASTPRT)
(AGR -) (1 V191) (2 GAP194))> from 3 to 5 from rule -5-8-7>
V191:<V ((VFORM PASTPRT) (ROOT SEE1) (SUBCAT
_NP) (1 V171)
(2 +EN172))> from 3 to 5 from rule -LEX5>
V171:<V ((LEX SEE) (ROOT SEE1)
(VFORM BARE) (SUBCAT _NP)
(IRREG-PAST +) (EN-PASTPRT +))> from 3 to 4 from rule NIL
+EN172:<+EN ((LEX +EN))> from
4 to 5 from rule NIL
GAP194:<NP ((EMPTY +) (GAP <NP ((SEM ?SEM193)
(AGR ?AGR192))>)
(SEM ?SEM193)
(AGR ?AGR192))> from 5 to 5 from rule NP-GAP-INTRO
wh-movement: move a wh-term to the beginning of a sentence to
form a wh-question. E.g., "Which dogs did he see?"
topicalization: move a constituent to the beginning of a sentene
for emphases. E.g., "That dog he never liked."
Adverb preposing: move an adverb to the beginning of a sentence.
E.g., "Tomorrow, he will see the dog."
Extraposition: move certain NP complements to the end of the
sentence. E.g., "A book was written about evolution."
Which dogs did he see?Here, the gap follows the verb phrase, and the word "Which" is sometimes called a "filler" for the gap (that is, a word that gives license to the existence of a gap following a transitive verb). Often words such as which (e.g., who, what, where, etc.; sometimes called the "wh-words") are also used at the head of relative clauses, such as in
The dogs which he saw returned.So the coding of words like which in the lexicon must allow these different uses. The feature WH is used for this purpose. Here is an encoding of the word which in the lexicon that distinguishes its use in a question from its use in a relative clause (see Allen Figure 5.6, page 135 for more discussion of these examples).
(which (qdet (WH q) (root WHICH) (agr (? a 3s
3p))))
(which (pro (WH r) (root WHICH) (agr (? a 3s
3p))))
The first encoding says that which can be used to introduce wh-questions, and the second says that it can be used to introduce relative clauses. Some corresponding grammar rules that can be used with the first of these two uses are as follows (the entire grammar is given in the file lisp/jallen/Parser1.1/Grams/chap5):
((s) -5-8-3>
(np (wh q) (gap -) (agr
?a))
(head (s (inv +) (gap
(% np (agr ?a))))))
((s (inv +) (wh ?w) (gap ?g)) -5-8-2>
(head (aux (compform
?s) (agr ?a)))
(np (wh ?w) (agr ?a)
(gap -))
(vp (vform ?s) (gap
?g)))
((np (wh ?w)) -5-7-2>
(det (wh ?w) (agr ?a))
(head (cnp (agr ?a))))
((det (wh ?w)) -5-7-7>
(head (qdet (wh ?w))))
The first rule says that a sentence can be constructed using a noun phrase of the WH variety (e.g., "which dogs") and a head of the inverted s variety (e.g., "did he see"). The second rule shows how an inverted s can be defined with a gapped vp. The third and fourth rules tell more about the structure of a np of the WH variety; that it can be a det of the qdet variety (e.g., which) followed by a complementary noun phrase ("dogs"). Agreement also appears in appropriate places, as does the location of the gap (e.g., following the transitive verb "see").
A full parse of the sentence "Which dogs did he see" appears below,. This corresponds to the chart parse shown and discussed on page 141 of Allen.
S218:<S ((VFORM PAST) (AGR 3S) (1 NP210)
(2 S217))> from
0 to 6 from rule -5-8-3>
NP210:<NP ((GAP -) (WH Q) (AGR 3P) (1 DET204)
(2 CNP209))> from 0 to 3 from rule -5-7-2>
DET204:<DET ((GAP -) (WH Q) (AGR 3P)
(1 QDET198))> from 0 to 1 from rule -5-7-7>
QDET198:<QDET ((LEX WHICH) (WH
Q) (ROOT WHICH)
(AGR (? A23 3P 3S)))> from 0 to 1 from rule NIL
CNP209:<CNP ((GAP -) (AGR 3P)
(1 N208))> from 1 to 3 from rule -5-7-3>
N208:<N ((AGR 3P) (ROOT DOG1)
(1 N199)
(2 +S200))> from 1 to 3 from rule -LEX7>
N199:<N ((LEX DOG)
(ROOT DOG1)
(AGR 3S))> from 1 to 2 from rule NIL
+S200:<+S ((LEX +S))>
from 2 to 3 from rule NIL
S217:<S ((GAP <NP ((SEM ?SEM214) (AGR 3P))>) (WH -)
(INV +)
(VFORM
PAST) (AGR 3S) (1 AUX201) (2 NP211)
(3
VP216))> from 3 to 6 from rule -5-8-2>
AUX201:<AUX ((LEX DID) (MODAL +) (ROOT DO1)
(VFORM PAST) (AGR 3S)
(COMPFORM BARE))> from 3 to 4 from rule NIL
NP211:<NP ((GAP -) (WH -) (POSS -) (AGR 3S)
(1 PRO202))> from 4 to 5 from rule -5-7-1>
PRO202:<PRO ((LEX HE) (ROOT HE1)
(AGR 3S))> from 4 to 5 from rule NIL
VP216:<VP ((GAP <NP ((SEM ?SEM214) (AGR
3P))>) (VFORM BARE) (AGR -)
(1 V203) (2 GAP215))> from 5 to 6 from rule -5-8-7>
V203:<V ((LEX SEE) (ROOT SEE1)
(VFORM BARE) (SUBCAT _NP)
(IRREG-PAST +) (EN-PASTPRT +))> from 5 to 6 from rule NIL
GAP215:<NP ((EMPTY +) (GAP <NP
((SEM ?SEM214) (AGR 3P))>)
(SEM ?SEM214)
(AGR 3P))> from 6 to 6 from rule NP-GAP-INTRO
sinv --> aux(finite/Required), np, vp(Required).That is, an inverted sentence is formed by an auxiliary verb, followed by a noun phrase and a verb phrase that reflects the Required part of the auxiliary verb phrase.
Recall that a gap is part of a phrase missing from its usual location, and a filler is another phrase that stands for the missiing one. For instance, in "terry read every book that bertrand wrote", the filler is "that" and the gap occurs after the verb "wrote" which normally takes a noun phrase as an object. In Prolog, a gap is realized by omitting a noun phrase:
np(gap(np)) --> [].Now a verb phrase that admits a gap can be formed from a transitive verb and a possibly-missing noun phrase:
vp(GapInfo) --> tv, np(GapInfo).
s(GapInfo) --> np(nogap), vp(GapInfo).
rel --> relpron, s(gap(np)).
That is, a relative clause is a relative pronoun followed by a sentence with a gap, as in "that bertrand wrote."
Wh-questions can be handled in Prolog using similar strategies. Questions like "who loves mary" and "who does mary love" are handled using the following rules, respectively:
q --> whpron, vp(nogap).
q --> whpron, sinv(gap(np)).
sinv(GapInfo) --> aux, np(nogap), vp(GapInfo).
CLE performance in 1992. A sample of 1000 sentences taken at random from the Lancaster Oslo Bergen corpus of printed British English. Of these, 634 were analyzed successfully by the CLE -- that is parsed and produced at least one logical form for meaning. 67% of the 634 were estimated to be valid meaning representations.