nlp6

Semantics - Meaning Representation in NLP

The entire purpose of a natural language is to facilitate the exchange of ideas among people about the world in which they live. These ideas converge to form the "meaning" of an utterance or text in the form of a series of sentences. The meaning of a text is called its semantics. A fully adequate natural language semantics would require a complete theory of how people think and communicate ideas. Since such a theory is not immediately available, natural langauge researchers have designed more modest and pragmatic approaches to natural language semantics.

One such approach uses the so-called "logical form," which is a representation of meaning based on the familiar predicate and lambda calculi. In this section, we present this approach to meaning and explore the degree to which it can represent ideas expressed in natural language sentences. We use Prolog as a practical medium for demonstrating the viability of this approach. We use the lexicon and syntactic structures parsed in the previous sections as a basis for testing the strengths and limitations of logical forms for meaning representation.

Logic and Logical Forms

The Meaning of Simple Objects and Events

Quantifiers and the Meaning of Determiners

The Meaning of Modifiers

Adjectives

Relative Clauses

Plurals, Cardinality, and Mass Nouns

Negation

Question-Answering

Yes/No

How Many and Which

Who and What

Discourse Referents, Anaphora, Definite Reference (the)

Word Sense Disambiguation

Ontological Methods

Statistical Methods

The semantics, or meaning, of an expression in natural language can be abstractly represented as a logical form. Once an expression has been fully parsed and its syntactic ambiguities resolved, its meaning should be uniquely represented in logical form. Conversely, a logical form may have several equivalent syntactic representations. Semantic analysis of natural language expressions and generation of their logical forms is the subject of this chapter.

Consider the sentence "The ball is red." Its logical form can be represented by red(ball101). This same logical form simultaneously represents a variety of syntactic expressions of the same idea, like "Red is the ball." and "Le bal est rouge."

The most challenging issue that stands between a syntactic utterance and its logical form is ambiguity; lexical ambiguity, syntactic ambiguity, and semantic ambiguity. Lexical ambiguity arises with words that have many senses. For example, the word "go" has at least the following different senses, or meanings:

move
depart
pass
vanish
reach
extend
set out

Compounding the situation, a word may have different senses in different parts of speech. The word "flies" has at least two senses as a noun (insects, fly balls) and at least two more as a verb (goes fast, goes through the air).

To assist with the categorization and representation of word meanings, an ontology is often used. An ontology is a broad classification scheme for objects, originally proposed by Aristotle. Many of the original classifications are still in use today; here is a typical list:

action                position
event                 state
substance          affection
quantity             ideas
quality               concepts
relation              plans
place                 situation
time

Case grammars have been proposed to further clarify relations between actions and objects. So-called case roles can be defined to link certain kinds of verbs and various objects. These roles include:

agent
theme
instrument

For example, in "John broke the window with the hammer," a case grammar would identify John as the agent, the window as the theme, and the hammer as the instrument. More examples of case roles and their use are given in Allen, p 248-9.

Logical Forms and Lambda Calculus

The language of logical forms is a combination of the first order predicate calculus (FOPC) and the lambda calculus. The predicate calculus includes unary, binary, and n-ary predicates, such as:

Lisp notation                                 Prolog notation                                Sentence represented
(dog1 fido1)                   dog1(fido1)                                     "Fido is a dog"
(loves1 sue1 jack1)            loves1(sue1, jack1)                       "Sue loves Jack"
(broke1 john1 window1 hammer1) broke1(john1, window1, hammer1)"John broke the window
                                                                                                                with a hammer"

The third example shows how the semantic information transmitted in a case grammar can be represented as a predicate.

Affixing a numeral to the items in these predicates designates that in the semantic representation of an idea, we are talking about a particular instance, or interpretation, of an action or object. For instance, loves1 denotes a particular interpretation of "love."

Logical forms can be constructed from predicates and other logical forms using the operators & (and), => (implies), and the quantifiers all and exists. In English, there are other useful quantifiers beyond these two, such as many, a few, most, and some. For example, the sentence "Most dogs bark" has the logical form

(most d1: (& (dog1 d1) (barks1 d1)))

Finally, the lambda calculus is useful in the semantic representation of natural language ideas. If p(x) is a logical form, then the expression \x.p(x) defines a function with bound variable x. Beta-reduction is the formal notion of applying a function to an argument. For instance, (\x.p(x))a applies the function \x.p(x) to the argument a, leaving p(a).

Semantic Rules for Context Free Grammars

One way to generate semantic representations for sentences is to associate with each grammar rule an associated step that defines the logical form that relates to each syntactic category. Consider simple grammars with S, NP, VP, and TV categories.

1. If the grammar rule is S --> NP VP and the logical forms for NP and VP are NP' and VP' respectively, then the logical form S' for S is VP'(NP'). For example, in the sentence "bertrand wrote principia" suppose that

NP' = bertrand and VP' = \x.wrote(x, principia)

Then the logical form S' is the result of Beta reduction:

(\x.wrote(x, principia))bertrand = wrote(bertrand, principia)

2. If the grammar rule is VP --> TV NP and the logical forms for TV and NP are TV' and NP' respectively, then the logical form VP' for VP is TV'(NP'). For example, in the phrase "wrote principia," if

TV' = \y.\x.wrote(x, y) and NP' = principia

Then the logical form VP' = \x.wrote(x, principia) after Beta reduction.

Prolog Representation

To accommodate the limitations of the ASCII chareacter set, the following conventions are used in Prolog to represent logical forms and lambda expressions.

Expression                     Prolog convention
(forall x: p(x))                  all(X, p(X))                  (recall that Prolog variables are capitalized)
(exists x: p(x))                 exists(X, p(X))
and                                &
implies                           =>
\x.p(x)                            X^p(X)
(\x.p(x)) y                       reduce(X^p(X), Y, LF)

The Beta reduction reduce is defined by the Prolog rule reduce(Arg^Exp, Arg, Exp). For example,

reduce(X^halts(X), shrdlu, LF)

gives the answer LF = halts(shrdlu).

Semantics of a Simple Grammar

Using these ideas, we can write Prolog rules with semantics as follows:

s(S) --> np(NP), vp(VP), {reduce(VP, NP, S)}.
np(NP) --> det(D), n(N), {reduce(D, N, NP)}.
np(NP) --> n(NP).
vp(VP) --> tv(TV), np(NP), {reduce(TV, NP, VP)}.
vp(VP) --> iv(VP).

In the first rule, VP has the lambda expression and NP has the subject. In fact, the references to reduce can be removed from these rules, and their effects can be inserted directly where they will take place. That is, the following set of rules

s(S) --> np(NP), vp(NP^S).
np(NP) --> det(N^NP), n(N).
np(NP) --> n(NP).
vp(VP) --> tv(NP^VP), np(NP).
vp(VP) --> iv(VP).

captures the same semantics as the original set. This is called partial execution.

We encode lexical entries for nouns and verbs as follows:

n(shrdlu) --> [shrdlu]. % proper nouns represent themselves
n(terry) --> [terry].
n(X^ `program(X)) --> [program]. % this encoding produces the "is-a" property
tv(Y^X^ `wrote(X, Y)) --> [wrote].
iv(X^ `halts(X)) --> [halts].

Note that nouns which represent classes of objects are encoded with a logical form that permits specific proper nouns to become members of that class. For instance, the logical form program(shrdlu) may result from the parsing of "shrdlu is a program." Transitive verbs, like wrote, generate 2-variable lambda expressions, while intransitive verbs generate 1-variable lambda expressions. With this grammar, we can parse sentences and directly generate encodings of logical forms. For example, the call

s(LF, [terry, wrote, shrdlu], [])

yields the result LF = wrote(terry, shrdlu), in a sequence of steps shown by the following partial trace (failing steps are omitted from this trace):

?- s(LF, [terry, wrote, shrdlu], []).
   Call: ( 7) s(_G276, [terry, wrote, shrdlu], []) ? creep
   Call: ( 8)   np(_L131, [terry, wrote, shrdlu], _L132) ? creep
   Call: ( 9)       n(_L131, [terry, wrote, shrdlu], _L132) ? creep
   Exit: ( 9)       n(terry, [terry, wrote, shrdlu], [wrote, shrdlu]) ? creep
   Exit: ( 8)    np(terry, [terry, wrote, shrdlu], [wrote, shrdlu]) ? creep
   Call: ( 8)    vp(terry^_G276, [wrote, shrdlu], []) ? creep
   Call: ( 9)       tv(_G382^terry^_G276, [wrote, shrdlu], _L171) ? creep
   Exit: ( 9)       tv(_G382^terry^wrote(terry, _G382), [wrote, shrdlu], [shrdlu]) ? creep
   Call: ( 9)       np(_G382, [shrdlu], []) ? creep
   Call: ( 10)           n(_G382, [shrdlu], []) ? creep
   Exit: ( 10)           n(shrdlu, [shrdlu], []) ? creep
   Exit: ( 9)       np(shrdlu, [shrdlu], []) ? creep
   Exit: ( 8)    vp(terry^wrote(terry, shrdlu), [wrote, shrdlu], []) ? creep
   Exit: ( 7) s(wrote(terry, shrdlu), [terry, wrote, shrdlu], []) ? creep

LF = wrote(terry, shrdlu)

Quantified Noun Phrases

The logical forms that represent quantified noun phrases, like "every program" in the sentence "every program halts," should carry appropriate quantifiers. To accomplish this, determiners like "every" should be encoded in the lexicon in a way that anticipates the addition of a quantifier to the logical form being constructed. Here is a definition for the determiner "every."

det((X^P)^(X^Q)^all(X, P=>Q)) --> [every].

The Prolog logical form here is an encoding of the lambda expression \p.\q.(all(x, p(x) => q(x))). Note here that the variable x is bound in each of the expressions p and q.

This definition suggests that when "every" is used as a determiner in a sentence, the semantic representation of that sentence is a lambda expression with two expressions (P and Q) identifying properties of an object X, such that P(X) implies Q(X). For instance, in "every program halts," the expression P=>Q is instantiated with program(X) and halts(X) for the object X, so we have the following logical form as a meaning representation:

\p.\q.(all(x, p(x) => q(x)) (\y.halts(y)) (\z.program(z))
      = \q.(all(x, program(x) => q(x)) (\y.halts(y)) )
      = all(x, program(x) => \y.halts(y) (x) )
      = all(x, program(x) => halts(x) )

Here is a partial trace for this parse.

?- s(LF, [every, program, halts], []).
   Call: ( 7) s(_G276, [every, program, halts], [])
   Call: ( 8)    np(_L131, [every, program, halts], _L132)
   Call: ( 9)       det(_G379^_G380, [every, program, halts], _L146)
   Call: ( 10)          'C'([every, program, halts], _L159, _L160)
   Exit: ( 10)          'C'([every, program, halts], every, [program, halts])
   Call: ( 10)           det(every, _G379^_G380)
   Exit: ( 10)           det(every, (_G382^_G383)^ (_G382^_G389)^all(_G382, _G383=>_G389))
   Call: ( 10)           _L146=[program, halts]
   Exit: ( 10)           [program, halts]=[program, halts]
   Exit: ( 9)        det( (_G382^_G383)^ (_G382^_G389)^all(_G382, _G383=>_G389),
                                                                                   [every, program, halts], [program, halts])
   Call: ( 9)        n(_G382^_G383, [program, halts], _L132)
   Exit: ( 9)         n(_G382^ `program(_G382), [program, halts], [halts])
   Exit: ( 8)    np( (_G382^_G389)^all(_G382, `program(_G382)=>_G389),
                                                                                   [every, program, halts], [halts])
   Call: ( 8)    vp( ( (_G382^_G389)^all(_G382, `program(_G382)=>_G389))^_G276, [halts], [])
   Call: ( 9)        iv( ( (_G382^_G389)^all(_G382, `program(_G382)=>_G389))^_G276, [halts], [])
   Exit: ( 9)        iv( ( (_G382^_G389)^all(_G382, `program(_G382)=>_G389))^
                                                `halts( (_G382^_G389)^all(_G382, `program(_G382)=>_G389)),
                                                                                   [halts], [])
   Exit: ( 8)    vp( ( (_G382^_G389)^all(_G382, `program(_G382)=>_G389))^
                                                `halts( (_G382^_G389)^all(_G382, `program(_G382)=>_G389)),
                                                                                   [halts], [])
   Exit: ( 7) s(`halts( (_G382^_G389)^all(_G382, `program(_G382)=>_G389)),
                                                                                   [every, program, halts], [])

LF = `halts( (_G382^_G389)^all(_G382, `program(_G382)=>_G389))

Note that the logical form derived in this parse is not the same as the one developed above. To see that they are equivalent, note that LF is of the form `halts( \z.\x.all(x, program(x) => z(x) ), and `halts has the logical form \y.halts(y).
Beta-reducing these two, we get:

(\z.\x.all(x, program(x) => z(x)) (\y.halts(y))
= \x.all(x, program(x) => \y.halts(y) (x))
= \x.all(x, program(x) => halts(x))

Semantics of Filler-Gap Dependencies

Gaps are analyzed like proper nouns except they supply a variable as the second argument.

np((X^S)^S, gap(np, X)) --> [].

That is, a relative clause meaning is a property, which is conjoined with a noun meaning (another property).

optrel((X^S1) ^ (X ^ (S1&S2))) --> relpron, vp(X^S2, nogap).
optrel((X^S1) ^ (X ^ (S1&S2))) --> relpron, s(S2, gap(np, X)).

For example, the clauses "professor who wrote principia" and "that bertrand wrote" represented by the following logical forms:

M^(professor(M) & wrote(M, principia))
B^wrote(bertrand, B)

A question can be interpreted as a property that is true of the answer to the question. For yes-no questions, we want the property "yes" conveyed if the condition in the inverted sentence is satisfied:

sinv(S, GapInfo) --> aux, np(VP^S, nogap), vp(VP, GapInfo).
q(yes^S) --> sinv(S, nogap).

For complement questions, we want the property to be that given by the VP.

q(V) --> whpron, vp(VP, nogap).
q(X^S) --> whpron, sinv(S, gap(np, X)).

Exercises

Add to the lexicon an appropriate encoding for the determiner "a", so that it can be used in sentences like "terry wrote a program". Hand-trace the application of the Prolog rules given in this section with this sentence and show the intermediate logical forms that lead to its logical form representation, exists(x, program(x) => wrote(terry, X)).
Assuming the grammatical rules found in this section, find appropriate semantic representations for the following statements:

a. terry wrote a program that halts

b. a program halts

c. terry wrote every program that halts

Give an example of a yes-no question and a complement question to which the rules in the last section can apply. For each example, show the intermediate steps in deriving the logical form for the question. Assume there are sufficient definitions in the lexicon for common words, like "who", "did", and so forth.
Look at program 4.2 on p 102 of Pereira. Using a trace, show the intermediate steps in the parse of the sentence "every student wrote a program."

References

Allen, chapter 9
Pereira, chapter 4