A brief survey of classic work on compositionality

A quick journey through classic theories of compositionality and their latent features.

A lot of people seem to be discussing compositionality right now, especially in the context of building AI systems. It seems to be a good time to come back to some ‘classic’ linguistic work on the topic. My aim in this post is to highlight some of the philosophical and psychological issues in the history of the concept.

There are two principles under the heading ‘compositionality’, both (possibly incorrectly) attributed to Frege. See Pelletier’s “Did Frege believe in Frege’s principle?” (2001).

Bottom-up, the ‘compositionality principle’: “… an important general principle which we shall discuss later under the name Frege’s Principle, that the meaning of the whole sentence is a function of the meanings of its parts.’” Cresswell (1973)
Top-down, the ‘context principle’: “[I]t’s a famous Fregean view that words have meaning only as constituents of (hence, presumably, only in virtue of their use in) sentences.” (Fodor and LePore, 1992)

Obviously, the two principles are at odds. So what exactly is compositionality? Partee’s answer (1984): “Given the extreme theory-dependence of the compositionality principle and the diversity of existing (pieces of) theories, it would be hopeless to try to enumerate all its possible versions.”

So it may be impossible to clearly define #compositionality, but Partee claims that we can at least attempt to pinpoint the latent features of the phenomenon. I’ll focus here on a) its relation to syntax; b) relation to cognition; c) the nature of meaning itself; d) the nature of ‘context’.

The first use of the term compositionality, as far as I know, is in Katz and Fodor (1963): “As a rule, the meaning of a word is a compositional function of the meanings of its parts, and we would like to be able to capture this compositionality.”

In that paper, Katz and Fodor try to adapt Chomsky’s notion of competence to the area of semantics. They’re concerned with the bottom-up principle, which they call ‘the Projection Problem’. They propose that semantics follows grammar rule-by-rule, according to an ‘amalgamation’ process.

Katz and Fodor’s ‘amalgamation’ is “the joining of elements from different sets of paths under a given grammatical marker if these elements satisfy the appropriate selection restrictions”. The appropriate rules are encoded in the lexicon, and mastery of the lexicon is a psychological competence.

The notion of selectional restriction in K&F follows the notion of Chomskian competence. They want a competent speaker to be able to deal with ambiguities that are unresolved by syntax, as well as semantic felicity: “The (restaurant’s / bird’s) bill is large / The paint is silent.”

For K&F, semantics – like syntax – is a competence which does not use “information about setting” and is “independent of individual differences between speakers”. However they also readily accept a weak version of context as discourse dependence in the spirit of Harris and the distributionalists.

The problem with the Chomskian account (and by extent Katz and Fodor’s) is that it gets into trouble when encountering quantifiers. “Every candidate voted for every candidate” is not equivalent to “Every candidate voted for him/herself.” But that’s what transformational grammar would predict.

Enters Montague. Like K&F, Montague believes in a homomorphism between syntax and semantics, but the actual implementation of that homomorphism is to be done via model theory. Suddenly, sentences have truth values and parts of sentences have extensions.

An important side note on truth from Partee (2011): “semantics itself is in the first instance concerned with truth-conditions (not actual truth, as is sometimes mistakenly asserted) and entailment relations, not with internal representations.”

That is, the Montagovian account is not a cognitive account. Also, truth does not have anything to do with the real world. If you want ideas sleeping (furiously or not), you intersect the set of ideas and the set of sleeping things. Done.

Montague posits an infinity of possible worlds, with sentences being true in a set of worlds and constituents having intensions that map worlds to extensions. The account seems essentially bottom-up. And still, the notion of context does appear in his work…

In a 1970 essay on pragmatics, Montague introduces contexts of use: who utters the sentence, where and when; the state of the world; the surrounding discourse, etc. Context is formalised as finite tuple of indexicals. In this setup, intensions are functions from indices to extensions.

One problem with the Montagovian account is its anti-psychologism. Partee (1979, 2014) highlights the difficulties of reconciling truth theory with the notion of competence, in particular when it comes to propositional attitudes.

I also note that logicians take possible worlds for granted and never explain where exactly they come from. Here’s where Fillmore comes in.

Initially, Fillmore is involved in syntactic work in the spirit of transformational grammar (1982). He develops ‘case grammar’ to encode selectional restrictions and Katz and Fodor’s projection rules. This is to become frame semantics.

For Fillmore, a frame is a conceptual representation of a situation: “I thought of each case frame as characterizing a small abstract ‘scene’ or ‘situation’, so that to understand the semantic structure of the verb it was necessary to understand the properties of such schematized scenes.”

Fillmore is interested in explaining why a speaker chose the specific words they uttered. He formalises this idea in the notion of ‘U-semantics’ (semantics of understanding). His semantics is perhaps the first to try and explicitly encode the paradox of the two principles of compositionality: …

“The U-semantics account is compositional in that its operation depends on knowledge of the meanings of individual lexical items […], but it is also ‘non-compositional’ in that the construction process is not guided by purely symbolic operations from bottom to top.” (Fillmore, 1985)

As in Katz & Fodor, Fillmore’s account is cognitive – but it does not cleanly distinguish syntax from semantics from pragmatics. As in Montague, truth is no metaphysical truth, it is about the way speakers perceive the real world. But it does not provide a notion of extension.

Frame semantics does not have a proper account of things, worlds, and quantification over things/worlds. In order to find a cognitive account of the way truth values can be computed over possible worlds, we have to jump to generative probabilistic models (e.g. Goodman and Lassiter, 2015).

Where do generative models stand? Against Fillmore, they draw a clear boundary between the world creation process and semantics. Against Montague, they anchor their possible worlds in probabilities learned from the real world. But there is more to language than the real world.

Talking of concept combination, Hampton (1991) shows people will try to make sense of anything (contra Chomsky). What is a fish that is a vehicle? “Some subjects put a saddle on the back of the fish […] while still others surgically implanted a pressurized compartment within the fish.”

So when making sense of unattested combinations, people are able to generate the possible world that would make the sentence true / the extension non-empty. If required, that world can be very far from reality.

I could continue, but this thread will turn into an exposition of mental models. Let’s just note that theories overlap and disagree on the topics of cognition, the autonomy of semantics wrt syntax and pragmatics, the idea of acceptability, the focus on (lexical) types vs (world) objects.

The message is: compositionality remains a mystery because it is difficult to isolate the phenomenon. It interacts with our notions of syntax and pragmatics, with our working definitions of meaning and worlds, with epistemological questions about linguistics (do we care about the mind?)

So there may not be an ultimate definition of compositionality in sight. Still, we can develop working definitions that will be useful to subfield. A great example is Bender et al (2015) in relation to parsing, which explicitly addresses the compositionality vs context paradox.

Let’s just make sure we are aware of all aspects of the question. It is a rather large one.

See this on Twitter

References

Bender, E. M., Flickinger, D., Oepen, S., Packard, W., & Copestake, A. (2015). Layers of interpretation: On grammar and compositionality. In Proceedings of the 11th international conference on Computational Semantics, IWCS 2015, (pp. 239-249).
Cresswell, M. (1973). Logics and Languages, London: Methuen.
Fillmore, C. J. (1982). Frame semantics. In ‘Linguistics in the Morning Calm’, Linguistic Society of Korea (ed.), 111–137. Seoul: Hanshin Publishing Company. Reprinted ‘Cognitive linguistics: Basic readings’ (2008), 34, 373-400.
Fillmore, C. J. (1985). Frames and the semantics of understanding. Quaderni di semantica, 6(2), 222-254.
Fodor, J. and LePore, E. (1992). Holism: A Shopper’s Guide, Oxford: Blackwell.
Goodman, N. D., & Lassiter, D. (2015). Probabilistic Semantics and Pragmatics. In ‘The handbook of contemporary semantic theory’, eds. S. Lappin and C. Fox. Wiley-Blackwell.
Hampton, J. A. (1991). The combination of prototype concepts. In ‘The psychology of word meanings’, ed. P. Schwanenflugel, 91-116. Lawrence Erlbaum Associates.
Katz, J. J., & Fodor, J. A. (1963). The structure of a semantic theory. Language, 39(2), 170-210.
Montague, R. (1970). Pragmatics and intensional logic. Synthese, 22(1-2), 68-94.
Partee, B. (1979). Semantics—mathematics or psychology? In ‘Semantics from different points of view’ (pp. 1-14). Springer, Berlin, Heidelberg.
Partee, B. (1984). Compositionality. Reprinted in ‘Compositionality in formal semantics: Selected papers’, 2008. John Wiley & Sons.
Partee, B. (2011). Formal semantics: Origins, issues, early impact. Baltic International Yearbook of Cognition, Logic and Communication, 6(1), 13.
Partee, B. (2014). The History of Formal Semantics: Changing Notions of Linguistic Competence. Harvard, 9th Annual Joshua and Verona Whatmough Lecture, April 28, 2014.
Pelletier, F. J. (2001). Did Frege believe Frege’s principle?. Journal of Logic, Language and information, 10(1), 87-114.