previous up contents next
Left: Multiple inheritance Up: DATR techniques Right: Encoding DAGs

Representing ambiguity and alternation

  DATR is a language that allows the lexicon writer to define sets of partial functions from sequences of atoms to sequences of atoms. That is actually all that it allows the lexicon writer to do. Because DATR deals in functions it does not embody any notion of disjunction or any possibility of multiple values being associated with a single node/path pair. It might seem, at first glance, as if such a language would be quite inappropriate to a domain such as the lexicon where ambiguities are common. In practice, however, this turns out not to be the case. Consider the homonymy of bank:

    Bank1:
        <> == NOUN
        <mor root> == bank
        <sem gloss> == side of river.
    Bank2:
        <> == NOUN
        <mor root> == bank
        <sem gloss> == financial institution.
This is simply the traditional analysis of homonymy, encoded in DATR: there are two entirely distinct lexemes with unrelated meanings that happen both to be nouns and to have indistinguishable morphological roots.

Or consider the polysemy of cherry - the example is due to Kilgarriff (1995) who shows that the kind of polysemy exhibited by cherry applies generally to fruit trees and can thus be specified at a higher node in the lexical network, removing the need for stipulation (as in our example) at the Cherry node, the Apple node, and so on. Kilgarriff & Gazdar (1995) also present an extended example showing how DATR can be used to encode the regular and subregular polysemy associated with the crop, fibre, yarn, fabric and garment senses of words like cotton and silk.

    Cherry:
        <> == NOUN
        <mor root> == cherry
        <sem gloss 1> == sweet red berry with pip
        <sem gloss 2> == tree bearing <sem gloss 1>
        <sem gloss 3> == wood from <sem gloss 2>.
Again, this is a rather traditional analysis. There are (at least) three distinct but related senses. For perspicuity, we provide these in DATR-augmented English here. But in a serious treatment they could just as well be given in a DATR-encoding of the lambda calculus, say (as used in Cahill & Evans 1990, for example). The three senses are not freely interchangeable alternative values for a single attribute or path. Instead, DATR allows their relatedness of meaning to be captured by using the definition of one in the definition of another.

A very few words in English have alternative morphological forms for the same syntactic specification. An example noted by Fraser & Hudson (1990, 62) is the plural of hoof which, for many English speakers, can appear as both hoofs and hooves (see also the dreamt/dreamed verb class discussed by Russell et al. 1992, 330-331). DATR does not permit a theorem set such as the following to be derived from a consistent description:

    Word7:
        <syn number> = plural
        <mor form> = hoof s
        <mor form> = hoove s.
But it is quite straightforward to define a description that will lead to the following theorem set:

    Word7:
        <syn number> = plural
        <mor form> = hoof s
        <mor form alternant> = hoove s.
Or something like this:

    Word7:
        <syn number> = plural
        <mor forms> = hoof s | hoove s .
Or this:

    Word7:
        <syn number> = plural
        <mor forms> = { hoof s , hoove s }.
Of course, as far as DATR is concerned { hoof s , hoove s } is just a sequence of seven atoms. It is up to some component external to DATR which makes use of such complex values to interpret it as a two member set of alternative forms. Likewise, if we have some good reason for wanting to put together the various senses of cherry into a value returned by a single path, then we can write something like this:
    Cherry:
        ...
        <sem glosses> == { <sem gloss 1> , <sem gloss 2> , <sem gloss 3> }.
which will then provide this theorem:
    Cherry:
        <sem glosses> = { sweet red berry with pip ,
                          tree bearing sweet red berry with pip ,
                          wood from tree bearing sweet red berry with pip }.
Also relevant here are the various techniques for reducing lexical disjunction discussed in Pulman (1996).

---------------------------------------------------------

previous up contents next
Left: Multiple inheritance Up: DATR techniques Right: Encoding DAGs
Copyright © Roger Evans, Gerald Gazdar & Bill Keller, Tuesday 10 November 1998