previous up contents next
Left: Boolean logic Up: DATR techniques Right: Representing lists

Finite state transduction

  Perhaps surprisingly, DATR turns out to be an excellent language for defining finite state transducers (FSTs). A path can be used as the input tape and a value as the output tape (recall that the DATR default mechanism means that extensions to left-hand-side paths are automatically carried over as extensions to right-hand-side paths, as discussed in Section 3.3, above). Nodes can be used for states, or else states can be encoded in attributes that are prefixed to the current input path. Here, for example, is a very simple DATR FST that will transduce a path such as <subj 1 sg futr obj 2 sg like> into the value ni ta ku penda (Swahili for I will like you). For clarity, this FST does not exploit default inheritance to capture the 50% overlap between the subject and object pronoun paradigms. See Gazdar (1992) for a version that does.

        <subj 1 sg> == ni S2:<>
        <subj 2 sg> == u  S2:<>
        <subj 3 sg> == a  S2:<>
        <subj 1 pl> == tu S2:<>
        <subj 2 pl> == m  S2:<>
        <subj 3 pl> == wa S2:<>.
        <past> == li S3:<>
        <futr> == ta S3:<>.
        <obj 1 sg> == ni S4:<>
        <obj 2 sg> == ku S4:<>
        <obj 3 sg> == m  S4:<>
        <obj 1 pl> == tu S4:<>
        <obj 2 pl> == wa S4:<>
        <obj 3 pl> == wa S4:<>.
        <like> == penda.
Although the example is trivial, the technique is both powerful and useful. Gibbon (1990), for example, has made notably effective use of it in his treatments of African tone systems. Much of the computational phonology and morphology of the last decade and a half has depended on FSTs. Potential lexical applications come readily to mind - for example, the orthographic spelling rules for English suffixation (cf. sky/skies). We give below a small fragment of such an FST in which + is used as a morpheme boundary marker. Note the role of DATR variables in giving concise expression to the rules:

    # vars $abc: a b c d e f g h i j k l m n o p q r s t u v w x y z.
    # vars $vow: a e i o u.
        <> ==
        <+> == <>
        <$abc> == $abc <>
        <e + $vow> == $vow <>.
These axioms then give rise to theorems such as these:
        <l o v e>           = l o v e
        <l o v e + s>       = l o v e s
        <l o v e + e d>     = l o v e d
        <l o v e + e r>     = l o v e r
        <l o v e + l y>     = l o v e l y
        <l o v e + i n g>   = l o v i n g
        <l o v e + a b l e> = l o v a b l e.


previous up contents next
Left: Boolean logic Up: DATR techniques Right: Representing lists
Copyright © Roger Evans, Gerald Gazdar & Bill Keller, Tuesday 10 November 1998