Using Selectional Profile Distance To Detect Verb Alternations
نویسندگان
چکیده
We propose a new method for detecting verb alternations, by comparing the probability distributions over WordNet classes occurring in two potentially alternating argument positions. Existing distance measures compute only the distributional distance, and do not take into account the semantic similarity between WordNet senses across the distributions. Our method compares two probability distributions over WordNet by measuring the semantic distance of the component nodes, weighted by their probability. To incorporate semantic similarity, we calculate the (dis)similarity between two probability distributions as a weighted distance “travelled” from one to the other through the WordNet hierarchy. We evaluate the measure on the causative alternation, and find that overall it outperforms existing distance measures. 1 Detecting Verb Alternations Although patterns of verb alternations, as in (1) and (2), may appear to be “mere” syntactic variation, the ability of a verb to alternate has been shown to be highly related to its semantic properties. 1. The sun melted the snow./The snow melted. 2. Kiva ate his lunch./Kiva ate./*His lunch ate. For example, melt in (1) undergoes a causative alternation in which the transitive form is related to the intransitive by the introduction of a Causal Agent (the sun) into the event structure. The verb eat in (2), like melt, allows both transitive and intransitive forms, but these are related by the unspecified object alternation, as opposed to causativization. Based largely on the influence of Levin (1993), it has become widely accepted that alternations such as these can serve as a basis for the formation of semantic classes of verbs. Correspondingly, the relation between alternation patterns and meaning is a key focus in the computational study of the lexical semantics of verbs (e.g., Allen, 1997; Dang et al., 2000; Dorr and Jones, 2000; Merlo and Stevenson, 2001; Schulte im Walde and Brew, 2002; Tsang et al., 2002). Furthermore, we note that recent work indicates that verb alternations may also play a role in automatic processing of language for applied tasks, such as question-answering (Katz et al., 2001), detection of text relations (Teufel, 1999), and determination of verb-particle constructions (Bannard, 2002). The theoretical and practical implications of alternations mean that it is important to identify verbs which undergo an alternation, and to discover the range of alternations. Manual annotation of verbs is labour intensive, and new verbs (or new uses of known verbs) may be encountered in any given domain. In response, some researchers have begun to investigate ways to detect alternations automatically in a corpus. Some of this work has focused on subcategorization patterns as the clear syntactic cue to an alternation (Lapata, 1999; Lapata and Brew, 1999; Schulte im Walde and Brew, 2002). Other work has observed, however, that detecting an alternation involves more than observing the use of particular subcategorizations—it must also be determined whether the semantic arguments are mapped to the appropriate positions.1 To address this issue, it has been suggested that, if a verb participates in an alternation, then there should be similarity in the kinds of nouns that show up in the synFor example, melt (as in (1) above) undergoes a causative alternation because the Theme argument that surfaces as subject of the intransitive surfaces as object of the transitive, with the addition of a Causal Agent as the subject of the latter. It is not the case that any optionally intransitive verb undergoes this alternation, as shown by eat in (2). tactic positions (or slots) that alternate—such as snow occurring as intransitive subject and transitive object in the causative alternation in (1) (Merlo and Stevenson, 2001; McCarthy, 2000). As a cue to this alternation, Merlo and Stevenson (2001) create a bag of head nouns for each of the two potentially alternating slots, and compare them. In contrast to comparing head nouns directly, McCarthy (2000) instead compares the selectional preferences for each of the two slots (captured by a probability distribution over WordNet). This approach thereby generalizes over the compared nouns, increasing performance over a method similar to that of Merlo and Stevenson. In our work, we have developed a new method for comparing WordNet probability distributions, called “selectional profile distance” (SPD), which combines the benefits of each of the above approaches for detecting alternations. The method used by Merlo and Stevenson (2001) has the advantage of directly capturing similarity between slots (in terms of use of identical nouns [lemmas]), but fails to generalize over the nouns, lending itself to sparse data problems. The approach of McCarthy (2000), on the other hand, addresses the generalization problem by comparing probability distributions over WordNet. However, her comparison measure abstracts over distances between nodes (classes of nouns) in WordNet: it rewards probability mass that occurs in the same subtree across two distributions, but does not take into account the distance between the classes that carry the probability mass. Thus, this approach only captures similarity among the noun arguments across slots at a very coarse level. Our new SPD method integrates a comparison of probability distributions over WordNet with a node similarity measure, successfully capturing both of the advantageous properties of generalization and word (class) similarity. SPD thus enables us to calculate a meaningful similarity measure over the patterns of classes of nouns across two syntactic slots. Our evaluation of the SPD measure for alternation detection also covers some interesting experimental conditions that have not been explored previously. For comparison to previous methods, we investigate these issues in the context of classifying verbs according to whether they undergo the causative alternation. We experiment with randomly selected verbs, for both our alternating and non-alternating (filler) classes, and use both relatively homogeneous and heterogeneous sets of filler verbs. We find that our method performs about the same on each set, indicating that it is insensitive to variation in the filler verbs. Moreover, we experiment with equal numbers of verbs in different frequency bands, and show that splitting verbs into high and low frequency (of slot occurrence) can improve performance. By classifying the high and low frequency verbs separately, our method achieves an accuracy of 70% overall on unseen test verbs, in a task with a baseline of 50%. (For comparison, McCarthy (2000) achieves 73% on her set of hand-selected verbs, but our implementation of her method yields much lower performance on our randomly selected test verbs.) In the next section, we present background work on capturing selectional preferences in WordNet, and on using them to detect alternations. In Section 3, we describe our new SPD measure, and show how it captures both the general differences between WordNet probability distributions, as well as the fine-grained semantic distances between the nodes that comprise them. Section 4 presents our corpus methodology and experimental set-up. In Section 5, we compare SPD to a range of distance measures, and evaluate the different effects of our experimental factors, such as the precise distance functions we use in SPD and the division of our verbs into frequency bands. We summarize our findings in Section 6 and point to directions in our on-going work. 2 The Use of Selectional Preferences Selectional preference refers to the general notion of how much a verb favours (or disfavours) a particular noun as a semantic argument. For example, informally we would say that eat has a strong selectional preference for nouns of type food as its Theme argument. Formalization of this notion has been difficult, but several computational methods have now been proposed that capture selectional preference of a verb as a probability distribution over the WordNet hierarchy (Resnik, 1993; Li and Abe, 1998; Clark and Weir, 2002).2 The key task that each of these proposals address is how to generalize appropriately from counts of observed nouns in the relevant verb argument position (in a corpus), to a probabilistic representation of selectional strength over classes. We will refer in the remainder of the paper to such a probability distribution over WordNet as a “selectional profile.” As mentioned above, McCarthy (2000) suggested the use of selectional profiles to capture generalizations over argument slots, so that two argument slots could be effectively compared for detecting alternations. After extracting the argument heads of the target slots of each verb (e.g., the intransitive subject and the transitive object for the causative alternation), she then determined their selectional profiles using a minimum description length tree cut model (Li and Abe, 1998).3 The two slot profiles were compared using skew divergence (a variant of Resnik’s proposed measure is not actually a probability distribution, but a difference between probability distributions. A tree cut for tree T is a set of nodes C in T such that every leaf node of T has exactly one member of C on a path between it and the root. As a selectional profile, a tree cut will have a non-zero probability associated with every node in C, and a zero probability for all other nodes in T. Figure 1 below has examples of two tree cuts.
منابع مشابه
Verb Alternations and Japanese : How, What and Where
We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese pattern-based valency dictionary. This is achieved through comparison of the selectional preference annotation on corresponding case slots, based on the assumption that selectional preferences are preserved under alternation. Three separate extracti...
متن کاملUsing Semantic Preferences to Identify Verbal Participation in Role Switching Alternations
We propose a method for identifying diathesis alternations where a particular argument type is seen in slots which have different grammatical roles in the alternating forms. The method uses selectional preferences acquired as probability distributions over WordNet. Preferences for the target slots are compared using a measure of distributional similarity. The method is evaluated on the causativ...
متن کاملA Cognitive Model for the Representation and Acquisition of Verb Selectional Preferences
We present a cognitive model of inducing verb selectional preferences from individual verb usages. The selectional preferences for each verb argument are represented as a probability distribution over the set of semantic properties that the argument can possess—a semantic profile. The semantic profiles yield verb-specific conceptualizations of the arguments associated with a syntactic position....
متن کاملDiscovering Asymmetric Entailment Relations between Verbs Using Selectional Preferences
In this paper we investigate a novel method to detect asymmetric entailment relations between verbs. Our starting point is the idea that some point-wise verb selectional preferences carry relevant semantic information. Experiments using WordNet as a gold standard show promising results. Where applicable, our method, used in combination with other approaches, significantly increases the performa...
متن کاملDictionary-driven analysis of Japanese verbal alternations
We present a method for extracting verbal (diathesis) alternations from a valency dictionary, based on comparison of selectional restrictions. The quality of match between selectional restrictions is evaluated according to an entropy-based measure with backing-off facility. We use the proposed method to derive a provisional listing of the range and distribution of verbal alternations in Japanese.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004