Book Reviews Syntax-Based Collocation Extraction
نویسنده
چکیده
Collocation is a common language phenomenon which has attracted the interest of researchers in many subfields of both theoretical and computational linguistics. Although there is no commonly accepted and precise definition of this phenomenon, collocations are generally understood as complex lexical items, often characterized as unpredictable, idiosyncratic, holistic, mutually selective, and so forth. Together with other types of multiword expressions (or phraseological units, such as compound nouns, phrasal verbs, idioms, etc.), collocations form a borderline phenomenon positioned between lexis and grammar: On one hand, they are unpredictable and must be learned in the same way as single words are (as whole units); on the other hand, they often also have internal syntactic structure and their components must then adhere to grammatical rules. Collocations play an important role in applications involving text production (e.g., machine translation and language generation), text analysis (e.g., parsing and word sense disambiguation), and also in other related tasks (such as information extraction, text classification, etc.). The book Syntax-Based Collocation Extraction by Violeta Seretan is based on her doctoral dissertation defended in 2008 at the Department of Linguistics, University of Geneva, under the supervision of Eric Wehrli, and refers to a number of their previous publications. The main text is divided into six chapters (amounting to 128 pages) and six appendices (70 pages). The first chapter can be regarded as a motivation for the whole work. It introduces the notion of collocation, explains its relevance (and importance) for natural language processing, specifies the aims of the work, and most importantly, it presents arguments for syntax-based collocation extraction as a more appropriate alternative to the traditional syntax-free n-gram and window-based techniques.
منابع مشابه
GRASP: Grammar- and Syntax-based Pattern-Finder for Collocation and Phrase Learning
We introduce a method for learning to find the representative syntax-based context of a given collocation/phrase. In our approach, grammatical patterns are extracted for query terms aimed at accelerating lexicographers’ and language learners’ navigation through the word usage and learning process. The method involves automatically lemmatizing, part-of-speech tagging and shallowly parsing the se...
متن کاملConstruction of Semantic Collocation Bank Based on Semantic Dependency Parsing
Collocation has always been an important issue in language research, especially in Chinese language researches. Chinese is an isolated language, which lacks morphological changes.Establishing a relatively complete dictionary of Chinese collocation will be a great contribution to Chinese study and research. Collocation plays a significant supporting role in many fields of NLP, such as informatio...
متن کاملInducing Discourse Connectives from Parallel Texts
Discourse connectives (e.g. however, because) are terms that explicitly express discourse relations in a coherent text. While a list of discourse connectives is useful for both theoretical and empirical research on discourse relations, few languages currently possess such a resource. In this article, we propose a new method that exploits parallel corpora and collocation extraction techniques to...
متن کاملThe Construction of a Chinese Collocational Knowledge Resource and Its Application for Second Language Acquisition
The appropriate use of collocations is a challenge for second language acquisition. However, high quality and easily accessible Chinese collocation resources are not available for both teachers and students. This paper presents the design and construction of a large scale resource of Chinese collocational knowledge, and a web-based application (OCCA, Online Chinese Collocation Assistant) which ...
متن کاملSentence Compression for Target-Polarity Word Collocation Extraction
Target-polarity word (T-P) collocation extraction, a basic sentiment analysis task, relies primarily on syntactic features to identify the relationships between targets and polarity words. A major problem of current research is that this task focuses on customer reviews, which are natural or spontaneous, thus posing a challenge to syntactic parsers. We address this problem by proposing a framew...
متن کامل