نتایج جستجو برای: corpus analysis

تعداد نتایج: 2874375  

2002
Susanne Salmon-Alt Renata Vieira

This paper presents the results of a multilingual corpus study on definite descriptions and demonstrative noun phrases. The analysis made on a parallel corpus (French and Portuguese) reinforces previous findings regarding the predominance of non-anaphoric uses of definite descriptions in English corpus. It is also shown that the use of demonstrative noun phrases, on the other hand, is more regu...

2017
Giulia Donato Patrizia Paggio

In this paper we present an annotated corpus created with the aim of analyzing the informative behaviour of emoji – an issue of importance for sentiment analysis and natural language processing. The corpus consists of 2475 tweets all containing at least one emoji, which has been annotated using one of the three possible classes: Redundant, Non Redundant, and Non Redundant + POS. We explain how ...

2017
Mazhar Ali Asim Imdad Wagan

Text corpus is important for assessment of language features and variation analysis. Machine learning techniques identify the language terms, features, text structures and sentiment from linguistic corpus. Sindhi language is one of the oldest languages of the world having proper script and complete grammar. Sindhi is remained less resourced language computationally even in this digital era. Vie...

2003
Hitoshi Isahara

There are three major parts of the “Spontaneous Speech: Corpus and Processing Technology” project; (1) compilation of large spontaneous speech corpus, (2) establishment of spoken language engineering based on the corpus, and (3) developing a prototype of a spoken language summarization system. This paper describes how we help to develop this large corpus, i.e., (1), using technology developed a...

2015
Jan Van Balen John Ashley Burgoyne Dimitrios Bountouridis Daniel Müllensiefen Remco C. Veltkamp

Compared to studies with symbolic music data, advances in music description from audio have overwhelmingly focused on ground truth reconstruction and maximizing prediction accuracy, with only a small fraction of studies using audio description to gain insight into musical data. We present a strategy for the corpus analysis of audio data that is inspired by the FANTASTIC toolbox and optimized fo...

2002
Kiyotaka Uchimoto Chikashi Nobata Atsushi Yamada Satoshi Sekine Hitoshi Isahara

This paper describes a project tagging a spontaneous speech corpus with morphological information such as word segmentation and parts-ofspeech. We use a morphological analysis system based on a maximum entropy model, which is independent of the domain of corpora. In this paper we show the tagging accuracy achieved by using the model and discuss problems in tagging the spontaneous speech corpus....

2009
Emmerich Kelih

The focus of this paper is on a detailed description of a newlydeveloped parallel corpus of Slavic languages. It consists of 11 Slavic translations of the well-known Russian socialist realist novel “Kak zakaljalas’ stal’/How the steel was tempered” (KZS), written by N.A. Ostrovskij in the years 1932-34. The KZS contains the Slovene, Croatian, Serbian (ekavian), Macedonian, Bulgarian, Ukrainian,...

2007
A. Ageno H. Rodriguez

Several experiments have been developed around a bidirectional island-driven chart parser. The system follows basically the approach of Stock, Satta and Corazza, and the experiments have been designed and performed with the purpose of examining several ways of improvement: basic strategyt of the algorithm (pure island-driven vs mixed island-driven/bottom-up approaches), strategies for extending...

2004
Nathalie Hernandez Josiane Mothe

Using ontologies for IR is one of the key issues for indexing. It is not always easy to decide witch ontology to use according to the corpus to index. We propose to define measures reflecting the adequacy of an ontology to a corpus. The goal of this measures is to evaluate if an ontology suites a corpus, but also to compare the adequacy of two or more ontologies to the same corpus. The measures...

2015
Nils Blomqvist Gintarė Grigonytė Simon Clematide Andrius Utka Martin Volk

The task-oriented and format-driven development of corpus query systems has led to the creation of numerous corpus query languages (QLs) that vary strongly in expressiveness and syntax. This is a severe impediment for the interoperability of corpus analysis systems, which lack a common protocol. In this paper, we present KoralQuery, a JSON-LD based general corpus query protocol, aiming to be in...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید