corpus analysis

نتایج جستجو برای: corpus analysis

تعداد نتایج: 2874375 فیلتر نتایج به سال:

Nominal Expressions in Multilingual Corpora: Definites and Demonstratives

2002

Susanne Salmon-Alt Renata Vieira

This paper presents the results of a multilingual corpus study on definite descriptions and demonstrative noun phrases. The analysis made on a parallel corpus (French and Portuguese) reinforces previous findings regarding the predominance of non-anaphoric uses of definite descriptions in English corpus. It is also shown that the use of demonstrative noun phrases, on the other hand, is more regu...

متن کامل

Investigating Redundancy in Emoji Use: Study on a Twitter Based Corpus

2017

Giulia Donato Patrizia Paggio

In this paper we present an annotated corpus created with the aim of analyzing the informative behaviour of emoji – an issue of importance for sentiment analysis and natural language processing. The corpus consists of 2475 tweets all containing at least one emoji, which has been annotated using one of the three possible classes: Redundant, Non Redundant, and Non Redundant + POS. We explain how ...

متن کامل

Sentiment Summerization and Analysis of Sindhi Text

2017

Mazhar Ali Asim Imdad Wagan

Text corpus is important for assessment of language features and variation analysis. Machine learning techniques identify the language terms, features, text structures and sentiment from linguistic corpus. Sindhi language is one of the oldest languages of the world having proper script and complete grammar. Sindhi is remained less resourced language computationally even in this digital era. Vie...

متن کامل

Corpus and Text Analysis of Spontaneous Japanese

2003

Hitoshi Isahara

There are three major parts of the “Spontaneous Speech: Corpus and Processing Technology” project; (1) compilation of large spontaneous speech corpus, (2) establishment of spoken language engineering based on the corpus, and (3) developing a prototype of a spoken language summarization system. This paper describes how we help to develop this large corpus, i.e., (1), using technology developed a...

متن کامل

Corpus Analysis Tools for Computational Hook Discovery

2015

Jan Van Balen John Ashley Burgoyne Dimitrios Bountouridis Daniel Müllensiefen Remco C. Veltkamp

Compared to studies with symbolic music data, advances in music description from audio have overwhelmingly focused on ground truth reconstruction and maximizing prediction accuracy, with only a small fraction of studies using audio description to gain insight into musical data. We present a strategy for the corpus analysis of audio data that is inspired by the FANTASTIC toolbox and optimized fo...

متن کامل

Morphological Analysis of the Spontaneous Speech Corpus

2002

Kiyotaka Uchimoto Chikashi Nobata Atsushi Yamada Satoshi Sekine Hitoshi Isahara

This paper describes a project tagging a spontaneous speech corpus with morphological information such as word segmentation and parts-ofspeech. We use a morphological analysis system based on a maximum entropy model, which is independent of the domain of corpora. In this paper we show the tagging accuracy achieved by using the model and discuss problems in tagging the spontaneous speech corpus....

متن کامل

Preliminary Analysis of a Slavic Parallel Corpus

2009

Emmerich Kelih

The focus of this paper is on a detailed description of a newlydeveloped parallel corpus of Slavic languages. It consists of 11 Slavic translations of the well-known Russian socialist realist novel “Kak zakaljalas’ stal’/How the steel was tempered” (KZS), written by N.A. Ostrovskij in the years 1932-34. The KZS contains the Slovene, Croatian, Serbian (ekavian), Macedonian, Bulgarian, Ukrainian,...

متن کامل

Using Bidirectional Chart Parsing for Corpus Analysis

2007

A. Ageno H. Rodriguez

Several experiments have been developed around a bidirectional island-driven chart parser. The system follows basically the approach of Stock, Satta and Corazza, and the experiments have been designed and performed with the purpose of examining several ways of improvement: basic strategyt of the algorithm (pure island-driven vs mixed island-driven/bottom-up approaches), strategies for extending...

متن کامل

An Approach to Evaluate Existing Ontologies for Indexing a Document Corpus

2004

Nathalie Hernandez Josiane Mothe

Using ontologies for IR is one of the key issues for indexing. It is not always easy to decide witch ontology to use according to the corpus to index. We propose to define measures reflecting the adequacy of an ontology to a corpus. The goal of this measures is to evaluate if an ontology suites a corpus, but also to compare the adequacy of two or more ontologies to the same corpus. The measures...

متن کامل

Proceedings of the Workshop on Innovative Corpus

2015

Nils Blomqvist Gintarė Grigonytė Simon Clematide Andrius Utka Martin Volk

The task-oriented and format-driven development of corpus query systems has led to the creation of numerous corpus query languages (QLs) that vary strongly in expressiveness and syntax. This is a severe impediment for the interoperability of corpus analysis systems, which lack a common protocol. In this paper, we present KoralQuery, a JSON-LD based general corpus query protocol, aiming to be in...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید