Construindo corpora bilíngues quimbundo- português-quimbundo / Building Kimbundu-Portuguese-Kimbundu bilingual corpora

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Portuguese Corpora at CLUL

The Corpus de Referência do Português Contemporâneo (CRPC) is being developed in the Centro de Linguística da Universidade de Lisboa (CLUL) since 1988 under a perspective of research data enlargement, in the sense of concepts and hypothesis verification by rejecting the sole use of intuitive data. The intention of creating this open corpus is to establish an on-line representative sample collec...

متن کامل

Building bilingual terminologies from comparable corpora: the TTC TermSuite

In this paper, we exploit domain-specific comparable corpora to build bilingual terminologies. We present the monolingual term extraction and the bilingual alignment that will allow us to identify and translate high specialised terminology. We stress the huge importance of taking into account both simple and complex terms in a multilingual environment. Such linguistic diversity implies to combi...

متن کامل

Building Comparable Corpora Based on Bilingual LDA Model

Comparable corpora are important basic resources in cross-language information processing. However, the existing methods of building comparable corpora, which use intertranslate words and relative features, cannot evaluate the topical relation between document pairs. This paper adopts the bilingual LDA model to predict the topical structures of the documents and proposes three algorithms of doc...

متن کامل

Building Carefully Tagged Bilingual Corpora to Cope with Linguistic Idiosyncrasy

We illustrate the effectiveness of medium-sized carefully tagged bilingual core corpus, that is, “semantic typology patterns” in our term together with some examples to give concrete evidence of its usefulness. The most important characteristic of these semantic typology patterns is the bridging mechanism between two languages which is based on sequences syntactic codes and semantic codes. This...

متن کامل

Bilingual Lexicon Extraction from Comparable Corpora Enhanced with Parallel Corpora

In this article, we present a simple and effective approach for extracting bilingual lexicon from comparable corpora enhanced with parallel corpora. We make use of structural characteristics of the documents comprising the comparable corpus to extract parallel sentences with a high degree of quality. We then use state-of-the-art techniques to build a specialized bilingual lexicon from these sen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: REVISTA DE ESTUDOS DA LINGUAGEM

سال: 2021

ISSN: 2237-2083,0104-0588

DOI: 10.17851/2237-2083.29.2.771-803