FaDA: Fast Document Aligner using Word Embedding

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FaDA: Fast Document Aligner using Word Embedding

FaDA1 is a free/open-source tool for aligning multilingual documents. It employs a novel crosslingual information retrieval (CLIR)-based document-alignment algorithm involving the distances between embedded word vectors in combination with the word overlap between the source-language and the target-language documents. In this approach, we initially construct a pseudo-query from a source-languag...

متن کامل

Acronym Disambiguation Using Word Embedding

According to the website AcronymFinder.com which is one of the world's largest and most comprehensive dictionaries of acronyms, an average of 37 new human-edited acronym definitions are added every day. There are 379,918 acronyms with 4,766,899 definitions on that site up to now, and each acronym has 12.5 definitions on average. It is a very important research topic to identify what exactly an ...

متن کامل

A New Document Embedding Method for News Classification

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

متن کامل

Designing an Improved Discriminative Word Aligner

The quality of statistical machine translation systems depends on the quality of the word alignments, computed during the translation model training phase. IBM generative alignment models, despite their poor quality compared to a gold standard, perform well in practice. In this paper, we propose an improved word aligner based on a maximum entropy alignment combination model, which employ better...

متن کامل

NATools - A statistical Word Aligner Workbench

This document presents the TerminUM project and the work done in its statistical word aligner workbench (NATools). It shows a variety of alignment methods for parallel corpora and discusses the resulting terminological dictionaries and their use: evaluation of sentence translations; construction of a multi-level navigation system for linguistic studies or statistical translations.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The Prague Bulletin of Mathematical Linguistics

سال: 2016

ISSN: 1804-0462

DOI: 10.1515/pralin-2016-0016