A Language Modelling Tool for Statistical NLP

نویسندگان

  • Daniel Bastos Pereira
  • Ivandré Paraboni
چکیده

In recent years the use of statistical language models (SLMs) has become widespread in most NLP fields. In this work we introduce jNina, a basic language modelling tool to aid the development of Machine Translation systems and many other text-generating applications. The tool allows for the quick comparison of multiple text outputs (e.g., alternative translations of a single source) based on a given SLM, and enables the user to build and evaluate her own SLMs from any corpora provided.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NLP-SIR: A Natural Language Approach for Spreadsheet Information Retrieval

Spreadsheets are a ubiquitous software tool, used for a wide variety of tasks such as financial modelling, statistical analysis and inventory management. Extracting meaningful information from such data can be a difficult task, especially for novice users unfamiliar with the advanced data processing features of many spreadsheet applications. We believe that through the use of Natural Language P...

متن کامل

Statistical Markovian Data Modeling for Natural Language Processing

Markov chain theory is a popular statistical tool in applied probability that is quite useful in modelling real-world computing applications. Over the past years; there has been grown interest to employ Markov chain theory in statistical learning of temporal (i.e. time series) data. A wide range of applications found to utilize Markov concepts; such applications include computational linguists,...

متن کامل

The MultiTal NLP tool infrastructure

This paper gives an overview of the MultiTal project, which aims to create a research infrastructure that ensures long-term distribution of NLP tools descriptions. The goal is to make NLP tools more accessible and usable to end-users of different disciplines. The infrastructure is built on a meta-data scheme modelling and standardising multilingual NLP tools documentation. The model is conceptu...

متن کامل

Trameur: A Framework for Annotated Text Corpora Exploration

Corpus resources with complex linguistic annotations are becoming increasingly important in the work of language specialists. They often need to perform extensive corpus research, including Natural Language Processing (NLP), statistical modelling and data visualisation. Our software system, called Trameur, aims at making these analyses possible within a single graphical user interface. It relie...

متن کامل

Deep Unsupervised Feature Learning for Natural Language Processing

Statistical natural language processing (NLP) builds models of language based on statistical features extracted from the input text. We investigate deep learning methods for unsupervised feature learning for NLP tasks. Recent results indicate that features learned using deep learning methods are not a silver bullet and do not always lead to improved results. In this work we hypothesise that thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007