Feature based Entailment Recognition for Malayalam Language Texts

نویسندگان

چکیده

Textual entailment is a relationship between two text fragments, namely, text/premise and hypothesis. It has applications in question answering systems, multi-document sum-marization, information retrieval social network analysis. In the era of digital world, recognizing semantic variability important understanding inferences texts. The texts are either form sentences, posts, tweets, or user experiences. Hence from customer experiences helps companies segmentation. availability ever-growing with textual data almost all languages, including low resource languages. This work deals various machine learning approaches applied to recognition natural language inference for Malayalam, South Indian language. A performance-based analysis using classification techniques such as Logistic Regression, Decision Tree, Support Vector Machine, Random Forest, AdaBoost, Naive Bayes done MaNLI (Malayalam Natural Language Inference) dataset. Different lexical surface-level features used this binary multiclass classification. With increasing size dataset, there drop performance feature-based comparison models deep highlights inference. main focus here 14 different its comparison, essential any NLP problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Subjective Feature Extraction for Sentiment Analysis in Malayalam Language

In recent days, Sentiment Analysis has become an active research in NLP, which analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from writing language. The growing importance of sentiment analysis coincides with the growth of social media such as reviews, forum discussions, blogs, and social network. In his paper, sentiment analysis of Malayalam film review is carried...

متن کامل

Feature Extraction Techniques for Recognition of Malayalam Handwritten Characters: Review

The Character recognition is one of the most important areas in the field of pattern recognition. Recently Indian Handwritten character recognition is getting much more attention and researchers are contributing a lot in this field. But Malayalam, a South Indian language has very less works in this area and needs further attention. Malayalam OCR is a complex task owing to the various character ...

متن کامل

developing a pattern based on speech acts and language functions for developing materials for the course “ the study of islamic texts translation”

هدف پژوهش حاضر ارائه ی الگویی بر اساس کنش گفتار و کارکرد زبان برای تدوین مطالب درس "بررسی آثار ترجمه شده ی اسلامی" می باشد. در الگوی جدید، جهت تدوین مطالب بهتر و جذاب تر، بر خلاف کتاب-های موجود، از مدل های سطوح گفتارِ آستین (1962)، گروه بندی عملکردهای گفتارِ سرل (1976) و کارکرد زبانیِ هالیدی (1978) بهره جسته شده است. برای این منظور، 57 آیه ی شریفه، به صورت تصادفی از بخش-های مختلف قرآن انتخاب گردید...

15 صفحه اول

A Wavelet Based Recognition System for Printed Malayalam Characters

This paper specifies an OCR system for printed Malayalam characters. Malayalam is the principal language of the South Indian state Kerala. It belongs to the family of Dravidian Language. The input to the system would be the scanned image of a page of text and the output is a machine editable file. Malayalam Character recognition is a complex task because of the presence of two scripts; old scri...

متن کامل

Subspace-Based Feature Representation and Learning for Language Recognition

This paper presents a novel subspace-based approach for phonotactic language recognition. The whole framework is divided into two parts: the speech feature representation and the subspacebased learning algorithm. First, the phonetic information as well as the contextual relationship, possessed by spoken utterances, are more abundantly retrieved by likelihood computation and feature concatenatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Advanced Computer Science and Applications

سال: 2022

ISSN: ['2158-107X', '2156-5570']

DOI: https://doi.org/10.14569/ijacsa.2022.0130283