TANA: The amalgam neural architecture for sarcasm detection in indian indigenous language combining LSTM and SVM with word-emoji embeddings
نویسندگان
چکیده
Sentiment analysis is indeed a difficult task owing to the playful language mannerism, altered vocabulary and speak-text used on online forums. Humans tend use words phrases in ways that are incomprehensible those who not involved discourse. Sarcastic remarks conversations often utilized mock others by saying something isn't pleasant. Sardonic or humorous statements/ tones insult make appear puerile. Automated sarcasm detection considered as one of key tasks tweak sentiment extending it morphologically rich free-order dominant indigenous Indian Hindi another challenge. This research puts forward ‘The Amalgam Neural Architecture’, TANA, detect tweets. The architecture trained using two embeddings, namely word emoji embeddings combines an LSTM with loss function SVM for detection. We Sarc-H dataset, which built scrapping tweets manually annotating based hashtags ‘’ (pronounced kataaksh, means Hindi) vyangya, tweeters results evaluated various classification performance metrics achieves F-score 0.9675 outperforming last layer softmax well existing works.
منابع مشابه
learners’ attitudes toward the effectiveness of mobile-assisted language learning (mall) in vocabulary acquisition in the iranian efl context: the case of word lists, audiobooks and dictionary use
رشد انفجاری تکنولوژی فرصت های آموزشی مهیج و جدیدی را پیش روی فراگیران و آموزش دهندگان گذاشته است. امروزه معلمان برای اینکه در امر آموزش زبان بروز باشند باید روش هایی را اتخاذ نمایند که درآن ها از تکنولوژی جهت کمک در یادگیری زبان دوم و چندم استفاده شده باشد. با در نظر گرفتن تحولاتی که رشته ی آموزش زبان در حال رخ دادن است هم اکنون زمان مناسبی برای ارزشیابی نگرش های موجود نسبت به تکنولوژی های جدید...
15 صفحه اولCharacter-Word LSTM Language Models
We present a Character-Word Long ShortTerm Memory Language Model which both reduces the perplexity with respect to a baseline word-level language model and reduces the number of parameters of the model. Character information can reveal structural (dis)similarities between words and can even be used when a word is out-of-vocabulary, thus improving the modeling of infrequent and unknown words. By...
متن کاملAutoExtend: Combining Word Embeddings with Semantic Resources
We present AutoExtend, a system that combines word embeddings with semantic resources by learning embeddings for non-word objects like synsets and entities and learning word embeddings which incorporate the semantic information from the resource. The method is based on encoding and decoding the word embeddings and is flexible in that it can take any word embeddings as input and does not need an...
متن کاملActionable and Political Text Classification using Word Embeddings and LSTM
In this work, we apply word embeddings and neural networks with Long Short-Term Memory (LSTM) to text classification problems, where the classification criteria are decided by the context of the application. We examine two applications in particular. The first is that of Actionability, where we build models to classify social media messages from customers of service providers as Actionable or N...
متن کاملConvolutional Neural Network with Word Embeddings for Chinese Word Segmentation
Character-based sequence labeling framework is flexible and efficient for Chinese word segmentation (CWS). Recently, many character-based neural models have been applied to CWS. While they obtain good performance, they have two obvious weaknesses. The first is that they heavily rely on manually designed bigram feature, i.e. they are not good at capturing n-gram features automatically. The secon...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition Letters
سال: 2022
ISSN: ['1872-7344', '0167-8655']
DOI: https://doi.org/10.1016/j.patrec.2022.05.026