The feature extraction for classifying words on social media with the Naïve Bayes algorithm
نویسندگان
چکیده
To classify Naïve Bayes classification (NBC), however, it is necessary to have a previous pre-processing and feature extraction. Generally, eliminates unnecessary words while extraction processes these words. This paper focuses on in which calculations searches are used by applying word2vec frequency using term frequency-Inverse document (TF-IDF). The process of classifying Twitter with 1734 tweets defined as weight the calculation TF-IDF that often come out tweet, value decreases vice versa. Following achievement word carried test data, yielding an accuracy 88.8% Slack category tweet verb 78.79%. It can be concluded data form available twitter classified those refer slack verbs fairly good level accuracy. so manifests from habit social media user.
منابع مشابه
the algorithm for solving the inverse numerical range problem
برد عددی ماتریس مربعی a را با w(a) نشان داده و به این صورت تعریف می کنیم w(a)={x8ax:x ?s1} ، که در آن s1 گوی واحد است. در سال 2009، راسل کاردن مساله برد عددی معکوس را به این صورت مطرح کرده است : برای نقطه z?w(a)، بردار x?s1 را به گونه ای می یابیم که z=x*ax، در این پایان نامه ، الگوریتمی برای حل مساله برد عددی معکوس ارانه می دهیم.
15 صفحه اولThe Impact of Feature Extraction on the Performance of a Classifier: kNN, Naïve Bayes and C4.5
“The curse of dimensionality” is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and the classification error in high dimensions. In this paper, different feature extraction techniques as means of (1) dimensionality reduction, and (2) constructive induction are analyzed with respect to the performance of a classifier. Three commonly used class...
متن کاملA Naïve Bayes Approach to Classifying Topics in Suicide Notes
The authors present a system developed for the 2011 i2b2 Challenge on Sentiment Classification, whose aim was to automatically classify sentences in suicide notes using a scheme of 15 topics, mostly emotions. The system combines machine learning with a rule-based methodology. The features used to represent a problem were based on lexico-semantic properties of individual words in addition to reg...
متن کامل“the effect of risk aversion on the demand for life insurance: the case of iranian life insurance market”
abstract: about 60% of total premium of insurance industry is pertained?to life policies in the world; while the life insurance total premium in iran is less than 6% of total premium in insurance industry in 2008 (sigma, no 3/2009). among the reasons that discourage the life insurance industry is the problem of adverse selection. adverse selection theory describes a situation where the inf...
15 صفحه اولBug Classification: Feature Extraction and Comparison of Event Model using Naïve Bayes Approach
In software industries, individuals at different levels from customer to an engineer apply diverse mechanisms to detect to which class a particular bug should be allocated. Sometimes while a simple search in Internet might help, in many other cases a lot of effort is spent in analyzing the bug report to classify the bug. So there is a great need of a structured mining algorithm where given a cr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IAES International Journal of Artificial Intelligence
سال: 2022
ISSN: ['2089-4872', '2252-8938']
DOI: https://doi.org/10.11591/ijai.v11.i3.pp1041-1048