نتایج جستجو برای: term frequency and inverse document frequency tf idf

تعداد نتایج: 16977020  

Journal: :CoRR 2015
Xinyu Fu Eugene Ch'ng Uwe Aickelin Lanyun Zhang

Novelty detection in news events has long been a difficult problem. A number of models performed well on specific data streams but certain issues are far from being solved, particularly in large data streams from the WWW where unpredictability of new terms requires adaptation in the vector space model. We present a novel event detection system based on the Incremental Term Frequency-Inverse Doc...

2016
Ryan Wong Hyun Sik Kim

Our project attempts to simplify the search process for selecting multiple venues for a single outing using Yelp. We have developed a machine learning model that recommends a complementary venue (such as a café) based on a restaurant searched by a user. Using a binary classifier, complementary venues were scored (great venue or mediocre / poor venue) based on unigrams and bigrams in review text...

2012
Ketut Eddy Purnama

This paper aims to classify texts in Indonesian language into emotion expression classes. The data were taken from 6 basic emotion classes whose training documents and test documents were obtained from articles in www.kompas.com, www.suaramerdeka.com, and www.detik.com. The text weighing was processed by using TFID method which is an integration of Term Frequency (TF) and Inverse Document Frequ...

Journal: :International journal of medical informatics 2002
Patrick Ruch Robert H. Baud Antoine Geissbühler

Unlike journal corpora, which are supposed to be carefully reviewed before being published, the quality of documents in a patient record are often corrupted by mispelled words and conventional graphies or abbreviations. After a survey of the domain, the paper focuses on evaluating the effect of such corruption on an information retrieval (IR) engine. The IR system uses a classical bag of words ...

2008
Muath Alzghool Diana Inkpen

In this paper we present a new method for combining the results of different models in order to improve the performance on a difficult task: Information Retrieval from spontaneous speech. Our technique is based on clustering the training topics according to their tf-idf (term frequency-inverse document frequency) properties, and selecting the best models for each cluster. When the system runs o...

Journal: :ACM Transactions on Asian and Low-Resource Language Information Processing 2023

Hate speech and Offensive Posts (OP) detection on Smart Multimedia Internet of Things (MIoT) have been an active issue for researchers. MIoT media texts in non-native English-speaking countries are often code-mixed or script mixed/switched. This paper proposes ensemble-based Deep Learning (DL) framework comprised a Convolutional Neural Network (CNN) Dense (DNN) identifying hate OP Malayalam Cod...

Journal: :Sustainability 2021

With an increase in ethical awareness, people have begun to criticize the unethical issues associated with use of animal materials. This study focused on transition global consumers’ awareness toward vegan materials and relationship between interest subjects such as animals, environment, For this purpose, posts about fur/fake fur leather/fake leather uploaded Google Twitter from 2008 2019 were ...

2000
Kevin Prey James C. French Allison L. Powell Charles L. Viles

INTRODUCTION Full text searching over a database of moderate size often uses the inverse document frequency, idf = log(N/df), as a component in term weighting functions used for document indexing and retrieval. However, in very large databases (e.g. internet search engines), there is the potential that the collection size (N) could dominate the idf value, decreasing the usefulness of idf as a t...

Journal: :Buildings 2023

Construction accident investigation reports contain critical information, but extracting useful insights from the voluminous Chinese text is challenging. Traditional methods rely on expert judgment, which leads to time-consuming and potentially inaccurate results. To overcome this problem, we propose a novel approach that combines mining techniques latent Dirichlet allocation (LDA) models analy...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه تحصیلات تکمیلی صنعتی کرمان - پژوهشکده برق و کامپیوتر 1390

a phase-locked loop (pll) based frequency synthesizer is an important circuit that is used in many applications, especially in communication systems such as ethernet receivers, disk drive read/write channels, digital mobile receivers, high-speed memory interfaces, system clock recovery and wireless communication system. other than requiring good signal purity such as low phase noise and low spu...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید