نتایج جستجو برای: term frequency and inverse document frequency tf idf

تعداد نتایج: 16977020  

Journal: :Computers, materials & continua 2022

Nowadays, people use online resources such as educational videos and courses. However, courses are mostly long thus, summarizing them will be valuable. The video contents (visual, audio, subtitles) could analyzed to generate textual summaries, i.e., notes. Videos’ subtitles contain significant information. Therefore, is effective concentrate on the necessary details. Most of existing studies us...

Journal: :Advances in multimedia 2022

TF-IDF (term frequency-inverse document frequency) is one of the traditional text similarity calculation methods based on statistics. Because does not consider semantic information words, it cannot accurately reflect between texts, and enhanced distinguish documents poorly because extended vectors with similar terms aggravate curse dimensionality. Aiming at this problem, paper advances a hybrid...

Journal: :Nucleation and Atmospheric Aerosols 2021

"Fake News (FNs) is defined as a made-up story to deceive or mislead." The problem of FNs spread widely in recent years, especially on social media such Facebook, Twitter, and other sources like webs blogs. It has become significant society result changing people's ideas opinions about the direction this news. In paper, detection can be proposed by using Term Frequency-Inverse Document Frequenc...

Journal: :Inf. Sci. 2004
Verayuth Lertnattee Thanaruk Theeramunkong

Most of traditional text categorization approaches utilize term frequency (tf) and inverse document frequency (idf) for representing importance of words and/or terms in classifying a text document. This paper describes an approach to apply term distributions, in addition to tf and idf, to improve performance of centroid-based text categorization. Three types of term distributions, called inter-...

Journal: :Indonesian Journal of Electrical Engineering and Computer Science 2022

Increased advancement in a variety of study subjects and information technologies, has increased the number published research articles. However, researchers are facing difficulties devote significant time amount locating scientific publications relevant to their domain expertise. In this article, an approach document classification is presented cluster text documents articles into expressive g...

Journal: :CoRR 2014
Michael Stewart

Specificity is important for extracting collocations, keyphrases, multi-word and index terms [Newman et al. 2012]. It is also useful for tagging, ontology construction [Ryu and Choi 2006], and automatic summarization of documents [Louis and Nenkova 2011, Chali and Hassan 2012]. Term frequency and inverse-document frequency (TF-IDF) are typically used to do this, but fail to take advantage of th...

2014
Puneet Goswami Vidya Kamath

The tf-idf is an algorithm which is generally used where massive data processing is done. Tf-idf is the weight given to a particular term within a document and it is proportional to the importance of the term. This paper aims to use the idea behind the tf-idf algorithm to design the df-icf algorithm which finds the importance of a particular document within the given corpus. General Terms DF-IC...

Journal: :IEEE Access 2021

Sentiment classification is increasingly used to automatically identify a positive or negative sentiment in text review. In classification, feature selection had always been critical and challenging problem. Most of the related for techniques unable overcome problems evaluating significant features that will reduce performance. This paper proposes an enhanced hybrid technique improve based on m...

Journal: :Applied sciences 2023

Construction accidents can lead to serious consequences. To reduce the occurrence of such and strengthen execution capabilities in on-site safety management, managers must analyze accident report texts depth extract valuable information from them. However, are usually presented unstructured or semi-structured forms; analyzing these manually requires a lot time effort, it is difficult cope with ...

2011
Garnett Carl Wilson Rodolphe Devillers Orland Hoeber

This work describes a novel fuzzy logic system designed to meet the real world demand of providing intelligent ranking to large repositories of documents previously encoded with non-fuzzy (crisp) metadata. The fuzzy logic prototype was tested in practice to complement the GeoConnections Discovery Portal, which is a web portal for specialized search and retrieval of Canadian geographic data reso...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید