latent semantic analysis

نتایج جستجو برای: latent semantic analysis

تعداد نتایج: 2942209 فیلتر نتایج به سال:

Estimating Document Similarity using Auxiliary Category Information

2003

Gerhard Paass

We have developed a novel approach to determine the similarity of documents using probabilistic latent semantic indexing. For each document a probability vector of latent factors is estimated which on the one hand takes into account the distribution of words in the text and on the other hand the distribution of category values. The emphasis can be freely shifted between both aspects and therefo...

متن کامل

How Reliable is Your Workflow: Monitoring Decay in Scholarly Publications

2013

José Manuél Gómez-Pérez Esteban García-Cuesta Jun Zhao Aleix Garrido José Enrique Ruiz

Semantic publishing can enable richer documents with clearer, computationally interpretable properties. For this vision to become reality, however, authors must benefit from this process, so that they are incentivised to add these semantics. Moreover, the publication process that generates final content must allow and enable this semantic content. Here we focus on author-led or “grey” literatur...

متن کامل

Semantic Associative Topic Models for Information Retrieval

2010

主題模型(topic model)被廣泛地應用在各種文件建模以及語音識別、資訊檢索和本文探勘系統中,有效地擷取文件或字詞的語意和統計資料。大多數主題模式,例如機率潛在語意分析(probabilistic latent semantic analysis) 和潛在狄利克里分配 (latent Dirichlet allocation),主要都透過一組潛藏的主題機率分布來描述文件與字詞之間的關係,並用以擷取文件的潛在語意資訊。然而,傳統的主題模型受限於詞袋(bag-of-words)的假設,其潛藏主題僅能用來擷取個體詞(individual word)之間的語意資訊。雖然個體詞可傳達主題信息,但有時會缺乏本文準確的語意知識,容易造成文件的誤判,降低檢索的品質。為了改善主題模型的缺點,本論文提出一種新穎的語意關聯主題模型(semantic associ...

متن کامل

Identifying reading strategies using latent semantic analysis: Comparing semantic benchmarks

Journal: :Behavior Research Methods, Instruments, & Computers 2004

متن کامل

lsemantica: A Stata Command for Text Similarity based on Latent Semantic Analysis

2017

Carlo Schwarz

The lsemantica command, presented in this paper, implements Latent Semantic Analysis in Stata. Latent Semantic Analysis is a machine learning algorithm for word and text similarity comparison. Latent Semantic Analysis uses Truncated Singular Value Decomposition to derive the hidden semantic relationships between words and texts. lsemantica provides a simple command for Latent Semantic Analysis ...

متن کامل

Using Random Indexing to improve Singular Value Decomposition for Latent Semantic Analysis

2008

Linus Sellberg Arne Jönsson

We present results from using Random Indexing for Latent Semantic Analysis to handle Singular Value Decomposition tractability issues. We compare Latent Semantic Analysis, Random Indexing and Latent Semantic Analysis on Random Indexing reduced matrices. In this study we use a corpus comprising 1003 documents from the MEDLINE-corpus. Our results show that Latent Semantic Analysis on Random Index...

متن کامل

Comparison of cosine similarity and k-NN for automated essays scoring

2014

A. A. Ewees Mohamed Eisa

In this paper, a comparison between Cosine Similarity and k-Nearest Neighbors algorithm in Latent Semantic Analysis method to score Arabic essays automatically is presented. It also improves Latent Semantic Analysis by processing the entered text, unifying the form of letters, deleting the formatting, replacing synonyms, stemming and deleting "Stop Words". The results showed that the use of Cos...

متن کامل

Probabilistic Latent Semantic Analysis

1999

Thomas Hofmann

Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two{mode and co-occurrence data, which has applications in information retrieval and ltering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Semantic Analysis which stems from linear algebra and performs a Singular Value Decomposition of co-occu...

متن کامل

Latent Semantic Analysis (Tutorial)

2009

Alex Thomo

We will see that the number of eigenvalues is n for an n× n matrix. Regarding eigenvectors, if x is an eigenvector then so is ax for any scalar a. However, if we consider only one eigenvector for each ax family, then there is a 1-1 correspondence of such eigenvectors to eigenvalues. Typically, we consider eigenvectors of unit length. Diagonal matrices are simple, the eigenvalues are the entries...

متن کامل

Latent Semantic Analysis

1999

Peter Wiemer-Hastings

Latent Semantic Analysis (LSA) is a technique for comparing texts using a vector-based representation that is learned from a corpus. This article begins with a description of the history of LSA and its basic functionality. LSA enjoys both theoretical support and empirical results that show how it matches human behavior. A number of the experiments that compare LSA with humans are described here...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید