How Latent is Latent Semantic Analysis?

نویسنده

  • Peter M. Wiemer-Hastings
چکیده

Latent Semantic Analysis (LSA) is a statistical, corpus-based text comparison mechanism that was originally developed for the task of information retrieval, but in recent years has produced remarkably human-like abilities in a variety of language tasks. LSA has taken the Test of English as a Foreign Language and performed as well as non-native English speakers who were successful college applicants. It has shown an ability to learn words at a rate similar to humans. It has even graded papers as reliably as human graders. We have used LSA as a mechanism for evaluating the quality of student responses in an intelligent tutoring system, and its performance equals that of human raters with intermediate domain knowledge. It has been claimed that LSA’s text-comparison abilities stem primarily from its use of a statistical technique called singular value decomposition (SVD) which compresses a large amount of term and document co-occurrence information into a smaller space. This compression is said to capture the semantic information that is latent in the corpus itself. We test this claim by comparing LSA to a version of LSA without SVD, as well as a simple keyword matching model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query expansion based on relevance feedback and latent semantic analysis

Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...

متن کامل

How Important Is Size? An Investigation of Corpus Size and Meaning in Both Latent Semantic Analysis and Latent Dirichlet Allocation

This study examines how differences in corpus size influence the accuracy of Latent Semantic Analysis (LSA) spaces and Latent Dirichlet Allocation (LDA) spaces in two tasks: a word association task and a vocabulary definition test. Specific optimizations were considered in building each semantic model. Initial results indicate that larger corpora lead to greater accuracy and that LDA probabilis...

متن کامل

Workshop Organization Programme Chairs Programme Committee Integrating Semantic Web Services and Matchmaking in Ebxml Registry 69 an Interest-based Offer Evaluation System for Semantic Matchmakers . . . 99 Probabilistic Methods for Service Clustering

This paper focuses on service clustering and uses service descriptions to construct probabilistic models for service clustering. We discuss how service descriptions can be enriched with machine-interpretable semantics and then we investigate how these service descriptions can be grouped in clusters in order to make discovery, ranking, and recommendation faster and more effective. We propose usi...

متن کامل

How Does Latent Semantic Analysis Work? A Visualisation Approach

By using a small example, an analogy to photographic compression, and a simple visualization using heatmaps, we show that latent semantic analysis (LSA) is able to extract what appears to be semantic meaning of words from a set of documents by blurring the distinctions

متن کامل

Probabilistic Methods for Service Clustering

This paper focuses on service clustering and uses service descriptions to construct probabilistic models for service clustering. We discuss how service descriptions can be enriched with machine-interpretable semantics and then we investigate how these service descriptions can be grouped in clusters in order to make discovery, ranking, and recommendation faster and more effective. We propose usi...

متن کامل

An ethnographic study of latent thermal pleasure in the tradition of desert climate architecture

The architectural tradition of the desert climate has both hidden and hidden teachings. The creation of rich sensory spaces is one of the gifts that the architecture of this context offers to its inhabitants. A considerable part of human sensory experiences in different spaces is formed by heat. Heat, like other sensory stimuli, can contribute to the richness of human perception of the environm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999