Semantic Text Indexing
نویسندگان
چکیده
منابع مشابه
Kaleta SEMANTIC TEXT INDEXING
The following article presents a specific issue of semantic analysis of texts in natural language – text indexing and describes one field of its application (web browsing). The main part of this article describes a computer system assigning a set of semantic indexes (similar to keywords) to a particular text. The indexing algorithm employs a semantic dictionary to find specific words in a text ...
متن کاملRole of semantic indexing for text classification
The Vector Space Model (VSM) of text representation suffers a number of limitations for text classification. Firstly, the VSM is based on the Bag-Of-Words (BOW) assumption where terms from the indexing vocabulary are treated independently of one another. However, the expressiveness of natural language means that lexically different terms often have related or even identical meanings. Thus, fail...
متن کاملLRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier
The task of Text Classification (TC) is to automatically assign natural language texts with thematic categories from a predefined category set. And Latent Semantic Indexing (LSI) is a well known technique in Information Retrieval, especially in dealing with polysemy (one word can have different meanings) and synonymy (different words are used to describe the same concept), but it is not an opti...
متن کاملPerformance Analysis of Semantic Indexing in Text Retrieval
We developed a new indexing formalism that considers not only the terms in a document, but also the concepts to represent the semantic content of a document. In this approach, concept clusters are defined and a concept vector space model is proposed to represent the semantic importance of words and concepts within a document. Through experiments on the TREC-2 collection, we show that the propos...
متن کاملInformation Retrieval and Text Categorization with Semantic Indexing
In this paper, we present the effect of the semantic indexing using WordNet senses on the Information Retrieval (IR) and Text Categorization (TC) tasks. The documents have been sense-tagged using a Word Sense Disambiguation (WSD) system based on Specialized Hidden Markov Models (SHMMs). The preliminary results showed that a small improvement of the performance was obtained only in the TC task. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Science
سال: 2014
ISSN: 1508-2806
DOI: 10.7494/csci.2014.15.1.19