نتایج جستجو برای: word clustering

تعداد نتایج: 205729  

2008
Niladri Chatterjee Shiwali Mohan

Random Indexing is a novel technique for dimensionality reduction while creating Word Space model from a given text. This paper explores the possible application of Random Indexing in discovering word senses from the text. The words appearing in the text are plotted onto a multi-dimensional Word Space using Random Indexing. The geometric distance between words is used as an indicative of their ...

2012
Neetu Sharma

In the Natural Language Processing (NLP) community, Word Sense Disambiguation (WSD) has been described as the task which selects the appropriate meaning (sense) to a given word in a text or discourse where this meaning is distinguishable from other senses potentially attributable to that word. These senses could be seen as the target labels of a classification problem. Clustering and classifica...

2010
Zhao Liu Xipeng Qiu Xuanjing Huang

This paper describes the implementation of our system at CLP 2010 bakeoff of Chinese word sense induction. We first extract the triplets for the target word in each sentence, then use the intersection of all related words of these triplets from the Internet. We use the related word to construct feature vectors for the sentence. At last we discriminate the word senses by clustering the sentences...

2014
Thien Huu Nguyen Ralph Grishman

Relation extraction suffers from a performance loss when a model is applied to out-of-domain data. This has fostered the development of domain adaptation techniques for relation extraction. This paper evaluates word embeddings and clustering on adapting feature-based relation extraction systems. We systematically explore various ways to apply word embeddings and show the best adaptation improve...

1998
Myung Gyu Song Hoi In Jung Kab-Jong Shim Hyung Soon Kim

In speech recognition for real-world applications, the performance degradation due to the mismatch introduced between training and testing environments should be overcome. In this paper, to reduce this mismatch, we provide a hybrid method of spectral subtraction and residual noise masking. We also employ multiple model approach to obtain improved robustness over various noise environments. In t...

1999
S. E. Johnson

The problem of labelling speaker turns by automatically segmenting and clustering a continuous audio stream is addressed. A new clustering scheme is presented and evaluated using a clustering e ciency score which treats both agglomerative and divisive clustering strategies equally. Results show an e ciency of 70% can be obtained on both manually and automatically derived segments on the 1996 Hu...

2003
Li Li Feng Liu Wu Chou

In this paper, an information theoretic approach for using word clusters in natural language call routing (NLCR) is proposed. This approach utilizes an automatic word class clustering algorithm to generate word classes from the word based training corpus. In our approach, the information gain (IG) based term selection is used to combine both word term and word class information in NLCR. A joint...

Journal: :ISPRS international journal of geo-information 2022

The discrete representation of resources in geospatial catalogues affects their information retrieval performance. performance could be improved by using automatically generated clusters related resources, which we name quasi-spatial dataset series. This work evaluates whether a clustering process can create series only textual from metadata elements. We assess the combination different kinds t...

Faramarz Soheili Hamid Maleki Mahmoud Ekrami, Somaye Rajabzade,

Background: Co- word analysis is one of the content analysis methods used in scientometric studies and mapping the scientific structure of various fields. The purpose of the present research is to map the structure of distance education using the co-word analysis. Methods: The research method is content analysis using co- word analysis. The research population are 31607 documents indexed in the...

Journal: :Bioinformatics 2016
Jie Ren Kai Song Minghua Deng Gesine Reinert Charles H. Cannon Fengzhu Sun

MOTIVATION Next-generation sequencing (NGS) technologies generate large amounts of short read data for many different organisms. The fact that NGS reads are generally short makes it challenging to assemble the reads and reconstruct the original genome sequence. For clustering genomes using such NGS data, word-count based alignment-free sequence comparison is a promising approach, but for this a...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید