word clustering

نتایج جستجو برای: word clustering

تعداد نتایج: 205729 فیلتر نتایج به سال:

A New Word Clustering Method for Building N-Gram Language Models in Continuous Speech Recognition Systems

2008

Mohammad Bahrani Hossein Sameti Nazila Hafezi Saeedeh Momtazi

In this paper a new method for automatic word clustering is presented. We used this method for building n-gram language models for Persian continuous speech recognition (CSR) systems. In this method, each word is specified by a feature vector that represents the statistics of parts of speech (POS) of that word. The feature vectors are clustered by k-means algorithm. Using this method causes a r...

متن کامل

Evidence of semantic clustering in letter-cued word retrieval

Journal: :Journal of Clinical and Experimental Neuropsychology 2013

متن کامل

Automatic Induction of Synsets from a Graph of Synonyms

2017

Dmitry Ustalov Alexander Panchenko Christian Biemann

This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. First, we build a weighted graph of synonyms extracted from commonly available resources, such as Wiktionary. Second, we apply word sense induction to deal with ambiguous words. Finally, we cluster the disambiguated version of the ambiguous input graph into synsets. Our meta-clus...

متن کامل

Parallel Web Text Clustering with a Modular Self-Organizing Map System

2007

Lean YU Shouyang WANG Kin Keung LAI

In this study, a multistage modular self-organizing map (SOM) model is proposed for parallel web text clustering. In the first stage, the large textual datasets are divided into some small disjoint datasets (i.e., task decomposition). In the second stage, each small data set is input into different unitary SOM models for word clustering map (i.e., modularization learning). In this stage, differ...

متن کامل

Comparison Clustering using Cosine and Fuzzy set based Similarity Measures of Text Documents

Journal: :CoRR 2015

Manan Mohan Goyal Neha Agrawal Manoj Kumar Sarma Nayan Jyoti Kalita

Keeping in consideration the high demand for clustering, this paper focuses on understanding and implementing K-means clustering using two different similarity measures. We have tried to cluster the documents using two different measures rather than clustering it with Euclidean distance. Also a comparison is drawn based on accuracy of clustering between fuzzy and cosine similarity measure. The ...

متن کامل

Algorithms for bigram and trigram word clustering

Journal: :Speech Communication 1995

Sven C. Martin Jörg Liermann Hermann Ney

CLUSTERING Sven Martin, J org Liermann, Hermann Ney Lehrstuhl f ur Informatik VI, RWTH Aachen, University of Technology, D-52056 Aachen, Germany ABSTRACT. This paper presents and analyzes improved algorithms for clustering bigram and trigram word equivalence classes, and their respective results: 1) We give a detailed time complexity analysis of bigram clustering algorithms. 2) We present an ...

متن کامل

Network Structure Influences Speech Production

Journal: :Cognitive science 2010

Kit Ying Chan Michael S. Vitevitch

Network science provides a new way to look at old questions in cognitive science by examining the structure of a complex system, and how that structure might influence processing. In the context of psycholinguistics, clustering coefficient-a common measure in network science-refers to the extent to which phonological neighbors of a target word are also neighbors of each other. The influence of ...

متن کامل

String Vector based AHC as Approach to Word Clustering

2016

Taeho Jo

In this research, we propose the string vector based AHC (Agglomerative Hierarchical Clustering) algorithm as the approach to the word clustering. In the previous works on text clustering, it was successful to encode texts into string vectors by improving the performance of text clustering; it provided the motivation of doing this research. In this research, we encode words into string vectors,...

متن کامل

Determining Gains Acquired from Word Embedding Quantitatively Using Discrete Distribution Clustering

2017

Jianbo Ye Yanran Li Zhaohui Wu James Zijun Wang Wenjie Li Jia Li

Word embeddings have become widelyused in document analysis. While a large number of models for mapping words to vector spaces have been developed, it remains undetermined how much net gain can be achieved over traditional approaches based on bag-of-words. In this paper, we propose a new document clustering approach by combining any word embedding with a state-of-the-art algorithm for clusterin...

متن کامل

Word clustering and disambiguation based on co-occurrence data

Journal: :Natural Language Engineering 2002

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید