نتایج جستجو برای: word clustering

تعداد نتایج: 205729  

2014
Haipeng Wang Tan Lee Cheung-Chi Leung Bin Ma Haizhou Li

This paper describes a new approach to unsupervised acoustic modeling, that is to build acoustic models for phoneme-like sub-word units from untranscribed speech data. The proposed approach is based on Gaussian component clustering. Initially a large set of Gaussian components are estimated from the untranscribed data. Then clustering is performed to group these Gaussian components into differe...

2013
Kashyap Popat

In this report, we present the literature survey done for our work with SA and other NLP applications. The road map of this report is as follows. In Section-1, we introduce clustering process and describe a few existing word clustering techniques. Section-2 talks about the smoothing process followed by why clustering is better for our work in Section-3. Finally in Section-4, we talk about the r...

2005
Stefano Faini Simone Marinai Emanuele Marino Giovanni Soda

In this paper we discuss some applications of word image clustering (based on Self Organizing Maps, SOM) for tasks related to document image retrieval. Two main applications are discussed: document retrieval and word retrieval. In document retrieval a document representation based on the vector model is obtained by computing the occurrences of words belonging to the SOM clusters in each documen...

2014
Cem Akkaya Janyce Wiebe Rada Mihalcea

Subjectivity word sense disambiguation (SWSD) is a supervised and applicationspecific word sense disambiguation task disambiguating between subjective and objective senses of a word. Not surprisingly, SWSD suffers from the knowledge acquisition bottleneck. In this work, we use a “cluster and label” strategy to generate labeled data for SWSD semiautomatically. We define a new algorithm called It...

2008
Joy Deep Nath Monojit Choudhury Animesh Mukherjee Christian Biemann Niloy Ganguly

We present a study of the word interaction networks of Bengali in the framework of complex networks. The topological properties of these networks reveal interesting insights into the morpho-syntax of the language, whereas clustering helps in the induction of the natural word classes leading to a principled way of designing POS tagsets. We compare different network construction techniques and cl...

2010
Stanley F. Chen Stephen M. Chu

Model M is a superior class-based n-gram model that has shown improvements on a variety of tasks and domains. In previous work with Model M, bigram mutual information clustering has been used to derive word classes. In this paper, we introduce a new word classing method designed to closely match with Model M. The proposed classing technique achieves gains in speech recognition word-error rate o...

Journal: :Pattern Recognition Letters 2006
William-Chandra Tjhi Lihui Chen

In this paper, a new algorithm fuzzy co-clustering with Ruspini s condition (FCR) is proposed for co-clustering documents and words. Compared to most existing fuzzy co-clustering algorithms, FCR is able to generate fuzzy word clusters that capture the natural distribution of words, which may be beneficial for information retrieval. We discuss the principle behind the algorithm through some theo...

2006
Simone Marinai Stefano Faini Emanuele Marino Giovanni Soda

We propose an approach for efficient word retrieval from printed documents belonging to Digital Libraries. The approach combines word image clustering (based on Self Organizing Maps, SOM) with Principal Component Analysis. The combination of these methods allows us to efficiently retrieve the matching words from large documents collections without the need for a direct comparison of the query w...

Journal: :EURASIP J. Audio, Speech and Music Processing 2013
Yongzhe Shi Weiqiang Zhang Jia Liu Michael T. Johnson

The recurrent neural network language model (RNNLM) has shown significant promise for statistical language modeling. In this work, a new class-based output layer method is introduced to further improve the RNNLM. In this method, word class information is incorporated into the output layer by utilizing the Brown clustering algorithm to estimate a class-based language model. Experimental results ...

Background: Finding the right word is a necessity in communication, and its evaluation has always been a challenging clinical issue, suggesting the need for valid and reliable measurements. The Homophone Meaning Generation Test (HMGT) can measure the ability to switch between verbal concepts, which is required in word retrieval. The purpose of this study was to adapt and validate the Persian ve...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید