word clustering

نتایج جستجو برای: word clustering

تعداد نتایج: 205729 فیلتر نتایج به سال:

Word Sense Induction Using Lexical Chain based Hypergraph Model

2014

Tao Qian Dong-Hong Ji Mingyao Zhang Chong Teng Congling Xia

Word Sense Induction is a task of automatically finding word senses from large scale texts. It is generally considered as an unsupervised clustering problem. This paper introduces a hypergraph model in which nodes represent instances of contexts where a target word occurs and hyperedges represent higher-order semantic relatedness among instances. A lexical chain based method is used for discove...

متن کامل

Utilizing the One-Sense-per-Discourse Constraint for Fully Unsupervised Word Sense Induction and Disambiguation

2004

Reinhard Rapp

Recent advances in word sense induction rely on clustering related words. In this paper, instead of using a clustering algorithm, we suggest to perform a Singular Value Decomposition (SVD) which can be guaranteed to always find a global optimum. However, in order to apply this method to the problem of word sense induction, a semantic interpretation of the dimensions computed by the SVD is requi...

متن کامل

Automatic Chinese Summarization Method Based on the HowNet and Clustering Algorithm

2007

Gang Bai Dongmei Wang Zongyao Ding Yi Zhu

To solve the problems in traditional automatic Chinese summarization, a new method based on the word concept and clustering is presented in this paper. Different from the normal statistical method, concept is used as feature instead of word. Also, instead of word frequency statistics, word concept frequency statistics (WCFS) is used in our approach. For each paragraph, a conceptual vector space...

متن کامل

The Statistical Model of Chinese Word Contours Based on Fuzzy Clustering Method

2005

Jianhua Tao Lianhong Cai Yuzuo Zhong

With the aim of constructing a set of prosodic rules enabling to generate high-quality synthetic speech of Chinese, tone concatenation features were investigated for Chinese words. A statistical model is developed for Chinese word pitch contours based on fuzzy clustering and analysis method. The clustering results shows that word contours are not only depending on the different combination of t...

متن کامل

WHU-BioNLP CHEMDNER System with Mixed Conditional Random Fields and Word Clustering

2013

Yanan Lu Xiaoyuan Yao Xiaomei Wei Donghong Ji Xiaohui Liang

Our team participated in the Chemical Compound and Drug Name Recognition task of BioCreative IV. We used a mixed conditional random fields with word clustering to fulfillment this task. For one hand, we generate the word feature by word clustering and train the corpus with word feature to get one model. On the other hand, the training corpus is transformed to a new one in the reversed order of ...

متن کامل

developing the persian version of the homophone meaning generation test

Journal: :medical journal of islamic republic of iran 0

mona ebrahimipour ebrahimipour department of speech therapy, school of rehabilitation, iran university of medical sciences, tehran, iran.سازمان اصلی تایید شده: دانشگاه علوم پزشکی ایران (iran university of medical sciences) mohammad reza motamed department of neurology, iran university of medical sciences, tehran, iran.سازمان اصلی تایید شده: دانشگاه علوم پزشکی ایران (iran university of medical sciences) hassan ashayeri department of basic sciences in rehabilitation, school of rehabilitation, iran university of medical sciences, tehran, iran.سازمان اصلی تایید شده: دانشگاه علوم پزشکی ایران (iran university of medical sciences) yahya modarresi department of linguistics, human sciences and cultural education institute, tehran, iran. mohammad kamali department of basic sciences in rehabilitation, iran university of medical sciences, school of rehabilitation sciences, tehran, iran.سازمان اصلی تایید شده: دانشگاه علوم پزشکی ایران (iran university of medical sciences)

background: finding the right word is a necessity in communication, and its evaluation has always been a challenging clinical issue, suggesting the need for valid and reliable measurements. the homophone meaning generation test (hmgt) can measure the ability to switch between verbal concepts, which is required in word retrieval. the purpose of this study was to adapt and validate the persian ve...

متن کامل

Co-clustering Documents and Words by Minimizing the Normalized Cut Objective Function

Journal: :J. Math. Model. Algorithms 2010

Charles-Edmond Bichot

This paper follows a word-document co-clustering model independently introduced in 2001 by several authors such as I.S. Dhillon, H. Zha and C. Ding. This model consists in creating a bipartite graph based on word frequencies in documents, and whose vertices are both documents and words. The created bipartite graph is then partitioned in a way that minimizes the normalized cut objective function...

متن کامل

A Part-of-Speech Tag Clustering for a Word Prediction System in Portuguese Language

Journal: :Procesamiento del Lenguaje Natural 2011

Daniel Cruz Cavalieri Teodiano Freire Bastos Filho Mário Sarcinelli Filho Sira E. Palazuelos-Cagigas Javier Macías Guarasa José Luis Martín Sánchez

This paper presents an automatic method for reducing the part-of-speech tagset to be considered by a word prediction system in Portuguese. The method is based on a similarity measure applied to a association matrix, generated by employing a odds ratio association measure in the bigrams of parts-of-speech (bipos) probability distribution in a corpus. The results reported in this paper show that ...

متن کامل

A Mixture Model with Sharing for Lexical Semantics

2010

Joseph Reisinger Raymond J. Mooney

We introduce tiered clustering, a mixture model capable of accounting for varying degrees of shared (context-independent) feature structure, and demonstrate its applicability to inferring distributed representations of word meaning. Common tasks in lexical semantics such as word relatedness or selectional preference can benefit from modeling such structure: Polysemous word usage is often govern...

متن کامل

Word Image Matching as a Techique for Degraded Text Recognition

1998

Jonathan J. Hull Siamak Khoubyari Tin Kam Ho

A technique is presented that determines equivalences between word images in a passage of text. A clustering procedure is applied to group visually similar words. Initial hypotheses for the identities of words are then generated by matching the word groups to language statistics that predict the frequency at which certain words will occur. This is followed by a recognition step that assigns ide...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید