نتایج جستجو برای: word clustering

تعداد نتایج: 205729  

2016
Ali Javed Byung Suk Lee

We enhance the accuracy of the currently available semantic hashtag clustering method, which leverages hashtag semantics extracted from dictionaries such as Wordnet and Wikipedia. While immune to the uncontrolled and often sparse usage of hashtags, the current method distinguishes hashtag semantics only at the word level. Unfortunately, a word can have multiple senses representing the exact sem...

2013
Yujing Si Zhen Zhang Ta Li Jielin Pan Yonghong Yan

Recurrent Neural Network Language Model (RNNLM) has recently been shown to outperform conventional N-gram LM as well as many other competing advanced language model techniques. However, the computation complexity of RNNLM is much higher than the conventional N-gram LM. As a result, the Class-based RNNLM (CRNNLM) is usually employed to speed up both the training and testing phase of RNNLM. In pr...

2003
Eneko Agirre Oier Lopez de Lacalle

This paper presents the results of a set of methods to cluster WordNet word senses. The methods rely on different information sources: confusion matrixes from Senseval-2 Word Sense Disambiguation systems, translation similarities, hand-tagged examples of the target word senses and examples obtained automatically from the web for the target word senses. The clustering results have been evaluated...

1999
Hirofumi Yamamoto Yoshinori Sagisaka

A new word-clustering technique is proposed to efficiently build statistically salient class 2-grams from language corpora. By splitting word neighboring characteristics into word-preceding and following directions, multiple (two-dimensional) word classes are assigned to each word. In each side, word classes are merged into larger clusters independently according to preceding or following word ...

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2023

The pre-trained language models, e.g., ELMo and BERT, have recently achieved promising performance improvement in a wide range of NLP tasks, because they can output strong contextualized embedded features words. Inspired by their great success, this paper we target at fine-tuning them to effectively handle the text clustering task, i.e., classic fundamental challenge machine learning. According...

Journal: :Jurnal Edukasi dan Penelitian Informatika (JEPIN) 2019

Journal: :CoRR 2016
Halid Ziya Yerebakan Fitsum A. Reda Yiqiang Zhan Yoshihisa Shinagawa

This paper presents a new Bayesian non-parametric model by extending the usage of Hierarchical Latent Dirichlet Allocation to extract tree structured word clusters from text data. The inference algorithm of the model collects words in a cluster if they share similar distribution over documents. In our experiments, we observed meaningful hierarchical structures on NIPS corpus and radiology repor...

2016
Jon Dehdari Liling Tan Josef van Genabith

Word clusters improve performance in many NLP tasks including training neural network language models, but current increases in datasets are outpacing the ability of word clusterers to handle them. In this paper we present a novel bidirectional, interpolated, refining, and alternating (BIRA) predictive exchange algorithm and introduce ClusterCat, a clusterer based on this algorithm. We show tha...

2006
Lichi Yuan

Category-based statistic language model is an important method to solve the problem of sparse data. But there are two bottlenecks about this model: (1) the problem of word clustering, it is hard to find a suitable clustering method that has good performance and not large amount of computation. (2) class-based method always loses some prediction ability to adapt the text of different domain. The...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید