semantic clustering

Effects of Creativity and Cluster Tightness on Short Text Clustering Performance

2016

Catherine Finegan-Dollak Reed Coke Rui Zhang Xiangyi Ye Dragomir R. Radev

Properties of corpora, such as the diversity of vocabulary and how tightly related texts cluster together, impact the best way to cluster short texts. We examine several such properties in a variety of corpora and track their effects on various combinations of similarity metrics and clustering algorithms. We show that semantic similarity metrics outperform traditional n-gram and dependency simi...

متن کامل

developing the persian version of the homophone meaning generation test

Journal: :medical journal of islamic republic of iran 0

mona ebrahimipour ebrahimipour department of speech therapy, school of rehabilitation, iran university of medical sciences, tehran, iran. mohammad reza motamed department of neurology, iran university of medical sciences, tehran, iran. hassan ashayeri department of basic sciences in rehabilitation, school of rehabilitation, iran university of medical sciences, tehran, iran. yahya modarresi department of linguistics, human sciences and cultural education institute, tehran, iran. mohammad kamali department of basic sciences in rehabilitation, iran university of medical sciences, school of rehabilitation sciences, tehran, iran.

background: finding the right word is a necessity in communication, and its evaluation has always been a challenging clinical issue, suggesting the need for valid and reliable measurements. the homophone meaning generation test (hmgt) can measure the ability to switch between verbal concepts, which is required in word retrieval. the purpose of this study was to adapt and validate the persian ve...

متن کامل

Web Image Semantic Clustering

2005

Zhiguo Gong Leong Hou U Chan Wa Cheang

This paper provides a novel Web image clustering methodology based on their associated texts. In our approach, the semantics of Web images are firstly represented into vectors of term-weight pairs. In order to correctly correlate terms to a Web image, the associated text of the Web image is partitioned into semantic blocks according to the semantic structure of the text with respect to the Web ...

متن کامل

Semantic Website Clustering

2007

I-Hsuan Yang Yu-tsun Huang Yen-Ling Huang

We propose a new approach to cluster the web pages. Utilizing an iterative reinforced algorithm, the model extracts semantic feature vectors from user click-through data. We then use LSA (Latent Semantic Analysis) to reduce the feature dimension and K-means algorithm to cluster documents. Compared to the traditional way of feature extraction (lexical binomial model), our new model has better pu...

متن کامل

Toward Semantic XML Clustering

2006

Andrea Tagarelli Sergio Greco

The increasing availability of heterogeneous XML informative sources has raised a number of issues concerning how to represent and manage semistructured data. Although XML sources can exhibit proper structures and contents, differently annotated XML documents may in principle encode related semantics due to subjective definitions of markup tags. Discovering knowledge to infer semantic organizat...

متن کامل

Semantical Clustering of Morphologically Related Chinese Words

2014

Chia-Ling Lee Ya-Ning Chang Chao-Lin Liu Chia-Ying Lee Jane Yung-jen Hsu

A Chinese character embedded in different compound words may carry different meanings. In this paper, we aim at semantic clustering of a given family of morphologically related Chinese words. In Experiment 1, we employed linguistic features at the word, syntactic, semantic, and contextual levels in aggregated computational linguistics methods to handle the clustering task. In Experiment 2, we r...

متن کامل

Java source-code clustering: Unifying syntactic and semantic features

Journal: :ACM SIGSOFT Software Engineering Notes 2013

متن کامل

Identifying Bengali Multiword Expressions using Semantic Clustering

Journal: :CoRR 2013

Tanmoy Chakraborty Dipankar Das Sivaji Bandyopadhyay

One of the key issues in both natural language understanding and generation is the appropriate processing of Multiword Expressions (MWEs). MWEs pose a huge problem to the precise language processing due to their idiosyncratic nature and diversity in lexical, syntactical and semantic properties. The semantic of a MWE cannot be expressed after combining the semantic of its constituents. Therefore...

متن کامل

Semantic Clustering and Convolutional Neural Network for Short Text Categorization

2015

Peng Wang Jiaming Xu Bo Xu Cheng-Lin Liu Heng Zhang Fangyuan Wang Hongwei Hao

Short texts usually encounter data sparsity and ambiguity problems in representations for their lack of context. In this paper, we propose a novel method to model short texts based on semantic clustering and convolutional neural network. Particularly, we first discover semantic cliques in embedding spaces by a fast clustering algorithm. Then, multi-scale semantic units are detected under the su...

متن کامل

Algorithm for Semantic Based Similarity Measure

2013

Sapna Chauhan Pridhi Arora

In a document representation model the Semanti based Similarity Measure (SBSM), is proposed. This model combines phrases analysis as well as words analysis with the use of propbank notation as background knowledge to explore better ways of documents representation for clustering. The SBSM assigns semantic weights to both document words and phrases. The new weights reflect the semantic relatedne...

متن کامل