cosine similarity

Using Sub-sequence Information with kNN for Classification of Sequential Data

2005

Pradeep Kumar M. Venkateswara Rao P. Radha Krishna Raju S. Bapi

With the enormous growth of data, which exhibit sequentiality, it has become important to investigate the impact of embedded sequential information within the data. Sequential data are growing enormously, hence an efficient classification of sequential data is needed. k-Nearest Neighbor (kNN) has been used and proved to be an efficient classification technique for two-class problems. This paper...

متن کامل

A Distance and Angle Similarity Measure Method

Journal: :JASIS 1999

Jin Zhang Robert R. Korfhage

This article presents a distance and angle similarity measure. The integrated similarity measure takes the strengths of both the distance and direction of measured documents into account. This article analyzes the features of the similarity measure by comparing it with the traditional distance-based similarity measure and the cosine measure, providing the iso-similarity contour, investigating t...

متن کامل

SimBow at SemEval-2017 Task 3: Soft-Cosine Semantic Similarity between Questions for Community Question Answering

2017

Delphine Charlet Géraldine Damnati

This paper describes the SimBow system submitted at SemEval2017-Task3, for the question-question similarity subtask B. The proposed approach is a supervised combination of different unsupervised textual similarities. These textual similarities rely on the introduction of a relation matrix in the classical cosine similarity between bag-of-words, so as to get a softcosine that takes into account ...

متن کامل

Slovenská Technická Univerzita V Bratislave Fakulta Informatiky a Informačných Technológií

2010

Michal Kompan

The information overloading is one of the biggest problems nowadays. We can see it in various domains, including business, especially in the news. This is more significant in connection to news portals, where the quality of the news portal is commonly measured by amount of news added to the site. Then the most renowned news portals add hundreds of new articles daily. The classical solution usua...

متن کامل

Multi-criteria Decision Making Method Based on Similarity Measures under Single Valued Neutrosophic Refined and Interval Neutrosophic Refined Environments

2015

FARUK KARAASLAN

Abstract. In this paper, we propose three similarity measure methods for single valued neutrosophic refined sets and interval neutrosophic refined sets based on Jaccard, Dice and Cosine similarity measures of single valued neutrosophic sets and interval neutrosophic sets. Furthermore, we suggest two multi-criteria decision making method under single valued neutrosophic refined environment and i...

متن کامل

Realisation of the Prosodic Structure of Spoken Telephone Numbers by Native and Non-native Speakers of Japanese

2011

Kanae Amino Takashi Osanai

This paper reports the different realisations of the prosodic structure of Japanese telephone numbers in native and non-native Japanese speech. In Japanese, spoken telephone numbers have a structured prosody called bipodic template; and their accentuation is determined to occur every two digits. Pitch contours of the spoken telephone numbers were analysed and compared among speakers of Japanese...

متن کامل

Taxonomy Learning - Factoring the Structure of a Taxonomy into a Semantic Classification Decision

2002

Viktor Pekar Steffen Staab

The paper examines different possibilities to take advantage of the taxonomic organization of a thesaurus to improve the accuracy of classifying new words into its classes. The results of the study demonstrate that taxonomic similarity between nearest neighbors, in addition to their distributional similarity to the new word, may be useful evidence on which classification decision can be based.

متن کامل

Identification of Three-Dimensional Crystal Lattices by Estimation of Their Unit Cell Parameters

2015

Dmitriy Kirsh Alexander V. Kupriyanov

The problem of the identification of three-dimensional crystal lattices is considered in the article. Two matching methods based on estimation of unit cell parameters were developed to solve this problem. The first method estimates and compares main parameters of Bravais unit cells. The second method estimates and compares volumes of Wigner-Seitz unit cells. Both methods include normalised simi...

متن کامل

Specialising Word Vectors for Lexical Entailment

Journal: :CoRR 2017

Ivan Vulic Nikola Mrksic

We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy...

متن کامل

The RMIT/CSIRO Ad Hoc, Q&A, Web, Interactive, and Speech Experiments at TREC 8

1999

Michael Fuller Marcin Kaszkiel Sam Kimberley Corinna Ng Ross Wilkinson Mingfang Wu Justin Zobel

The constants k1, k3 and b were set to 1.2, 1000 and 0.75 respectively, as recommended by the City University group [13]. Wd is the length of the document d in bytes and avr Wd is the average document length in the entire collection. N is the total number of documents in the collection, ft is the number of documents in which term t occurs, and fx;t is the frequency of term t in either a documen...

متن کامل