نتایج جستجو برای: wikipedia mining

تعداد نتایج: 92181  

2010
Julian Szymanski

The paper concerns the problem of automatic category system creation for a set of documents connected with references. Presented approach has been evaluated on the Polish Wikipedia, where two graphs: the Wikipedia category graph and article graph has been analyzed. The linkages between Wikipedia articles has been used to create a new category graph with weighted edges. We compare the created ca...

2011
F. Gediz Aksit

Wikipedia despite having a very small budget has been among the top ten most visited websites for over half a decade. Being this visible also generated the problem of ill intended people modifying Wikipedia in a destructive manner. VandalSense is an experimental tool programmed by F. Gediz Aksit to automatically identify vandalism on Wikipedia through the use of machine learning and text mining...

2014
Michele Filannino Goran Nenadic

Discovery of temporal information is key for organising knowledge and therefore the task of extracting and representing temporal information from texts has received an increasing interest. In this paper we focus on the discovery of temporal footprints from encyclopaedic descriptions. Temporal footprints are time-line periods that are associated to the existence of specific concepts. Our approac...

2015
Silvana Hartmann György Szarvas

The collection of the specialized vocabulary of a particular domain (terminology) is an important initial step of creating formalized domain knowledge representations (ontologies). Terminology Extraction (TE) aims at automating this process by collecting the relevant domain vocabulary from existing lexical resources or collections of domain texts. In this chapter, the authors address the extrac...

2010
Peter Nabende

This paper describes the use of a pair Hidden Markov Model (pair HMM) system in mining transliteration pairs from noisy Wikipedia data. A pair HMM variant that uses nine transition parameters, and emission parameters associated with single character mappings between source and target language alphabets is identified and used in estimating transliteration similarity. The system resulted in a pre...

2011
Mike Chen Razvan Bunescu

The dynamic and continuously growing category structure of Wikipedia has been used in numerous ontology extraction methods. We present a dataset of category subgraphs automatically extracted from Wikipedia that are manually annotated for is-a and instance-of relations in order to enable a more comprehensive evaluation of taxonomy mining approaches. We also show how the new dataset can be used w...

Journal: :Informatica (Slovenia) 2007
Abhijit Bhole Blaz Fortuna Marko Grobelnik Dunja Mladenic

This paper presents an approach to mining information relating people, places, organizations and events extracted from Wikipedia and linking them on a time scale. The approach consists of two phases: (1) identifying relevant pages categorizing the articles as containing people, places or organizations; (2) generating timeline linking named entities and extracting events and their time frame. We...

2015
Xiaojie Liu Jian-Yun Nie

Concepts are often used in Medical Information Retrieval. In any conceptbased method one has to extract concepts from texts (query or document). MetaMap is often used for this task. However, if the query is issued by a layperson, the query may not contain the appropriate concept expressions and MetaMap will fail to extract correct concepts. In this situation we need to explore other resources t...

Journal: :CoRR 2014
Kalpit V. Desai Roopesh Ranjan

The Wikimedia Foundation has recently observed that newly joining editors on Wikipedia are increasingly failing to integrate into the Wikipedia editors’ community, i.e. the community is becoming increasingly harder to penetrate [1]. To sustain healthy growth of the community, the Wikimedia Foundation aims to quantitatively understand the factors that determine the editing behavior, and explain ...

2012
Andias Wira-Alam Brigitte Mathiak

In this paper, we discuss the aspects of mining links and text snippets from Wikipedia as a new knowledge base. Current knowledge base, e.g. DBPedia[1], covers mainly the structured part of Wikipedia, but not the content as a whole. Acting as a complement, we focus on extracting information from the text of the articles. We extract a database of the hyperlinks between Wikipedia articles and pop...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید