text domain

Domain Specific Automatic Question Generation from Text

2017

Katira Soleymanzadeh

The goal of my doctoral thesis is to automatically generate interrogative sentences from descriptive sentences of Turkish biology text. We employ syntactic and semantic approaches to parse descriptive sentences. Syntactic and semantic approaches utilize syntactic (constituent or dependency) parsing and semantic role labeling systems respectively. After parsing step, question statements whose an...

متن کامل

One-Class Clustering in the Text Domain

2008

Ron Bekkerman Koby Crammer

Having seen a news title “Alba denies wedding reports”, how do we infer that it is primarily about Jessica Alba, rather than about weddings or reports? We probably realize that, in a randomly driven sentence, the word “Alba” is less anticipated than “wedding” or “reports”, which adds value to the word “Alba” if used. Such anticipation can be modeled as a ratio between an empirical probability o...

متن کامل

Domain Based Classification of Punjabi Text Documents

2012

Nidhi Krail Vishal Gupta

With the dramatic increase in the amount of content available in digital forms gives rise to a problem to manage this online textual data. As a result, it has become a necessary to classify large texts (documents) into specific classes. And Text Classification is a text mining technique which is used to classify the text documents into predefined classes. Most text classification techniques wor...

متن کامل

Unsupervised Domain Adaptation based on Text Relatedness

2011

Georgios Petasis

In this paper an unsupervised approach to domain adaptation is presented, which exploits external knowledge sources in order to port a classification model into a new thematic domain. Our approach extracts a new feature set from documents of the target domain, and tries to align the new features to the original ones, by exploiting text relatedness from external knowledge sources, such as WordNe...

متن کامل

Domain Structure, Rhetorical Structure, And Text Structure

1993

Penelope Sibun

It is generally agreed that text has structure (at least, coherent text does). Therefore, an understanding and appreciation of text structure must play some role in building computational systems that are capable of using text as people do. What is less clear is what are necessary and sufficient sources of structure for a text-using system, and further, what such a system needs to know about an...

متن کامل

Domain Ontology Construction from Biomedical Text

2007

Saurav Sahay Baoli Li Ernest V. Garcia Eugene Agichtein Ashwin Ram

NLM's Unified Medical Language System (UMLS) is a very large ontology of biomedical and health data. In order to be used effectively for knowledge processing, it needs to be customized to a specific domain. In this paper, we present techniques to automatically discover domain-specific concepts, discover relationships between these concepts, build a context map from these relationships, link the...

متن کامل

Compression-Domain Text Indexing and Retrieval

1997

Tzi-cker Chiueh Srinidhi Varadarajan

Keyword-based text retrieval engines have been and will continue to be essential to text-based information access systems because they serve as the basic building blocks to high-level text analysis systems. Traditionally, text compression and text retrieval are teated as independent problems. Text les are compressed and indexed separately. To answer a keyword-based query, text les are rst uncom...

متن کامل

Crystal: Learning Domain-speciic Text Analysis Rules

1996

Stephen G. Soderland

An enormous amount of knowledge is needed to infer the meaning of unrestricted natural language. The problem can be reduced to a manageable size by restricting attention to a prede ned set of concepts in a speci c domain. Two widely di erent domains are used to illustrate this domain-speci c approach. One domain is a collection of Wall Street Journal articles in which the target concept is mana...

متن کامل

A CAPTCHA in the Text Domain

2006

Pablo Ximenes André L. M. dos Santos Marcial Fernandez Joaquim Celestino

Research on CAPTCHA has led CAPTCHA design into adopting almost exclusively graphical implementations that deal mostly with character recognition. This has reached an exhaustion point, where new approaches are vital to the survival of the technique. This paper discusses the early stages of a research that intends to solve the open problem of a CAPTCHA in the text domain offering, this way, inno...

متن کامل

Supervised Ranking in Open-Domain Text Summarization

2002

Tadashi Nomoto Yuji Matsumoto

The paper proposes and empirically motivates an integration of supervised learning with unsupervised learning to deal with human biases in summarization. In particular, we explore the use of probabilistic decision tree within the clustering framework to account for the variation as well as regularity in human created summaries. The corpus of human created extracts is created from a newspaper co...

متن کامل