نتایج جستجو برای: dataset generation

تعداد نتایج: 446131  

2013
Xiaowei Zhan Dajiang J. Liu

Summary: We develop TaSer (TabAnno and SeqMiner), a toolkit for annotating and querying next generation sequence (NGS) dataset in tab-delimited files. TabAnno is a powerful and efficient command-line tool designed to pre-process sequence data, annotate variations and generate an indexed feature-enriched project file that can integrate multiple sources of information. Using the project file gene...

2014
Mario Mezzanzanica Roberto Boselli Mirko Cesarini Fabio Mercorio

Research on data quality is growing in importance in both industrial and academic communities, as it aims at deriving knowledge (and then value) from data. Information Systems generate a lot of data useful for studying the dynamics of subjects’ behaviours or phenomena over time, making the quality of data a crucial aspect for guaranteeing the believability of the overall knowledge discovery pro...

2004
Viral Parekh Jin-Ping Gwo Timothy W. Finin

In Geoscience domain, large amounts of data are accessible, however they vary in formats and are stored at various organizations leading to problems of data discovery, data interoperability and usability. In this paper, we propose a new semantic metadata paradigm based on ontologies and the use of Semantic Web languages. Our suggested data model ontology is used to guide the generation of metad...

2011
Mehmet HACIBEYOĞLU Fatih BAŞÇİFTÇİ Şirzat KAHRAMANLI

The goal of attribute reduction is to find a minimal subset (MS) R of the condition attribute set C of a dataset such that R has the same classification power as C. It was proved that the number of MSs for a dataset with n attributes may be as large as (nn/2) and the generation of all of them is an NP-hard problem. The main reason for this is the intractable space complexity of the conversion o...

2014
Dhinaharan Nagamalai Shampa sengupta Asit Kumar Das

In today’s changing world huge amount of data is generated and transferred frequently. Although the data is sometimes static but most commonly it is dynamic and transactional. New data that is being generated is getting constantly added to the old/existing data. To discover the knowledge from this incremental data, one approach is to run the algorithm repeatedly for the modified data sets which...

2017
Luowei Zhou Chenliang Xu Jason J. Corso

Learning from instructional video is a promising direction that may help ground the vision and language problem. To move toward this goal, we collect a largescale cooking video dataset, called YouCookII, with 2000 videos downloaded from YouTube. All the videos are untrimmed, under unconstrained environment and in third person viewpoint. They represent a more challenging visual problem than exis...

2016
Osea Giuntella

This study examines the birth weight of second and third-generation Hispanics born in California and Florida, two of the major immigrant destination states in the US. I exploit a unique dataset of linked birth records for two generations of children born in California and Florida (1970-2009) and linear probability models to investigate the generational decline in the birth outcomes of Hispanics...

2015
Weina Ma Kamran Sartipi

User behavior pattern mining has drawn great attention in business and security areas. Realistic and accurate datasets are required for evaluating various user behavior pattern mining approaches, their implementations and optimization results. Synthetic datasets are crucial due to restricted access to production datasets, security and privacy issues, meeting specific needs of consumers, or the ...

Journal: :CoRR 2018
Jaro Milan Zink

This master thesis addresses the subject of automatically generating a dataset for image recognition, which takes a lot of time when being done manually. As the thesis was written with motivation from the context of the biodiversity workgroup at the City University of Applied Sciences Bremen, the classification of taxonomic entries was chosen as an exemplary use case. In order to automate the d...

2017
Jiwei Tan Xiaojun Wan Jianguo Xiao

Headline generation is a task of abstractive text summarization, and previously suffers from the immaturity of natural language generation techniques. Recent success of neural sentence summarization models shows the capacity of generating informative, fluent headlines conditioned on selected recapitulative sentences. In this paper, we investigate the extension of sentence summarization models t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید