نتایج جستجو برای: dataset generation

تعداد نتایج: 446131  

2008
Cameron W. Potter Debra Lew Jim McCaa Sam Cheng Scott Eichelberger Eric Grimit

The Western Wind and Solar Integration Study (WWSIS) is one of the world’s largest regional integration studies to date. This paper discusses the creation of the wind dataset that will be the basis for assessing the operating impacts and mitigation options due to the variability and uncertainty of wind power on the utility grids. The dataset is based on output from a mesoscale numerical weather...

Journal: :Transactions of the Association for Computational Linguistics 2023

Abstract The recognition of dataset names is a critical task for automatic information extraction in scientific literature, enabling researchers to understand and identify research opportunities. However, existing corpora mention detection are limited size naming diversity. In this paper, we introduce the Dataset Mentions Detection (DMDD), largest publicly available corpus task. DMDD consists m...

2013
Nikita Bhatt Amit Thakkar Amit Ganatra Nirav Bhatt

Classification is a machine learning technique which is used to categorize the different input patterns into different classes. To select the best classifier for a given dataset is one of the critical issues in Classification. Using cross-validation approach, it is possible to apply candidate algorithms on a given dataset and best classifier is selected by considering various evaluation measure...

Journal: :IJIIDS 2010
James O'Shea Zuhair Bandar Keeley A. Crockett David McLean

Short Text Semantic Similarity measurement is a new and rapidly growing field of research. “Short texts” are typically sentence length but are not required to be grammatically correct. There is great potential for applying these measures in fields such as Information Retrieval, Dialogue Management and Question Answering. A dataset of 65 sentence pairs, with similarity ratings, produced in 2006 ...

2017
Yu Guo Bowen Yao Yue Liu

Automatically generating video captions with natural language remains a challenge for both the field of nature language processing and computer vision. Recurrent Neural Networks (RNNs), which models sequence dynamics, has proved to be effective in visual interpretation. Based on a recent sequence to sequence model for video captioning, which is designed to learn the temporal structure of the se...

Journal: :CoRR 2016
Xiaohang Ren Kai Chen Jun Sun

Scene text recognition plays an important role in many computer vision applications. The small size of available public available scene text datasets is the main challenge when training a text recognition CNN model. In this paper, we propose a CNN based Chinese text recognition algorithm. To enlarge the dataset for training the CNN model, we design a synthetic data engine for Chinese scene char...

2013

This paper introduces a non-parametric data synthesizing algorithm to generate privacysafe “realistic but not real” synthetic health data. The proposed algorithm synthesizes artificial records while preserving the statistical characteristics of the original data to the extent possible. The risk from “database linking attack” is quantified by an l-diversified data generation process. Moreover it...

2016
Yusuke Watanabe Kazuma Hashimoto Yoshimasa Tsuruoka

We propose a simple domain adaptation method for neural networks in a supervised setting. Supervised domain adaptation is a way of improving the generalization performance on the target domain by using the source domain dataset, assuming that both of the datasets are labeled. Recently, recurrent neural networks have been shown to be successful on a variety of NLP tasks such as caption generatio...

Journal: :The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2019

2009
Yu Lin Hiroshi Tarui Peter Simons

We develop an OWL ontology: OGI (Ontology for Genetic Interval) for the formalization of the genomic elements by defining them as a Genetic Interval. Based on OGI’s definition of Genetic Interval Relations, which derived from the Allen interval calculus, we attempt to represent the relationships among contigs and sequence data from next generation sequencing. A real dataset generated from the b...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید