COVER: conformational oversampling as data augmentation for molecules
نویسندگان
چکیده
منابع مشابه
Dropout as data augmentation
Dropout is typically interpreted as bagging a large number of models sharing parameters. We show that using dropout in a network can also be interpreted as a kind of data augmentation in the input space without domain knowledge. We present an approach to projecting the dropout noise within a network back into the input space, thereby generating augmented versions of the training data, and we sh...
متن کاملSMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules
Simplified Molecular Input Line Entry System (SMILES) is a single line text representation of a unique molecule. One molecule can however have multiple SMILES strings, which is a reason that canonical SMILES have been defined, which ensures a one to one correspondence between SMILES string and molecule. Here the fact that multiple SMILES represent the same molecule is explored as a technique fo...
متن کاملAdaptive Oversampling for Imbalanced Data Classification
Data imbalance is known to significantly hinder the generalization performance of supervised learning algorithms. A common strategy to overcome this challenge is synthetic oversampling, where synthetic minority class examples are generated to balance the distribution between the examples of the majority and minority classes. We present a novel adaptive oversampling algorithm, VIRTUAL, that comb...
متن کاملData augmentation for diffusions
The problem of formal likelihood-based (either classical or Bayesian) inference for discretely observed multi-dimensional diffusions is particularly challenging. In principle this involves data-augmentation of the observation data to give representations of the entire diffusion trajectory. Most currently proposed methodology splits broadly into two classes: either through the discretisation of ...
متن کاملPerformance Analysis of Oversampling Data Recovery Circuit∗
In this paper an analysis on the oversampling data recovery circuit is presented. The input waveform is assumed to be non-return-zero (NRZ) binary signals. A finite Markov chain model is used to evaluate the steady-state phase jitter performance. Theoretical analysis enables us to predict the input signal-to-noise ratio (SNR) versus bit error rate (BER) of the oversampling data recovery circuit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Cheminformatics
سال: 2020
ISSN: 1758-2946
DOI: 10.1186/s13321-020-00420-z