Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers

نویسندگان

  • Ji Gao
  • Jack Lanchantin
  • Mary Lou Soffa
  • Yanjun Qi
چکیده

Although various techniques have been proposed to generate adversarial samples for white-box attacks on text, little attention has been paid to black-box attacks, which are more realistic scenarios. In this paper, we present a novel algorithm, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input. We introduce novel scoring strategies to find the most important tokens to modify such that the classifier will make a wrong prediction. Simple character-level transformations are applied to the highest-ranked tokens in order to minimize the edit distance of the perturbation, yet change the original classification. We evaluated DeepWordBug on eight real-world text datasets, including text classification, sentiment analysis and spam detection. We compare the result of DeepWordBug with two baselines: Random (Black-box) and Gradient (White-box). Our experimental results indicate that DeepWordBug leads to a decrease from the original classification accuracy up to 63% on average for a Word-LSTM model and up to 46% on average for a Char-CNN model, both of which deeplearning models are state-of-the-art.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DANCin SEQ2SEQ: Fooling Text Classifiers with Adversarial Text Example Generation

Machine learning models are powerful but fallible. Generating adversarial examples inputs deliberately crafted to cause model misclassification or other errors can yield important insight into model assumptions and vulnerabilities. Despite significant recent work on adversarial example generation targeting image classifiers, relatively little work exists exploring adversarial example generation...

متن کامل

Improvement of generative adversarial networks for automatic text-to-image generation

This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...

متن کامل

Generic Black-Box End-to-End Attack Against State of the Art API Call Based Malware Classifiers

Deep neural networks (DNNs) are used to solve complex classification problems, for which other machine learning classifiers, such as SVM, fall short. Recurrent neural networks (RNNs) have been used for tasks that involves sequential inputs, such as speech to text. In the cyber security domain, RNNs based on API calls have been used effectively to classify previously un-encountered malware. In t...

متن کامل

Blocking Transferability of Adversarial Examples in Black-Box Learning Systems

Advances in Machine Learning (ML) have led to its adoption as an integral component in many applications, including banking, medical diagnosis, and driverless cars. To further broaden the use of ML models, cloud-based services offered by Microsoft, Amazon, Google, and others have developed ML-as-a-service tools as black-box systems. However, ML classifiers are vulnerable to adversarial examples...

متن کامل

Cascade Adversarial Machine Learning Regularized with a Unified Embedding

Deep neural network classifiers are vulnerable to small input perturbations carefully generated by the adversaries. Injecting adversarial inputs during training, known as adversarial training, can improve robustness against one-step attacks, but not for unknown iterative attacks. To address this challenge, we propose to utilize embedding space for both classification and low-level (pixel-level)...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1801.04354  شماره 

صفحات  -

تاریخ انتشار 2018