Character-level Convolutional Networks for Text Classification
نویسندگان
چکیده
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several largescale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.
منابع مشابه
Character-level Convolutional Network for Text Classification Applied to Chinese Corpus
Compared with word-level and sentence-level convolutional neural networks (ConvNets), the character-level ConvNets has a better applicability for misspellings and typos input. Due to this, recent researches for text classification mainly focus on character-level ConvNets. However, while the majority of these researches employ English corpus for the character-level text classification, few resea...
متن کاملVery Deep Convolutional Networks for Text Classification
The dominant approach for many NLP tasks are recurrent neural networks, in particular LSTMs, and convolutional neural networks. However, these architectures are rather shallow in comparison to the deep convolutional networks which are very successful in computer vision. We present a new architecture for text processing which operates directly on the character level and uses only small convoluti...
متن کاملWhich Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?
This article offers an empirical study on the different ways of encoding Chinese, Japanese, Korean (CJK) and English languages for text classification. Different encoding levels are studied, including UTF-8 bytes, characters, words, romanized characters and romanized words. For all encoding levels, whenever applicable, we provide comparisons with linear models, fastText (Joulin et al., 2016) an...
متن کاملConvolutional Neural Networks for Text Categorization: Shallow Word-level vs. Deep Character-level
This paper reports the performances of shallow word-level convolutional neural networks (CNN), our earlier work (2015) [3, 4], on the eight datasets with relatively large training data that were used for testing the very deep characterlevel CNN in Conneau et al. (2016) [1]. Our findings are as follows. The shallow word-level CNNs achieve better error rates than the error rates reported in [1] t...
متن کاملA New Method to Improve Automated Classification of Heart Sound Signals: Filter Bank Learning in Convolutional Neural Networks
Introduction: Recent studies have acknowledged the potential of convolutional neural networks (CNNs) in distinguishing healthy and morbid samples by using heart sound analyses. Unfortunately the performance of CNNs is highly dependent on the filtering procedure which is applied to signal in their convolutional layer. The present study aimed to address this problem by a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015