BERT Models for Arabic Text Classification: A Systematic Review
نویسندگان
چکیده
Bidirectional Encoder Representations from Transformers (BERT) has gained increasing attention researchers and practitioners as it proven to be an invaluable technique in natural languages processing. This is mainly due its unique features, including ability predict words conditioned on both the left right context, pretrained using plain text corpus that enormously available web. As BERT more interest, models were introduced support different languages, Arabic. The current state of knowledge practice applying Arabic classification limited. In attempt begin remedying this gap, review synthesizes have been applied classification. It investigates differences between them compares their performance. also examines how effective they are compared original English models. concludes by offering insight into aspects need further improvements future work.
منابع مشابه
Arabic Text Watermarking: A Review
The using of the internet with its technologies and applications have been increased rapidly. So, protecting the text from illegal use is too needed . Text watermarking is used for this purpose. Arabic text has many characteristics such existing of diacritics , kashida (extension character) and points above or under its letters .Each of Arabic letters can take different shapes with different Un...
متن کاملA Comparative Study on Arabic Text Classification
This paper focuses on Automatic Arabic classifications. Arabic language is highly inflectional and derivational language which makes text mining a complex task. In classifying Arabic text, there are many published experimental results. Since these results came from different datasets, authors and evaluation metrics, we cannot compare the performance of the experimented classifiers. In this pape...
متن کاملText Summarization as Feature Selection for Arabic Text Classification
Text classification (TC) or text categorization task is assigning a document to one or more predefined classes or categories. A common problem in TC is the high number of terms or features in document(s) to be classified (the curse of dimensionality). This problem can be solved by selecting the most important terms. In this study, an automatic text summarization is used for feature selection. S...
متن کاملHigh capacity steganography tool for Arabic text using 'Kashida'
Steganography is the ability to hide secret information in a cover-media such as sound, pictures and text. A new approach is proposed to hide a secret into Arabic text cover media using "Kashida", an Arabic extension character. The proposed approach is an attempt to maximize the use of "Kashida" to hide more information in Arabic text cover-media. To approach this, some algorithms have been des...
متن کاملAdaptive models of Arabic text
The main aim of this thesis is to build adaptive language models of Arabic text that can achieve the best compression performance over existing models. Prediction by partial matching (PPM) language models has been the best performing over the other adaptive language models through the past three decades in term of compression performance. In order to get such performance for Arabic text, the ri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2022
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app12115720