Fine-tuning and multilingual pre-training for abstractive summarization task for the Arabic language

نویسندگان

چکیده

The main task of our research is to train various abstractive summarization models for the Arabic language. work text has hardly begun so far due unavailability datasets needed that. In previous research, we created first monolingual corpus in language summarization. Based on this corpus, fine-tuned transformer models. We tested PreSumm and multilingual BART achieved a “state art” result area with method. present study continues same series research. extended “AraSum” managed reach up 50 thousand items, each consisting an article its corresponding lead. addition, pretrained own trilingual them addition mT5 model language, using AraSum corpus. While there room improvement resources infrastructure possess, results clearly demonstrate that most surpassed XL-Sum which considered be state art far. Our will released facilitate future

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Natural Language Generation within Abstractive Summarization

With the tremendous amount of textual data available in the Internet, techniques for abstractive text summarization become increasingly appreciated. In this paper, we present work in progress that tackles the problem of multilingual text summarization using semantic representations. Our system is based on abstract linguistic structures obtained from an analysis pipeline of disambiguation, synta...

متن کامل

Task Knowledge in Abstractive Summarization

This paper discusses the path towards asbtractive summarization and proposes a new knowledge-based methodology called KBABS as a step forward on this path. We propose to use both world knowledge, to identify useful content, and task knowledge, to filter out unreliable content, to generate more accurate summaries. This approach was implemented for guided summarization. The evaluation shows that,...

متن کامل

the search for the self in becketts theatre: waiting for godot and endgame

this thesis is based upon the works of samuel beckett. one of the greatest writers of contemporary literature. here, i have tried to focus on one of the main themes in becketts works: the search for the real "me" or the real self, which is not only a problem to be solved for beckett man but also for each of us. i have tried to show becketts techniques in approaching this unattainable goal, base...

15 صفحه اول

developing a pattern based on speech acts and language functions for developing materials for the course “ the study of islamic texts translation”

هدف پژوهش حاضر ارائه ی الگویی بر اساس کنش گفتار و کارکرد زبان برای تدوین مطالب درس "بررسی آثار ترجمه شده ی اسلامی" می باشد. در الگوی جدید، جهت تدوین مطالب بهتر و جذاب تر، بر خلاف کتاب-های موجود، از مدل های سطوح گفتارِ آستین (1962)، گروه بندی عملکردهای گفتارِ سرل (1976) و کارکرد زبانیِ هالیدی (1978) بهره جسته شده است. برای این منظور، 57 آیه ی شریفه، به صورت تصادفی از بخش-های مختلف قرآن انتخاب گردید...

15 صفحه اول

Actor-Critic based Training Framework for Abstractive Summarization

We present a training framework for neural abstractive summarization based on actor-critic approaches from reinforcement learning. In the traditional neural network based methods, the objective is only to maximize the likelihood of the predicted summaries, no other assessment constraints are considered, which may generate low-quality summaries or even incorrect sentences. To alleviate this prob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Az Eszterházy Károly Tanárképz? F?iskola tudományos közleményei

سال: 2023

ISSN: ['1216-6014', '1787-6117', '1787-5021', '1589-6498']

DOI: https://doi.org/10.33039/ami.2022.11.002