نتایج جستجو برای: arabic

تعداد نتایج: 95942  

2012
Rabih Zbib Erika Malchiodi Jacob Devlin David Stallard Spyridon Matsoukas Richard M. Schwartz John Makhoul Omar Zaidan Chris Callison-Burch

Arabic Dialects present many challenges for machine translation, not least of which is the lack of data resources. We use crowdsourcing to cheaply and quickly build LevantineEnglish and Egyptian-English parallel corpora, consisting of 1.1M words and 380k words, respectively. The dialectal sentences are selected from a large corpus of Arabic web text, and translated using Amazon’s Mechanical Tur...

2017
Taha Zerrouki Amar Balla

Arabic diacritics are often missed in Arabic scripts. This feature is a handicap for new learner to read َArabic, text to speech conversion systems, reading and semantic analysis of Arabic texts. The automatic diacritization systems are the best solution to handle this issue. But such automation needs resources as diactritized texts to train and evaluate such systems. In this paper, we describe ...

2004
Ahmed Abdelali James Cowie Hamdy S. Soliman

Language Engineering, including Information Retrieval, Machine Translation and other Natural Language-related disciplines, is showing in recent years more interest in the Arabic language. Suitable resources for Arabic are becoming a vital necessity for the progress of this research. Until recently, only two Arabic corpora were commonly available for researchers: the AFP Arabic newswire from LDC...

2016
Waseem Alromima Ibrahim F. Moawad Rania Elgohary Mostafa Aref

The semantic resources are important parts in the Information Retrieval (IR) such as search engines, Question Answering (QA), etc., these resources should be available, readable and understandable. In semantic web, the ontology plays a central role for the information retrieval, which use to retrieves more relevant information from unstructured information. This paper presents a semantic-based ...

2011
Mahmoud El-Haj Udo Kruschwitz Chris Fox

We present the results of our Arabic and English runs at the TAC 2011 Multilingual summarisation (MultiLing) task. We participated with centroid-based clustering for multidocument summarisation. The automatically generated Arabic and English summaries were evaluated by human participants and by two automatic evaluation metrics, ROUGE and AutoSummENG. The results are compared with the other syst...

2006
Areej Al-Wabil Panayiotis Zaphiris Stephanie Wilson

This paper reports results of a workshop on the design of electronic content for users with Specific Learning Difficulties (SpLD), particularly Arabic dyslexics. First we shed some light on the nature of the Arabic language and discuss features that account for the unique needs of Arabic users with reading disorders. Then we present recommendations for accessible web design for Arabic content i...

2003
David Stallard John Makhoul Fred Choi Ehry MacRostie Premkumar Natarajan Richard M. Schwartz Bushra Zawaydeh

We present a limited speech translation system for English and colloquial Levantine Arabic, which we are currently developing as part of the DARPA Babylon program. The system is intended for question/answer communication between an English-speaking operator and an Arabic-speaking subject. It uses speech recognition to convert a spoken English question into text, and plays out a pre-recorded spe...

2014
Mona Alshehri Stephen M. Watt

Typeface technology has become quite complex over the years. There have been several attempts to use Arabic calligraphic styles in computer typography. These proved to be useful, but they had their shortcomings and drawbacks. Computational time cost and lack of Arabic script documentation were the most crucial issues with that work. In a few studies, the accuracy of results obtained also was an...

2011
Omar Zaidan Chris Callison-Burch

The written form of Arabic, Modern Standard Arabic (MSA), differs quite a bit from the spoken dialects of Arabic, which are the true “native” languages of Arabic speakers used in daily life. However, due to MSA’s prevalence in written form, almost all Arabic datasets have predominantly MSA content. We present the Arabic Online Commentary Dataset, a 52M-word monolingual dataset rich in dialectal...

2014
Ryan Cotterell Chris Callison-Burch

This paper presents a multi-dialect, multi-genre, human annotated corpus of dialectal Arabic with data obtained from both online newspaper commentary and Twitter. Most Arabic corpora are small and focus on Modern Standard Arabic (MSA). There has been recent interest, however, in the construction of dialectal Arabic corpora (Zaidan and Callison-Burch, 2011a; Al-Sabbagh and Girju, 2012). This wor...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید