نتایج جستجو برای: modern arabic

تعداد نتایج: 279409  

2014
Chu-Cheng Lin Waleed Ammar Lori S. Levin Chris Dyer

We describe the CMU submission for the 2014 shared task on language identification in code-switched data. We participated in all four language pairs: Spanish–English, Mandarin–English, Nepali–English, and Modern Standard Arabic–Arabic dialects. After describing our CRF-based baseline system, we discuss three extensions for learning from unlabeled data: semi-supervised learning, word embeddings,...

Journal: :Scientific American 1891

2012
Rabih Zbib Erika Malchiodi Jacob Devlin David Stallard Spyridon Matsoukas Richard M. Schwartz John Makhoul Omar Zaidan Chris Callison-Burch

Arabic Dialects present many challenges for machine translation, not least of which is the lack of data resources. We use crowdsourcing to cheaply and quickly build LevantineEnglish and Egyptian-English parallel corpora, consisting of 1.1M words and 380k words, respectively. The dialectal sentences are selected from a large corpus of Arabic web text, and translated using Amazon’s Mechanical Tur...

2017
Taha Zerrouki Amar Balla

Arabic diacritics are often missed in Arabic scripts. This feature is a handicap for new learner to read َArabic, text to speech conversion systems, reading and semantic analysis of Arabic texts. The automatic diacritization systems are the best solution to handle this issue. But such automation needs resources as diactritized texts to train and evaluate such systems. In this paper, we describe ...

2004
Ahmed Abdelali James Cowie Hamdy S. Soliman

Language Engineering, including Information Retrieval, Machine Translation and other Natural Language-related disciplines, is showing in recent years more interest in the Arabic language. Suitable resources for Arabic are becoming a vital necessity for the progress of this research. Until recently, only two Arabic corpora were commonly available for researchers: the AFP Arabic newswire from LDC...

2011
Omar Zaidan Chris Callison-Burch

The written form of Arabic, Modern Standard Arabic (MSA), differs quite a bit from the spoken dialects of Arabic, which are the true “native” languages of Arabic speakers used in daily life. However, due to MSA’s prevalence in written form, almost all Arabic datasets have predominantly MSA content. We present the Arabic Online Commentary Dataset, a 52M-word monolingual dataset rich in dialectal...

2014
Ryan Cotterell Chris Callison-Burch

This paper presents a multi-dialect, multi-genre, human annotated corpus of dialectal Arabic with data obtained from both online newspaper commentary and Twitter. Most Arabic corpora are small and focus on Modern Standard Arabic (MSA). There has been recent interest, however, in the construction of dialectal Arabic corpora (Zaidan and Callison-Burch, 2011a; Al-Sabbagh and Girju, 2012). This wor...

2012
Nizar Habash Mona T. Diab Owen Rambow

Dialectal Arabic (DA) refers to the day-to-day vernaculars spoken in the Arab world. DA lives side-by-side with the official language, Modern Standard Arabic (MSA). DA differs from MSA on all levels of linguistic representation, from phonology and morphology to lexicon and syntax. Unlike MSA, DA has no standard orthography since there are no Arabic dialect academies, nor is there a large edited...

Journal: :Int. Arab J. Inf. Technol. 2012
Ahmad T. Al-Taani Mohammed M. Msallam Sana A. Wedian

Parsing of Arabic sentences is a necessary mechanism for many natural language processing applications such as machine translation; question answering, knowledge extraction and information retrieval. In this study, we present a top-down chart parser for parsing simple Arabic sentences, including nominal and verbal sentences within specific domain Arabic grammar. We used the Context Free Grammar...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید