A corpus study on topic shifting discourse macro markers in TED talks
نویسندگان
چکیده
منابع مشابه
An Arabic-Hebrew parallel corpus of TED talks
We describe an Arabic-Hebrew parallel corpus of TED talks built upon WIT, the Web inventory that repurposes the original content of the TED website in a way which is more convenient for MT researchers. The benchmark consists of about 2,000 talks, whose subtitles in Arabic and Hebrew have been accurately aligned and rearranged in sentences, for a total of about 3.5M tokens per language. Talks ha...
متن کاملEnhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks
In this paper, we present improvements made to the TED-LIUM corpus we released in 2012. These enhancements fall into two categories. First, we describe how we filtered publicly available monolingual data and used it to estimate well-suited language models (LMs), using open-source tools. Then, we describe the process of selection we applied to new acoustic data from TED talks, providing addition...
متن کاملCorpus Annotation of Macro Discourse Structures
We present our discourse annotation project, ANNODIS, which aims to make available a diversified French corpus annotated with discourse information, along with a set of tools for annotation and corpus exploitation. An original aspect of the project is that it combines two theoretically and methodologically different points of view on discourse: bottom-up and top-down. In the bottom-up perspecti...
متن کاملBuilding a Chinese discourse topic corpus with a micro-topic scheme based on theme-rheme theory
*Correspondence: [email protected] 2School of Computer Science and Technology, Soochow University, ShiZi Road, Suzhou, China Full list of author information is available at the end of the article Abstract Background: How to build a suitable discourse topic structure is an important issue in discourse topic analysis, which is the core of natural language understanding. Not only is it the key ba...
متن کاملinvestigation of (meta)discourse markers in elt coursebooks
the present study aimed to investigate representation of discourse markers and metadiscourse markers in conversations and readings of general elt coursebook series used in the language centers of iran. to this aim, four elt coursebooks popularly taught in language centers of this country were analyzed based on fung and carter’s (2007) framework regarding discourse markers and hyland’s (2005) fr...
ذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: e-Kafkas Eğitim Araştırmaları Dergisi
سال: 2020
ISSN: 2148-8940
DOI: 10.30900/kafkasegt.742904