Journal of the Text Encoding Initiative, Issue 1 | 2011
نویسندگان
چکیده
This paper formulates a proposal for standardising spoken language transcription, as practised inconversation analysis, sociolinguistics, dialectology and related fields, with the help of the TEIguidelines. Two areas relevant to standardisation are identified and discussed: first, the macrostructure of transcriptions, as embodied in the data models and file formats of transcription toolssuch as ELAN, Praat or EXMARaLDA; second, the micro structure of transcriptions as embodied intranscription conventions such as CA, HIAT or GAT. A two-step process is described in which firstthe macro structure is represented in a generic TEI format based on elements defined in the P5version of the Guidelines. In the second step, character data in this representation is parsedaccording to the regularities of a transcription convention resulting in a more fine-grained TEImarkup which is also based on P5. It is argued that this two step process can, on the one hand,map idiosyncratic differences in tool formats and transcription conventions onto a unifiedrepresentation. On the other hand, differences motivated by different theoretical decisions canbe retained in a manner which still allows a common processing of data from different sources.In order to make the standard usable in practice, a conversion tool—TEI Drop—is presentedwhich uses XSL transformations to carry out the conversion between different tool formats(CHAT, ELAN, EXMARaLDA, FOLKER and Transcriber) and the TEI representation of transcriptionmacro structure (and vice versa) and which also provides methods for parsing the microstructure of transcriptions according to two different transcription conventions (HIAT andcGAT). Using this tool, transcribers can continue to work with software they are familiar withwhile still producing TEI-conformant transcription files. The paper concludes with a discussionof the work needed in order to establish the proposed standard. It is argued that both toolformats and the TEI guidelines are in a sufficiently mature state to serve as a basis forstandardisation. Most work consequently remains in analysing and standardising differencesbetween different transcription conventions.
منابع مشابه
Music Encoding Initiative (MEI) DTD
This paper provides a technical introduction to the Music Encoding Initiative (MEI) DTD currently under development by the author. It is consciously modeled on the highly successful Text Encoding Initiative (TEI) DTD. The primary purpose of the MEI DTD is the creation of a comprehensive yet extensible standard for the encoding and transmission of music documents in electronic form.
متن کاملThe Music Encoding Initiative (MEI)
This paper draws parallels between the Text Encoding Initiative (TEI) and the proposed Music Encoding Initiative (MEI), reviews existing design principles for music representations, and describes an eXtensible Markup Language (XML) document type definition (DTD) for modeling music notation which attempts to incorporate those principles.
متن کاملEncoding models for scholarly literature
In this chapter, the authors examine the issue of digital formats for document encoding, archiving and publishing, through the specific example of “born-digital” scholarly journal articles. This small area of electronic publishing represents a microcosm of the state of the art, and provides a good basis for this discussion. The authors will begin by looking at the traditional workflow of journa...
متن کاملJournal of the Text Encoding Initiative, Issue 4 | 2013
This paper presents the Register of Early Modern Slovenian Manuscripts, which includesmanuscripts from the 17th and 18th centuries that have been overlooked by scholars focused onprinted books from the same era. The Register attempts to address this gap in Slovenianmanuscript studies by describing these unknown and forgotten early modern manuscripts withfacsimiles, an index of b...
متن کاملText Encoding Initiative Semantic Modeling. A Conceptual Workflow Proposal
In this paper we present a proposal for the XML TEI semantic enhancement, through an ontological modelization based on a three level approach: an ontological generalization of the TEI schema; an intensional semantics of TEI elements; an extensional semantics of the markup content. A possible TEI semantic enhancement will be the result of these three levels dialogue and combination. We conclude ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016