Tracking Typological Traits of Uralic Languages in Distributed Language Representations
نویسندگان
چکیده
Although linguistic typology has a long history, computational approaches have only recently gained popularity. The use of distributed representations in computational linguistics has also become increasingly popular. A recent development is to learn distributed representations of language, such that typologically similar languages are spatially close to one another. Although empirical successes have been shown for such language representations, they have not been subjected to much typological probing. In this paper, we first look at whether this type of language representations are empirically useful for model transfer between Uralic languages in deep neural networks. We then investigate which typological features are encoded in these representations by attempting to predict features in the World Atlas of Language Structures, at various stages of fine-tuning of the representations. We focus on Uralic languages, and find that some typological traits can be automatically inferred with accuracies well above a strong baseline.
منابع مشابه
Matti MiestaMo (Helsinki) POLAR INTERROGATIVES IN URALIC LANGUAGES A TYPOLOGICAL PERSPECTIVE
The paper surveys the domain of polar interrogation in the Uralic language family in a typological perspective. An overview of the ways in which polar interrogation is marked in the world’s languages is presented and the encoding of the domain in Uralic languages is examined against this background. All the major types of polar interrogative marking are found in the family. Polar interrogatives...
متن کاملMirror Neurons and (Inter)subjectivity: Typological Evidence from East Asian Languages
Language is primarily constituted by action and interaction based on sensorimotor information. This paper demonstrates the nature of subjectivity and intersubjectivity through the neural mechanism and typological evidence of sentence-final particles from East Asian languages and extends to the discussion of the relationship between them. I propose that intersubjecivity is a kind of embedded or ...
متن کاملFrom Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings
A core part of linguistic typology is the classification of languages according to linguistic properties, such as those detailed in the World Atlas of Language Structure (WALS). Doing this manually is prohibitively time-consuming, which is in part evidenced by the fact that only 100 out of over 7,000 languages spoken in the world are fully covered in WALS. We learn distributed language represen...
متن کاملLanguage, Emotion and Metapragmatics: A Theory Based on Typological Evidence
Humans are equipped with some universal or language-specific abilities to recognize emotions. However, because of the different emotional contents in diverse languages and the relevant cultural differences, humans with different cultural backgrounds own different metapragmatical abilities to recognize and express emotions. A hypothesis concerning emotional effects about intonation and particle ...
متن کاملADJECTIVES DESCRIBING SURFACE TEXTURE: TOWARDS LEXICAL TYPOLOGY Category: oral Theme session: Lexical typology of qualitative concepts
This paper deals with adjectives describing surface texture (‘slippery’, ‘smooth’, ‘level’, ‘rough’, etc.). The language sample comprises Russian, English, Chinese, Spanish, Korean, and a set of the Uralic languages (Finnish, Estonian, Erzya, Mari, Komi, Udmurt, Hungarian, Khanty, Nenets, Selqup). Their indepth study was aimed at exploring the dependence between the genetic proximity of languag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1711.05468 شماره
صفحات -
تاریخ انتشار 2017