Tracking Typological Traits of Uralic Languages in Distributed Language Representations

نویسندگان

  • Johannes Bjerva
  • Isabelle Augenstein
چکیده

Although linguistic typology has a long history, computational approaches have only recently gained popularity. The use of distributed representations in computational linguistics has also become increasingly popular. A recent development is to learn distributed representations of language, such that typologically similar languages are spatially close to one another. Although empirical successes have been shown for such language representations, they have not been subjected to much typological probing. In this paper, we first look at whether this type of language representations are empirically useful for model transfer between Uralic languages in deep neural networks. We then investigate which typological features are encoded in these representations by attempting to predict features in the World Atlas of Language Structures, at various stages of fine-tuning of the representations. We focus on Uralic languages, and find that some typological traits can be automatically inferred with accuracies well above a strong baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Matti MiestaMo (Helsinki) POLAR INTERROGATIVES IN URALIC LANGUAGES A TYPOLOGICAL PERSPECTIVE

The paper surveys the domain of polar interrogation in the Uralic language family in a typological perspective. An overview of the ways in which polar interrogation is marked in the world’s languages is presented and the encoding of the domain in Uralic languages is examined against this background. All the major types of polar interrogative marking are found in the family. Polar interrogatives...

متن کامل

Mirror Neurons and (Inter)subjectivity: Typological Evidence from East Asian Languages

Language is primarily constituted by action and interaction based on sensorimotor information. This paper demonstrates the nature of subjectivity and intersubjectivity through the neural mechanism and typological evidence of sentence-final particles from East Asian languages and extends to the discussion of the relationship between them. I propose that intersubjecivity is a kind of embedded or ...

متن کامل

From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings

A core part of linguistic typology is the classification of languages according to linguistic properties, such as those detailed in the World Atlas of Language Structure (WALS). Doing this manually is prohibitively time-consuming, which is in part evidenced by the fact that only 100 out of over 7,000 languages spoken in the world are fully covered in WALS. We learn distributed language represen...

متن کامل

Language, Emotion and Metapragmatics: A Theory Based on Typological Evidence

Humans are equipped with some universal or language-specific abilities to recognize emotions. However, because of the different emotional contents in diverse languages and the relevant cultural differences, humans with different cultural backgrounds own different metapragmatical abilities to recognize and express emotions. A hypothesis concerning emotional effects about intonation and particle ...

متن کامل

ADJECTIVES DESCRIBING SURFACE TEXTURE: TOWARDS LEXICAL TYPOLOGY Category: oral Theme session: Lexical typology of qualitative concepts

This paper deals with adjectives describing surface texture (‘slippery’, ‘smooth’, ‘level’, ‘rough’, etc.). The language sample comprises Russian, English, Chinese, Spanish, Korean, and a set of the Uralic languages (Finnish, Estonian, Erzya, Mari, Komi, Udmurt, Hungarian, Khanty, Nenets, Selqup). Their indepth study was aimed at exploring the dependence between the genetic proximity of languag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1711.05468  شماره 

صفحات  -

تاریخ انتشار 2017