Incorporating structural topic modeling into short text analysis
نویسندگان
چکیده
Abstract The past few decades have seen the rapid development of topic modeling. So far, research has been more concerned with determining ideal number topics or meaningful clustering words than applying modeling techniques to evaluate linguistic theories. This study proposes Structural Topic Model (STM)-led framework facilitate interpretation results and standardize text analysis. STM encompasses various model training mechanisms, thereby requiring systematic designs properly combine language studies. “Structural” in refers inclusion metadata structure. Unlike corpus-based keyness approach, can capture contextual cues meta-information for topical results. Besides, make cross-corpora comparisons via contrast, a challenging task corpus-driven related models such as Biterm (BTM). Stylistic variations song lyrics are taken an illustration show how use suggested delve into theory proposed by Pennebaker (2013) . iterable paradigm clarify pronouns affect style distinction. We believe STM-led shed light on analysis conducting reproducible comparison short texts.
منابع مشابه
Incorporating Word Correlation Knowledge into Topic Modeling
This paper studies how to incorporate the external word correlation knowledge to improve the coherence of topic modeling. Existing topic models assume words are generated independently and lack the mechanism to utilize the rich similarity relationships among words to learn coherent topics. To solve this problem, we build a Markov Random Field (MRF) regularized Latent Dirichlet Allocation (LDA) ...
متن کاملTopic Modeling and Classification of Cyberspace Papers Using Text Mining
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...
متن کاملIncorporating topic information into semantic analysis models
This paper reports experiments in classifying texts based upon their favorability towards the subject of the text using a feature set enriched with topic information on a small dataset of music reviews hand-annotated for topic. The results of these experiments suggest ways in which incorporating topic information into such models may yield improvement over models which do not use topic informat...
متن کاملIncorporating Metadata into Dynamic Topic Analysis
Everyday millions of blogs and micro-blogs are posted on the Internet These posts usually come with useful metadata, such as tags, authors, locations, etc. Much of these data are highly specific or personalized. Tracking the evolution of these data helps us to discover trending topics and users’ interests, which are key factors in recommendation and advertisement placement systems. In this pape...
متن کاملTopic Modeling over Short Texts by Incorporating Word Embeddings
Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks, such as content charactering, user interest profiling, and emerging topic detecting. Existing methods such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) cannot solve this problem very well since only very limited word co-o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Concentric? Studies in Linguistics
سال: 2023
ISSN: ['1810-7478', '2589-5230']
DOI: https://doi.org/10.1075/consl.22026.wan