Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation
نویسندگان
چکیده
منابع مشابه
Text Segmentation with Topic Modeling and Entity Coherence
This paper describes a system which uses entity and topic coherence for improved Text Segmentation (TS) accuracy. First, Linear Dirichlet Allocation (LDA) algorithm was used to obtain topics for sentences in the document. We then performed entity mapping across a window in order to discover the transition of entities within sentences. We used the information obtained to support our LDA-based bo...
متن کاملText segmentation with character-level text embeddings
Learning word representations has recently seen much success in computational linguistics. However, assuming sequences of word tokens as input to linguistic analysis is often unjustified. For many languages word segmentation is a non-trivial task and naturally occurring text is sometimes a mixture of natural language strings and other character data. We propose to learn text representations dir...
متن کاملMultiple text segmentation for statistical language modeling
In this article we deal with the text segmentation problem in statistical language modeling for under-resourced languages with a writing system without word boundary delimiters. While the lack of text resources has a negative impact on the performance of language models, the errors introduced by the automatic word segmentation makes those data even less usable. To better exploit the text resour...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence
سال: 2020
ISSN: 2374-3468,2159-5399
DOI: 10.1609/aaai.v34i05.6284