Developing unsupervised knowledge-enhanced models to reduce the semantic gap in information retrieval
نویسندگان
چکیده
In this thesis we tackle the semantic gap, a long-standing problem in Information Retrieval (IR). The gap can be described as mismatch between users' queries and way retrieval models answer to such queries. Two main lines of work have emerged over years bridge gap: (i) use external knowledge resources enhance bag-of-words representations used by lexical models, (ii) perform matching latent documents. To deal with issue, first an in-depth evaluation through different analyses [Marchesin et al., 2019]. objective is understand what features share, if their signals are complementary, how they combined effectively address gap. particular, focuses on (semantic) neural critical aspects. Each analysis brings perspective understanding relation models. outcomes highlight differences signals, need combine them at early stages IR pipeline Then, build insights develop addressing Specifically, unsupervised that integrate from resources, evaluate for medical domain - high social value, where prominent, large presence authoritative allows us explore effective ways it. For investigate extent concepts relations stored within integrated query improve effectiveness Thus, propose several knowledge-based expansion reduction techniques [Agosti 2018, 2019; Di Nunzio These reformulations increase probability retrieving relevant documents adding or removing original highly specific terms. experimental test collections Precision Medicine particular case Clinical Decision Support (CDS) show proposed reformulations. subset allow achieve top performing results all considered collections. Regarding analyze limitations knowledge-enhanced presented literature. overcome these limitations, SAFIR 2020], framework IR. integrates learning process it does not require labeled data training. learned optimized encode linguistic CDS demonstrate when entire document collection retrieve Pseudo Relevance Feedback (PRF) methods is, pipeline. quantitative qualitative ability affected well combining complementary provide obtain better answers semantically hard
منابع مشابه
Events Retrieval Using Enhanced Semantic Web Knowledge
In this article, we present an experimental end user application to query DeRiVE 2011 challenge dataset in an innovative and intuitive manner. After enriching the dataset with external sources of information, it is indexed in a way that enables users to submit queries combining keywords, location and temporal anchor, in a single search field. The goal is to ease event retrieval providing a simp...
متن کاملKnowledge Sharing by Information Retrieval in the Semantic Web
Effective and efficient information retrieval, knowledge sharing and combining has become an essential part of more and more professional tasks and work flows in different kinds of projects. Our aim is to investigate the use of emerging Semantic Web technologies, tools, and standards in the support of effective information retrieval in real multi-disciplinary activities, such as innovative prod...
متن کاملSemantic Associative Topic Models for Information Retrieval
主題模型(topic model)被廣泛地應用在各種文件建 模以及語音識別、資訊檢索和本文探勘系統中,有 效地擷取文件或字詞的語意和統計資料。大多數主 題模式,例如機率潛在語意分析(probabilistic latent semantic analysis) 和 潛 在 狄 利 克 里 分 配 (latent Dirichlet allocation),主要都透過一組潛藏的主題機 率分布來描述文件與字詞之間的關係,並用以擷取 文件的潛在語意資訊。然而,傳統的主題模型受限 於詞袋(bag-of-words)的假設,其潛藏主題僅能用來 擷取個體詞(individual word)之間的語意資訊。雖然 個體詞可傳達主題信息,但有時會缺乏本文準確的 語意知識,容易造成文件的誤判,降低檢索的品 質。為了改善主題模型的缺點,本論文提出一種新 穎的語意關聯主題模型(semantic associ...
متن کاملa frame semantic approach to the study of translating cultural scripts in salingers franny and zooey
the frame semantic theory is a nascent approach in the area of translation studies which goes beyond the linguistic barriers and helps us to incorporate cognitive and cultural factors to the study of translation. based on rojos analytical model (2002b), which centered in the frames or knowledge structures activated in the text, the present research explores the various translation problems that...
15 صفحه اولBridging the Semantic Gap in Image Retrieval
The emergence of multimedia technology and the rapidly expanding image and video collections on the internet have attracted significant research efforts in providing tools for effective retrieval and management of visual data. Image retrieval is based on the availability of a representation scheme of image content. Image content descriptors may be visual features such as color, texture, shape, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Sigir Forum
سال: 2021
ISSN: ['0163-5840', '1558-0229']
DOI: https://doi.org/10.1145/3476415.3476433