نتایج جستجو برای: captioning order

تعداد نتایج: 908879  

Journal: :Proceedings of the AAAI Conference on Artificial Intelligence 2019

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

Image captioning is a challenging computer vision task, which aims to generate natural language description of an image. Most recent researches follow the encoder-decoder framework depends heavily on previous generated words for current prediction. Such methods can not effectively take advantage future predicted information learn complete semantics. In this paper, we propose Context-Aware Auxil...

Journal: :Iet Image Processing 2022

The image captioning task has attracted great attention from many researchers, and significant progress been made in the past few years. Existing models, which mainly apply attention-based encoder-decoder architecture, achieve developments captioning. These however, are limited caption generation due to potential errors resulting inaccurate detection of objects incorrect objects. To alleviate l...

2002
Yiqing Liang Bede Liu

We integrated a practical digital video database system based on language and image analysis with components from digital video processing, still image search, information retrieval, dosed captioning processing. The attempt is to utilize the multiple modalities of information in video and implement data fusion among the multiple modalities. Keyframes are extracted to represent shots based on vi...

2015
Ádam Varga Balázs Tarján Zoltán Tobler György Szaszák Tibor Fegyó Csaba Bordás Péter Mihajlik

In this paper, the application of LVCSR (Large Vocabulary Continuous Speech Recognition) technology is investigated for real-time, resource-limited broadcast close captioning. The work focuses on transcribing live broadcast conversation speech to make such programs accessible to deaf viewers. Due to computational limitations, real time factor (RTF) and memory requirements are kept low during de...

Journal: :CoRR 2018
Yun Chen Yang Liu Victor O. K. Li

While end-to-end neural machine translation (NMT) has achieved notable success in the past years in translating a handful of resource-rich language pairs, it still suffers from the data scarcity problem for low-resource language pairs and domains. To tackle this problem, we propose an interactive multimodal framework for zero-resource neural machine translation. Instead of being passively expos...

Journal: :CoRR 2018
Dong-Jin Kim Jinsoo Choi Tae-Hyun Oh Youngjin Yoon In-So Kweon

Human behavior understanding is arguably one of the most important mid-level components in artificial intelligence. In order to efficiently make use of data, multi-task learning has been studied in diverse computer vision tasks including human behavior understanding. However, multitask learning relies on task specific datasets and constructing such datasets can be cumbersome. It requires huge a...

2007
Marco Bressan Gabriela Csurka Yves Hoppenot Jean-Michel Renders

In this paper we present a Travel Blog Assistant System that facilitates the travel blog writing by automatically selecting for each blog paragraph written by the user the most relevant images from an uploaded image set. In order to do this, the system first automatically adds metadata to the traveler’s photos based both on a Generic Visual Categorizer (visual keywords) and by exploiting cross-...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید