captioning order

نتایج جستجو برای: captioning order

تعداد نتایج: 908879 فیلتر نتایج به سال:

Text Augmentation Using BERT for Image Captioning

Journal: :Applied Sciences 2020

متن کامل

Motion Guided Spatial Attention for Video Captioning

Journal: :Proceedings of the AAAI Conference on Artificial Intelligence 2019

متن کامل

Panoptic Segmentation-Based Attention for Image Captioning

Journal: :Applied Sciences 2020

متن کامل

Image Captioning with Context-Aware Auxiliary Guidance

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

Image captioning is a challenging computer vision task, which aims to generate natural language description of an image. Most recent researches follow the encoder-decoder framework depends heavily on previous generated words for current prediction. Such methods can not effectively take advantage future predicted information learn complete semantics. In this paper, we propose Context-Aware Auxil...

متن کامل

Variational joint self‐attention for image captioning

Journal: :Iet Image Processing 2022

The image captioning task has attracted great attention from many researchers, and significant progress been made in the past few years. Existing models, which mainly apply attention-based encoder-decoder architecture, achieve developments captioning. These however, are limited caption generation due to potential errors resulting inaccurate detection of objects incorrect objects. To alleviate l...

متن کامل

A Practical Video Database Based on Language and Image Analysis

2002

Yiqing Liang Bede Liu

We integrated a practical digital video database system based on language and image analysis with components from digital video processing, still image search, information retrieval, dosed captioning processing. The attempt is to utilize the multiple modalities of information in video and implement data fusion among the multiple modalities. Keyframes are extracted to represent shots based on vi...

متن کامل

Automatic Close Captioning for Live Hungarian Television Broadcast Speech: A Fast and Resource-Efficient Approach

2015

Ádam Varga Balázs Tarján Zoltán Tobler György Szaszák Tibor Fegyó Csaba Bordás Péter Mihajlik

In this paper, the application of LVCSR (Large Vocabulary Continuous Speech Recognition) technology is investigated for real-time, resource-limited broadcast close captioning. The work focuses on transcribing live broadcast conversation speech to make such programs accessible to deaf viewers. Due to computational limitations, real time factor (RTF) and memory requirements are kept low during de...

متن کامل

Zero-Resource Neural Machine Translation with Multi-Agent Communication Game

Journal: :CoRR 2018

Yun Chen Yang Liu Victor O. K. Li

While end-to-end neural machine translation (NMT) has achieved notable success in the past years in translating a handful of resource-rich language pairs, it still suffers from the data scarcity problem for low-resource language pairs and domains. To tackle this problem, we propose an interactive multimodal framework for zero-resource neural machine translation. Instead of being passively expos...

متن کامل

Disjoint Multi-task Learning between Heterogeneous Human-centric Tasks

Journal: :CoRR 2018

Dong-Jin Kim Jinsoo Choi Tae-Hyun Oh Youngjin Yoon In-So Kweon

Human behavior understanding is arguably one of the most important mid-level components in artificial intelligence. In order to efficiently make use of data, multi-task learning has been studied in diverse computer vision tasks including human behavior understanding. However, multitask learning relies on task specific datasets and constructing such datasets can be cumbersome. It requires huge a...

متن کامل

Travel Blog Assistant System (TBAS) - An Example Scenario of how to Enrich Text with Images and Images with Text using Online Multimedia Repositories

2007

Marco Bressan Gabriela Csurka Yves Hoppenot Jean-Michel Renders

In this paper we present a Travel Blog Assistant System that facilitates the travel blog writing by automatically selecting for each blog paragraph written by the user the most relevant images from an uploaded image set. In order to do this, the system first automatically adds metadata to the traveler’s photos based both on a Generic Visual Categorizer (visual keywords) and by exploiting cross-...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید