DeepDiary: Automatically Captioning Lifelogging Image Streams

نویسندگان

Chenyou Fan

David J. Crandall

چکیده

Lifelogging cameras capture everyday life from a first-person perspective, but generate so much data that it is hard for users to browse and organize their image collections effectively. In this paper, we propose to use automatic image captioning algorithms to generate textual representations of these collections. We develop and explore novel techniques based on deep learning to generate captions for both individual images and image streams, using temporal consistency constraints to create summaries that are both more compact and less noisy. We evaluate our techniques with quantitative and qualitative results, and apply captioning to an image retrieval application for finding potentially private images. Our results suggest that our automatic captioning algorithms, while imperfect, may work well enough to help users manage lifelogging photo collections.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DeepDiary: Automatic Caption Generation for Lifelogging Image Streams

Lifelogging cameras capture everyday life from a firstperson perspective, but generate so much data that it is hard for users to browse and organize their image collections effectively. In this paper, we propose to use automatic image captioning algorithms to generate textual representations of these collections. We develop and explore novel techniques based on deep learning to generate caption...

متن کامل

Understanding Interactions Between Multiple Wearable Cameras for Personal Memory Capture

A recent trend in mobile computing is the increasing use of worn devices for data capture. Wearable lifelogging cameras such as the SenseCam and Narrative Clip reflect this trend, allowing mobile users to continuously capture images for later review. Image streams provided by these mobile devices can be used in a range of applications, but are commonly used as a way of capturing personal memori...

متن کامل

Neural Captioning for the ImageCLEF 2017 Medical Image Challenges

Manual image annotation is a major bottleneck in the processing of medical images and the accuracy of these reports varies depending on the clinician’s expertise. Automating some or all of the processes would have enormous impact in terms of efficiency, cost and accuracy. Previous approaches to automatically generating captions from images have relied on hand-crafted pipelines of feature extrac...

متن کامل

PlaceAvoider: Steering First-Person Cameras away from Sensitive Spaces

Cameras are now commonplace in our social and computing landscapes and embedded into consumer devices like smartphones and tablets. A new generation of wearable devices (such as Google Glass) will soon make ‘first-person’ cameras nearly ubiquitous, capturing vast amounts of imagery without deliberate human action. ‘Lifelogging’ devices and applications will record and share images from people’s...

متن کامل

Image2Text: A Multimodal Caption Generator

In this work, we showcase the Image2Text system, which is a real-time captioning system that can generate human-level natural language description for any input image. We formulate the problem of image captioning as a multimodal translation task. Analogous to machine translation, we present a sequence-to-sequence recurrent neural networks (RNN) model for image caption generation. Different from...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

DeepDiary: Automatically Captioning Lifelogging Image Streams

نویسندگان

چکیده

منابع مشابه

DeepDiary: Automatic Caption Generation for Lifelogging Image Streams

Understanding Interactions Between Multiple Wearable Cameras for Personal Memory Capture

Neural Captioning for the ImageCLEF 2017 Medical Image Challenges

PlaceAvoider: Steering First-Person Cameras away from Sensitive Spaces

Image2Text: A Multimodal Caption Generator

عنوان ژورنال:

اشتراک گذاری