Object-Centric Image Generation from Layouts

نویسندگان

چکیده

We begin with the hypothesis that a model must be able to understand individual objects and relationships between in order generate complex scenes multiple well. Our layout-to-image-generation method, which we call Object-Centric Generative Adversarial Network (or OC-GAN), relies on novel Scene-Graph Similarity Module (SGSM). The SGSM learns representations of spatial scene, lead our model's improved layout-fidelity. also propose changes conditioning mechanism generator enhance its object instance-awareness. Apart from improving image quality, contributions mitigate two failure modes previous approaches: (1) spurious being generated without corresponding bounding boxes layout, (2) overlapping layout leading merged images. Extensive quantitative evaluation ablation studies demonstrate impact contributions, outperforming state-of-the-art approaches both COCO-Stuff Visual Genome datasets. Finally, address an important limitation metrics used works by introducing SceneFID -- object-centric adaptation popular Fréchet Inception Distance metric, is better suited for multi-object

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Object-Centric Spatial Pooling for Image Classification

Spatial pyramid matching (SPM) based pooling has been the dominant choice for state-of-art image classification systems. In contrast, we propose a novel object-centric spatial pooling (OCP) approach, following the intuition that knowing the location of the object of interest can be useful for image classification. OCP consists of two steps: (1) inferring the location of the objects, and (2) usi...

متن کامل

Generation of Object-Centric Datasets with Adaptive Sky

Adaptive Sky is an ESTO-funded Advanced Information Systems Technology activity that is developing software to enable multiple sensing assets to be dynamically combined into sensor webs. The ASky feature correspondence toolbox consists of a variety of methods for automatically relating the observations of one instrument at time t to the observations of another instrument at time t’. A key end p...

متن کامل

Obj2Text: Generating Visually Descriptive Language from Object Layouts

Generating captions for images is a task that has recently received considerable attention. In this work we focus on caption generation for abstract scenes, or object layouts where the only information provided is a set of objects and their locations. We propose OBJ2TEXT, a sequence-tosequence model that encodes a set of objects and their locations as an input sequence using an LSTM network, an...

متن کامل

Object-centric Sampling for Fine-grained Image Classification

This paper proposes to go beyond the state-of-the-art deep convolutional neural network (CNN) by incorporating the information from object detection, focusing on dealing with fine-grained image classification. Unfortunately, CNN suffers from over-fiting when it is trained on existing finegrained image classification benchmarks, which typically only consist of less than a few tens of thousands t...

متن کامل

Object-Centric Representation Learning from Unlabeled Videos

Supervised (pre-)training currently yields state-of-the-art performance for representation learning for visual recognition, yet it comes at the cost of (1) intensive manual annotations and (2) an inherent restriction in the scope of data relevant for learning. In this work, we explore unsupervised feature learning from unlabeled video. We introduce a novel object-centric approach to temporal co...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i3.16368