Object-Centric Image Generation from Layouts
نویسندگان
چکیده
We begin with the hypothesis that a model must be able to understand individual objects and relationships between in order generate complex scenes multiple well. Our layout-to-image-generation method, which we call Object-Centric Generative Adversarial Network (or OC-GAN), relies on novel Scene-Graph Similarity Module (SGSM). The SGSM learns representations of spatial scene, lead our model's improved layout-fidelity. also propose changes conditioning mechanism generator enhance its object instance-awareness. Apart from improving image quality, contributions mitigate two failure modes previous approaches: (1) spurious being generated without corresponding bounding boxes layout, (2) overlapping layout leading merged images. Extensive quantitative evaluation ablation studies demonstrate impact contributions, outperforming state-of-the-art approaches both COCO-Stuff Visual Genome datasets. Finally, address an important limitation metrics used works by introducing SceneFID -- object-centric adaptation popular Fréchet Inception Distance metric, is better suited for multi-object
منابع مشابه
Object-Centric Spatial Pooling for Image Classification
Spatial pyramid matching (SPM) based pooling has been the dominant choice for state-of-art image classification systems. In contrast, we propose a novel object-centric spatial pooling (OCP) approach, following the intuition that knowing the location of the object of interest can be useful for image classification. OCP consists of two steps: (1) inferring the location of the objects, and (2) usi...
متن کاملGeneration of Object-Centric Datasets with Adaptive Sky
Adaptive Sky is an ESTO-funded Advanced Information Systems Technology activity that is developing software to enable multiple sensing assets to be dynamically combined into sensor webs. The ASky feature correspondence toolbox consists of a variety of methods for automatically relating the observations of one instrument at time t to the observations of another instrument at time t’. A key end p...
متن کاملObj2Text: Generating Visually Descriptive Language from Object Layouts
Generating captions for images is a task that has recently received considerable attention. In this work we focus on caption generation for abstract scenes, or object layouts where the only information provided is a set of objects and their locations. We propose OBJ2TEXT, a sequence-tosequence model that encodes a set of objects and their locations as an input sequence using an LSTM network, an...
متن کاملObject-centric Sampling for Fine-grained Image Classification
This paper proposes to go beyond the state-of-the-art deep convolutional neural network (CNN) by incorporating the information from object detection, focusing on dealing with fine-grained image classification. Unfortunately, CNN suffers from over-fiting when it is trained on existing finegrained image classification benchmarks, which typically only consist of less than a few tens of thousands t...
متن کاملObject-Centric Representation Learning from Unlabeled Videos
Supervised (pre-)training currently yields state-of-the-art performance for representation learning for visual recognition, yet it comes at the cost of (1) intensive manual annotations and (2) an inherent restriction in the scope of data relevant for learning. In this work, we explore unsupervised feature learning from unlabeled video. We introduce a novel object-centric approach to temporal co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2021
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v35i3.16368