InfinityGAN: Towards Infinite-Resolution Image Synthesis
Humans can guess the whole scene given a partial observation of it. For machines to understand images in a way humans can has been a high-interest area for researchers. In an article recently published on arxiv.org, InfinityGAN is presented as a solution to generate images of arbitrary resolution.
Synthesizing infinite-resolution images from finite-resolution inputs with InfinityGAN. Image credit: Chieh Hubert Lin et al., arXiv:2104.03963v1 Challenges with existing models to generate high-resolution images - Most generative models to improve image resolution require increased training time, larger model size, and stricter data requirements.
- Large images should be locally and globally consistent, avoid repetitive patterns, and look realistic.
- Existing solutions are not resolution-independent, and extrapolating them to a higher resolution becomes very computationally heavy.
InfinityGAN is proposed as a solution for the above challenges to produce infinite resolution images using limited computational resources The aim of the research, as mentioned by the team, is: We aim to build a generator that trains with image patches, and at inference time synthesizes images well beyond its training data resolution. The generator can thus generalize to an unbounded arbitrarily high resolution. What is InfinityGAN It is a method that can train on finite and low-resolution images to generate infinite resolution images. InfinityGAN has a low computational requirement that yields high-quality, seamless and high-resolution image outputs. In this technique, local texture and structure are modelled separately, which allows the method to synthesize diverse local details. How does InfinityGAN work? InfinityGAN considers global & local factors to yield high-resolution images. - Global: InfintyGAN assumes that Images have a high-level composition that is coherent. It means that images have a global theme OR appearance that is consistent across the whole image. For example, a football match has a central theme: fans cheering, players following the ball on the ground, etc. A product launch also has a central theme and would have a presenter and audience. A Walmart store would have items stacked in racks and shoppers shopping.
- Local: A close up of the image is defined by the structure and texture in its neighbourhood. It is defined by the objects, shapes and their relative arrangement. Once a structure is defined, the texture would be decided by InfinityGAN based on the material & lighting of the objects in the structure to render a realistic scene. inifintyGAN will also map the texture to conform to the global coherence and the structure and texture of the neighbouring patches.
Components of InfinityGAN - Structure synthesizer: The structure synthesizer conditions global appearance and produces local structural representations.
- Texture synthesizer: It generates texture for the structure provided by the structure synthesizer.
Why InfinityGAN - Can also work effectively in a resource-constrained, both in terms of computation and availability of high-resolution training data
- It generates images that are locally and globally consistent, avoid repetitive patterns, and look realistic.
Applications of InfinityGAN - InfinityGAN provides flexibility & controllability by spatially fusing structures and textures from different distributions within an image.
- InfinityGAN allows that an image can be outpainted to synthesize an image of arbitrary length
- Particularly useful for high-resolution image synthesis where images can be divided into independent patches to make the process faster.
Limitations of InfinityGAN - Different FoV's & distances from the scene can negatively impact image extrapolation using InfinityGAN. It could sometimes lead to a bizarre global view.
- If manipulated photographs are used, it could mislead InfinityGAN to synthesize an inaccurate representation.
- If the motion module is trained along with the image module, InfinityGAN can achieve inferior performance.
Conclusion: InfinityGAN trains and infers patch-by-patch seamlessly with low computational resources. The research team also has done experimental evaluation that supports the statement that InfinityGAN generates images with superior global structure compared to some other techniques. Although InfinityGAN has certain limitations, it is proposed as a valuable resource to generate arbitrary resolution images. Source: Chieh Hubert Lin, Hsin-Ying Lee, Yen-Chi Cheng, Sergey Tulyakov and Ming-Hsuan Yang, "InfinityGAN: Towards Infinite-Resolution Image Synthesis".