Adobe releases video superscoring project VideoGigaGAN

The video super-resolution (VSR) method shows impressive temporal consistency in upsampled video.
However, these methods tend to produce fuzzier results than their image counterparts because of their limited generation capabilities.
This raises a fundamental question: Can the success of generating image upsamplers be extended to VSR tasks while maintaining temporal consistency? Introduce VideoGigaGAN, a new generation VSR model that can generate videos with high-frequency detail and temporal consistency.
VideoGigaGAN is based on GigaGAN, a large-scale image upsampler. Simply extending GigaGAN to video models by adding time modules can produce severe time flashes.
Several key issues were identified and techniques were proposed to significantly improve the temporal consistency of upsampled video.
Experiments have shown that unlike previous VSR methods, VideoGigaGAN generates time-consistent videos with more fine-grained appearance details.
Verify the effectiveness of VideoGigaGAN by comparing VideoGigaGAN to the most advanced VSR models on public datasets and showing video results at 8x super-resolution.

The video super-resolution (VSR) model is built on the asymmetric U-Net architecture of the GigaGAN upsampler for images.
To enforce temporal consistency, the image upsampler is first expanded into a video upsampler by adding a temporal attention layer to the decoder block.
Consistency is also enhanced by incorporating the functionality of the flow guided propagation module.
To suppress aliasing artifacts, anti-aliasing blocks are used in the downsampling layer of the encoder.
Finally, high-frequency features are transmitted directly to the decoder layer through skipping connections to compensate for the loss of detail during the BlurPool process.

Judging from the presentation effect, it is quite impressive. It supports 8x video magnification and can adapt to different styles of videos.

More detailed introduction:

The model is able to generate video that maintains temporal continuity while having high-frequency detail. VideoGigaGAN is designed and optimized based on GigaGAN, an advanced large-scale image magnification model.

If GigaGAN is simply extended to a Media Processing Service model and added modules to process time information, it will cause serious flicker problems in video. To solve this problem, we identified and improved several key technical points that significantly improved the temporal stability of the video.

By comparing it with other advanced VSR models on public datasets and demonstrating 8x super-resolution video effects, we verified the efficiency of VideoGigaGAN.

If you want to learn more, you can click on the link below the video.
Thank you for watching this video. If you like it, please subscribe and like it. thank

Project address:https://videogigagan.github.io

Video: