Adobe Research at CVPR 2023

In this CVPR paper, the authors present a new method to scale up the training of Generative Adversarial Networks for the first time on several hundred million image-caption pairs. In the figure, the algorithm turns an input 128px image into a stunning 4K image, while being 50 times faster than existing methods.

At the 2023 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Adobe has co-authored 44 papers, including 6 highlight papers (10% of all accepted papers and 2.5% of all submissions), 35 other main conference papers and 3 workshop papers.

Adobe authors have also contributed to the conference by co-organizing workshops, giving invited talks at the workshops, area chairing, and paper reviewing. Many of Adobe’s co-authored papers are the result of past internships and collaborations with university students and faculty.

Here is the list of Adobe’s contributions to CVPR 2023.

Highlight Papers

Language-Guided Music Recommendation for Video via Prompt Analogies
Daniel McKee, Justin Salamon, Josef Sivic, Bryan Russell

Normal-guided Garment UV Prediction for Human Re-texturing
Yasamin Jafarian, Tuanfeng Y. Wang, Duygu Ceylan, Jimei Yang, Nathan Carr, Yi Zhou, Hyun Soo Park

PixHt-Lab: Pixel Height Based Light Effect Generation for Image Compositing
Yichen Sheng, Jianming Zhang, Julien Philip, Yannick Hold-Geoffroy, Xin Sun, He Zhang, Lu Ling, Bedrich Benes

REALIMPACT: A Dataset of Impact Sound Fields for Real Objects
Samuel Clarke, Ruohan Gao, Mason Wang, Mark Rau, Julia Xu, Jui-Hsien Wang, Doug L. James, Jiajun Wu

SceneComposer: Any-Level Semantic Image Synthesis
Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, John Collomosse, Jason Kuen, Vishal M. Patel

SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, Kun Zhang

Other main conference papers

Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Bo He, Jun Wang, Jielin Qiu, Trung Bui, Abhinav Shrivastava, Zhaowen Wang

Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images
Hugo Bertiche, Niloy Mitra, Kuldeep Kulkarni, Chun-Hao P. Huang, Tuanfeng Wang, Meysam Madadi, Sergio Escalera, Duygu Ceylan

Conditional Generation of Audio From Video via Foley Analogies
Yuexi Du, Ziyang Chen, Justin Salamon, Bryan Russell, Andrew Owens

CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing
Ambareesh Revanur, Debraj Basu, Shradha Agrawal, Dhwanit Agarwal, Deepak Pai

Complete 3D Human Reconstruction from a Single Incomplete Image.
Junying Wang, Jae Shin Yoon, Tuanfeng Y. Wang, Krishna Kumar Singh, and Ulrich Neumann

DA Wand: Distortion-Aware Selection using Neural Mesh Parameterization
Richard Liu, Noam Aigerman, Vladimir G. Kim, Rana Hanocka

Domain Expansion of Image Generators
Yotam Nitzan, Michaël Gharbi, Richard Zhang, Taesung Park, Jun-Yan Zhu, Daniel Cohen-Or, Eli Shechtman

DualVector: Unsupervised Vector Font Synthesis with Dual-Part Representation
Ying-Tian Liu, Zhifei Zhang, Yuan-Chen Guo, Matthew Fisher, Zhaowen Wang, Song-Hai Zhang

GamutMLP: A Lightweight MLP for Color Loss Recovery
Hoang Le, Brian L. Price, Scott Cohen, Michael S. Brown

Grid-Guided Neural Radiance Fields for Large Urban Scenes
Linning Xu, Yuanbo Xiangli, Sida Peng, Xingang Pan, Nanxuan Zhao, Christian Theobalt, Bo Dai, Dahua Lin

Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations
Ziyan Yang, Kushal Kafle, Franck Dernoncourt, Vicente Ordonez

Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Reuben Tan, Arijit Ray, Andrea Burns, Bryan A. Plummer, Justin Salamon, Oriol Nieto, Bryan Russell, Kate Saenko

LightPainter: Interactive Portrait Relighting with Freehand Scribble
Yiqun Mei, He Zhang, Xuaner Zhang, Jianming Zhang, Zhixin Shu, Yilin Wang, Zijun Wei, Shi Yan, HyunJoon Jung, Vishal M. Patel

Meta-Personalizing Vision-Language Models to Find Named Instances in Video
Chun-Hsiao Yeh, Bryan Russell, Josef Sivic, Fabian Caba Heilbron, Simon Jenni

MIME: Human-Aware 3D Scene Generation
Hongwei Yi, Chun-Hao P. Huang, Shashank Tripathi, Lea Hering, Justus Thies, Michael J. Black

Modernizing Old Photos Using Multiple References via Photorealistic Style Transfer
Agus Gunawan, Soo Ye Kim, Hyeonjun Sim, Jae-Ho Lee, Munchurl Kim

Multi-concept Customization of Text-to-Image Diffusion
Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan Zhu

Neural Preset for Color Style Transfer
Zhanghan Ke, Yuhao Liu, Lei Zhu, Nanxuan Zhao, Rynson W.H. Lau

ObjectStitch: Generative Object Compositing
Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga

Perspective Fields for Single Image Camera Calibration
Linyi Jin, Jianming Zhang, Yannick Hold-Geoffroy, Oliver Wang, Kevin Blackburn-Matzen, Matthew Sticha, David F. Fouhey

Putting People in Their Place: Affordance-Aware Human Insertion into Scenes
Sumith Kulal, Tim Brooks, Alex Aiken, Jiajun Wu, Jimei Yang, Jingwan Lu, Alexei A. Efros, Krishna Kumar Singh

Realistic Saliency Guided Image Enhancement
S. Mahdi H. Miangoleh, Zoya Bylinskii, Eric Kee, Eli Shechtman, Yağiz Aksoy

RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
Titas Anciukevičius, Zexiang Xu, Matthew Fisher, Paul Henderson, Hakan Bilen, Niloy Mitra, Paul Guerrero

Scaling up GANs for Text-to-Image Synthesis
Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, Taesung Park

Self-Supervised Representation Learning for CAD
Benjamin Jones, Michael Hu, Milin Kodnongbua, Vladimir G. Kim, and Adriana Schulz

SimpSON: Simplifying Photo Cleanup with Single-Click Distracting Object Segmentation Network
Chuong Huynh, Yuqian Zhou, Zhe Lin, Connelly Barnes, Eli Shechtman, Sohrab Amirghodsi, Abhinav Shrivastava

Single View Scene Scale Estimation using Scale Field
Byeong-Uk Lee, Jianming Zhang, Yannick Hold-Geoffroy, In So Kweon

SVGformer: Representation Learning for Continuous Vector Graphics using Transformers
Defu Cao, Zhaowen Wang, Jose Echevarria, Yan Liu

TopNet: Transformer-based Object Placement Network for Image Compositing
Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

Towards Open-World Segmentation of Parts
Tai-Yu Pan, Qing Liu, Wei-Lun Chao, Brian Price

Towards Transferable Targeted Adversarial Examples
Zhibo Wang, Hongshan Yang, Yunhe Feng, Peng Sun, Hengchang Guo, Zhifei Zhang, Kui Ren

Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Dezhao Luo, Jiabo Huang, Shaogang Gong, Hailin Jin, Yang Liu

3D Cinemagraphy from a Single Image
Xingyi Li, Zhiguo Cao, Huiqiang Sun, Jianming Zhang, Ke Xian, Guosheng Lin

Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang

Unsupervised 3D Shape Reconstruction by Part Retrieval and Assembly
Xianghao Xu, Paul Guerrero, Matthew Fisher, Siddhartha Chaudhuri, Daniel Ritchie

Workshop Papers

EKILA: Synthetic Media Provenance and Attribution for Generative Art
Kar Balan, Shruti Agarwal, Simon Jenni, Andy Parsons, Andrew Gilbert, John Collomosse
CVPR Workshop on Media Forensics

RoSteALS: Robust Steganography using Autoencoder Latent Space
Tu Bui, Shruti Agarwal, Ning Yu and John Collomosse
CVPR Workshop on Media Forensics

Scene Graph Driven Text-Prompt Generation for Image Inpainting
Tripti Shukla, Paridhi Maheshwari, Rajhans Singh, Ankita Shukla, Kuldeep Kulkarni, Pavan Turaga
GCV Workshop

Workshop Co-organizers

Srikrishna Karanam is a co-organizer of Fourth Workshop on Fair, Data-efficient, and Trusted Computer Vision
Chun-Hao Huang is a co-organizer of The first Workshop on Reconstruction of Human-Object Interactions (RHOBIN)
Paul Guerrero is a co-organizer of StruCo3D Workshop: Structural and Compositional Learning on 3D Data
Yijun Li is a co-organizer of AI4CC: AI for Content Creation Workshop

Invited Talks

Aaron Hertzmann is on a panel on Vision, Language, and Creativity
John Collomosse is on a panel at the Workshop on Media Forensics
Aaron Hertzmann is a keynote speaker at the CVFAD Workshop
John Collomosse is a keynote speaker at the Video Copy Detection Workshop

Area Chairs

Bryan Russell
Jimei Yang
John Collomosse
Aaron Hertzmann
Yijun Li
Brian L. Price
Niloy Mitra
Richard Zhang

Adobe Research at CVPR 2023

June 21, 2023

Tags: AI & Machine Learning, Computer Vision, Imaging & Video, Conferences

Related Posts

Adobe Research is helping shape the future of generative AI for creative expression with Firefly

Adobe has launched Firefly, a new family of generative AI models for creative expression.

Behind the Tech: Motion in Adobe Fresco

The Motion feature in Adobe Fresco empowers artists and creators to add motion to individual elements in their artwork.

Content Authenticity and Image Fingerprinting: Q&A with John Collomosse

John Collomosse, Principal Research Scientist at Adobe Research, draws on computer vision, machine learning, and blockchain to create tools that build trust for digital content.