How Adobe Research is helping unlock the intelligence inside trillions of PDFs

April 11, 2024

Tags: AI & Machine Learning, Document Intelligence

Tons of information lives in the world’s three trillion PDFs—and now Adobe Researchers are using generative AI and machine learning (ML) to help users unlock that knowledge.

The new AI Assistant for Reader and Acrobat, now available in beta, can answer a user’s question about information in a PDF, generate summaries and insights, and format its findings for emails, reports, and presentations. The Assistant also takes a unique approach to AI-generated information, including built-in features that show readers how the Assistant produced its answers, and privacy protections for information inside each document.

“With the new interactive AI Assistant, anyone with document-related tasks—those in finance, marketing, legal, academia, people in government, you name it—can have a conversation with a PDF. They can easily tap into the information they need with a question like ‘What are the key takeaways in this document?’ or a request such as ‘Write an email about this paper,’” explains Tong Sun, Research Lab Director for Adobe Research.

Crafting AI for the unique world of PDFs

The team at Adobe Research began working on AI-powered features for PDFs in the early days of large language models (LLMs). In prototypes, they extensively explored and evaluated how generative AI could change the way people interact with information, and they developed innovative prompting strategies, grounded in PDF content, to help interactions run smoothly. Along the way, the team created proprietary AI and ML models that go beyond the text-based LLMs so they can understand the richness of tables and charts that live inside PDFs.

The Adobe Research team also wanted to make sure users can rely on the information they get from their AI Assistants. That’s why they developed new technologies that allow the AI to explain its sources and reasoning. For example, when analyzing a financial report, a user may ask the AI Assistant, “What was the total revenue for this company in 2022?” If the information isn’t explicit in the document, the AI Assistant can reason from other data it has access to within the document—such as finding quarterly revenue figures and adding them. Then, the AI Assistant shares its answer—and explains its reasoning and cites the data it used.

“When you’re looking at financial or legal documents, you don’t want to generate creative new ideas—it’s about the facts. So, one of Adobe Research’s key contributions was to cite where information is coming from so we can make that transparent,” says Sun. “It’s about building trust.”

What’s next for AI and PDFs?

The new AI Assistant can answer questions, suggest follow-up questions, and provide insights in a single document, but future iterations will have the power to integrate information from multiple documents, and even more types of images and charts. And while the current version is available in English, more languages will follow.

Further into the future, Sun imagines that the AI Assistant will evolve from consumption to creation. “Right now, PDFs feel like a final format. They’re something I save so everyone can see what I see. But consuming a document is often just the first step in creation. So, our future innovations will help unlock intelligence that can copilot the creation of something new.”

For example, future AI capabilities could help people use the knowledge from multiple documents to begin creating new things, from quickly generating and editing first drafts of articles and presentations to instantly changing their voice, tone, length, and layout for better user engagement. “Our research is all about unlocking the intelligence inside PDFs, and amplifying productivity and creativity for our users,” says Sun. “We’ll always need a human in the loop, but we’re enabling our users to do so much more.”

Reader and Acrobat customers will have access to the full range of AI Assistant capabilities through a new add-on subscription plan when AI Assistant is out of beta. Until then, the new AI Assistant features are available in beta for Acrobat Standard and Pro Individual and Teams subscription plans on desktop and web in English, with features coming to Reader desktop customers in English over the next few weeks – all at no additional cost. Other languages to follow. A private beta is available for enterprise customers.

Wondering what else is happening inside Adobe Research? Check out our latest news here.

Related Posts