What metadata is stored in a PDF file?

PDFs can store extensive metadata including title, author, subject, keywords, creation date, modification date, creator application, PDF producer, page count, file size, and custom properties. XMP metadata can include even more detailed information like copyright, licensing, and GPS coordinates from scanned documents.

Can PDF metadata reveal sensitive information?

Yes. PDF metadata can expose the original author's name, organization, software used, creation and edit timestamps, and sometimes even file paths from the creator's computer. For sensitive documents, always review and clean metadata before sharing.

How do I remove all metadata from a PDF?

Use a metadata editor tool to strip all properties at once. This removes author names, timestamps, creator applications, and custom fields. Some tools let you selectively keep certain fields while removing others, giving you control over exactly what information remains.

Does editing metadata change the PDF content?

No. Metadata editing only changes the document's properties—the visible content on every page remains completely untouched. Your text, images, formatting, and layout are not affected in any way by metadata changes.

PDF Metadata: What It Is and How to Edit It

March 31, 2026 · 12 min read

Table of Contents

What Is PDF Metadata?
Types of PDF Metadata
Why Metadata Matters
How to View PDF Metadata
How to Edit PDF Metadata
Privacy and Security Concerns
Metadata Standards and Schemas
Using Metadata to Compare Documents
Metadata in Professional Workflows
Troubleshooting Common Metadata Issues
Frequently Asked Questions
Related Articles

What Is PDF Metadata?

Every PDF file carries hidden information that most users never see. This invisible layer of data—called metadata—describes the document itself rather than its visible content. Think of it as a detailed label on a package: it tells you who created it, when it was made, what software was used, and much more, all without opening the document to read its pages.

PDF metadata serves essential functions in document management, search, organization, and compliance. Libraries use metadata to catalog digital collections. Legal teams rely on metadata timestamps to establish document provenance. SEO specialists optimize PDF metadata to improve search engine rankings. Organizations use metadata standards to maintain consistent document properties across thousands of files.

Understanding metadata isn't just for power users—it's important for anyone who creates or shares PDFs. The metadata in your documents might reveal more about you and your workflow than you realize, and knowing how to control it gives you power over your digital privacy and professional image.

Metadata exists in two primary layers within a PDF file. The first is the Document Information Dictionary, a legacy format that's been part of PDF since version 1.0. The second is XMP (Extensible Metadata Platform), introduced in PDF 1.4, which uses XML to store more complex and extensible metadata. Modern PDFs typically contain both formats for backward compatibility.

Quick tip: You can view basic PDF metadata in most PDF readers by opening File > Properties or pressing Ctrl+D (Windows) or Cmd+D (Mac). This reveals the document's title, author, creation date, and other standard fields.

Types of PDF Metadata

Document Information Dictionary

The most basic form of PDF metadata, the Document Information Dictionary has been part of the PDF specification since its earliest versions. It stores standard properties that appear in virtually every PDF reader's document properties dialog.

The eight standard fields in the Document Information Dictionary are:

Title: The document's title, which may differ from the filename
Author: The person who created the document
Subject: A brief description of the document's topic
Keywords: Search terms relevant to the document content
Creator: The application that created the original document (e.g., "Microsoft Word")
Producer: The application that converted the document to PDF (e.g., "Adobe PDF Library 15.0")
CreationDate: When the document was first created
ModDate: When the document was last modified

These fields are simple text strings (except for dates, which use a specific format). While they're called "standard," they're all optional—a PDF can exist with none of these fields populated.

XMP Metadata

XMP (Extensible Metadata Platform) is Adobe's standard for embedding metadata in files. Introduced in 2001, XMP uses XML to store metadata in a structured, extensible format that can accommodate custom properties and complex relationships.

XMP metadata is organized into namespaces, each serving a specific purpose:

Dublin Core (dc): Basic bibliographic information like title, creator, description, and subject
XMP Basic (xmp): Fundamental properties including creation date, modification date, and creator tool
XMP Rights Management (xmpRights): Copyright and usage rights information
PDF Schema (pdf): PDF-specific properties like keywords, PDF version, and producer
Photoshop Schema (photoshop): Image-specific metadata when PDFs contain photos
EXIF: Camera and image capture data for photographs
IPTC: Journalism and media industry metadata standards

XMP's XML structure allows for much richer metadata than the simple key-value pairs of the Document Information Dictionary. You can store arrays of values, nested structures, and custom properties specific to your organization or workflow.

Structural Metadata

Beyond descriptive metadata, PDFs contain structural metadata that defines how the document is organized:

Page labels: Custom numbering schemes (Roman numerals for front matter, Arabic for body)
Bookmarks: Navigation structure and outline hierarchy
Document structure tags: Semantic markup for accessibility (headings, paragraphs, lists)
Logical structure: Reading order and content relationships
Attachments: Embedded files and their descriptions

This structural metadata is crucial for accessibility, navigation, and document understanding by assistive technologies.

Technical Metadata

PDFs also store technical information about the file itself:

PDF version: Which version of the PDF specification the file conforms to
Page dimensions: Size of each page in points
Color space: RGB, CMYK, or other color models used
Font information: Embedded fonts and their properties
Compression methods: How images and content streams are compressed
Encryption settings: Security restrictions and permissions
Linearization: Whether the PDF is optimized for web viewing

This technical metadata is typically managed automatically by PDF creation software and isn't meant for manual editing.

Metadata Type	Format	Primary Use	User Editable
Document Info Dictionary	Key-value pairs	Basic document properties	Yes
XMP Metadata	XML	Extended properties, rights management	Yes
Structural Metadata	PDF objects	Navigation, accessibility	Partially
Technical Metadata	PDF internal structures	File specifications, rendering	No

Why Metadata Matters

Document Organization and Searchability

Proper metadata transforms a collection of files into a searchable, organized library. When you store hundreds or thousands of PDFs, filenames alone aren't enough to find what you need quickly.

Well-maintained metadata enables:

Desktop search: Operating systems index PDF metadata, making documents findable through system search
Document management systems: Enterprise systems rely on metadata for categorization and retrieval
Digital asset management: Creative teams use metadata to track versions, rights, and usage
Research databases: Academic institutions catalog papers using standardized metadata schemas

A PDF titled "Q4_Report_Final_v3_FINAL.pdf" tells you nothing. But metadata fields for Title ("Q4 2025 Financial Report"), Author ("Finance Department"), Subject ("Quarterly earnings and projections"), and Keywords ("revenue, expenses, forecast, 2025") make that document instantly discoverable.

SEO and Web Visibility

Search engines index PDF metadata when crawling websites. Google, Bing, and other search engines read the Title, Author, Subject, and Keywords fields to understand document content and relevance.

Optimizing PDF metadata for SEO involves:

Writing descriptive, keyword-rich titles that match search intent
Including relevant keywords in the Subject and Keywords fields
Ensuring the Author field reflects your brand or organization
Keeping metadata consistent with the document's actual content

A white paper with the title "Document1.pdf" and no metadata will rank poorly compared to one titled "Complete Guide to Cloud Security Best Practices 2026" with properly optimized metadata fields.

Legal and Compliance Requirements

In legal, financial, and regulated industries, metadata serves as evidence of document authenticity and chain of custody. Courts accept metadata as proof of when documents were created and modified.

Legal teams use metadata to:

Establish document timelines in litigation
Verify document authenticity and detect tampering
Track document versions and revisions
Comply with discovery requirements in legal proceedings
Meet regulatory record-keeping standards

Financial institutions must maintain audit trails showing when documents were created, who created them, and what changes were made. Metadata provides this audit trail automatically.

Professional Presentation

Metadata affects how your documents appear to recipients. When someone opens your PDF, the title bar displays the Title field—not the filename. A professional title makes a better impression than "Untitled" or a cryptic filename.

Complete metadata signals professionalism and attention to detail. It shows you care about document quality beyond just the visible content.

Pro tip: Before sharing any PDF externally, review its metadata using our Metadata Editor tool. Remove any internal information, set a professional title, and ensure the author field reflects how you want to be identified.

How to View PDF Metadata

Using Adobe Acrobat Reader

Adobe Acrobat Reader, the most widely used PDF viewer, provides easy access to document metadata:

Open your PDF in Acrobat Reader
Go to File > Properties or press Ctrl+D (Windows) or Cmd+D (Mac)
The Document Properties dialog opens, showing the Description tab by default
View Title, Author, Subject, and Keywords in the Description tab
Click the Additional Metadata button for XMP metadata
Switch to other tabs (Security, Fonts, Initial View) for additional information

The Additional Metadata dialog shows the complete XMP metadata in a tree structure, organized by namespace. You can expand each namespace to see all properties and their values.

Using Other PDF Readers

Most PDF readers provide similar functionality, though the exact menu location varies:

Foxit Reader: File > Properties or Ctrl+D
PDF-XChange Editor: File > Document Properties
Sumatra PDF: File > Properties
Preview (Mac): Tools > Show Inspector, then click the Info tab
Evince (Linux): File > Properties

Browser-based PDF viewers (Chrome, Firefox, Edge) typically show limited metadata or none at all. For complete metadata access, use a dedicated PDF application.

Using Command-Line Tools

For batch processing or automation, command-line tools extract metadata efficiently:

ExifTool (cross-platform):

exiftool document.pdf

This displays all metadata fields in a readable format. To extract specific fields:

exiftool -Title -Author -Subject document.pdf

pdfinfo (part of Poppler utilities on Linux/Mac):

pdfinfo document.pdf

pdftk (PDF Toolkit):

pdftk document.pdf dump_data

These tools are invaluable for scripting and batch operations across large document collections.

Using Online Tools

Web-based tools offer convenient metadata viewing without installing software. Our PDF Metadata Viewer lets you upload a PDF and instantly see all metadata fields in an organized interface.

Online tools are ideal for quick checks, but be cautious about uploading sensitive documents to third-party services. For confidential files, use local software instead.

How to Edit PDF Metadata

Editing in Adobe Acrobat Pro

Adobe Acrobat Pro (the paid version) allows full metadata editing:

Open your PDF in Acrobat Pro
Go to File > Properties or press Ctrl+D
In the Description tab, click in any field to edit it
Modify Title, Author, Subject, and Keywords as needed
Click Additional Metadata to edit XMP properties
In the Advanced panel, you can add custom properties
Click OK to save changes

Acrobat Pro also offers batch metadata editing through Action Wizard, allowing you to apply the same metadata changes to multiple files simultaneously.

Editing in Free PDF Editors

Several free PDF editors support metadata editing:

PDF-XChange Editor (free version):

File > Document Properties > Description tab
Edit fields directly and click OK to save

LibreOffice Draw:

Open PDF in LibreOffice Draw
File > Properties > Description tab
Edit metadata and export as PDF

PDFtk Free:

Windows GUI for PDFtk with metadata editing interface
Simple form-based editing of standard fields

Note that free tools often have limitations—they may not support XMP metadata editing or custom properties.

Editing with Command-Line Tools

For automation and batch processing, command-line tools are most efficient:

ExifTool can modify most metadata fields:

exiftool -Title="New Title" -Author="John Smith" document.pdf

To process multiple files:

exiftool -Title="Annual Report" -Author="Finance Dept" *.pdf

pdftk uses a two-step process:

# Extract metadata to a text file
pdftk document.pdf dump_data output metadata.txt

# Edit metadata.txt with a text editor

# Update the PDF with modified metadata
pdftk document.pdf update_info metadata.txt output document_updated.pdf

This approach works well for scripted workflows and integration with other systems.

Using Online Metadata Editors

Our PDF Metadata Editor provides a user-friendly interface for editing metadata without installing software:

Upload your PDF file
View current metadata in organized fields
Edit any field you want to change
Add new custom properties if needed
Download the updated PDF with modified metadata

The tool preserves all document content and formatting while updating only the metadata layer. It's perfect for quick edits and one-off changes.

Removing Metadata Entirely

Sometimes you want to strip all metadata from a PDF for privacy reasons:

Adobe Acrobat Pro:

Tools > Redact > Remove Hidden Information
Select metadata items to remove
Click Remove to clean the document

ExifTool:

exiftool -all= document.pdf

pdftk:

pdftk document.pdf output clean.pdf

Our Metadata Remover tool strips all metadata while preserving document content, ideal for sharing documents publicly without revealing internal information.

Pro tip: Before removing metadata, save a copy of the original file. Some workflows require metadata for document management, and once removed, it can't be recovered without the original.

Privacy and Security Concerns

What Metadata Can Reveal

PDF metadata can expose information you didn't intend to share. Every time you create or edit a PDF, metadata accumulates, potentially revealing:

Your identity: Author field often contains your full name or username
Your organization: Company name in Creator or Producer fields
Your software: Specific applications and versions you use
Your location: File paths may include computer names or network locations
Document history: Creation and modification timestamps reveal workflow patterns
Editing activity: Number of revisions and time spent editing
Internal comments: Hidden annotations or review comments

In 2003, a leaked document from the UK government revealed that it had been edited to exaggerate intelligence claims because metadata showed last-minute changes. In 2013, metadata in a PDF released by the NSA revealed the identity of a redacted name. These cases demonstrate how metadata can undermine confidentiality.

Metadata in Sensitive Documents

Certain document types require extra metadata scrutiny:

Legal documents: May reveal attorney-client privileged information or work product
Financial reports: Can expose internal systems and processes
Medical records: May contain patient identifiers beyond the visible content
Government documents: Could reveal classified information or sources
Whistleblower submissions: Metadata can identify the source
Anonymous publications: Author information defeats anonymity

Before sharing sensitive documents, always review and clean metadata. Many organizations have policies requiring metadata removal from externally shared files.

Best Practices for Metadata Privacy

Protect your privacy with these metadata management practices:

Review before sharing: Always check metadata before sending PDFs externally
Use generic author names: Set author to your organization name rather than personal name
Remove metadata from public documents: Strip all metadata from PDFs posted on websites
Configure PDF creation software: Set default metadata values that don't reveal personal information
Use metadata removal tools: Automate cleaning for documents leaving your organization
Educate your team: Ensure everyone understands metadata privacy implications
Implement document policies: Create organizational standards for metadata handling

Metadata and GDPR Compliance

Under GDPR and similar privacy regulations, metadata containing personal information is subject to the same protections as document content. If metadata includes names, email addresses, or other identifiers, it's considered personal data.

Organizations must:

Include metadata in data protection impact assessments
Respond to subject access requests by providing metadata
Honor right-to-erasure requests by removing metadata
Implement appropriate security measures for metadata
Document metadata handling in privacy policies

Failure to manage metadata properly can result in GDPR violations and significant fines.

Document Type	Privacy Risk	Recommended Action
Internal memos	Low	Keep metadata for document management
Client proposals	Medium	Review and clean sensitive fields
Public white papers	High	Remove all metadata except title and author
Legal filings	High	Strip all metadata, verify with tools
Anonymous submissions	Critical	Complete metadata removal, use clean system

Metadata Standards and Schemas

Dublin Core

Dublin Core is one of the most widely adopted metadata standards, originally developed for describing web resources but now used across many document types including PDFs. It defines 15 core elements:

Title, Creator, Subject, Description, Publisher
Contributor, Date, Type, Format, Identifier
Source, Language, Relation, Coverage, Rights

Dublin Core's simplicity makes it ideal for basic document description. Libraries, archives, and digital repositories commonly use Dublin Core for cataloging PDFs.

PDF/A Metadata Requirements

PDF/A, the ISO standard for long-term archiving, has specific metadata requirements to ensure documents remain accessible decades into the future:

XMP metadata must be present and valid
Document Information Dictionary must match XMP metadata
Title field must be populated
Metadata must be embedded in the file, not referenced externally
Custom metadata schemas must be properly declared

PDF/A-compliant documents ensure metadata survives format migrations and remains readable by future software.

Industry-Specific Schemas

Different industries have developed specialized metadata schemas for their needs:

PRISM (Publishing Requirements for Industry Standard Metadata):

Used by publishers for journals, magazines, and books
Includes fields for ISSN, volume, issue, page numbers
Supports rights management and distribution information

IPTC (International Press Telecommunications Council):

Standard for news and media organizations
Includes fields for byline, headline, caption, copyright
Supports location data and subject categorization

MARC (Machine-Readable Cataloging):

Library standard for bibliographic data
Comprehensive cataloging information
Used by academic and public libraries worldwide

Creating Custom Metadata Schemas

Organizations can define custom metadata schemas for internal needs. XMP's extensibility allows you to create custom namespaces with your own properties:

Define your namespace URI (e.g., "http://yourcompany.com/metadata/1.0/")
Create a schema document describing your properties
Implement the schema in your PDF creation workflow
Document the schema for future reference

Custom schemas are useful for tracking internal document properties like project codes, approval status, or department classifications.

Pro tip: When implementing custom metadata schemas, maintain backward compatibility by also populating standard fields. This ensures your documents remain usable even if custom schema support isn't available.

Using Metadata to Compare Documents

Identifying Document Versions

Metadata provides crucial clues for identifying which version of a document you're looking at. When you have multiple files with similar names, metadata helps determine which is most recent and authoritative.

Key metadata fields for version identification:

ModDate: Shows when the document was last modified
Version numbers: Some documents include version in the Subject or Keywords field