PDFspy
PDFspy is the ultimate “get info” utility for your PDF documents. It can extract a comprehensive list of attributes from a PDF file into an XML-based format.
New features and enhancements including:
- Support for PDF 1.7/ISO 32000 (Acrobat 9, X, DC)
- Element now shows CMYK separations that are actually used by text and vector elements
- New element that shows the number of shading objects in PDF file
- Restored output being written to stdout if -o option not used, recommend using -quiet option when writing to stdout
- Fixed calculation of page labels
- Improved text extraction algorithm
- Calculates color simulation values for ICCBased, Separation and DeviceN colorspaces
- Improved Unicode, ISO Latin and AdobePDF character set support
Some examples of the many types of information PDFspy can extract:
- Page information (count, size, boxes)
- Fonts usage (name, type, embedding & subset status, use of Unicode)
- Colorspaces used (alternates, separation names, index bases)
- Images (size, resolution, compression, colorspace)
- Use of transparency, smooth shadings and patterns
- Presence (or absence) of hidden text and optional content/layers
- Hyperlinks (size, location and destination)
- Annotations (size, location, type, contents, colors)
- PDF/X compliance (including output intent details)
- Metadata (info dictionary & XMP)
- Security and Encryption settings
Example uses:
- Asset management system: extract page count, metadata, font & image information
- Document management: determine text or image only documents, extract comments
- Preflight: extract information about colorspaces, compression & font types
- Developers: easily examine the structure of complex PDF documents