spindrift

PDF manipulation in Node.js! Split, join, crop, read, extract, boil, mash, stick them in a stew.

npm install spindrift
13 downloads in the last week
44 downloads in the last month

spindrift

PDF manipulation in Node.js! Split, join, crop, read, extract, boil, mash, stick them in a stew.

Example

var spindrift = require('spindrift');

// Use and chain any of these commands...
var pdf = spindrift('in.pdf')
   .pages(7, 24)
   .page(1)
   .even()
   .odd()
   .rotate(90)
   .compress()
   .uncompress()
   .crop(100, 100, 300, 200) // left, bottom, right, top

// Join multiple files...
var pdfA = spindrift('1.pdf'), pdfB = spindrift('2.pdf'), pdfC = spindrift('3.pdf')
spindrift.join(pdfA.page(1), pdfB, pdfC.pages(5, 10)).deflate().pdfStream()...

// And output data as streams.
pdf.pdfStream().pipe(fs.createWriteStream('out.pdf')); // PDF of compiled output
pdf.pngStream(300).pipe(fs.createWriteStream('out-page1.png')); // PNG of first page at 300 dpi
pdf.textStream().pipe(process.stdout) // Individual text strings

// Extract content as text or images:
pdf.contentStream().on('data', console.log) 
// { type: 'string', x: 1750, y: 594,
//   string: 'Reinhold Messner',
//   font: { height: 112, width: 116, font: 'ZSVUGH+Imago-Book' },
//   color: { r: 137, g: 123, b: 126 } }
// { type: 'image', x: 3049, y: 5680, width: 655, height: 810, index: 4 }

// Use the 'index' property of an image element to extract an image:
pdf.extractImageStream(0)

Requirements

  • Install PDFTK (http://www.pdflabs.com/docs/install-pdftk/) on your system.
  • Ensure you have Ghostscript installed (check by running gs --version).
  • (optional) To extract individual images from a page, install pdfimages with brew install xpdf or apt-get install poppler-utils.

References

npm loves you