Head Belly Root
Seeds Text Creative Commons License

The problem

Documents always contain more information than you want them to. Personal information. Private information. Embarrasing information. Removing it is not so easy. Black rectangles printed over sensitive text to hide it don't work if the text is still in the pdf to see...

The solution

Pdf2pdf solves this problem: it generates WYSIWYG pdf from arbitary pdf; no hidden surfaces, no hidden text.


The quick and dirty way to do this is easy. Fix a resolution. Print each page in the pdf file using this resolution, and save the resulting image instead of the original page in the result. Problem: large files, and loss of detail.

The harder solution: for each page in the original pdf, and for each object on that page, do a hidden surface removal, and store the remaining shape in the resulting pdf.

The resulting pdf may still contain identifying information if, for example, the font shapes in the original pdf were watermarked (ie contain very subtle, undiscernable, patterns on their outlines that identify their source).