Rotolux

Physics, Python programming, Miscellaneous geekness

Guide to reducing borders around text in a pdf

OK, here’s a howto on using LaTeX with the pdfpages package to strip whitespace from around the text of an existing pdf document. Say you have a pdf with a large blank marginspace around the text. The following LaTeX file will make a new version. Unfortunately, it will lose any bookmarks, links etc. I usually run it with pages={-8} to process just a few pages, tweaking the offset and scale values until it’s stripping out the bits I want, then change back to pages={-} or maybe a different page range to strip out any guff you don’t want. It’ll also let you glue together multiple pdfs into one file.

\documentclass[a5paper,oneside]{book}
\usepackage{pdfpages}
\begin{document}
\pagestyle{empty}
\includepdf[pages={-},offset=0mm -5mm,scale=1.3,pagecommand={}]{”pdf_file_name”}
\end{document}

Guide to reducing the file size of a pdf for the eSlick

I reported a problem to Foxit with reading a few pages of Kevin O'Holleran's 27MB thesis on my eSlick.
Christina from Foxit explained that the error message "! This page cannot be displayed normally" was indicating a memory issue. Therefore, I worked out how to reduce the file size to a comparatively small 3.4MB by re-sampling just the raster images. Now the eSlick handles the file without problems.

To produce the smaller file on a Windows PC:

  1. Install the latest version of the free PDFCreator software v0.9.8.
  2. Open the original file and print it to the PDFCreator printer device.
  3. After the printing Progress bar completes and the printer driver queue empties, the PDFCreator dialog appears. Click the Options button. Click the Ghostscript link. In the Additional Ghostscript parameters field, add the text "-dPDFSETTINGS=/screen" (without the quotes) and click the Save button.
  4. Click the Save button on the PDFCreator dialog.
  5. Enjoy reading your new, lighter ebook

Gutenberg rejoiner

So, what does one do with an e-reader? Why, read project Gutenberg etexts of course. One way to do this if you've got an eSlick is to use feedbooks to generate a nice pdf for you. In your profile settings, feedbooks lets you set up a custom pdf page format. I use 92 x 122mm, 11px Palatino for the eSlick. However, for novels it's probably better to just use a text file. Then the eSlick can reflow the text for you. The trouble is that Gutenberg etexts are filled with unwanted newline characters so they don't flow nicely on screen. What to do? Well, if you use Perl, you could try the script here. Alternatively, if you're a more enlightened Pythonista like me, you could use the following Python script to rejoin the lines (Note: I created this listing using this cool page):

from Tkinter import Tk
from tkCommonDialog import Dialog

class OpenFile(Dialog):
command = "tk_getOpenFile"

rootwin = Tk(); rootwin.withdraw()
fname = OpenFile().show().split('/')[-1]
if fname != '':
print >>open('out_'+fname, 'w'), open(fname).read().replace('\n\n','#uNiQuE#').replace('\n',' ').replace('#uNiQuE#','\n')

First eSlick post


So, I bought a new toy - one of those newfangled e-ink book readers that herald the death of the print industry - not that I've got anything against the print industry. Occasionally common sense takes leave of me and I become an early adopter of something - in this case it's a Foxit eSlick. That's a photo of me holding it, showing a page of David Paganin's Coherent X-Ray optics book. Hopefully this post spells the start of me blogging some helpful tips for making reading with the eSlick a nicer experience.