Most of the time, working from the command line poses no problems for the server administrator. Hardly any server work actually requires a graphical interface. After all, you're not writing documents for a case study or creating a presentation! If at all you need to open and view files, they're likely to be configuration files that are purely text based, and in fact might be easier to manipulate from the command line. Log files with thousands of lines of text are easier to open and view from the terminal, than with a GUI.
However, sometimes you run into problems like PDF files. Under normal circumstances, you will never need to view or create a PDF file from the command line. They're not text files after all, and play no role in administration. Moreover, they often have a specific layout that doesn't lend itself well to display on the terminal. As such if you want to view a PDF file, it's best to simply start up a GUI like an X Windows system and view open it there.
However, what if you just need to get a quick peek at a PDF file from within the terminal, and don't have an X Window environment already set up? In this article, I'll show you a quick way to use a Linux tool to open PDF files from the command line.
Step 1: Trying to View a Sample PDF File
Let's say I have a sample text file with very basic text like this:
There's nothing special about this. No real formatting, no multi columns. It might as well be a text file. Unfortunately, trying to open this file in Linux using the regular text based utilities just gives us gibberish:
This is because PDF has a special encoding that text editors and viewers don't understand. So to view this properly in a command line, we have to use a tool called "pdftotext".
Step 2: Installing pdftotext
The tool "pdftotext" is included with the suite called "poppler-utils". This contains all kinds of useful tools for working with PDFs like merging them, extracting images, converting them to bitmaps etc. If you're using a RHEL/CentOS installation, you can use yum:
yum install poppler-utils
Or "apt-get" if you have Debian/Ubuntu:
apt-get install poppler-utils
After the installation is complete, the tools will be available for you to use.
Step 3: Converting a Basic PDF File into Text
The syntax for conversion is simple:
pdftotext [pdffilename] [textfilename]
This will output the text contents of the PDF into the textfile you supply as the second argument. You can replace it with the universal "-" to display it on the screen (standard output) piped to the "more" command if you want like this:
pdftotext pdf-test.pdf - | more
This gives us:
As you can see the content has been converted into text, and you can quickly read what the PDF has.
Step 4: Preserving Layouts
Converting the text in a PDF and maintaining a complex layout can be almost impossible. There's no real standard, and even sophisticated GUI suites are unable to always do a good job. Just try converting a PDF e-book for example! However, pdftotext will give it a good try if you specify the "layout" option. For example, here's a PDF with multiple columns:
We convert it with the layout option with this command:
pdftotext -layout pdf-test.pdf - | more
And we get the following output:
It kind of breaks down if you have more complicated layouts like with three columns, but the tool will try and give you what you want. Just don't expect too much!
You can get more options by using the help command:
You'll see that you can provide user or owner passwords if necessary, specify the exact pages to extract, as well as the coordinates for a specific portion of the page. All in all, a very useful tool to have on your system if you ever need to quickly glance at the contents of a simple PDF, without needing to open a dedicated GUI tool.