How to Extract Images from PDF Documents in Ubuntu/Linux

February 4, 2012 by Ubuntu Genius

PDF (Portable Document Format) documents are a handy way to present text and images to others knowing they’ll look the same no matter what word processor or operating system they use. Basically, they’re a snapshot of a document, so saving images from them can be a hassle, even if your viewer lets you right-click them and save them as files. There are a few programs around that can do this for you, but it’s actually much easier and faster doing this from the command-line.

The pdfimages command is part of poppler-utils, which should already be installed on your system (sudo apt-get install poppler-utils in the terminal if it isn’t). To extract the images from a PDF, just open a terminal in the folder with the document, and run a command like the following:

pdfimages -j Cool-Pix-of-2011.pdf cool2011

Note that when extracting from files with spaces in the name, you will need to enclose the filename in single quotes. Eg:

pdfimages -j 'Cool Pix of 2011.pdf' cool2011

The text at the end of the command is what each extracted image will begin with, so the resulting filenames will be cool2011-000.jpg onwards (note that numbering starts at 000, not 001). Once again, if you’d prefer to have spaces in the target names, for example to mirror the name of the original PDF, then enclose that in single quotes too (eg: 'Cool Pix of 2011 ' – note the space at the end, just to provide a bit more separation between '2011' and the hyphen preceding the automatic numbering; this is of course optional, and you can pretty much do what you want). Eg:

pdfimages -j 'Cool Pix of 2011.pdf' 'Cool Pix of 2011 '

Your pictures will now be extracted into the folder with names starting with Cool Pix of 2011 -000.jpg.

Also, the -j option is to save the images in the .jpg format, otherwise they will be saved in .ppm (Portable Pixmap) format, with each file being over a megabyte. This can mean, for example, that an 18Mb document with 120 images can extract to 154Mb of files, whereas exporting them as .jpg ends up with a total of 18Mb, just like the original document. Of course, if you’d prefer to save them as .ppm images, simply leave out the -j option.

If you would like to include the page numbering in the file names, add the -p option. Eg:

pdfimages -j -p 'Cool Pix of 2011.pdf' 'Cool Pix of 2011 '

Lastly, don’t worry if you see the following in the terminal for each image being extracted:

Error (18468081): Missing ‘endstream’
Error: Unknown operator ‘endstream’
Error: Unknown operator ‘endobj’

You shouldn’t see that with every PDF you try to extract from, but even when you do you should find the target images have been created without issue.

Extra Notes:

For more options for this command, run pdfimages -?. For example, you can specify a start and end page, but personally I find it easier to just extract the whole document and delete any images I don’t want afterwards. But if you need to specify a password, you will find the option here.

☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻

Did this information make your day? Did it rescue you from hours of headache? Then please consider making a donation via PayPal, to buy me a donut, beer, or some fish’n’chips for my time and effort! Many thanks!

Buy Ubuntu Genius a Beer to say Thanks!

Posted in Command-Line/Terminal, Graphics, Office/Business/Writing | Tagged command-line, documents, format, images, jpg, Linux, PDF, photos, pictures, pixmap, portable, terminal, Ubuntu | 12 Comments

12 Responses

on February 10, 2012 at 5:30 am | Reply OLGA

I m not sure this is the right area for my question
but it is very short

In Ubuntu
I like to change the ICONS of programs > internet etc browsers

and als o f that those icons related .
followed in the short-cuts In windows that function is in Properties

I am happy tobe a member of this group
You have such a simpel. – i MEAN not FRIGHTENING OR
PUSHING UNDER THE WATER THE NON GENIUSSES HIHI :)

THANKS
on March 3, 2012 at 12:43 am | Reply Kevin Buchs

Thanks for sharing this helpful tip.
on June 25, 2012 at 10:24 pm | Reply Rodrigo Ferreira

Thanks for the tip! It helped me a lot!!!
on June 18, 2013 at 6:47 am | Reply Rommel

very helpful, thanks.
on August 20, 2013 at 5:23 am | Reply Ubuntuman

Thanks! REALLY HELPFUL! :)
on January 10, 2014 at 12:46 am | Reply titanioverde

Just what I needed. And it seems this comes pre-installed on Mint.
on August 21, 2014 at 10:17 am | Reply chronique d'un retour impossible ? pdf

It’s actually a great and helpful piece of info. I am happy that you just shared this helpful info with us.

Please keep us up to date like this. Thanks for sharing.
on November 14, 2014 at 3:54 am | Reply R.Harish Navnit

Reblogged this on The Enigma and commented:
And that’s how I got the images of all my batchmates. Two PDF files, plenty of photos, even more memories !
on November 14, 2014 at 3:56 am | Reply tonythomas01

Reblogged this on Open and Free Source! and commented:
This works great !:)
on May 1, 2015 at 1:43 am | Reply (Ubuntu) Εξαγωγή εικόνων και κειμένων από PDF | K.I.S Web Formations

[…] πηγή : https://ubuntugenius.wordpress.com/2012/02/04/how-to-extract-images-from-pdf-documents-in-ubuntulinu… […]
on July 6, 2015 at 1:16 am | Reply pony

Images can be extracted from a pdf by converting it to a html. This can be done with the default tool pdftohtml – https://www.youtube.com/watch?v=CG1rf7k3xo8 .
- on July 6, 2015 at 10:54 am | Reply Ubuntu Genius
  
  Yes, that’s another way of doing it, though the pdfimages way is much easier, and you only end up with the images, and they’re already nicely named. But many thanks for your input.

Comments RSS

	Open local folder fr… on How to Open .url Internet Expl…
	Add New Disk - Linux… on Ubuntu Hardware Permissions: H…
	Command line equival… on Ubuntu USB Storage: How to Saf…
	루트 드라이브에 디스크 공간이 부족합… on Ubuntu Cleanup: How to Remove…
	Ubuntu Genius on Caja Toolbar: Add Open Trash,…

Ubuntu Genius's Blog

Cool UBUNTU Tips & Tricks brought to you by OzzyFrank