PDF (Portable Document Format) documents are a handy way to present text and images to others knowing they’ll look the same no matter what word processor or operating system they use. Basically, they’re a snapshot of a document, so saving images from them can be a hassle, even if your viewer lets you right-click them and save them as files. There are a few programs around that can do this for you, but it’s actually much easier and faster doing this from the command-line.
The pdfimages command is part of poppler-utils, which should already be installed on your system (sudo apt-get install poppler-utils in the terminal if it isn’t). To extract the images from a PDF, just open a terminal in the folder with the document, and run a command like the following:
pdfimages -j Cool-Pix-of-2011.pdf cool2011
Note that when extracting from files with spaces in the name, you will need to enclose the filename in single quotes. Eg:
pdfimages -j 'Cool Pix of 2011.pdf' cool2011
The text at the end of the command is what each extracted image will begin with, so the resulting filenames will be cool2011-000.jpg onwards (note that numbering starts at 000, not 001). Once again, if you’d prefer to have spaces in the target names, for example to mirror the name of the original PDF, then enclose that in single quotes too (eg: 'Cool Pix of 2011 ' – note the space at the end, just to provide a bit more separation between '2011' and the hyphen preceding the automatic numbering; this is of course optional, and you can pretty much do what you want). Eg:
pdfimages -j 'Cool Pix of 2011.pdf' 'Cool Pix of 2011 '
Your pictures will now be extracted into the folder with names starting with Cool Pix of 2011 -000.jpg.
Also, the -j option is to save the images in the .jpg format, otherwise they will be saved in .ppm (Portable Pixmap) format, with each file being over a megabyte. This can mean, for example, that an 18Mb document with 120 images can extract to 154Mb of files, whereas exporting them as .jpg ends up with a total of 18Mb, just like the original document. Of course, if you’d prefer to save them as .ppm images, simply leave out the -j option.
If you would like to include the page numbering in the file names, add the -p option. Eg:
pdfimages -j -p 'Cool Pix of 2011.pdf' 'Cool Pix of 2011 '
Lastly, don’t worry if you see the following in the terminal for each image being extracted:
Error (18468081): Missing ‘endstream’
Error: Unknown operator ‘endstream’
Error: Unknown operator ‘endobj’
You shouldn’t see that with every PDF you try to extract from, but even when you do you should find the target images have been created without issue.
Extra Notes:
For more options for this command, run pdfimages -?. For example, you can specify a start and end page, but personally I find it easier to just extract the whole document and delete any images I don’t want afterwards. But if you need to specify a password, you will find the option here.
☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻
Did this information make your day? Did it rescue you from hours of headache? Then please consider making a donation via PayPal, to buy me a donut, beer, or some fish’n’chips for my time and effort! Many thanks!
I m not sure this is the right area for my question
but it is very short
In Ubuntu
I like to change the ICONS of programs > internet etc browsers
and als o f that those icons related .
followed in the short-cuts In windows that function is in Properties
I am happy tobe a member of this group
You have such a simpel. – i MEAN not FRIGHTENING OR
PUSHING UNDER THE WATER THE NON GENIUSSES HIHI :)
THANKS
Thanks for sharing this helpful tip.
Thanks for the tip! It helped me a lot!!!
very helpful, thanks.
Thanks! REALLY HELPFUL! :)
Just what I needed. And it seems this comes pre-installed on Mint.
It’s actually a great and helpful piece of info. I am happy that you just shared this helpful info with us.
Please keep us up to date like this. Thanks for sharing.
Reblogged this on The Enigma and commented:
And that’s how I got the images of all my batchmates. Two PDF files, plenty of photos, even more memories !
Reblogged this on Open and Free Source! and commented:
This works great !:)
[…] πηγή : https://ubuntugenius.wordpress.com/2012/02/04/how-to-extract-images-from-pdf-documents-in-ubuntulinu… […]
Images can be extracted from a pdf by converting it to a html. This can be done with the default tool pdftohtml – https://www.youtube.com/watch?v=CG1rf7k3xo8 .
Yes, that’s another way of doing it, though the pdfimages way is much easier, and you only end up with the images, and they’re already nicely named. But many thanks for your input.