ImageMagick create PDF version 1.4 from image? - image-processing

I know that I can use ImageMagick's convert tool to turn different image files into PDF documents. However, is there some way to specify what version of PDF document I want to use for the output? Can I convert an image to a PDF v1.4 document?
I am trying to find a way to automate the conversion of image files (probably SVG) to PDF files that need to be sent to a printing service. The printer's service requires the PDF files to meet certain requirements, and one of them is that the PDF file is v1.4. My version of convert is "6.5.7-8 2010-12-02 Q16".
Thanks,
Carl

This question on superuser.com
https://superuser.com/questions/193791/batch-convert-pdf-versions
will give you some hints how to change the version number in the PDF afterwards.

Related

How to read pdf and extract text from pdf in symfony1.1?

I am working on Symfony-1.1 in an existing project. How can I read pdf files and extract text from them?
It's not a Symfony 1.1 related question, actually. It's a PHP one. There several libraries to handle PDFs in PHP. Following are some suggestions.
https://github.com/smalot/pdfparser
http://pastebin.com/dvwySU1a
http://www.pdflib.com/
If you just need to parse pdf in anyway and then process the text in PHP, you can also consider using a java library like the following.
http://pdfbox.apache.org/ (Is there a PDF parser for PHP?)

How to make a Photoshop (.psd) file have hyperlinks when saved as a .PDF file?

I created a template for a document i want to use, in photoshop.
I want to share this document as a PDF file.
I want some of the text i made in photoshop to work as hyperlinks and direct ppl to websites.
How do i save a photoshop file as a PDF and get hyperlinks to work in the PDF file?
I have tried using the slice tool. It works to assign a url and target.
But when saved as a PDF the links do not work.
Anyone?
I know that indesign allows for saving pdf documents in either print or web format. the latter allows for hyperlinks to be available. if photoshop does not offer this you can always add them in acrobat (not reader) and then resave the .pdf.

Apache Tika Office to PDF conversion

I am trying to convert office files to PDF using POI and iText. I am able to do the basic conversion where I read the word file using WordExtractor and write the contents to PDF file using PDF writer.
However, this does not retain the structure (tables, styles etc). I have come across this forum that you can retain the formats using Tika. Are there any working examples for this?

Create thumbnail from Adobe Illustrator file?

Does anybody know how to create a thumbnail from an Adobe Illustrator file without using Illustrator? I have a php/linux based application and I'd like to do so.
-Dave
By default, Adobe Illustrator saves files as PDF compatible. Unless the file was saved in a strange way, you should be able to use ImageMagick directly to generate a thumbnail. For example:
convert file.ai -thumbnail 250x250 -unsharp 0x.5 thumbnail.png
Note: If the file has multiple artboards (which are interpreted as pages as a PDF), it will generate multiple files or, if saved as a GIF, an animated GIF.
If you can save it in PDF, PS, or EPS format you may be able to manipulate it in things like ImageMagick or Ghostscript.
EDIT:
I think you can actually use ImageMagick's convert with *.ai files as well.

Search Words in pdf files

Is it possible to search "words" in pdf files with delphi?
I have code with which I can search in many others files like (exe, dll, txt) but it doesn't work with pdf files.
It depends on the structure of the specific PDF.
If the pdf is made of images (scanned pages) then you have to OCR each image and build a full text index inside the PDF. (To see if its image based, open it with notepad and look for obj tags full of random chars). There are a few utilities and apps that do this kind of work for you, CVision PDF Compressor is one that I have used before.
If the pdf is a standard PDF, then you should be able to open it like any other text file and search for the words.
Here is page that will detail some of the structure of a PDF. This a SO post for the same.
The components/libraries mentioned in the answer to this question should do what you need.
I'm just working on a project that does this. The method I use is to convert the PDF file to plain text (with pdftotext.exe) and create an index on the resulting text. We do the same with word and other office files, works pretty good!
Searching directly into pdf files from Delphi (without external app) is more difficult I think. If you find anything, please update here as I would also be very interested in that!
One option I have used is to use Microsoft's ifilter technology, this is used by windows desktop search and many other products such as sharepoint and SQL server full-text search.
It supports almost any office/office-like file format, even dwg, msg, pdf, and files in zip/rar archives.
The easiest way to use it is to run FiltDump.exe on any files you have, and index the text output.
To know about the filters installed on your PC, you can use ifilter explorer.
Wikipedia has some links on its ifilters page.
Quick PDF Library's GetPageText function can give you the words from a PDF as well as the page number and the co-ordinates of those words - sometimes useful for highlighting.
PDF is not just a binary representation. Think of it as a tree of objects, where an object node has some metadata and some content information. Some of these objects have string data, some don't. Some of these are even encrypted, and some are compressed. So, there's very little chance your string finder will work on any arbitrary PDF.

Resources