IFilter for JPG files - delphi

I'm using the IFilter interface to read the content of files such as .docx, .pdf, etc. For those two types (and others) this already works pretty well, but I was wondering if it is possible to use this mechanism to read the meta data of a jpg file as well.
I created a test image file and added some information to its details (Title, description, ...)
Interestingly, the Windows Indexer is able to find this file using the text that I specified as title. By using the IFilter interface, however, I retrieve only an empty string for my jpeg.
I also tested the command line tool filtdump.exe from here: https://msdn.microsoft.com/en-us/library/windows/desktop/dd940434(v=vs.85).aspx#command_line, which returned the same results as my implementation.
Does anyone know how the Windows indexer is able to see the content and how I couold use the same mechanism to achieve similar results?

Related

How to find out if a program accepts as parameters just a file or a list of files?

I would like to write some code in Delphi for opening some files (e.g. mp3, png) with the associated program in Windows.
With AssocQueryString, I can find the program for a given extension.
With this program, I can start the given file, when only one file was selected.
The problem is when I try to start the program with a list of files.
Example 1 - mp3 is associated with AIMP3 and this call works fine
D:\Tools\AIMP3\AIMP3.exe "F:\TestFiles\mp3\file1.mp3" "F:\TestFiles\mp3\file2.mp3"
Example 2 - png is associated with IrfanView and this call fail
D:\Tools\IrfanView\i_view32.exe "F:\TestFiles\png\file1.png" "F:\TestFiles\png\file2.png"
IrfanView does not accept calling it with a list of files, but only with one file.
My question is, how do I find out if a program accepts as parameters just a single file or a list of files?
I have tried to check the Registry but found nothing. In shell->open->command I can find "%1" for both programs.
I have tried to use the IDropTarget interface, but this does not work with IrfanView, either (drop multiple files on i_view32.exe doesn't work in Windows Explorer, either).
On the other hand, Windows Explorer (if using Open from the context menu for many png files) opens a new instance of IrfanView for each file. If I had this information, I could also start IrfanView for each file.

read write file properties with PropertyHandler Shell Extension

I'm trying to create PropertyHandler shell extension.
What's the best way for embedding properties like (Title,Author,.....) to use the same file in multi computers or devices?
StgCreateStorageEx ? way or there is other ways to do it?
because StgCreateStorageEx dealing with NTFS files only and i'm not sure if the file hold these properties with it if i open it in other device with same PropertyHandler
Is there any way to save properties inside the my file ?
The StgCreateStorageEx function creates a new storage object using the IStorage interface. This allows storing multiple data objects within a single binary file, see for example https://en.wikipedia.org/wiki/COM_Structured_Storage. So, technically, you can save almost anything in this file including embedded properties.
I don't think that this is limited to NTFS: The old Microsoft Office .doc format (and many other Microsoft products) use this storage format and work also with FAT32.
If you want to use this binary file format is a completely different question. As you did not provide any information about the content and format of your file, I cannot recommend anything. One alternative would be to store the content of your file in an xml file. Properties like Title and Author then could be added easily.

Can't get the F# Word typeprovider to work for custom Word documents

I've downloaded the FSharp 3 Sample Pack and tried out the sample for the Word documents typeprovider which works fine in the TestScript.fsx file when using the provided sample document (AA.docx). But when I try using it with a different Word document it doesn't work i.e. no properties are generated on the type provider instance (Person, MyCompany etc.). Even if I create a new document and copy the contents of AA.docx to it (keeping source formatting) it doesn't work. What could be the issue?
The word type provider uses the Open xML API. The same word content can have different XML representation at backend. I'd suggest you to download the Open XML SDK and use the tool to visualize its content.

How can I create PDF/X-3:2002 compliant PDF files from Delphi?

From my Delphi application I require to create a PDF document that is PDF/X-3:2002 compliant.
This is a strict requirement of the client as the PDF files are going to be printed in a printing press.
I have wPDF but it does not support. (Please see: http://wpcubed.com/forum/viewtopic.php?t=5693)
If no component currently exists, then what techniques and other software can I use to accomplish this? The application allows the user to add images and rich-text onto templates (TPanels) that should make up the pages of the PDF.
How do you manage your color?
If you use RGB Colors in your Delphi application to handle the image, PDF/X-3:2002 won't be just a matter of tagging.
The PdfLib do handle this format, and can be used in Delphi.
I guess that default PDF/A-1 settings will meet most of the PDF/X-3 requirements, especially:
Embed fonts;
Include color profile;
Contain metadata.
Our Open Source engine is able to produce PDF/A-1 files - if you take a look at the specs, you may be able to generate PDF/X-3:2002 compliant PDF files.

Search Words in pdf files

Is it possible to search "words" in pdf files with delphi?
I have code with which I can search in many others files like (exe, dll, txt) but it doesn't work with pdf files.
It depends on the structure of the specific PDF.
If the pdf is made of images (scanned pages) then you have to OCR each image and build a full text index inside the PDF. (To see if its image based, open it with notepad and look for obj tags full of random chars). There are a few utilities and apps that do this kind of work for you, CVision PDF Compressor is one that I have used before.
If the pdf is a standard PDF, then you should be able to open it like any other text file and search for the words.
Here is page that will detail some of the structure of a PDF. This a SO post for the same.
The components/libraries mentioned in the answer to this question should do what you need.
I'm just working on a project that does this. The method I use is to convert the PDF file to plain text (with pdftotext.exe) and create an index on the resulting text. We do the same with word and other office files, works pretty good!
Searching directly into pdf files from Delphi (without external app) is more difficult I think. If you find anything, please update here as I would also be very interested in that!
One option I have used is to use Microsoft's ifilter technology, this is used by windows desktop search and many other products such as sharepoint and SQL server full-text search.
It supports almost any office/office-like file format, even dwg, msg, pdf, and files in zip/rar archives.
The easiest way to use it is to run FiltDump.exe on any files you have, and index the text output.
To know about the filters installed on your PC, you can use ifilter explorer.
Wikipedia has some links on its ifilters page.
Quick PDF Library's GetPageText function can give you the words from a PDF as well as the page number and the co-ordinates of those words - sometimes useful for highlighting.
PDF is not just a binary representation. Think of it as a tree of objects, where an object node has some metadata and some content information. Some of these objects have string data, some don't. Some of these are even encrypted, and some are compressed. So, there's very little chance your string finder will work on any arbitrary PDF.

Resources