"Relative" kerning with ImageMagick - imagemagick

I'm rendering text to a PNG using ImageMagick's convert -annotate command. Now I want to reduce the kerning, i.e. spacing between the letters. This can be done in IM using an option like -kerning -2.
Unfortunately, this option completely overrides the default kerning of the font, so that now the bounding boxes of all letters have the same distance, even in cases like the letter combination "AV", where they should overlap.
Is there a way to make IM apply the default kerning first, but decrease the resulting spacing by e.g. 2px, instead of using the same spacing everywhere?
Failing that, are there alternative command line (or Ruby) tools that can render text in a custom font to PNG while supporting the desired behaviour?

If you have a later version of Imagemagick > 6.7.6-3 there is a new feature ( I think it will only work on Linux machines ) which might be able to do what you want. I have not checked it out: http://www.imagemagick.org/Usage/text/#pango

This issue has been fixed in ImageMagick 6.8.9-6 Beta.

It's a bit more work to set up, but of late I've been advocating to people wanting to do server-side document rendering to build them in SVG and convert them to bitmap using Inkscape. I'd be pretty sure that this feature is supported (get yourself a copy and check that out in the UI - if it's in there, you can do it).
You'd need to be confident in manipulating XML docs - and basic SVG is pretty easy to learn once you get into it.

Related

Finding known text in an image (guided OCR)

I'm looking for a way to locate known text within an image.
Specifically, I'm trying to create a tool convert a set of scanned pages into PDFs that support searching and copy+paste. I understand how this is usually done: OCR the page, retaining the position of the text, and then add the text as an invisible layer to the PDF. Acrobat has this functionality built in, and tesseract can output hOCR files (containing the recognized text along with its location), which can be used by hocr2pdf to generate a text layer.
Unfortunately, my source images are rather low quality (at most 150 DPI, with plenty of JPEG artifacts, and non-solid backgrounds behind some of the text), leading to pretty poor OCR results. However, I do have the a copy of the text (sans pictures and layout) that appears on each page.
Matching already known text to it's location on a scanned page seems like it would be much easier to do accurately, but I failed to discover any software with this capability built-in. How can I leverage existing software to do this?
Edit: The text varies in size and font, though passages of it are consistent.
The thought that springs to mind for me would be a cross-correlation. So, I would take the list of words that you know occur on the page and render them one at a time onto a canvas to create a picture of that word. You would need to use a similar font and size as the words in the document - which is what I asked in my comment. Then I would run a normalised cross-correlation of the picture of the word against the scanned image to see where it occurs. I would do all that with ImageMagick which is available for Windows and OSX (use homebrew on OS X) and included in most Linux distros.
So, let's take a screengrab of the second paragraph of your question and look for the word pretty - where you mention pretty poor OCR.
First, you need to render the word pretty onto a white background. The command will be something like this:
convert -background white -fill black -font Times -pointsize 14 label:pretty word.png
Result:
Then perform a normalised cross-correlation using Fred Weinhaus's script from here like this:
normcrosscorr -p word.png scan.png correlation-result.png
Match Coords: (504,30) And Score In Range 0 to 1: (0.999803)
and you can see the coordinates of the match are 504,30.
Result:
Another Idea
Another idea might be to take Google's Tesseract-OCR and replace the standard dictionary with the text file containing the words on the page you are processing...

MySQL WorkBench EER diagram dimensions are terrible

I am using MYSQL workbench to generate an EER diagram, and to the best of my knowledge, one can not control the dimensions of the canvas, only the size in number of pages. This has the result that you get a huge amount of white space around your diagram, making it nearly unusable. Why anyone would design it this way is beyond me. There are a lot of questions which ask how to crop a pdf, but they are either more complicated (ie. crop to a certain dimension, or crop and output to different format and ratio) or they do not preserve the image quality, or they just plain do not work. My question therefore is this:
How does one create or convert an EER diagram using MySQL Workbench such that there are no white borders AND the image quality is preserved?
Note I asked the question here as it pertains to databasing, but apologies if it is in the wrong place.
Looks like what you are after is a way to limit the output of an image export to a relatively small area, so that it fits nicely in another document. Several options are possible:
1) Export as png and simply cut off the unwanted parts. Depending on the further usage this might be good enough.
2) Export as SVG and use any of the SVG editors to limit the image size to the wanted area only. Then convert it to the format you need in your target document.
3) Set a paper size in the model that encompasses the content as close as possible. E.g. the statement paper type is quite small. Then rearrange your objects. Resize them if you need larger ones. By setting a larger font (via Preferences) you should be able to make the entire appearence larger. Then export as PDF.

Is there a graphical tool for Mac to assist in positioning CCNode objects on a Layer?

If my designer gives me a 960x640px image of what the screen should look like, as well as all of the individual elements as images or text, is there a way to lay out the images and text on the iPhone/iPad screen without doing it manually through code? The way I'm doing it now is a series of trial and error, trying to guess the position of each element.
By the way, the types of layouts I'm trying to do are simple static layouts for stuff like Menus and High Scores lists, etc.
You should try one of the editing tools: LevelHelper, CocoShop and CocosBuilder. The problem will be the output format, make sure that not only the editing part works to your specification but that you can actually use just the snippet of code you need to plug it into your code.
Do you have an image-editing software like Photoshop or GIMP? How about opening the 960x640px image with any such software, then hovering your mouse over the center of each element for its coordinates, and then finally pumping these values into your code?
In my opinion, this is at least better and way faster than trial and error:)
If you want to measure position of graphic elements. You can try a commercial called xscope. The trail version can be downloaded form their official website. It is the best tool I ever seen to measure distance, color(like, it can copy color measured directly to [UIColor ...] format), etc. If you want something freeware, I would like to recommend markman, which is a Chinese software, it's built on adobe air. All elements/button are graphic, so you don't need to read chinese to use it..
You can try to use some open source editor and write your exporter. For example I am using blender as a level editor for the game I am working on. It has a nice python API that can be used to export all the information you need.

Generate font from an image of text

Is it possible to generate a specific
set of font from the below given image
?
My idea is to generate a specific font
for the below given image of text ,by
manually selecting portion of the
image and mapping it to a set of
letter's.Generate the font for this
and then use this font to make it
readable for an OCR.Is generation of
font possible using any open-source
implementation ? Also please suggest
any good OCR's.
Abbyy FineReader 10 gets better than expected results but predictably gets confused when the characters touch.
Your problem is that the line spacing is too small. The descenders of each line overlap the character bounding boxes of the characters in the line directly below. This makes character segmentation almost impossible because the characters are touching and overlapping. The number of combinations of overlapping characters is virtually impossible to train for. The 'g' and 'y' characters are the worst offenders.
A double line spaced version of this would probably OCR reasonably well.
A custom solution that segmented and separated the each line along with a good dictionary would definitely improve the results. There would still be some errors to correct manually though. The custom routine would have to deal with the ascenders and descenders and try and segment the image into lines which can then be fed to a decent OCR engine. One way would be to analyse every character blob on the page and allocate it to a line. Leptonica (www.leptonica.com - C Imaging Library) would probably make this job a little easier.
I would not try this without increasing the resolution to 200 or 300 dpi first.
With this custom solution, training a font becomes an option if the OCR engine does a poor job initially.
Abbyy (www.abbyy.com) or Google Tesseract OCR 3.00 would be a good place to start.
No guarantees as to whether all of this will work though. This is quite a difficult page to OCR and you need to work out whether it is better to have it typed up manually overseas. It depends on the number of pages to need to process.

How do you scale an image for print without degrading the quality?

I was wondering how would you print an image that's scaled three times its original size without making it look like crap? If you change the dpi to 300 and print it'll look like crap. Is there a way to convert it gracefully?
You may have the problem of trying to add detail that isn't there. Hopefully you're aware of this.
The best way to enlarge an image that I know of is to use bicubic interpolation. If it's any help, Photoshop recommends using 'bicubic smoother' for enlargement.
Also, be careful with DPI vs PPI.
This is called supersampling or interpolation. There's no 'perfect' algorithm, since that would imply generating new information where there was none ('between' the pixels); but some methods are better than others in fooling the eye/brain to fill the voids, or at least not making big square boxes.
Start with the wikipedia articles on Nearest-Neighbor, Bilinear and Bicubic interpolations (the three offered by PhotoShop). A few more Tricubic interpolation, Lanczos resampling could be of interest, also check the theory, and comparison links.
In short, this isn't a cut-and-clear issue; but an active investigation field, full of subjectivity and practical trade-offs.
You should vectorize your image, scale it, and if you wish you may convert it back to the original format (jpg, gif, png...). However this works best for simple images.
Do you know how to vectorize? There are some sites that do it online, just do some Google research and you'll find some.
Changing the DPI won't matter if you don't have enough pixels in your image for the size you are printing. In the biz it's called GIGO (Garbage In, Garbage Out).
If your image is in HTML then create a media="print" stylesheet and feed a high-res image that way.

Resources