How to prevent latex memory overflow - memory

I've got a latex macro that makes small pictures. In that picture I need to draw area. Borders of that area are quadratic bezier curves and that area is to be filled. I did not know how to do it so currently I'm "filling" the area by drawing a plenty of bezier curves inside it...
This slows down typeseting and when a macro is used multiple times (so tex is drawing really a lot of quadratic bezier curves) it produces following error:
! TeX capacity exceeded, sorry [main memory size=3000000].
How can I prevent this error ? (by freeing memory after macro or such...) Or even better how do I fill the area determined by two quadratic bezier curves?
Code that produces error:
\usepackage{forloop}
\usepackage{picture}
\usepackage{eepic}
...
\linethickness{\lineThickness\unitlength}%
\forloop[\lineThickness]{cy}{\cymin}{\value{cy} < \cymax}{%
\qbezier(\ax, \ay)(\cx, \value{cy})(\bx, \by)%
}%
Here are some example values for variables:
\setlength{\unitlength}{0.01pt}
\lineThickness=20
%cy is just a counter - inital value is not important
\cymin=450 \cymax=900
%from following only the difference between \ax and \bx is important
\ax=0 \ay=0 \bx=550 \by=0
Note: To reproduce the error this code have to execute approximately 150 times (could be more depending on your latex memory settings).
Thanks a lot for any help

I admit I don't know how to manage LaTeX's memory. However, there are better drawing frameworks for LaTeX than the old picture environment, that doesn't seem to support filled bezier paths. Two that comes to mind are the modern-style PGF and Tikz (see also examples) and the more ancient Metapost.

For historical reasons the memory available to TeX lives in a static pool where the size of the allocation is hard coded. You can recompile TeX with this set to a larger size, and some versions allow it to be configured at runtime. This FAQ entry discusses it in a bit more detail.
This page discusses configuring memory in MikTeX. Depending on which distro you're using the details will vary but something similar can be done on most modern TeX distros. Some older ones may require you to modify the source.

It seems to me that my question does not have simple and all solving answer.
Using more advanced picture drawing package as Little Bobby Tables suggested caused latex to be able to draw more pictures with some memory size(+- 2 times more) but when drawing more than that the error still occurs.
Enlarging the memory as ConcernedOfTunbridgeWells suggested and then recompiling is something I wanted to avoid. It has also same problem as Little Bobby's suggestion: you can enlarge it 100 times but when typesetting 100 times longer document it will not be sufficient again.
The solution would be to rewrite latex completely as I find this only one of more problems that makes it insufficient for my purposes or use some better typesetting engine(any ideas ?). As I find this too hard I'll be forced to just enlarge memory.

Related

Complex Number App - graphing with core-plot, power-plot or else?

I'm coding iOS app that will explain complex numbers to the user. Complex numbers can be displayed in Cartesian coordinates and that's what I want to do; print one or more vectors on the screen.
I am looking for the easiest way to print 3 vectors into a coordinate system that will adjust itself to the vector-size (if x-coord is > y-coord adjust both axis to x-coord and vice versa).
I tried using Core Plot, which I think is way too multifunctional for my purpose.
Right now I am working with PowerPlot and my coordinate system looks okay already, but I still encounter some problems (x- and y-axis are set to the x and y values which results in a 45 degree angled line, no matter the user input).
The functionality of the examples in CorePlot and PowerPlot don't seem to meet my needs.
My last two approaches were using HTML and a web view, and doing it all myself with Quartz (not the simple way...)
Do you have any advice how to do this the simple way, as it is a simple problem, I guess?
If you're not wanting to do much actual graphing and plotting, then using Core Plot or similar sounds like overkill to me. The extra bloat of adding coreplot to your project, not to mention the time taken for you to understand how to use it, might not be worth it for some simple graphics.
Quartz is well equipped for the job of showing a few vectors on the screen, assuming you're not interested in fancy 3D graphics. There are plenty of tutorials and examples of using Core Graphics (AKA Quartz) to draw lines etc. If you're going the Quartz route, perhaps get some simple line drawing going in Quartz, then ask more questions if you need help with the maths aspect of it.
The typical technique used when rendering with Quartz is to override drawRect in a subclass of UIView and place calls to Core Graphics drawing functions in there.
A decent question and example of Quartz line drawing is here:
How do I draw a line on the iPhone?
If you aren't adverse to using Google Chart Image you can load reasonably complex data sets in a simple manner by calling the appropriate URL and then putting the image in a UIImageView. It takes very little code: here is a blog post explanation with sample code.
The limitations are
length of the data set is restricted by the max URL length you can request from Google (2048 characters, with encoding is large), though I've plotted with 120 data points in 4 series.
a net connection is required (at least to get the initial chart)
and perhaps the biggest problem, API is deprecated and will be discontinued in 2015 at some point. You would then have to switch to the UIWebView/Javascript Google Chart API implementation...
Sample image:

Making a "piece of paper with text on it" in OpenGL (Specifically on iOS 5)

I've never done OpenGL, but I'm looking for some pointers on this particular question on an AR app I'm practicing with.
I'd like to make an app with a "flat rectangle" along with text written on the surface of the rectangle. Visually, I'm imagining something along the lines of a piece of paper with text written on it. Each time the app starts, the text would be something different (the text is pulled from a plist file).
The user would be able to view the paper from all sides, much as if there was a piece of paper hanging in front of him.
Is this trivial to do in OpenGL? How could I get started?
Sorry for the really open-ended question, but I wanted to get a feel for how this kind of thing is done.
Looking at the OpenGL template source code in the Xcode sample projects, I see that there is a big array of vertices. I presume that to create a "flat" rectangle, I'd essentally just have to remove or make the z-axis zero. And then the dynamic text that will attach to the surface of the flat rectangle...I dont have any idea how to do that......
This question is hard to answer unambiguously. In general, this is trivial, but then again it is not.
Drawing a "flat rectangle with something on it" is a couple of API calls, as simple as it can get. Drawing text in OpenGL in an efficient way, and high quality, and without big preprocessing is an entirely different story.
What I would do is render text using whatever the "normal system-supported" way is under iOS (just like you would draw in any window, I wouldn't know this specific detail), but draw into a bitmap rather than on the screen. This should be supported, pretty much every OS has supported this for at least 10-15 years. Then turn this bitmap into a texture, bind it, and draw your trivial flat quad with OpenGL (set up a vertex buffer with 4 vertices, each vertex a texture coordinate, and draw two triangles - as easy as it gets).
The huge advantage of that is that you get to use the installed system fonts (or any fonts available), you don't need to generate a bitmap font and don't need to think about really ugly things such as hinting and proper spacing, and it's much easier to mix different text styles, etc. OpenGL has built-in support for text too, of course, but it is not terribly efficient or nice either. If the text does not change every millisecond, it's really best to render it using the standard renderer that the operating system provides (yes, that probably won't be hardware accelerated, but so what... since the user must read the text, it likely won't change every millisecond).
Now it gets more complicated if your "piece of paper" should bend and twist too, or do a page peel effect rather than being just a flat rectangle. In that case you need to tesselate it, which can be harder than it sounds, too. Not all tesselations look optimal for all bends/twists, or they do but do not have the optimal (read as minimum) number of vertices.
There is an article on "page peel" and such tesselation in one of the GPU Gems or GPU Pro books, let me search...
There: Andreas Bizzotto: "A Shader-Based eBook Reader - Page peeling effect", GPU Pro2 pp. 278-299
Maybe you can get hold of a copy or are lucky enough to find it on Google Books or something.

Improve OCR accuracy from scanned documents

I'm scanning a lot of A3 documents using a standard Brother A3 Multifunction and then use FineReader Pro for OCR'ing the images.
However, I'm getting a lot of errors in the characters recognized, and lots of non-alphanumeric strange characters.
Can someone give me any tips for programmatically improving the OCR accuracy, either pre-processing on the scanned images, or post-processing on the recognized text?
Edit: Find a sample pdf. It includes some sample images from which I get the poorest results.
Do you have a sample image you can post somewhere then we can quickly tell you what is causing most of your problems. FineReader is one of the better OCR engines out there so there are definitely reasons why you are getting poor results.
It could be related to poor contrast and threshold settings, image skewing, dirty rollers in the scanner, complex and coloured backgrounds, dithered backgrounds, font sizes too small, scanning dpi being too low etc...
After seeing the attached image there are a few small issues.
There are lots of dirty specks on the background page. FineReader seems to do a reasonable job with this on your images.
There is some slight skew but that is not causing and problems.
FineReader is getting confused with BOLD tall Arial type font used for column headers.
4 A big problem seems to be the bottom region of the pages where the contrast is poor and the image is fuzzy. This seems to be a problem with the scanner but could be due to printing problems.
The printing is quite poor and I am guessing it is a scan from a newspaper. Most of your errors are due to scanning issues so it would be hard to programmatically improve the results.
Firstly, I would try scanning the image in grayscale using a slightly higher resolution and see if that helps. FineReader works well with grayscale images. If you have to have a B/W image then see if the scanner driver includes a setting for dynamic thresholding and turn it on.
Your images would not be an easy task for any OCR engine. You will get better results if you can improve the scanning. Page 3 has a lot of noise in the bottom right corner.
What version of FineReasder are you using ? FR10 would probably give better results than previous versions.

Generate font from an image of text

Is it possible to generate a specific
set of font from the below given image
?
My idea is to generate a specific font
for the below given image of text ,by
manually selecting portion of the
image and mapping it to a set of
letter's.Generate the font for this
and then use this font to make it
readable for an OCR.Is generation of
font possible using any open-source
implementation ? Also please suggest
any good OCR's.
Abbyy FineReader 10 gets better than expected results but predictably gets confused when the characters touch.
Your problem is that the line spacing is too small. The descenders of each line overlap the character bounding boxes of the characters in the line directly below. This makes character segmentation almost impossible because the characters are touching and overlapping. The number of combinations of overlapping characters is virtually impossible to train for. The 'g' and 'y' characters are the worst offenders.
A double line spaced version of this would probably OCR reasonably well.
A custom solution that segmented and separated the each line along with a good dictionary would definitely improve the results. There would still be some errors to correct manually though. The custom routine would have to deal with the ascenders and descenders and try and segment the image into lines which can then be fed to a decent OCR engine. One way would be to analyse every character blob on the page and allocate it to a line. Leptonica (www.leptonica.com - C Imaging Library) would probably make this job a little easier.
I would not try this without increasing the resolution to 200 or 300 dpi first.
With this custom solution, training a font becomes an option if the OCR engine does a poor job initially.
Abbyy (www.abbyy.com) or Google Tesseract OCR 3.00 would be a good place to start.
No guarantees as to whether all of this will work though. This is quite a difficult page to OCR and you need to work out whether it is better to have it typed up manually overseas. It depends on the number of pages to need to process.

Character extraction methods overview

I am searching for a good character extraction method,
or sometimes it is called stroke-model or stroke filter.
So, I;ve seen many papers, but they all take a long time for understanding and implementation,
I want to ask if someone knows some good source codes or demos?
Also I want to get some kind of full overview of methods available on these theme : character extraction from images, (grayscale).
The main problem is to get a regions of image that include only characters and then some binarization can be made. After that the feature extraction is done (actually OCR works then).
Maybe GNU Ocrad can be interesting? I haven't looked at the source though.
An area with characters is recognized by a large number of sharp edges. There will be some preferential directions, but this is not as strong as you'd see with box shapes.
You seem to assume that it is possible to get "regions of image that include only characters". This is too optimistic. Just look at this very page. There are symbols mixed in with text. And above this editing box, the first four toolbuttons are B, I, a globe and ". Five, if you count the thin divider bar | after the I

Resources