Jasper Reports virtualizer - alternative for large amounts of data - memory

Jasper-reports has a virtualizer option that writes to disk but I'm curious if there is another way to handle large amounts of data that cause out of memory problems.
Does the gzip virtualizer work well instead of file-based or does it have the same memory problem? http://community.jaspersoft.com/wiki/comparison-report-virtualizers
Would creating multiple JasperPrint objects of sections of the PDF content and then merging them with the exporter into one output stream work?
Any suggestions?

Related

Why string is not disposed and how can I optimize the memory usage?

I am writing an excel parse web api that reads a structured file into objects(a recurrence function).
But I've noticed a significant memory spike while parsing some files and thus would throw OutOfMemory exception. The excel parse engine needs the whole file to be loaded before it can read its structure. And I found out it's not the loading that consumes the most of the memory, it's the parsing(turning excel to structured objects) and the json http return(serialize the objects to json) that finally kills the memory. For example, a 1M large file can parse into 70M json.
So I googled around, found this .net Memory Profiler and tried to analyze what was going on that led to this huge memory usage. Here is the snapshot that I captured while parsing the same file twice. I've noticed that there are huge string /Object[] that are not being GCed.
Now I'm at lost. What are the best practices when you are dealing with lots of List and lots of string? As to reducing the memory usage, where should I start looking into? What are the best practices while handling long running process(Adding queue? Use signalR to notify the process result?)?
Some guidance would be really appreciated!
I'd say you need to avoid loading the whole file at once.
Here is a good place to start:
Load Large Excel file in C#

What are the scenario that makes us compress data before we transfer it?

I am wondering the reason why we need to apply file compression before we upload files to server under some scenarios. For my understanding, as soon as the server received the compressed files, the compressed file need to be extracted to allow the server read the file content. It certainly consumes the computation power of the server if multiple Http POSTs are sent from many client side platforms.
Therefore, as far as I can think of the scenario of sending the compressed file is uploading the backup files, setting files, files that only servers as back up for the client side platforms. Please give me more scenarios for uploading compressed data.
I think the following article gives an perfect explanation to the question:http://www.dataexpedition.com/support/notes/tn0014.html
Here's the content:
Compression Pros & Cons
Simply put, compression is a process which trades CPU cycles for bytes. But the trade isn't always a good one. Sometimes you can spend a lot of valuable CPU cycles for little or no gain.
In the context of network data transport, "Should I compress?" is a common question. But the answer can get complicated, depending on several factors. The most important thing to remember is that compression can actually make your data move much slower, so it should not be used without some consideration.
When Compression Is Good
Compression algorithms try to identify large repeating patterns in a data set and replace them with smaller patterns. Ideally, this shrinks the size of the data set. For the purposes of network transport, having less data to move means it should take less time to move it.
Documents and files which consist mostly of plain text or machine executable code tend to compress well. Examples include word processing documents, HTML files, some .exe files, and some database files.
Combining many small files into a single archive prior to network transfer can often result in faster speeds than transferring each file individually. This may be true even if the individual files themselves are not compressible. Many archiving utilities have options to pack files into an archive without compression, such as the "-0" option for "zip". ExpeDat will combine the contents of a folder into a single data stream when you enable Streaming Folders.
When Compression Is Bad
Many data types are not compressible, because the repeating patterns have already been removed. This includes most images, videos, songs, any data that is already compressed, or any data that has been encrypted.
Trying to compress data that is not compressible wastes CPU time. When you are trying to move data at high speeds, that CPU time may be critical to feeding the network. So by taking away processing time with worthless compression, you can actually end up moving your data much more slowly than if you had compression turned off.
If you are using a compression utility only for the purposes of combining many small files, check for options that disable compression. For example, the "zip" command has a "-0" option which packages files into an archive without spending time trying to compress them.
Inline versus Offline
Many transport mechanisms allow you to apply compression algorithms to data as its being transferred. This is convenient because the compression and decompression occur seamlessly without the user having to perform extra steps. But it is also risky because any CPU time spent on compression is time NOT being spent on feeding data through the network. If the network is very fast, the CPU is very slow, or the compression algorithm is unable to scale, having inline compression turned on may cause your data to move more slowly than if you turn compression off. Inline compression can be slower than no compression even when the data is compressible!
If you are going to be transferring the same data set multiple times, it pays to compress it first using Zip or Tar-Gzip. Then you can transfer the compressed archive without taking CPU cycles away from the network processing. If you are planning to encrypt your data, make sure you compress it first, then encrypt second.
Hidden Compression
Devices in your network may be applying compression without you realizing it. This becomes evident if the "speed" of the network seems to change for different data types. If the network seems slow when you are transferring data that is already compressed, but fast when you are transferring uncompressed text files, then you can be pretty sure that something out there is making compression decisions for you.
Network compression devices can be helpful in that they take the compression burden away from the end-point CPUs. But they can also create very inconsistent results since they will not work for all destinations and data types. Network level compression can also run into the same CPU trade-offs discussed above, resulting in some files moving more slowly than they would if there was no compression.
If you are testing the speed of your network, try using data that is already compressed or encrypted to ensure consistent results.
Should I Turn On Inline Compression?
For compressed data, images, audio, video, or encrypted files: No.
For other types of data, test it both ways to see which is faster.
If the network is very fast (hundreds of megabits per second or faster), consider turning off inline compression and instead compress the data before you move it.

iOS: Strategies for downloading very large data from web

I'm struggling with memory management in iOS while downloading relatively large files from the web (such as videos with 350MB size).
The goal here is to download these kind of files and store it on CoreData on a Binary Data field.
At the moment I'm using NSURLSession.dataTaskWithUrl and NSURLSession.dataTaskWithRequest methods to retrieve these files, but it looks like these methods don't treat problems such as memory usage, they just keep on filling the memory until it reaches its maximum usage, leaving me with a memory warning when I reach 380MB~.
Initial Memory Usage
Memory Warning
What's the best strategy to perform this kind of large data retrieval from the web without reaching a memory warning? Does AlamoFire and other libs can deal with this problem?
It is better to use download task.
And save the video as a file to Document or Library directory.
Then save the relative path to CoreData
If you use download task
You can resume if last download fail
Need less memory
You can try AFNetworking to download large files.

Partly loading a PDF into memory

is there a way to load (large) PDF files only partly? So, let's say: Don't load the complete PDF file, but only the first 5 pages.
Because I'm actually handling large PDF files (30 - 50 MB) and when I call CGPDFRetain the whole document, so the complete 30-50 MB are retained in memory.
Can somebody help me with that? Is it possible to fetch single pages out of PDF without first loading the complete PDF into memory?
Can somebody help me with that problem?
Update:
Due to the fact, that my app needs to support offline access, the PDFs should be loaded from local storage.
Update 2: I tried different strategies by now, but the app is still on memory edge, because I'm loading my PDF completely into the memory in one single step. But somehow it should be possible to support big PDF files, shouldn't it?
I don't know what CGPDFRetain is, so I might be totally off. PDF is designed in such a way that you only need parts of it to render it correctly. There is something called a "web optimized" PDF which has its objects arranged in a special way. Every webserver is able to send a byte range of a document, and these two mechanisms allow the partial loading of a PDF.
You should elaborate where you load the PDF.
It's not like this. CGPDFDocument POINTS to your a disk space and has parts cached in memory, but never the whole document.
There are some problems with CGPDFDocument getting too greedy with memory, but in that case just destroy and re-create the CGPDFDocument and you're fine. Otherwise, your app might just crash after CGPDFDocument has allocated too much memory.

OutOfMemoryException Processing Large File

We are loading a large flat file into BizTalk Server 2006 (Original release, not R2) - about 125 MB. We run a map against it and then take each row and make a call out to a stored procedure.
We receive the OutOfMemoryException during orchestration processing, the Windows Service restarts, uses full 2 GB memory, and crashes again.
The server is 32-bit and set to use the /3GB switch.
Also I've separated the flow into 3 hosts - one for receive, the other for orchestration, and the third for sends.
Anyone have any suggestions for getting this file to process wihout error?
Thanks,
Krip
If this is a flat file being sent through a map you are converting it to XML right? The increase in size could be huge. XML can easily add a factor of 5-10 times over a flat file. Especially if you use descriptive or long xml tag names (which normally you would).
Something simple you could try is to rename the xml nodes to shorter names, depending on the number of records (sounds like a lot) it might actually have a pretty significant impact on your memory footprint.
Perhaps a more enterprise approach, would be to subdivide this in a custom pipeline into separate message packets that can be fed through the system in more manageable chunks (similar to what Chris suggests). Then the system throttling and memory metrics could take over. Without knowing more about your data it would be hard to say how to best do this, but with a 125 MB file I am guessing that you probably have a ton of repeating rows that do not need to be processed sequentially.
Where does it crash? Does it make it past the Transform shape? Another suggestion to try is to run the transform in the Receive Port. For more efficient processing, you could even debatch the message and have multiple simultaneous orchestration instances be calling the stored procs. This would definately reduce the memory profile and increase performance.

Resources