Im thinking to use the Huffman coding to make an app that takes pictures right from the iPhone camera and compress it. Would it be possible for the hardware to handle the complex computation and building the tree ? In other words, is it doable?
Thank you
If you mean the image files (like jpg, png, etc), then you should know that they are already compressed with algorithms specific to images. The resulting files would not huffman compress much, if at all.
If you mean that you are going to take the UIImage raw pixel data and compress it, you could do that. I am sure that the iPhone could handle it.
If this is for a fun project, then go for it. If you want this to be a useful and used app, you will have some challenges
It is very unlikely that Huffman will be better than the standard image compression used in JPG, PNG, etc.
Apple has already seen a need for better compression and implemented HEIF in iOS 11. WWDC Video about HEIF
They did a lot of work in the OS and Photos app to make sure to use HEIF locally, but if you share the photo it turns it into something anyone could use (e.g. JPG)
All of the compression they implement uses hardware acceleration. You could do this too, but the code is a lot harder than Huffman.
So, for learning and fun, it's a good project -- it might be easier to do as a Mac app instead, but for something meant to be real, it would be extremely hard to overcome the above issues.
There are 2 parts, encoding and decoding. The encoding process involves constructing a tree or a table based representation of a tree. The decoding process covers reading from huff encoding bytes and undoing a delta. It would likely be difficult to get much speed advantage in the encoding as compared to PNG, but for decoding a very effective speedup can be seen by moving the decoding logic to the GPU with Metal. You can have a look at the full source code of an example that does just that for grayscale images on github Metal Huffman.
Related
I was wondering if the type of photo used to train an object detector makes a difference, I can't seem to find anything about this online. I am using opencv and dlib if that makes a difference but I am interested in a more general answer if possible.
Am I correct in assuming that lossless file formats would be better than lossey formats? And if training for an object jpg would be better than png as pngs are optimized for text and graphs?
As long as the compression doesn't introduce noticeable artifacts it generally won't matter. Also, many real world computer vision systems need to deal with video or images acquired from less than ideal sources. So you usually shouldn't assume you will get super high quality images anyway.
I've been doing research on the best way to do video processing on iOS using the latest technologies and have gotten a few different results. It seems there's ways to do this with Core Image, OpenGL, and some open source frameworks as well. I'd like to steer away from the open source options just so that I can learn what's going on behind the scenes, so the question is:
What is my best option for processing (filters, brightness, contrast, etc.) a pre-recorded video on iOS?
I know Core Image has a lot of great built in filters and has a relatively simple API, but I haven't found any resources on how to actually break down a video into images and then re-encode them. Any help on this topic would be extremely useful, thanks.
As you state, you have several options for this. Whichever you regard as "best" will depend on your specific needs.
Probably your simplest non-open-source route would be to use Core Image. Getting the best performance out of Core Image video filtering will still take a little work, since you'll need to make sure you're doing GPU-side processing for that.
In a benchmark application I have in my GPUImage framework, I have code that uses Core Image in an optimized manner. To do so, I set up AV Foundation video capture and create a CIImage from the pixel buffer. The Core Image context is set to render to an OpenGL ES context, and the properties on that (colorspace, etc.) are set to render quickly. The settings I use there are ones suggested by the Core Image team when I talked to them about this.
Going the raw OpenGL ES route is something I talk about here (and have a linked sample application there), but it does take some setup. It can give you a little more flexibility than Core Image because you can write completely custom shaders to manipulate images in ways that you might not be able to in Core Image. It used to be that this was faster than Core Image, but there's effectively no performance gap nowadays.
However, building your own OpenGL ES video processing pipeline isn't simple, and it involves a bunch of boilerplate code. It's why I wrote this, and I and others have put a lot of time into tuning it for performance and ease of use. If you're concerned about not understanding how this all works, read through the GPUImageVideo class code within that framework. That's what pulls frames from the camera and starts the video processing operation. It's a little more complex than my benchmark application, because it takes in YUV planar frames from the camera and converts those to RGBA in shaders in most cases, instead of grabbing raw RGBA frames. The latter is a little simpler, but there are performance and memory optimizations to be had with the former.
All of the above was talking about live video, but prerecorded video is much the same, only with a different AV Foundation input type. My GPUImageMovie class has code within it to take in prerecorded movies and process individual frames from that. They end up in the same place as frames you would have captured from a camera.
Currently, I am developing an app that needs to store large amount of text on an iPad. My question is, are algorithms like Huffman coding actually used in production? I just need a very simple compression algorithm (there's not going to be a huge amount of text and it only needs a more efficient method of storage), so would something like Huffamn work? Should I look into some other kind of compression library?
From Wikipedia on the subject:
Huffman coding today is often used as a "back-end" to some other compression methods. DEFLATE (PKZIP's algorithm) and multimedia codecs such as JPEG and MP3 have a front-end model and quantization followed by Huffman coding (or variable-length prefix-free codes with a similar structure, although perhaps not necessarily designed by using Huffman's algorithm).
So yes, Huffman coding is used in production. Quite a lot, even.
Huffman coding (also entropy coding) is used very widely. Anything you imagine that is being compressed, with exceptions of some very old schemes, uses them. Image compression, Zip and RAR archives, every imaginable codec and so on.
Keep in mind that Huffman coding is lossless and requires you to know all of the data you're compressing in advance. If you're doing lossy compression, you need to perform some transformations on your data to reduce its entropy first (removing and quantizing DCT coefficients in JPEG compression). If you want Huffnam coding to work on real-time data (you don't know every bit in advance), adaptive Huffman coding is used. You can find a whole lot on this topic in signal processing literature.
Some of the pre-Huffman compression include schemes like runlength coding (fax machines). Runlength coding is still sometimes used (JPEG, again) in combination with Huffman coding.
Yes, they are used in production.
As others have mentioned, true Huffman requires you to analyze the entire corpus first to get the most efficient encoding, so it isn't typically used by itself.
Probably shortly after you were born, I implemented Huffman compression on the Psion Series 3 handheld computer in C in order to compress data which was preloaded onto data packs and only decompressed on the handheld. In those days, space was tight and there was no built-in compression library.
Like most software which is well-specified, I would strongly consider using any feature built into the iOS or standard packages available in your development environment.
This will save a lot of debugging and allow you to concentrate on the most significant portions of your app which add value.
Large amounts of text will be amenable to zip-style compression. And it will be unlikely that spending effort improving its performance (in either space or time) will pay off in the long run.
There's an iOS embedded mechanism to support zlib algorithm (zlib.h in Objective-C).
You may implement your own compression functionality and utilize iOS embedded zlib functions. And compare the performance.
I think the embedded zlib functionality will be faster and will give higher compression ratio.
Huffman codes are the backbone to many "real world" production algorithms. The common compression algorithms today improve upon Huffman codes by transforming their data to improve compression ratios. Of course, there are many application specific techniques used to do this.
As for whether or not you should use Huffman codes, my question is why should you when you can achieve better compression and ease of code by using an already implemented 3rd party library?
Yes, I'm using a huffman compression in my web app for storing a complete snapshot of my engine in an hidden input field. First off it was just curiosity but it offload my SESSION memory moving it to the client browser memory and i used it to store it in a file to backup and exchange that snapshot with my collegue. Man, you have to see their faces when you can just load a file in an admin panel to load the engine in the web!!! It's basically a serialized compressed and base64 encoded array. It helps me to save about 15% bandwith but I think I can do it better now.
Yes, you're using Huffman coding (decoding) to read this page, since web pages are compressed to the gzip format. So it is used pretty much every nanosecond by web browsers all over the planet.
Huffman coding is almost never used by itself, but rather always with some higher-order modeling of the data to give the Huffman algorithm what it needs to work with. So LZ77 models text and other byte-oriented data as repeating strings coded as literals and length/distance pairs, which is then fed to Huffman coding, e.g. in the deflate compressed format using zlib. Or with difference or other prediction coding of pixels for PNG, followed by Huffman coding.
As for what you should use, look at lz4, zlib, and zstd.
I am interested in studying some image processing. I imagine matlab is the best way to go about that but right now I don't have access to matlab. I tried octave but for some reason it can't even load a png, bmp or anything other than 1 specific format. R doesn't seem to be the key here either.
What is the language of choice here? Perl?
Also can anyone point me to any other good tutorials that I may have missed on image processing?
Opencv is an excellent image processing library. Although written in C it comes with some high level tools to display images handle image files, mouse events etc so you can experiment without writing a lot of windows code.
It also works with python, although I haven't used it with the PIL.
If you are interested in how the algorithms work then implementing them yourself using python and numpy for the matrix ops is easy.
I guess it depends on what you want to do. Matlab certainly is a high end choice, but for a lot of things the image modules of general purpose programming languages do the trick.
I did some pixel mangling and image processing with PIL, the python image library. It is perfectly sufficient for processing single RGB images of reasonable size (say, what a consumer digital camera delivers). It can handle alpha channels, has some filters, more or less quick methods of accessing the pixel information - and it is python, a very straightforward and readable language.
The recommended language in my computer vision class was Ch with the OpenCV library. Ch is basically an interpreted version of C, the syntax is quite similar but has a few nice features, like treating arrays as matrices. OpenCV will house pretty much any image processing function you could need.
I think any free programming environments will do basic image processing well. If speed is not an issue, Processing will work fine and you can easily extend your code to Java in the future.
Have a look at Adobe Pixel Bender. It's really fun to play with.
I'm curious as to what type of experiences people have had with libgd. I am looking for an alternative to GDI+ (something faster). I have tried ImageMagick, but can't get the performance out of it i need.
I have heard that ligd is fast, but less feature rich, and that ImageMagick is slow, but more feature rich.
I only need very simple image processing procedures (scale, crop) on a limited number of formats mainly jpg. However I need very high quality interpolation and it has to be fast (well faster than GDI+).
I am considering trying libgd does this seem like a good fit, given my requirements?
My own experiences with libgd are very good.
I've used it on a website where I'm taking jpeg images from disk, adding a text title to them, and rendering them back out, and it feels just about as fast as if the file was being served straight from disk.
Given that each time it's decoding the jpeg, altering it, and then re-encoding it, that's pretty good!