What size should my image be to retrain Inception? - machine-learning

I am following
codelabs Tensorflow for poets guide to re-train inception v3 with my own images. But there is no mention of what the size my image should be. I also watched some Youtube video that suggested cropping and filling in white spaces to make square images. But it didn't really mention the size.
How should I resize my training images to so I get the best result re-training inception?

The code does the resizing for you. Have a look at retrain.py. I have listed the code responsible for deciding the size of the images depending on the network architecture.
if architecture == 'inception_v3':
# pylint: disable=line-too-long
data_url = 'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz'
# pylint: enable=line-too-long
bottleneck_tensor_name = 'pool_3/_reshape:0'
bottleneck_tensor_size = 2048
input_width = 299
input_height = 299
input_depth = 3
resized_input_tensor_name = 'Mul:0'

The code prepares images for you and automatically and feeds them into the network. All you need to do is to properly setup the folders and provide enough training images. In my experience the size of images doesn't matter too much. I did retraining following the instructions using 640x480 and 1280x1024 images. I got great results with training sizes of 400 - 1000 images per class.

Related

Tips for resizing very wide image for extracting deep-learning model feature-embedding

I am using Resnet50 to extract bottleneck features from input image. Resnet or any-other model will always have some input size requirement like: Resnet-50 requires input to be of shape: 224x224x3.
Problem: The input image is of size 5000x90, which is very wide. Resizing this very wide image into 224x224 has 2 major problems:
Information Loss
The aspect ratio is skewed, hence directly resizing it to square would not be a very good idea is what I think.
Example: I cannot put the actual image example but my images look something similar: (posting a random example from google image)
https://pbs.twimg.com/media/D_X0LXRVUAEW8Pg.jpg
My images are similar wide webpage-ads.
What I tried:
Cut the wide image in 7 parts and stack them vertically like: 5000x90 --> 714x90 ..... 7 times --> (stacking every part vertically) 714*630
Side Note: Since these are ads from web, I felt we can cut it across, let me know if that is even a valid idea.
Please fellow experienced developers, guide me on how to tackle this problem.

LibWebP is taking too much time for image compression

I'm working on the image compression techniques and analyzing the best algorithms that could produce smaller output within 100 ms or lesser for images of resolution 1920 * 1080 on a laptop with octa core processor for transmission through the network.
I had been using GDI+ and CxImage libraries for image compression, through JPG or PNG compression techniques that gives me the output image within around 30 ms for JPG and around 70 ms using PNG for colorful images, time taken is pretty good but the compressed data size is much higher in size if I go for better quality.
Then I came across google's WebP format. I tried it using libWebP with VC++. The quality and the compression rate is really awesome but the time taken is is much higher than I expected. It takes more than 300 ms and sime times even more than 1 second if I set true for alpha filtering and alpha compression.
Here are my WebpConfig settings
m_webp_config.quality = 50;
m_webp_config.alpha_quality = 0;
m_webp_config.lossless = false;
m_webp_config.method = 3;
m_webp_config.alpha_compression = false;
m_webp_config.alpha_filtering = false;
m_webp_config.autofilter = false;
m_webp_config.filter_sharpness = false;
m_webp_config.filter_strength = 0;
m_webp_config.filter_type = 0;
m_webp_config.use_sharp_yuv = false;
And sometimes I get black images whenever I capture command prompt or notepad++ (I suspect that the problem is with those images with lot of text data but the same is not true with web pages that has huge amount of text)
Am I doing anything wrong with the WebPConfig ? Is there a way to optimize it?.
I didn't find much documentation and any forums that can give me some idea with these problems.
Any help would be appreciated. Thanks in advance.
Try lossless compression? There can be modes of lossless that are faster than the lossy.
If you are in control of the compression and decompression, split the image into 256x256 squares and compress and send them separately. That way you can not only parallelize the computation, but also interleave some of the transmission to happen during the compression, which may simplify your time budgeting for the compression computation.
If you reduce the 'method' value, you will generally find a faster WebP compression. For lossless, you need to reduce BOTH method and quality, they control the same thing in a complicated manner. For lossless, try quality 20 and method 1 (or perhaps there is a 0 method, too, don't remember).

Capturing image at higher resolution in iOS

Currently i am working on an image app where i am able to capture an image by using AVFoundation framework. But what i am looking is , to capture an image with certain resolution and DPI (may be 300 DPI or greater).
How to do this ?
There have been numerous posts on here about trying to do OCR on camera generated images. The problem is not that the resolution is too low, but that its too high. I cannot find a link right now, but there was a question a year or so ago where in the end, if the image size was reduced by a factor of four or so, the OCR engine worked better. If you examine the image in Preview, what you want is the number of pixels per character to be say 16x16 or 32x32, not 256x256. Frankly I don't know the exact number but I'm sure you can research this and find posts from actual framework users telling you the best size.
Here is a nice response on how to best scale a large image (with a link to code).

Carrierwave - Processed image too big in size

I got a Carrierwave uploader and process images like this:
version :thumbnail do
process :resize_to_model
process :quality => 90
end
def resize_to_model
thumbs_size = model.thumbnail_size
resize_to_fill thumbs_size[:width], thumbs_size[:height]
end
However, after processing an image which was 1024x724px and is 214x151px afterwards the file size only went down from 2,1mb to 1,8mb. I think 1,8mb really is a lot for that size. Can I do something about that? Even with 90% quality the image should be like maybe 100kb or not?
Before someone asks, the rest works perfect. No errors, the size in px is right and everything else is also fine.
Edit: I forgot to mention I Use rmagick(resize_to_fill). Is that a reason maybe?
The difference between 100% and 90% quality is so small and the storage space savings is negligible. If you are truly just using this version as a thumbnail you should look at using a much lower quality, say 60% or 40%.
If you are concerned about making sure the quality is still "good enough" then you could also look at different compression techniques. The process used to provide #2x images for Retina displays can be used in this instance. A great resource is available in the Filament Group's article Compressive Images.
The tl;dr version is basically, use the original (or near original) size of the image but drastically reduce the image quality (to 0-20%). Then, when using the reduced quality image be sure to provide width and height attributes in the <img> element to resize it down to the thumbnail size. Because the image will be scaled down you will not see the reduction in quality of the "thumbnail" image.

Facebook image processing technique

Well, i wonder, what compression procession processes they are using..
I uploaded a test image of 2.3mb and suddenly downloaded it
It was only only 92 kbs, what the heck, only 92 kb's
and the thumbnail was only 11 kbs..
How this all is done and what algorithms are utilized.. how do i do it..
If I had to guess, the file size decrease is probably due primarily to just old-fashioned downsampling. Images on facebook are sized to be viewed on part of a screen, but not much larger.
For instance, I uploaded a picture that was 3456x2304 (3.2MB) which is 7,962,624 pixels. This was downsized by facebook to 960x602 (85kB) which is only 577,920 pixels. That only about 1/14th the total number of pixels.
This probably explains the majority of the difference, but it also looks like they are using the sRGB color profile, which can reduce file sizes.
One other possibility is that most JPEG encoders have a quality setting. They may be using a lower quality setting than that of the original.

Resources