How can I catch corrupt JPEGs when loading an image with imread() in OpenCV? - opencv

OpenCV says something like
Corrupt JPEG data: premature end of data segment
or
Corrupt JPEG data: bad Huffman code
or
Corrupt JPEG data: 22 extraneous bytes before marker 0xd9
when loading a corrupt jpeg image with imread().
Can I somehow catch that? Why would I get this information otherwise?
Do I have to check the binary file on my own?

OpenCV (version 2.4) does not overwrite the basic error handling for libjpeg, making them 'uncatchable'. Add the following method to modules/highgui/src/grfmt_jpeg.cpp, right below the definition of error_exit():
METHODDEF(void)
output_message( j_common_ptr cinfo )
{
char buffer[JMSG_LENGTH_MAX];
/* Create the message */
(*cinfo->err->format_message) (cinfo, buffer);
/* Default OpenCV error handling instead of print */
CV_Error(CV_StsError, buffer);
}
Now apply the method to the decoder error handler:
state->cinfo.err = jpeg_std_error(&state->jerr.pub);
state->jerr.pub.error_exit = error_exit;
state->jerr.pub.output_message = output_message; /* Add this line */
Apply the method to the encoder error handler as well:
cinfo.err = jpeg_std_error(&jerr.pub);
jerr.pub.error_exit = error_exit;
jerr.pub.output_message = output_message; /* Add this line */
Recompile and install OpenCV as usual. From now on you should be able to catch libjpeg errors like any other OpenCV error. Example:
>>> cv2.imread("/var/opencv/bad_image.jpg")
OpenCV Error: Unspecified error (Corrupt JPEG data: 1137 extraneous bytes before marker 0xc4) in output_message, file /var/opencv/opencv-2.4.9/modules/highgui/src/grfmt_jpeg.cpp, line 180
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
cv2.error: /var/opencv/opencv-2.4.9/modules/highgui/src/grfmt_jpeg.cpp:180: error: (-2) Corrupt JPEG data: 1137 extraneous bytes before marker 0xc4 in function output_message
(I've submitted a pull request for the above but it got rejected because it would cause issues with people reading images without exception catching.)
Hope this helps anyone still struggling with this issue. Good luck.

It could be easier to fix the error in the file instead of trying to repair the loading function of OpenCV. If you are using Linux you can use ImageMagick to make reparation to a set of images (is usual to have it installed by default):
$ mogrify -set comment 'Image rewritten with ImageMagick' *.jpg
This command changes a property of the file leaving the image data untouched. However, the image is loaded and resaved, eliminating the extra information that causes the corruption error.
If you need more information about ImageMagick you can visit their website: http://www.imagemagick.org/script/index.php

You cannot catch it if you use imread(). However there is imdecode() function that is called by imread(). Maybe it gives you more feedback. For this you would have to load the image into memory on your own and then call the decoder.
It boils down to: You have to dig through the OpenCV sources to solve your problem.

i had to deal with this recently and found a solution over here
http://artax.karlin.mff.cuni.cz/~isa_j1am/other/opencv/
i just need to make 2 edits # $cv\modules\highgui\src\grfmt_jpeg.cpp.
--- opencv-1.0.0.orig/otherlibs/highgui/grfmt_jpeg.cpp 2006-10-16 13:02:49.000000000 +0200
+++ opencv-1.0.0/otherlibs/highgui/grfmt_jpeg.cpp 2007-08-11 09:10:28.000000000 +0200
## -181,7 +181,7 ##
m_height = cinfo->image_height;
m_iscolor = cinfo->num_components > 1;
- result = true;
+ result = (cinfo->err->num_warnings == 0);
}
}
## -405,8 +405,9 ##
icvCvt_CMYK2Gray_8u_C4C1R( buffer[0], 0, data, 0, cvSize(m_width,1) );
}
}
- result = true;
+
jpeg_finish_decompress( cinfo );
+ result = (cinfo->err->num_warnings == 0);
}
}

I am using opencv python package to read some image and also met this error message. This error can not be catch by Python. But if you want to find which image is corrupted without recompiling opencv as #Robbert suggested, you can try the following method.
First, you can pinpoint the directory where the corrupt images reside, which is fairly easy. Then you go to the directory, and use mogrify command line tool provided by ImageMagick to change the image meta info, as suggest by #goe.
mogrify -set comment "errors fixed in meta info" -format png *.jpg
The above command will convert the original jpg image to png format and also clean the original image to remove errors in meta info. When you run mogrify command, it will also output some message about which image is corrupted in the directory so that you can accurately find the corrupted image.
After that, you can do whatever you want with the original corrupted jpg image.

Any one stumbles upon this post and reads this answer.
I had to get hold of a corrupted image file.
These websites can help you corrupt your file
Corrupt a file - The file corrupter you were looking for!
CORRUPT A FILE ONLINE
Corrupt my File
First and the third website was not that much useful.
Second website is interesting as I could set the amount of file that I need to corrupt.
OpenCV version I used here is 3.4.0
I used normal cv2.imread(fileLocation)
fileLocation Location of corrupted image file
OpenCV didn't show any error message for any of the corrupted files used here
First and Third website only gave one file and both had None stored in them, when I tried to print them
Second website did let me decide the amount of file that was needed to be corrupted
Corruption% Opencv message on printing the image
4% None
10% None
25% None
50% None Corrupt JPEG data: 3 extraneous bytes before marker 0x4f
75% None Corrupt JPEG data: 153 extraneous bytes before marker 0xb2
100% Corrupt JPEG data: 330 extraneous bytes before marker 0xc6 None
I guess the only check we have to make here would be
if image is not None:
Do your code or else pop an error

You can redirect stderr to a file, then after imread, search for the string "Huffman" inside that file. After searching the file, empty it. It works for me and now I am able to discard corrupted images and just process good ones.

If you load your image with imdecode, you can check errno :
std::vector<char> datas();
//Load yout image in datas here
errno = 0;
cv::Mat mat = cv::imdecode(datas, -1);
if (errno != 0)
{
//Error
}
(tested on OpenCV 3.4.1)

I found that the issue is in libjpeg. If OpenCV uses it, it gets error
Corrupt JPEG data: 22 extraneous bytes before marker 0xd9
You can try my solution to solve it. It disables JPEG during compilation. After that OpenCV cannot read/write, but it works.
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_SHARED_LIBS=OFF -D BUILD_EXAMPLES=OFF -D BUILD_TESTS=OFF -D BUILD_PERF_TESTS=OFF -D WITH_JPEG=OFF -D WITH_IPP=OFF ..

I found an easy solution without the need to recompile openCV.
You can use imagemagick to detect the same errors, however it returns an error as expected. See the description here: https://stackoverflow.com/a/66283167/2887398

Related

Extract/Convert an (multi-file)image that converted

I have a file made with a program , an image sticker maker .
I know this program saves it's images(probably an image, a bg and a mask) into single file with extension ".adf" .
I couldn't convert the output file with image magick cause of below error :
convert: no decode delegate for this image format `output.adf' # error/constitute.c/ReadImage/532.
I don't know how this Image converted with Image magick .
it's my -list configure result :
Path: [built-in]
Name Value
-------------------------------------------------------------------------------
NAME ImageMagick
Path: configure.xml
Name Value
-------------------------------------------------------------------------------
CC vs10
COPYRIGHT Copyright (C) 1999-2011 ImageMagick Studio LLC
DELEGATES bzlib freetype jpeg jp2 lcms png tiff x11 xml wmf zlib
FEATURES OpenMP
HOST Windows
LIB_VERSION 0x671
LIB_VERSION_NUMBER 6,7,1,0
NAME ImageMagick
RELEASE_DATE 2011-07-15
VERSION 6.7.1
WEBSITE http:// www.image magick.org
I attached the file :
src.adf
* EDIT *
if I run file command on src.adf it tells :
root#MexHex-PC:# file -vv src.adf
file-5.25
magic file from /etc/magic:/usr/share/misc/magic
What's missed !?
Thanks
This src.adf looks like a very minimal & flat data file. I know nothing about Dalahoo3D and/or ArcGis products, but we can quickly extract the embedded images with python.
import struct
with open('src.adf', 'rb') as f:
# Calculate file size.
f.seek(0, 2)
total_bytes = f.tell()
# Rewind to beging.
f.seek(0)
file_cursor = f.tell()
image_cursor = 0
while file_cursor < total_bytes:
# Can for start of JPEG.
if f.read(1) == b"\xFF":
if f.read(3) == b"\xD8\xFF\xE0":
print("JPEG FOUND!")
# Backup and find the size of the image
f.seek(-8, 1)
payload_size = struct.unpack('<I', f.read(4))[0]
# Write image to disk
d_filename = 'image{0}.jpeg'.format(image_cursor)
with open(d_filename, 'wb') as d:
d.write(f.read(payload_size))
image_cursor += 1
else:
f.seek(-3, 1) # Back cursor up, and try again.
file_cursor = f.tell()
Which dumps the following three images...
I'm sure this file was made with Imagemagick. I had already seen that one would convert the file to tiff image. He told me to do this with Imagemagick but did not explain the method.
I'm guessing this is just a matter of miscommunication. It's true that ImageMagick commonly handles JPEG / TIFF formats, but not geographic information systems and/or 3D modeling. That's usually extended by a vendor -- like ArcGIS. I would bet that ImageMagick is present in the workflow of generating TIFF files, but .ADF wouldn't be supported by ImageMagick until someone writes a delegate coder.
Update
From this question, it looks like you'll need to extend ImageMagick delegates to call GDAL utilities. You'll need to update the delegates.xml file to call the correct utility.

opencv read image “Premature end of JPEG file”

I was using OpenCV to read the images from a folder. A lot of messages like this show up:
Corrupt JPEG data: premature end of data segment
Premature end of JPEG file
Premature end of JPEG file
Premature end of JPEG file
How to catch this exception and remove these image files?
Since you said you are reading 'images' (multiple images), you would be looping through files in the folder that you are reading them from.
In that case, if you check if the image is valid or not by using the following :
Mat image;
image = imread(argv[1], CV_LOAD_IMAGE_COLOR); // Read the file
if(! image.data ) // Check for invalid input
{
cout << "Could not open or find the image" << std::endl ;
return -1;
}
you can then proceed to deleting files which are corrupt/bad.
I've been struggling to find a solution too. Read tens of articles, most of which just state that openCV does not throw errors and only outputs the error on stderr.
Some suggest to use PIL, but that does not detect most of the image corruptions. Usually only premature end of file.
However the same errors that OpenCV warns about can be detected via imagemagick.
Install imagemagick (https://imagemagick.org/)
Make sure you have it in the path.
Put the following sub into your code and call it to verify a file from wherever you need to. It also outputs errors to stderr, however it raises an error (thanks to "-regard-warnings")
import subprocess
def checkFile(imageFile):
try:
subprocess.run(["identify", "-regard-warnings", imageFile]).check_returncode()
return true
except (subprocess.CalledProcessError) as e:
return false
If you don't want the check to spam your outputs, add stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL params to the run function call.
On windows if you have not installed the legacy commands use the new syntax:
subprocess.run(["magick", "identify", "-regard-warnings", imageFile]).check_returncode()

using imread of OpenCV failed when the image is Ok

I encountered a problem when I want to read an image using the OpenCV function imread().
The image is Ok and I can show it in the image display software.
But when I use the imdecode() to get the image data, the data returns NULL.
I will upload the image and the code and hope some one could help me
Mat img = imread(image_name);
if(!img.data) return -1;
The image's link is here: http://img3.douban.com/view/photo/raw/public/p2198361185.jpg
PS: The image_name is all right.
I guess OpenCV cannot decode this image. So is there any way to decode this image using OpenCV?, like add new decode library. By the way, I can read this image using other image library such as freeImage.
Your image is in .gif and it is not supported by OpenCV as of now.
Note OpenCV offers support for the image formats Windows bitmap (bmp),
portable image formats (pbm, pgm, ppm) and Sun raster (sr, ras). With
help of plugins (you need to specify to use them if you build yourself
the library, nevertheless in the packages we ship present by default)
you may also load image formats like JPEG (jpeg, jpg, jpe), JPEG 2000
(jp2 - codenamed in the CMake as Jasper), TIFF files (tiff, tif) and
portable network graphics (png). Furthermore, OpenEXR is also a
possibility.
Source - Click here
You can use something like this, to perform the conversion.
I was able to load your image using imread using this. Also, you can check out FreeImage.
You can also try to use the library gif2numpy. It converts a gif image to a numpy image which then can be loaded by OpenCV:
import cv2, gif2numpy
np_images, extensions, image_specs = gif2numpy.convert("yourgifimage.gif")
cv2.imshow("np_image", np_images[0])
cv2.waitKey()
The library can be found here: https://github.com/bunkahle/gif2numpy It is not dependent on PIL or pillow for this like imageio.
There are two methods to read an image in OpenCV, one is using Mat the other one using IplImage. I see you have used the former one. You can try with the second argument of imread also:
image = imread("image.jpg", CV_LOAD_IMAGE_COLOR); // Read the file
else use IplImage
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc_c.h"
#include <opencv2/core/core.hpp>
IplImage* src = 0;
if( (src = cvLoadImage("filename.jpg",1)) == 0 )
{
printf("Cannot load file image %s\n", filename);
}
If they don't work please check if you have installed libjpeg, libtiff and other dependencies for reading an image in OpenCV.
Hope it would help.

Error during image decoding (imdecode)

I use puthon 2.7, windows 7 and opencv 2.4.6. and I try to run the following code:
https://github.com/kyatou/python-opencv_tutorial/blob/master/08_image_encode_decode.py
#import opencv library
import cv2
import sys
import numpy
argvs=sys.argv
if (len(argvs) != 2):
print 'Usage: # python %s imagefilename' % argvs[0]
quit()
imagefilename = argvs[1]
try:
img=cv2.imread(imagefilename, 1)
except:
print 'faild to load %s' % imagefilename
quit()
#encode to jpeg format
#encode param image quality 0 to 100. default:95
#if you want to shrink data size, choose low image quality.
encode_param=[int(cv2.IMWRITE_JPEG_QUALITY),90]
result,encimg=cv2.imencode('.jpg',img,encode_param)
if False==result:
print 'could not encode image!'
quit()
#decode from jpeg format
decimg=cv2.imdecode(encimg,1)
cv2.imshow('Source Image',img)
cv2.imshow('Decoded image',decimg)
cv2.waitKey(0)
cv2.destroyAllWindows()
I keep getting the following error:
encode_param=[int(cv2.IMWRITE_JPEG_QUALITY), 90]
AttributeError: 'module' object has no attribute 'IMWRITE_JPEG_QUALITY'
I have tried a lot of things: reinstall opencv, convert cv2 to cv code and searched different forums but I keep getting this error. Am I missing something? Is there someone who can run this code without getting the error?
BTW: Other opencv code (taking pictures from webcam) runs without problems....
At the moment I save the image to a temp JPG file. Using the imencode function I want to create the jpg file in the memory.
Thanks in advance and with best regards.
The problem is not in your code, it should work, but it is with your OpenCV Python package. I can't tell you why is raising that error, but you can avoid it by changing the line of the encode_param declaration by this one:
encode_param=[1, 90]

Google PageSpeed & ImageMagick JPG compression

Given a user uploaded image, I need to create various thumbnails of it for display on a website. I'm using ImageMagick and trying to make Google PageSpeed happy. Unfortunately, no matter what quality value I specify in the convert command, PageSpeed is still able to suggest compressing the image even further.
Note that http://www.imagemagick.org/script/command-line-options.php?ImageMagick=2khj9jcl1gd12mmiu4lbo9p365#quality mentions:
For the JPEG ... image formats,
quality is 1 [provides the] lowest
image quality and highest compression
....
I actually even tested compressing the image using 1 (it produced an unusable image, though) and PageSpeed still suggests that I can still optimize such image by "losslessly compressing" the image. I don't know how to compress an image any more using ImageMagick. Any suggestions?
Here's a quick way to test what I am talking about:
assert_options(ASSERT_BAIL, TRUE);
// TODO: specify valid image here
$input_filename = 'Dock.jpg';
assert(file_exists($input_filename));
$qualities = array('100', '75', '50', '25', '1');
$geometries = array('100x100', '250x250', '400x400');
foreach($qualities as $quality)
{
echo("<h1>$quality</h1>");
foreach ($geometries as $geometry)
{
$output_filename = "$geometry-$quality.jpg";
$command = "convert -units PixelsPerInch -density 72x72 -quality $quality -resize $geometry $input_filename $output_filename";
$output = array();
$return = 0;
exec($command, $output, $return);
echo('<img src="' . $output_filename . '" />');
assert(file_exists($output_filename));
assert($output === array());
assert($return === 0);
}
echo ('<br/>');
}
The JPEG may contain comments, thumbnails or metadata, which can be removed.
Sometimes it is possible to compress JPEG files more, while keeping the same quality. This is possible if the program which generated the image did not use the optimal algorithm or parameters to compress the image. By recompressing the same data, an optimizer may reduce the image size. This works by using specific Huffman tables for compression.
You may run jpegtran or jpegoptim on your created file, to reduce it further in size.
To minimize the image sizes even more, you should remove all meta data. ImageMagick can do this by adding a -strip to the commandline.
Have you also considered to put your thumbnail images as inline-d base64 encoded data into your HTML?
This can make your web page load much faster (even though the size gets a bit larger), because it saves the browser from running multiple requests for all the image files (the images) which are referenced in the HTML code.
Your HTML code for such an image would look like this:
<IMG SRC="data:image/png;base64,
iVBORw0KGgoAAAANSUhEUgAAAM4AAABJAQMAAABPZIvnAAAABGdBTUEAALGPC/xh
BQAAAAFzUkdCAK7OHOkAAAAgY0hSTQAAeiYAAICEAAD6AAAAgOgAAHUwAADqYAAA
OpgAABdwnLpRPAAAAAZQTFRFAAAA/wAAG/+NIgAAAAF0Uk5TAEDm2GYAAAABYktH
RACIBR1IAAAACXBIWXMAAABIAAAASABGyWs+AAAB6ElEQVQ4y+3UQY7bIBQG4IeQ
yqYaLhANV+iyi9FwpS69iGyiLuZYpepF6A1YskC8/uCA7SgZtVI3lcoiivkIxu/9
MdH/8U+N6el2pk0oFyibWyr1Q3+PlO2NqJV+/BnRPMjcJ9zrfJ/U+zQ9oAvo+QGF
d+npPqFQn++TXElkrEpEJhAtlTBR6dNHUuzIMhFnEhxAmJDkKxlmt7ATXDDJYcaE
r4Txqtkl42VYSH+t9KrD9b5nxZeog/LWGVHprGInGWVQUTvjDWXca5KdsowqyGSc
DrZRlGlQUl4kQwpUjiSS9gI9VdECZhHFQ2I+UE2CHJQfkNxTNKCl0RkURqlLowJK
1h1p3sjc0CJD39D4BIqD7JvvpH/GAxl2/YSq9mtHSHknga7OKNOHKyEdaFC2Dh1w
9VSJemBeGuHgMuh24EynK03YM1Lr83OjUle38aVSfTblT424rl4LhdglsUag5RB5
uBJSJBIiELSzaAeIN0pUlEeZEMeClC4cBuH6mxOlgPjC3uLproUCWfy58WPN/MZR
86ghc888yNdD0Tj8eAucasl2I5LqX19I7EmEjaYjSb9R/G1SYfQA7ZBuT5H6WwDt
UAfK1BOJmh/eZnKLeKvZ/vA8qonCpj1h6djfbqvW620Tva36++MXUkNDlFREMVkA
AAAldEVYdGRhdGU6Y3JlYXRlADIwMTItMDgtMjJUMDg6Mzc6NDUrMDI6MDBTUnmt
AAAAJXRFWHRkYXRlOm1vZGlmeQAyMDEyLTA4LTIyVDA4OjM3OjQ1KzAyOjAwIg/B
EQAAAA50RVh0bGFiZWwAImdvb2dsZSJdcbX4AAAAAElFTkSuQmCC"
ALT="google" WIDTH=214 HEIGHT=57 VSPACE=5 HSPACE=5 BORDER=0 />
And you would create the base64 encoded image data like this:
base64 -i image.jpg -o image.b64
Google performs those calculations based on it's WebP image format (https://developers.google.com/speed/webp/).
Despite giving performance gains though, it is currently supported only by chrome and opera (http://caniuse.com/webp)

Resources