How to produce thumbnails in real-time? - stream

Is there a program or script that can read an image on standard input and write a resized image to standard output without waiting for EOF on standard input? Poor quality is acceptable; waiting for the whole image to load is not.
ImageMagick (convert and stream alike) will read, then process, then output. What I want is more like a real-time stream processor: if I'm scaling down 50%, it should output one row of thumbnail for every two rows of input (roughly), regardless of the state of the input stream.
If this doesn't make sense yet, imagine you're loading an image over a slow network connection. As soon as it can, the browser starts displaying the top edge of the image. If the image is larger than the window, the browser scales it down to fit the window. It doesn't have to wait for the whole image to load.
Here are some of the tools I've used for testing. This serves an image on port 8080 in ten slices, with a one-second delay between slices to simulate a slow network connection:
IMAGE=test.jpg; SLICES=10; SIZE=$(stat -c "%s" $IMAGE); BS=$(($SIZE / $SLICES + 1)); (echo HTTP/1.0 200 OK; echo Content-Type: image/jpeg; echo; for i in $(seq 0 $(($SLICES - 1))); do dd if=$IMAGE bs=$BS skip=$i count=1; sleep 1; done) | nc -lp8080 -q0
Run that and immediately open localhost:8080 in your browser to see the image slowly load. If you pipe the image slices to convert or stream instead of nc (omitting all the echoes), no output appears for ten seconds, and then you get the whole thumbnail at once.

This is difficult depending on the image format. PNG, for example, is chunked, and each chunk zlib-compressed, so you have to read in a potentially large portion of the file before you can start rendering the image. BMP images are stored "bottom-up", where it renders from the lower right to the upper left, so unless your thumbnail will also be a BMP, you will have either read in the entire image or process the file backwards. JPEG can do this more readily; it's stored in order, and if it's a progressive JPEG you can abuse that and read in only the first N passes to get the thumbnail resolution you need. Wavelet formats like DJVU might also be more straightforward.
I don't think you'll find general-purpose tools that do this, but you could write a custom format-specific streaming decoder to handle it.

Related

USB webcam too slow in taking photos on raspberry pi

I'm using fswebcam to capture an image using node-red exec block running on a raspberry pi.
The time it takes to capture the image is 3+ seconds.
fswebcam -r 1280x720 image.jpg
I tried the same using OpenCV and the result is a little better but similar.
from cv2 import *
cam = VideoCapture(1)
s, img = cam.read()
if s:
imwrite("/home/pi/pythontest/tt.jpg",img) #save image
cam.release()
I'm guessing that it takes some time for the USB camera to initialize and take a picture which increases the time drastically. Is there any way to keep the camera initialized?
Any other workarounds to ameliorate this issue?
There may be other methods, but one way to do this is to run the camera continuously during periods when you want faster responses. You will need to consider some things though:
bandwidth used to capture images
wear on your SD card
access incomplete images midway through capture.
I'll leave you to determine what USB bandwidth you need for the resolution you are using.
As regards the second - wear on your SD card - I would suggest you capture to /tmp and ensure that is based on a RAM filesystem by becoming root and adding a line like this to your /etc/fstab:
tmpfs /tmp tmpfs defaults,noatime,nosuid 0 0
Then reboot. This way the data never goes near your SD card.
As regards the third - incomplete images still being captured - you can leverage the --exec option of fswebcam to get around this. Basically, you capture to one file and then after it is complete, you use --exec to rename the file to /tmp/latest.jpg and you use that in your application.
fswebcam -r 640x480 --loop 1 --exec 'mv /tmp/inprogress.jpg /tmp/latest.jpg' /tmp/inprogress.jpg
This relies on the fact that, under Unix at least, renaming a file does not affect any process that has that file open and that renaming is atomic. So your application will always either get either the entire new or the entire old file and never half a file still being written.
My camera produces images around 160kB, so I tested the file size like this in a tight loop, reading the file as fast as possible and only notifying me if it is far less than the normal size, i.e. truncated:
while : ; do l=$(wc -c < latest.jpg); [[ $l -lt 140000 ]] && echo $l; done
Try to profile your code (using cProfile e.g.) to ensure that issue is not in python interpreter start-up time or imwrite.
If the issue in a camera initialization, then I suppose that the only options is to write a daemon that will keep camera online and give you an image at your request

Scan video for text string?

My goal is to find the title screen from a movie trailer. I need a service where I can search a video for a string, then return the frame with that string. Pretty obscure, does anything like this exist?
e.g. for this movie, I'd scan for "Sausage Party" and retrieve this frame:
Edit: I found the cloudsight api which would actually work except cost is prohibitive # $.04 per call assuming I need to split the video into 1s intervals and scan every image (at least 60 calls per video).
No exact service that I can find, but you could attempt to do this yourself...
ffmpeg -i sausage_party.mp4 -r 1 %04d.png
/usr/local/bin/parallel --no-notice -j 8 \
/usr/local/bin/tesseract -psm 6 -l eng {} {.} \
::: *.png
This extracts one frame a second from the video file, and then uses tesseract to extract the text via OCR into files of the same name as the image frame (eg. 0135.txt. However your results are going to vary massively depending on the font used and the quality of the video file.
You'd probably find it cheaper/easier to use something like Amazon Mechanical Turk , especially since the OCR is going to have a hard time doing this automatically.
Another option could be implementing this service by yourself using the Scene Text Detection and Recognition module in OpenCV (docs.opencv.org/3.0-beta/modules/text/doc/text.html). You can take a look at this video to get an idea of how such a system would operate. As pointed out above the accuracy would depend on the font used in the movie titles, the quality of the video files, and the OCR.
OpenCV relies on Tesseract as the underlying OCR but, alternatively, you could use the text detection and localization functions (docs.opencv.org/3.0-beta/modules/text/doc/erfilter.html) in OpenCV to find text areas in the image and then employ a different OCR to perform the recognition. The text detection and localization stage can be done very quickly thus achieving real time performance would be mostly a matter of picking a fast OCR.

What is the fastest way to convert PostScript to GIF?

I am using the ImageMagick convert utility right now. I have a PostScript file that takes about 90 seconds to convert to GIF.
I am looking for a faster way to do this perferably by modifying the options to "convert".
When I say "fast", ideally a few seconds but I'll take any significant speed up. Something suitable for an interactive GUI.
I only need this in black and white or greyscale (specifically it is is an image of seismic data "wiggle traces" so B&W is fine.)
Other acceptable formats are BMP, GIF, JPEG, JPG, PCX, PGM, PNG, PNM, PPM, RAS, TGA, TIF, or TIFF.
Trying to stick with ImageMagick as that is already installed and trying to avoid selling my boss on anything new. Still happy to hear other suggestions.
My suggestion is: Use Ghostscript.
Since you have a working ImageMagick already installed, that means Ghostscript is also there: because ImageMagick cannot convert PDF or PostScript to raster images all by its own -- it has to call Ghostscript as its delegate to do this anyway.
Ghostscript can directly convert PDF/PostScript input to TIFF/TIF/TIFFg4, JPEG, PBM, PCX, PNG, PNM, PPM, BMP raster image output.
The advantages are: you don't need to have ImageMagick involved. So it's faster and also gives you more direct control over the conversion parameters. If you run Ghostscript via ImageMagick that's a level of indirection which isn't always required. (Sometimes it may be required to add some fine-tuning and post-processing manipulations to the raster image data that Ghostscript generated -- but that doesn't seem to be the case for you.)
The only disadvantage is: Ghostscript cannot produce GIF. If you required GIF (which you don't seem to), you need ImageMagick for post-processing the raster output of Ghostscript to GIF.
You can see how ImageMagick calls Ghostscript (and which parameters it uses for the call -- look for a printed line on stderr containing gs, gsx or gswin32c or gswin64c) by running for example:
convert -verbose some.pdf[0] some.gif
Update
I did run a very, very un-scientific 'benchmark', running the following two commands 100 time each, which convert the randomly picked page 333 of the official PDF specification (ISO version for PDF-1.7) to GIF, measuring the time consumed. I run these commands in concurrently parallel, so both should have had to deal with the same overall system load, making the results better comparable:
'Comfortably' using ImageMagick's convert to directly produce GIF:
time for i in $(seq -w 1 100); do
convert \
PDF32000_2008.pdf[333] \
p333-im-no_${i}.gif ;
done
Using Ghostscript to create from the same page grayscale PNGs, piping Ghostscript's output to ImageMagick's convert in order to get GIFs:
time for i in $(seq -w 1 100); do
gs \
-q \
-o - \
-dFirstPage=333 \
-dLastPage=333 \
-sDEVICE=pnggray \
PDF32000_2008.pdf \
| \
convert \
- \
p333-gs-no_${i}.gif ;
done
Timing esults for the first command (running the 'comfortable' convert to achieve the PDF->GIF transformation, which uses Ghostscript only 'behind our backs'):
real 2m29.282s
user 2m22.526s
sys 0m5.647s
Timing results for the second command (running gs directly + openly, piping it's output to convert:
real 1m27.370s
user 1m23.447s
sys 0m3.435s
One more thing:
The total size of the 100 'Ghostscript'-GIFs was 1,6 MByte -- but they were 8-bit grayscale.
The total size of the 100 'ImageMagic-direct'-GIFs was 1,2 MByte -- but they were 2-bit black+white.
I don't have the motivation currently to tweak the test commandline parameters more for even closer comparability of the resulting files.
This result (149 seconds vs. 87 seconds) gives me enough confidence into my guess that you can gain significant performance improvements when you follow my recommendation. :-)
I am using the ImageMagick convert utility right now. I have a
PostScript file that takes about 90 seconds to convert to GIF.
I am looking for a faster way to do this perferably by modifying the
options to "convert".
When I say "fast", ideally a few seconds but I'll take any significant
speed up. Something suitable for an interactive GUI.
I only need this in black and white or greyscale (specifically it is
is an image of seismic data "wiggle traces" so B&W is fine.)
You can start with GhostScript:
gs -dSAFER -dBATCH -dNOPAUSE \
-sDEVICE=pnggray -r300 -sOutputFile=seismic.png seismic.pdf
A very longer but interesting way would be to analyze exactly what is in those PDFs.
I had to do something similar with the PDF output of an EKG workflow. The original data were unavailable, we only had the PDF, but I discovered that the PDF was vector based and not raster. After a little hacking it was very easy to decode the labels, the legend and the single elementary lines making up the EKG diagram, and I came up with an option to recolor the tracks starting from what appeared a grayscale image. It did take several days, though.
It is possible that your PDF is generated in a similar way, and the data could be decoded (at first I had to use pdftk to get me a non-compressed PDF, then I found a library that I could use - it implemented the Deflate algorithm). It would be really cool to have output in SVG format :-)

Apple's Automator: compression settings for jpg?

When I run Apple's Automator to simply cut a bunch of images in their size Automator will also reduce the quality of the files (jpg) and they get blurry.
How can I prevent this? Are there settings that I can take control of?
Edit:
Or are there any other tools that do the same job but without affecting the image quality?
If you want to have finer control over the amount of JPEG compression, as kopischke said you'll have to use the sips utility, which can be used in a shell script. Here's how you would do that in Automator:
First get the files and the compression setting:
The Ask for Text action should not accept any input (right-click on it, select "Ignore Input").
Make sure that the first Get Value of Variable action is not accepting any input (right-click on them, select "Ignore Input"), and that the second Get Value of Variable takes the input from the first. This creates an array that is then passed on to the shell script. The first item in the array is the compression level that was given to the Automator Script. The second is the list of files that the script will do the sips command on.
In the options on the top of the Run Shell Script action, select "/bin/bash" as the Shell and select "as arguments" for Pass Input. Then paste this code:
itemNumber=0
compressionLevel=0
for file in "$#"
do
if [ "$itemNumber" = "0" ]; then
compressionLevel=$file
else
echo "Processing $file"
filename="$file"
sips -s format jpeg -s formatOptions $compressionLevel "$file" --out "${filename%.*}.jpg"
fi
((itemNumber=itemNumber+1))
done
((itemNumber=itemNumber-1))
osascript -e "tell app \"Automator\" to display dialog \"${itemNumber} Files Converted\" buttons {\"OK\"}"
If you click on Results at the bottom, it'll tell you what file it's currently working on. Have fun compressing!
Automator’s “Crop Images” and “Scale Images” actions have no quality settings – as is often the case with Automator, simplicity trumps configurability. However, there is another way to access CoreImage’s image manipulation facilities whithout resorting to Cocoa programming: the Scriptable Image Processing System, which makes image processing functions available to
the shell via the sips utility. You can fiddle with the most minute settings using this, but as it is a bit arcane in handling, you might be better served with the second way,
AppleScript via Image Events, a scriptable faceless background application provided by OS X. There are crop and scale commands, and the option of specifying a compression level when saving as a JPEG with
save <image> as JPEG with compression level (low|medium|high)
Use a “Run AppleScript” action instead of your “Crop” / “Scale” one and wrap the Image Events commands in a tell application "Image Events" block, and you should be set. For instance, to scale the image to half its size and save as a JPEG in best quality, overwriting the original:
on run {input, parameters}
set output to {}
repeat with aPath in input
tell application "Image Events"
set aPicture to open aPath
try
scale aPicture by factor 0.5
set end of output to save aPicture as JPEG with compression level low
on error errorMessage
log errorMessage
end try
close aPicture
end tell
end repeat
return output -- next action processes edited files.
end run
– for other scales, adjust the factor accordingly (1 = 100 %, .5 = 50 %, .25 = 25 % etc.); for a crop, replace the scale aPicture by factor X by crop aPicture to {width, height}. Mac OS X Automation has good tutorials on the usage of both scale and crop.
Eric's code is just brilliant. Can get most of the jobs done.
but if the image's filename contains space, this workflow will not work.(due to space will break the shell script when processing sips.)
There is a simple solution for this: add "Rename Finder Item" in this workflow.
replace spaces with "_" or anything you like.
then, it's good to go.
Comment from '20
I changed the script into a quick action, without any prompts (for compression as well as confirmation). It duplicates the file and renames the original version to _original. I also included nyam's solution for the 'space' problem.
You can download the workflow file here: http://mobilejournalism.blog/files/Compress%2080%20percent.workflow.zip (file is zipped, because otherwise it will be recognized as a folder instead of workflow file)
Hopefully this is useful for anyone searching for a solution like this (like I did an hour ago).
Comment from '17
To avoid "space" problem, it's smarter to change IFS than renaming.
Back up current IFS and change it to \n only. And restore original IFS after the processing loop.
ORG_IFS=$IFS
IFS=$'\n'
for file in $#
do
...
done
IFS=$ORG_IFS

pipeline image compression

I have a custom made web server running that I use for scanning documents. To activate the scanner and load the image on screen, I have a scan button that links to a page with the following image tag:
<img src="http://myserver/archive/location/name.jpg?scan" />
When the server receives the request for a ?scan file it streams the output of the following command, and writes it to disk on the requested location.
scanimage --resolution 150 --mode Color | convert - jpg:-
This works well and I am happy with this simple setup. The problem is that convert (ImageMagick) buffers the output of scanimage, and spits out the jpeg image only when the scan is complete. The result of this is that the webpage is loading for a long time with the risk of timeouts. It also keeps me from seeing the image as it is scanned, which should otherwise be possible because it is exactly how baseline encoded jpeg images show up on slow connections.
My question is: is it possible to do jpeg encoding without buffering the image, or is the operation inherently global? If it is possible, what tools could I use? One thought I had is separately encoding strips of eight lines, but I do not know how to put these chunks together. If it is not possible, is there another compression format that does allow this sort of pipeline encoding? My only restriction is that the format should be supported by the mainstream browsers.
Thanks!
You want to subdivide the image with a space-filling-curve. A sfc recursivley subivide the surface in smaller tiles and because of it's fractal dimension reduce the 2d complexity to a 1d complexity. When you have subdivide the image you can you use this curve to continously scan the image. Or you can use a BFS and some sort of an image-low-frequency-detail filter to continuously scan higher resolution of your image. You want to look for Nick's spatial index hilbert curve quadtree blog but I don't think you can put the tiles together with a jpg format (cat?). Or you can continously reduce the resolution?
scanimage --resolution [1-150] --mode Color | convert - jpg:-

Resources