I have a custom made web server running that I use for scanning documents. To activate the scanner and load the image on screen, I have a scan button that links to a page with the following image tag:
<img src="http://myserver/archive/location/name.jpg?scan" />
When the server receives the request for a ?scan file it streams the output of the following command, and writes it to disk on the requested location.
scanimage --resolution 150 --mode Color | convert - jpg:-
This works well and I am happy with this simple setup. The problem is that convert (ImageMagick) buffers the output of scanimage, and spits out the jpeg image only when the scan is complete. The result of this is that the webpage is loading for a long time with the risk of timeouts. It also keeps me from seeing the image as it is scanned, which should otherwise be possible because it is exactly how baseline encoded jpeg images show up on slow connections.
My question is: is it possible to do jpeg encoding without buffering the image, or is the operation inherently global? If it is possible, what tools could I use? One thought I had is separately encoding strips of eight lines, but I do not know how to put these chunks together. If it is not possible, is there another compression format that does allow this sort of pipeline encoding? My only restriction is that the format should be supported by the mainstream browsers.
Thanks!
You want to subdivide the image with a space-filling-curve. A sfc recursivley subivide the surface in smaller tiles and because of it's fractal dimension reduce the 2d complexity to a 1d complexity. When you have subdivide the image you can you use this curve to continously scan the image. Or you can use a BFS and some sort of an image-low-frequency-detail filter to continuously scan higher resolution of your image. You want to look for Nick's spatial index hilbert curve quadtree blog but I don't think you can put the tiles together with a jpg format (cat?). Or you can continously reduce the resolution?
scanimage --resolution [1-150] --mode Color | convert - jpg:-
Related
I am working on a project where my task is to identify machine part by its part number written on label attached to it or engraved on its surface. One such example of label and engraved part is shown in below figures.
My task is to recognise 9 or 10 alphanumerical number (03C 997 032 D in 1st image and 357 955 531 in 2nd image). This seems to be easy task however I am facing problem in distinguishing between useful information in the image and rest of the part i.e. there are many other numbers and characters in both image and I want to focus on only mentioned numbers. I tried many things but no success as of now. Does anyone know the image pre processing methods or any ML/DL model which I should apply to get desired result?
Thanks in advance!
JD
You can use OCR to the get all characters from the image and then use regular expressions to extract the desired patterns.
You can use OCR method, like Tesseract.
Maybe, you want to clean the images before running the text-recognition system, by performing some filtering to remove noise / remove extra information, such as:
Convert to gray scale (colors are not relevant, aren't them?)
Crop to region of interest
Canny Filter
A good start can be one of this tutorial:
OpenCV OCR with Tesseract (Python API)
Recognizing text/number with OpenCV (C++ API)
My goal is to find the title screen from a movie trailer. I need a service where I can search a video for a string, then return the frame with that string. Pretty obscure, does anything like this exist?
e.g. for this movie, I'd scan for "Sausage Party" and retrieve this frame:
Edit: I found the cloudsight api which would actually work except cost is prohibitive # $.04 per call assuming I need to split the video into 1s intervals and scan every image (at least 60 calls per video).
No exact service that I can find, but you could attempt to do this yourself...
ffmpeg -i sausage_party.mp4 -r 1 %04d.png
/usr/local/bin/parallel --no-notice -j 8 \
/usr/local/bin/tesseract -psm 6 -l eng {} {.} \
::: *.png
This extracts one frame a second from the video file, and then uses tesseract to extract the text via OCR into files of the same name as the image frame (eg. 0135.txt. However your results are going to vary massively depending on the font used and the quality of the video file.
You'd probably find it cheaper/easier to use something like Amazon Mechanical Turk , especially since the OCR is going to have a hard time doing this automatically.
Another option could be implementing this service by yourself using the Scene Text Detection and Recognition module in OpenCV (docs.opencv.org/3.0-beta/modules/text/doc/text.html). You can take a look at this video to get an idea of how such a system would operate. As pointed out above the accuracy would depend on the font used in the movie titles, the quality of the video files, and the OCR.
OpenCV relies on Tesseract as the underlying OCR but, alternatively, you could use the text detection and localization functions (docs.opencv.org/3.0-beta/modules/text/doc/erfilter.html) in OpenCV to find text areas in the image and then employ a different OCR to perform the recognition. The text detection and localization stage can be done very quickly thus achieving real time performance would be mostly a matter of picking a fast OCR.
I know this has been posted elsewhere and that this is no means a difficult problem but I'm very new to writing macros in FIJI and am having a hard time even understanding the solutions described in various online resources.
I have a series of images all in the same folder and want to apply the same operations to them all and save the resultant excel files and images in an output folder. Specifically, I'd like to open, smooth the image, do a Max intensity Z projection, then threshold the images to the same relative value.
This thresholding is one step causing a problem. By relative value I mean that I would like to set the threshold so that the same % of the intensity histogram is included. Currently, in FIJI if you go to image>adjust>threshold you can move the sliders such that a certain percentage of the image is thresholded and it will display that value for you in the open window. In my case 98% is what I am trying to achieve, eg thresholding all but the top 2% of the data.
Once the threshold is applied to the MIP, I convert it to binary and do particle analysis and save the results (summary table, results, image overlay.
My approach has been to try and automate all the steps/ do batch processing but I have been having a hard time adapting what I have written to work based on instructions found online. Instead I've been just opening every image in the directory one by one and applying the macro that I wrote, then saving the results manually. Obviously this is a tedious approach so any help would be much appreciated!
What I have been using for my simple macro:
run("Smooth", "stack");
run("Z Project...", "projection=[Max Intensity]");
setAutoThreshold("Default");
//run("Threshold...");
run("Convert to Mask");
run("Make Binary");
run("Analyze Particles...", " show=[Overlay Masks] display exclude clear include summarize in_situ");
You can use the Process ▶ Batch ▶ Macro... command for this.
For further details, see the Batch Processing page of the ImageJ wiki.
Apologies for tagging this just ImageJ - it's a problem regarding MicroManager, a microscopy plugin for it and I thought this would be best.
I'd recently taken images for an important experiment using MicroManager (a recent version, though I cannot recall the exact number). The IT services at my institution have recently been having some networking problems and my saved preferences for the software had been erased. I'd got half way through my experiment when I realised that I'd saved my images as separate image files (three greyscale TIFFs plus metadata text files) instead of OME-TIFF iamge stacks.
All of my ImageJ macros for image processing rely on having a multiple channel image stack, so this is a bit of a problem. Is there any easy way in MicroManager (or ImageJ) to bulk convert these single channel greyscale images into the OME-TIFF image stack after the images have already been taken?
Cheers.
You can start with a macro like this one:
// Convert your images to a stack
run("Images to Stack", "name=Stack title=[] use");
// The stack will default the images to time points. Convert to channels
run("Stack to Hyperstack...", "order=xyczt(default) channels=3 slices=1 frames=1 display=Color");
// Export as OME-TIFF
run("Bio-Formats Exporter");
This is designed to reconstruct one dataset at a time (open 3 images, run the macro and export the OME-TIFF).
If you don't want any dialogs to show you can pass an output directory to the Bio-Formats exporter:
run("Bio-Formats Exporter", "save=/path/to/image.ome.tif export compression=Uncompressed");
For the output file name you can get the original image name in the macro with getTitle()
There is also a template example on iterating over all the files in a directory, if you want to completely automate the macro. However this may take some tweaking since you want to operate on your images 3 at a time.
Hope that helps!
Is there a program or script that can read an image on standard input and write a resized image to standard output without waiting for EOF on standard input? Poor quality is acceptable; waiting for the whole image to load is not.
ImageMagick (convert and stream alike) will read, then process, then output. What I want is more like a real-time stream processor: if I'm scaling down 50%, it should output one row of thumbnail for every two rows of input (roughly), regardless of the state of the input stream.
If this doesn't make sense yet, imagine you're loading an image over a slow network connection. As soon as it can, the browser starts displaying the top edge of the image. If the image is larger than the window, the browser scales it down to fit the window. It doesn't have to wait for the whole image to load.
Here are some of the tools I've used for testing. This serves an image on port 8080 in ten slices, with a one-second delay between slices to simulate a slow network connection:
IMAGE=test.jpg; SLICES=10; SIZE=$(stat -c "%s" $IMAGE); BS=$(($SIZE / $SLICES + 1)); (echo HTTP/1.0 200 OK; echo Content-Type: image/jpeg; echo; for i in $(seq 0 $(($SLICES - 1))); do dd if=$IMAGE bs=$BS skip=$i count=1; sleep 1; done) | nc -lp8080 -q0
Run that and immediately open localhost:8080 in your browser to see the image slowly load. If you pipe the image slices to convert or stream instead of nc (omitting all the echoes), no output appears for ten seconds, and then you get the whole thumbnail at once.
This is difficult depending on the image format. PNG, for example, is chunked, and each chunk zlib-compressed, so you have to read in a potentially large portion of the file before you can start rendering the image. BMP images are stored "bottom-up", where it renders from the lower right to the upper left, so unless your thumbnail will also be a BMP, you will have either read in the entire image or process the file backwards. JPEG can do this more readily; it's stored in order, and if it's a progressive JPEG you can abuse that and read in only the first N passes to get the thumbnail resolution you need. Wavelet formats like DJVU might also be more straightforward.
I don't think you'll find general-purpose tools that do this, but you could write a custom format-specific streaming decoder to handle it.