OpenCV read image from csv file - opencv

I have image in csv file and i want to load it in my program. I found that I can load image from cvs like this:
CvMLData mlData;
mlData.read_csv(argv[1]);
const CvMat* tmp = mlData.get_values();
cv::Mat img(tmp, true),img1;
img.convertTo(img, CV_8UC3);
cv::namedWindow("img");
cv::imshow("img", img);
I have RGB picture in that file but I got grey picture... Can somebody explain me how to load color image or how can I modify this code to get color image?
Thanks!

Updated
Ok, I don't know how to read your file into OpenCV for the moment, but I can offer you a work-around to get you started. The following will create a header for a PNM format file to match your CSV file and then append your data onto the end and you should end up with a file that you can load.
printf "P3\n284 177\n255\n" > a.pnm # Create PNM header
tr -d ',][' < izlaz.csv >> a.pnm # Append CSV data, after removing commas and []
If I do the above, I can see your bench, tree and river.
If you cannot read that PNM file directly into OpenCV, you can make it into a JPEG with ImageMagick like this:
convert a.pnm a.jpg
I also had a look at the University of Wisconsin ML data archive, that is read with those OpenCV functions that you are using, and the format of their data is different from yours... theirs is like this:
1000025,5,1,1,1,2,1,3,1,1,2
1002945,5,4,4,5,7,10,3,2,1,2
1015425,3,1,1,1,2,2,3,1,1,2
1016277,6,8,8,1,3,4,3,7,1,2
yours looks like this:
[201, 191, 157, 201 ... ]
So maybe this tr command is enough to convert your data:
tr -d '][' < izlaz.csv > TryMe.csv
Original Answer
If you run the following on your CSV file, it translates commas into newlines and then counts the lines:
tr "," "\n" < izlaz.csv | wc -l
And that gives 150,804 lines, which means 150,804 commas in your file and therefore 150,804 integers in your file (+/- 1 or 2). If your greyscale image is 177 rows by 852 columns, you are going to need 150,804 RGB triplets (i.e. 450,000 +/- integers) to represent a colour image, as it is you only have a single greyscale value for each pixel.
The fault is in the way you write the file, not the way you read it.

To see color image I must set number of channels. So this code works for me:
CvMLData mlData;
mlData.read_csv(argv[1]);
const CvMat* tmp = mlData.get_values();
cv::Mat img(tmp, true),img1;
img.convertTo(img, CV_8UC3);
img= img.reshape(3); //set number of channels

Related

PySimpleGUI/tk: drawing OpenCV image into sg.Graph efficiently

An image captured from a camera is stored into a numpy ndarray object with a shape of (1224,1024,3).
This format is very convenient for using OpenCV methods over it.
I was looking for the way to draw it into (or onto) an sg.Graph element of PySimpleGUI.
The method I have found worked, but was very inefficient:
def draw_img(self, img):
# turn the image into a PIL image object:
pil_im = Image.fromarray(img)
# use PIL to convert the image into an in-memory PNG file
with BytesIO() as output:
pil_im.save(output, format="PNG")
png = output.getvalue()
# remove any previous elements from the canvas of our sg.Graph:
self.image_element.erase()
# add an image into the sg.Graph element
self.image_element.draw_image(data=png, location=(0, self.img_sz[1]))
The reason for being inefficient is clearly because we are encoding the raw image into PNG.
However I could not find any better way to do this! In my case, I had to show every frame coming from the camera, and it was way too slow.
So what is a better way to do it?
I could not find any 'by the book' solution, but I have found a working way.
The idea came from a discussion in this page. With it, I had to copy the original code of draw_image in PySimpleGUI and modify it.
def draw_img(self, img):
# turn our ndarray into a bytesarray of PPM image by adding a simple header:
# this header is good for RGB. for monochrome, use P5 (look for PPM docs)
ppm = ('P6 %d %d 255 ' % (self.img_sz[0], self.img_sz[1])).encode('ascii') + img.tobytes()
# turn that bytesarray into a PhotoImage object:
image = tk.PhotoImage(width=self.img_sz[0], height=self.img_sz[1], data=ppm, format='PPM')
# for first time, create and attach an image object into the canvas of our sg.Graph:
if self.img_id is None:
self.img_id = self.image_element.Widget.create_image((0, 0), image=image, anchor=tk.NW)
# we must mimic the way sg.Graph keeps a track of its added objects:
self.image_element.Images[self.img_id] = image
else:
# we reuse the image object, only changing its content
self.image_element.Widget.itemconfig(self.img_id, image=image)
# we must update this reference too:
self.image_element.Images[self.img_id] = image
Using this method, I could achieve a great performance, so I decided to share this solution with the community. Hoping this will help anybody!
Thank you for helping me. I modified your plan.
my code:
def image_source(path,width=128,height=128):
img = Image.open(path).convert('RGB')#PIL image
image = img.resize((width, height))
ppm = ('P6 %d %d 255 ' % (width, height)).encode('ascii') + image.tobytes()
return ppm
image1=image_source(r'Z:\xxxx6.png')
layout = [.... sg.Image(data=image1) ...

Python parallelization for code to combine multiple images

I am new to Python and am trying to parallelize a program that I somehow pieced together from the internet. The program reads all image files (usually multiple series of images such as abc001,abc002...abc015 and xyz001,xyz002....xyz015) in a specific folder and then combines images in a specified range. Most times, the number of files exceeds 10000, and my latest case requires me to combine 24000 images. Could someone help me with:
Taking 2 sets of images from different directories. Currently I have to move these images into 1 directory and then work in said directory.
Reading only specified files. Currently my program reads all files, saves names in an array (I think it's an array. Could be a directory also) and then uses only the images required to combine. If I specify a range of files, it still checks against all files in the directory and takes a lot of time.
Parallel Processing - I work with usually 10k files or sometimes more. These are images saved from the fluid simulations that I run at specific times. Currently, I save about 2k files at a time in separate folders and run the program to combine these 2000 files at one time. And then I copy all the output files to a separate folder to keep them together. It would be great if I could use all 16 cores on the processor to combine all files in 1 go.
Image series 1 is like so.
Consider it to be a series of photos of the cat walking towards the camera. Each frame is is suffixed with 001,002,...,n.
Image series 1 is like so.
Consider it to be a series of photos of the cat's expression changing with each frame. Each frame is is suffixed with 001,002,...,n.
The code currently combines each frame from set1 and set2 to provide output.png as shown in the link here.
import sys
import os
from PIL import Image
keywords=input('Enter initial characters of image series 1 [Ex:Scalar_ , VoF_Scene_]:\n')
keywords2=input('Enter initial characters of image series 2 [Ex:Scalar_ , VoF_Scene_]:\n')
directory = input('Enter correct folder name where images are present :\n') # FOLDER WHERE IMAGES ARE LOCATED
result1 = {}
result2={}
name_count1=0
name_count2=0
for filename in os.listdir(directory):
if keywords in filename:
name_count1 +=1
result1[name_count1] = os.path.join(directory, filename)
if keywords2 in filename:
name_count2 +=1
result2[name_count2] = os.path.join(directory, filename)
num1=input('Enter initial number of series:\n')
num2=input('Enter final number of series:\n')
num1=int(num1)
num2=int(num2)
if name_count1==(num2-num1+1):
a1=1
a2=name_count1
elif name_count2==(num2-num1+1):
a1=1
a2=name_count2
else:
a1=num1
a2=num2+1
for x in range(a1,a2):
y=format(x,'05') # '05' signifies number of digits in the series of file name Ex: [Scalar_scene_1_00345.png --> 5 digits], [Temperature_section_2_951.jpg --> 3 digits]. Change accordingly
y=str(y)
for comparison_name1 in result1:
for comparison_name2 in result2:
test1=result1[comparison_name1]
test2=result2[comparison_name2]
if y in test1 and y in test2:
a=test1
b=test2
test=[a,b]
images = [Image.open(x) for x in test]
widths, heights = zip(*(i.size for i in images))
total_width = sum(widths)
max_height = max(heights)
new_im = Image.new('RGB', (total_width, max_height))
x_offset = 0
for im in images:
new_im.paste(im, (x_offset,0))
x_offset += im.size[0]
output_name='output'+y+'.png'
new_im.save(os.path.join(directory, output_name))
I did a Python version as well, it's not quite as fast but it is maybe closer to your heart :-)
#!/usr/bin/env python3
import cv2
import numpy as np
from multiprocessing import Pool
def doOne(params):
"""Append the two input images side-by-side to output the third."""
imA = cv2.imread(params[0], cv2.IMREAD_UNCHANGED)
imB = cv2.imread(params[1], cv2.IMREAD_UNCHANGED)
res = np.hstack((imA, imB))
cv2.imwrite(params[2], res)
if __name__ == '__main__':
# Build the list of jobs - each entry is a tuple with 2 input filenames and an output filename
jobList = []
for i in range(1000):
# Horizontally append a-XXXXX.png to b-XXXXX.png to make c-XXXXX.png
jobList.append( (f'a-{i:05d}.png', f'b-{i:05d}.png', f'c-{i:05d}.png') )
# Make a pool of processes - 1 per CPU core
with Pool() as pool:
# Map the list of jobs to the pool of processes
pool.map(doOne, jobList)
You can do this a little quicker with libvips. To join two images left-right, enter:
vips join left.png out.png result.png horizontal
To test, I made 200 pairs of 1200x800 PNGs like this:
for i in {1..200}; do cp x.png left$i.png; cp x.png right$i.png; done
Then tried a benchmark:
time parallel vips join left{}.png right{}.png result{}.png horizontal ::: {1..200}
real 0m42.662s
user 2m35.983s
sys 0m6.446s
With imagemagick on the same laptop I see:
time parallel convert left{}.png right{}.png +append result{}.png ::: {1..200}
real 0m55.088s
user 3m24.556s
sys 0m6.400s
You can do that much faster without Python, and using multi-processing with ImageMagick or libvips.
The first part is all setup:
Make 20 images, called a-000.png ... a-019.png that go from red to blue:
convert -size 64x64 xc:red xc:blue -morph 18 a-%03d.png
Make 20 images, called b-000.png ... b-019.png that go from yellow to magenta:
convert -size 64x64 xc:yellow xc:magenta -morph 18 b-%03d.png
Now append them side-by-side into c-000.png ... c-019.png
for ((f=0;f<20;f++))
do
z=$(printf "%03d" $f)
convert a-${z}.png b-${z}.png +append c-${z}.png
done
Those images look like this:
If that looks good, you can do them all in parallel with GNU Parallel:
parallel convert a-{}.png b-{}.png +append c-{}.png ::: {1..19}
Benchmark
I did a quick benchmark and made 20,000 images a-00000.png...a-019999.png and another 20,000 images b-00000.png...b-019999.png with each image 1200x800 pixels. Then I ran the following command to append each pair horizontally and write 20,000 output images c-00000.png...c-019999.png:
seq -f "%05g" 0 19999 | parallel --eta convert a-{}.png b-{}.png +append c-{}.png
and that takes 16 minutes on my MacBook Pro with all 12 CPU cores pegged at 100% throughout. Note that you can:
add spacers between the images,
write annotation onto the images,
add borders,
resize
if you wish and do lots of other processing - this is just a simple example.
Note also that you can get even quicker times - in the region of 10-12 minutes if you accept JPEG instead of PNG as the output format.

How do I create a dataset with multiple images the same format as CIFAR10?

I have images 1750*1750 and I would like to label them and put them into a file in the same format as CIFAR10. I have seen a similar answer before that gave an answer:
label = [3]
im = Image.open(img)
im = (np.array(im))
print(im)
r = im[:,:,0].flatten()
g = im[:,:,1].flatten()
b = im[:,:,2].flatten()
array = np.array(list(label) + list(r) + list(g) + list(b), np.uint8)
array.tofile("info.bin")
but it doesn't include how to add multiple images in a single file. I have looked at CIFAR10 and tried to append the arrays in the same way, but all I got was the following error:
E tensorflow/core/client/tensor_c_api.cc:485] Read less bytes than requested
Note that I am using Tensorflow to do my computations, and I have been able to isolate the problem from the data.
The CIFAR-10 binary format represents each example as a fixed-length record with the following format:
1-byte label.
1 byte per pixel for the red channel of the image.
1 byte per pixel for the green channel of the image.
1 byte per pixel for the blue channel of the image.
Assuming you have a list of image filenames called images, and a list of integers (less than 256) called labels corresponding to their labels, the following code would write a single file containing these images in CIFAR-10 format:
with open(output_filename, "wb") as f:
for label, img in zip(labels, images):
label = np.array(label, dtype=np.uint8)
f.write(label.tostring()) # Write label.
im = np.array(Image.open(img), dtype=np.uint8)
f.write(im[:, :, 0].tostring()) # Write red channel.
f.write(im[:, :, 1].tostring()) # Write green channel.
f.write(im[:, :, 2].tostring()) # Write blue channel.

Matching image to images collection

I have large collecton of card images, and one photo of particular card. What tools can I use to find which image of collection is most similar to mine?
Here's collection sample:
Abundance
Aggressive Urge
Demystify
Here's what I'm trying to find:
Card Photo
New method!
It seems that the following ImageMagick command, or maybe a variation of it, depending on looking at a greater selection of your images, will extract the wording at the top of your cards
convert aggressiveurge.jpg -crop 80%x10%+10%+10% crop.png
which takes the top 10% of your image and 80% of the width (starting at 10% in from the top left corner and stores it in crop.png as follows:
And if your run that through tessseract OCR as follows:
tesseract crop.png agg
you get a file called agg.txt containing:
E‘ Aggressive Urge \L® E
which you can run through grep to clean up, looking only for upper and lower case letters adjacent to each other:
grep -Eo "\<[A-Za-z]+\>" agg.txt
to get
Aggressive Urge
:-)
Thank you for posting some photos.
I have coded an algorithm called Perceptual Hashing which I found by Dr Neal Krawetz. On comparing your images with the Card, I get the following percentage measures of similarity:
Card vs. Abundance 79%
Card vs. Aggressive 83%
Card vs. Demystify 85%
so, it is not an ideal discriminator for your image type, but kind of works somewhat. You may wish to play around with it to tailor it for your use case.
I would calculate a hash for each of the images in your collection, one at a time and store the hash for each image just once. Then, when you get a new card, calculate its hash and compare it to the stored ones.
#!/bin/bash
################################################################################
# Similarity
# Mark Setchell
#
# Calculate percentage similarity of two images using Perceptual Hashing
# See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
#
# Method:
# 1) Resize image to black and white 8x8 pixel square regardless
# 2) Calculate mean brightness of those 64 pixels
# 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
# 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
#
# If finding difference between Perceptual Hashes, simply total up number of bits
# that differ between the two strings - this is the Hamming distance.
#
# Requires ImageMagick - www.imagemagick.org
#
# Usage:
#
# Similarity image|imageHash [image|imageHash]
# If you pass one image filename, it will tell you the Perceptual hash as a 16
# character hex string that you may want to store in an alternate stream or as
# an attribute or tag in filesystems that support such things. Do this in order
# to just calculate the hash once for each image.
#
# If you pass in two images, or two hashes, or an image and a hash, it will try
# to compare them and give a percentage similarity between them.
################################################################################
function PerceptualHash(){
TEMP="tmp$$.png"
# Force image to 8x8 pixels and greyscale
convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"
# Calculate mean brightness and correct to range 0..255
MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)
# Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
hash=""
for i in {0..7}; do
for j in {0..7}; do
pixel=$(convert "${TEMP}"[1x1+${i}+${j}] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
bit="0"
[ $pixel -gt $MEAN ] && bit="1"
hash="$hash$bit"
done
done
hex=$(echo "obase=16;ibase=2;$hash" | bc)
printf "%016s\n" $hex
#rm "$TEMP" > /dev/null 2>&1
}
function HammingDistance(){
# Convert input hex strings to upper case like bc requires
STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
STR2=$(tr '[a-z]' '[A-Z]' <<< $2)
# Convert hex to binary and zero left pad to 64 binary digits
STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))
# Calculate Hamming distance between two strings, each differing bit adds 1
hamming=0
for i in {0..63};do
a=${STR1:i:1}
b=${STR2:i:1}
[ $a != $b ] && ((hamming++))
done
# Hamming distance is in range 0..64 and small means more similar
# We want percentage similarity, so we do a little maths
similarity=$((100-(hamming*100/64)))
echo $similarity
}
function Usage(){
echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
exit 1
}
################################################################################
# Main
################################################################################
if [ $# -eq 1 ]; then
# Expecting a single image file for which to generate hash
if [ ! -f "$1" ]; then
echo "ERROR: File $1 does not exist" >&2
exit 1
fi
PerceptualHash "$1"
exit 0
fi
if [ $# -eq 2 ]; then
# Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
if [ -f "$1" ]; then
hash1=$(PerceptualHash "$1")
else
hash1=$1
fi
if [ -f "$2" ]; then
hash2=$(PerceptualHash "$2")
else
hash2=$2
fi
HammingDistance $hash1 $hash2
exit 0
fi
Usage
I also tried a normalised cross-correlation of each of your images with the card, like this:
#!/bin/bash
size="300x400!"
convert card.png -colorspace RGB -normalize -resize $size card.jpg
for i in *.jpg
do
cc=$(convert $i -colorspace RGB -normalize -resize $size JPG:- | \
compare - card.jpg -metric NCC null: 2>&1)
echo "$cc:$i"
done | sort -n
and I got this output (sorted by match quality):
0.453999:abundance.jpg
0.550696:aggressive.jpg
0.629794:demystify.jpg
which shows that the card correlates best with demystify.jpg.
Note that I resized all images to the same size and normalized their contrast so that they could be readily compared and effects resulting from differences in contrast are minimised. Making them smaller also reduces the time needed for the correlation.
I tried this by arranging the image data as a vector and taking the inner-product between the collection image vectors and the searched image vector. The vectors that are most similar will give the highest inner-product. I resize all the images to the same size to get equal length vectors so I can take inner-product. This resizing will additionally reduce inner-product computational cost and give a coarse approximation of the actual image.
You can quickly check this with Matlab or Octave. Below is the Matlab/Octave script. I've added comments there. I tried varying the variable mult from 1 to 8 (you can try any integer value), and for all those cases, image Demystify gave the highest inner product with the card image. For mult = 8, I get the following ip vector in Matlab:
ip =
683007892
558305537
604013365
As you can see, it gives the highest inner-product of 683007892 for image Demystify.
% load images
imCardPhoto = imread('0.png');
imDemystify = imread('1.jpg');
imAggressiveUrge = imread('2.jpg');
imAbundance = imread('3.jpg');
% you can experiment with the size by varying mult
mult = 8;
size = [17 12]*mult;
% resize with nearest neighbor interpolation
smallCardPhoto = imresize(imCardPhoto, size);
smallDemystify = imresize(imDemystify, size);
smallAggressiveUrge = imresize(imAggressiveUrge, size);
smallAbundance = imresize(imAbundance, size);
% image collection: each image is vectorized. if we have n images, this
% will be a (size_rows*size_columns*channels) x n matrix
collection = [double(smallDemystify(:)) ...
double(smallAggressiveUrge(:)) ...
double(smallAbundance(:))];
% vectorize searched image. this will be a (size_rows*size_columns*channels) x 1
% vector
x = double(smallCardPhoto(:));
% take the inner product of x and each image vector in collection. this
% will result in a n x 1 vector. the higher the inner product is, more similar the
% image and searched image(that is x)
ip = collection' * x;
EDIT
I tried another approach, basically taking the euclidean distance (l2 norm) between reference images and the card image and it gave me very good results with a large collection of reference images (383 images) I found at this link for your test card image.
Here instead of taking the whole image, I extracted the upper part that contains the image and used it for comparison.
In the following steps, all training images and the test image are resized to a predefined size before doing any processing.
extract the image regions from training images
perform morphological closing on these images to get a coarse approximation (this step may not be necessary)
vectorize these images and store in a training set (I call it training set even though there's no training in this approach)
load the test card image, extract the image region-of-interest(ROI), apply closing, then vectorize
calculate the euclidean distance between each reference image vector and the test image vector
choose the minimum distance item (or the first k items)
I did this in C++ using OpenCV. I'm also including some test results using different scales.
#include <opencv2/opencv.hpp>
#include <iostream>
#include <algorithm>
#include <string.h>
#include <windows.h>
using namespace cv;
using namespace std;
#define INPUT_FOLDER_PATH string("Your test image folder path")
#define TRAIN_IMG_FOLDER_PATH string("Your training image folder path")
void search()
{
WIN32_FIND_DATA ffd;
HANDLE hFind = INVALID_HANDLE_VALUE;
vector<Mat> images;
vector<string> labelNames;
int label = 0;
double scale = .2; // you can experiment with scale
Size imgSize(200*scale, 285*scale); // training sample images are all 200 x 285 (width x height)
Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
// get all training samples in the directory
hFind = FindFirstFile((TRAIN_IMG_FOLDER_PATH + string("*")).c_str(), &ffd);
if (INVALID_HANDLE_VALUE == hFind)
{
cout << "INVALID_HANDLE_VALUE: " << GetLastError() << endl;
return;
}
do
{
if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
{
Mat im = imread(TRAIN_IMG_FOLDER_PATH+string(ffd.cFileName));
Mat re;
resize(im, re, imgSize, 0, 0); // resize the image
// extract only the upper part that contains the image
Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
// get a coarse approximation
morphologyEx(roi, roi, MORPH_CLOSE, kernel);
images.push_back(roi.reshape(1)); // vectorize the roi
labelNames.push_back(string(ffd.cFileName));
}
}
while (FindNextFile(hFind, &ffd) != 0);
// load the test image, apply the same preprocessing done for training images
Mat test = imread(INPUT_FOLDER_PATH+string("0.png"));
Mat re;
resize(test, re, imgSize, 0, 0);
Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
morphologyEx(roi, roi, MORPH_CLOSE, kernel);
Mat testre = roi.reshape(1);
struct imgnorm2_t
{
string name;
double norm2;
};
vector<imgnorm2_t> imgnorm;
for (size_t i = 0; i < images.size(); i++)
{
imgnorm2_t data = {labelNames[i],
norm(images[i], testre) /* take the l2-norm (euclidean distance) */};
imgnorm.push_back(data); // store data
}
// sort stored data based on euclidean-distance in the ascending order
sort(imgnorm.begin(), imgnorm.end(),
[] (imgnorm2_t& first, imgnorm2_t& second) { return (first.norm2 < second.norm2); });
for (size_t i = 0; i < imgnorm.size(); i++)
{
cout << imgnorm[i].name << " : " << imgnorm[i].norm2 << endl;
}
}
Results:
scale = 1.0;
demystify.jpg : 10989.6, sylvan_basilisk.jpg : 11990.7, scathe_zombies.jpg : 12307.6
scale = .8;
demystify.jpg : 8572.84, sylvan_basilisk.jpg : 9440.18, steel_golem.jpg : 9445.36
scale = .6;
demystify.jpg : 6226.6, steel_golem.jpg : 6887.96, sylvan_basilisk.jpg : 7013.05
scale = .4;
demystify.jpg : 4185.68, steel_golem.jpg : 4544.64, sylvan_basilisk.jpg : 4699.67
scale = .2;
demystify.jpg : 1903.05, steel_golem.jpg : 2154.64, sylvan_basilisk.jpg : 2277.42
If i understand you correctly you need to compare them as pictures. There is one very simple, but effective solution here - it's called Sikuli.
What tools can I use to find which image of collection is most similar to mine?
This tool is working very good with the image-processing and is not only capable to find if your card(image) is similar to what you have already defined as pattern, but also search partial image content (so called rectangles).
By default you can extend it's functionality via Python. Any ImageObject can be set to accept similarity_pattern in percentages and by doing so you'll be able to precisely find what you are looking for.
Also another big advantage of this tool is that you can learn basics in one day.
Hope this helps.

Convert ESRI projection coordinates to lat-lng

I have a large dataset of x,y coordinates in "NAD 1983 StatePlane Michigan South FIPS 2113 Feet" (aka ESRI 102690). I'd like to convert them to lat-lng points.
In theory, this is something proj is built to handle, but the documentation hasn't given me a clue -- it seems to describe much more complicated cases.
I've tried using a python interface, like so:
from pyproj import Proj
p = Proj(init='esri:102690')
sx = 13304147.06410000000 #sample points
sy = 288651.94040000000
x2, y2 = p(sx, sy, inverse=True)
But that gives wildly incorrect output.
There's a Javascript library, but I have ~50,000 points to handle, so that doesn't seem appropriate.
What worked for me:
I created a file called ptest with each pair on its own line, x and y coordinates separated by a space, like so:
13304147.06410000000 288651.94040000000
...
Then I fed that file into the command and piped the results to an output file:
$>cs2cs -f %.16f +proj=lcc +lat_1=42.1 +lat_2=43.66666666666666
+lat_0=41.5 +lon_0=-84.36666666666666 +x_0=4000000 +y_0=0 +ellps=GRS80
+datum=NAD83 +to_meter=0.3048006096012192 +no_defs +zone=20N +to
+proj=latlon ptest > out.txt
If you only need to reproject and can do some data-mining on your text files use whatever you like and use http://spatialreference.org/ref/esri/102690/ as reference.
For example use the Proj4 and store it in a shell/cmd file and call out your input file with proj4 (linux/windows version available) no problem with the size your dataset.
cs2cs +proj=latlong +datum=NAD83 +to +proj=utm +zone=10 +datum=NAD27 -r <<EOF
cs2cs -f %.16f +proj=utm +zone=20N +to +proj=latlon - | awk '{print $1 " " $2}
so in your case something like this:
cs2cs -f %.16f +proj=lcc +lat_1=42.1 +lat_2=43.66666666666666 +lat_0=41.5 +lon_0=-84.36666666666666 +x_0=4000000 +y_0=0 +ellps=GRS80 +datum=NAD83 +to_meter=0.3048006096012192 +no_defs +zone=20N +to +proj=latlon
http://trac.osgeo.org/proj/wiki/man_cs2cs
http://trac.osgeo.org/proj/
If you have coordinates in TXT, CSV or XLS file, you can do CTRL+C and insert them to http://cs2cs.mygeodata.eu where you can set appropriate input and desired output coordinate system. It is possible to insert thousands of coordinates in various formats...

Resources