I'd like to use the DeepLearningKit for iOS. I want to use UIImage objects to be classified. The sample application only uses a float array loaded from a json file. Thus I have to create the bitmap representation of the UIImage as a float array and use this for the classify-method.
Can anybody help me on that? Is there a way to create a bitmap representation for UIImage? Moreover I have to swap the channels from RGB to BGR.
Thank you
Have added an extension to UIImage that allows setting and getting RGB(A) pixels directly - key methods:
public func setPixelColorAtPoint(point:CGPoint, color: RawColorType) -> UIImage?
func getPixelColorAtLocation(point:CGPoint)->UIColor?
where RawColorType is defined as
public typealias RawColorType = (newRedColor:UInt8, newgreenColor:UInt8, newblueColor:UInt8, newalphaValue:UInt8)
This way you should be able to convert back and forth between bitmap representation and UIImage. Wrote a blog post that gives some more context: http://deeplearningkit.org/tutorials-for-ios-os-x-and-tvos/tutorial-image-handling-in-deeplearningkit/
I've write a function to convert a image file to Caffe blob on iOS platform. You can find it here. I hope it will help you.
Code snippet:
// Convert Bitmap (channels*width*height) to Matrix (width*height*channels)
// Remove alpha channel
int input_channels = input_layer->channels();
LOG(INFO) << "image_channels:" << image_channels << " input_channels:" << input_channels;
if (input_channels == 3 && image_channels != 4) {
LOG(ERROR) << "image_channels input_channels not match.";
return false;
} else if (input_channels == 1 && image_channels != 1) {
LOG(ERROR) << "image_channels input_channels not match.";
return false;
}
float *input_data = input_layer->mutable_cpu_data();
for (size_t h = 0; h < height; h++) {
for (size_t w = 0; w < width; w++) {
for (size_t c = 0; c < input_channels; c++) {
// OpenCV use BGR instead of RGB
size_t cc = c;
if (input_channels == 3) {
cc = 2 - c;
}
// Convert uint8_t to float
input_data[c*width*height + h*width + w] =
static_cast<float>(result[h*width*image_channels + w*image_channels + cc]);
if (mean.size() == input_channels) {
input_data[c*width*height + h*width + w] -= mean[c];
}
}
}
}
Related
I have a binary image and a color image of the same size. I need to iterate each blob (white pixel blocks) of the binary image and use it as a mask and find the mean color of this blob region from the color image.
I have tried:
HierarchyIndex[] hierarchy;
Point[][] contours;
binaryImage.FindContours(out contours, out hierarchy, RetrievalModes.List, ContourApproximationModes.ApproxNone);
using (Mat mask = Mat.Zeros(matColor.Size(), MatType.CV_8UC1))
foreach (var bl in contours)
if (Cv2.ContourArea(bl) > 5)
{
mask.DrawContour(bl, Scalar.White, -1);
Rect rect = Cv2.BoundingRect(bl);
Scalar mean = Cv2.Mean(colorImage[rect], mask[rect]);
mask.DrawContour(bl, Scalar.Black, -1);
}
which works for the blobs not having holes. However in my case I have many blob regions having huge holes that affects the mean calculation.
I couldn't figure it out how to solve it using the hierarchy info; or with another approach.
(My code is for OpenCVSharp but answer in any other wrapper or language is wellcome.)
Edit: I've added an example image. The traffic signs part is the problem.
Actually I think I have solved this problem with this method:
using PLine = List<Point>;
using Shape = List<List<Point>>;
internal static IEnumerable<Tuple<PLine, Shape>> FindContoursWithHoles(this Mat mat)
{
Point[][] contours;
HierarchyIndex[] hierarchy;
mat.FindContours(out contours, out hierarchy, RetrievalModes.Tree, ContourApproximationModes.ApproxNone);
Dictionary<int, bool> dic = new Dictionary<int, bool>();
for (int i = 0; i < contours.Length; i++)
if (hierarchy[i].Parent < 0)
dic[i] = true;
bool ok = false;
while (!ok)
{
ok = true;
for (int i = 0; i < contours.Length; i++)
if (dic.ContainsKey(i))
{
bool isParent = dic[i];
var hi = hierarchy[i];
if (hi.Parent >= 0) dic[hi.Parent] = (!isParent);
if (hi.Child >= 0) dic[hi.Child] = (!isParent);
while (hi.Next >= 0)
{
dic[hi.Next] = isParent;
hi = hierarchy[hi.Next];
if (hi.Parent >= 0) dic[hi.Parent] = (!isParent);
if (hi.Child >= 0) dic[hi.Child] = (!isParent);
}
hi = hierarchy[i];
while (hi.Previous >= 0)
{
dic[hi.Previous] = isParent;
hi = hierarchy[hi.Previous];
if (hi.Parent >= 0) dic[hi.Parent] = (!isParent);
if (hi.Child >= 0) dic[hi.Child] = (!isParent);
}
}
else
ok = false;
}
foreach (int i in dic.Keys.Where(a => dic[a]))
{
PLine pl = contours[i].ToList();
Shape childs = new Shape();
var hiParent = hierarchy[i];
if (hiParent.Child >= 0)
{
childs.Add(contours[hiParent.Child].ToList());
var hi = hierarchy[hiParent.Child];
while (hi.Next >= 0)
{
childs.Add(contours[hi.Next].ToList());
hi = hierarchy[hi.Next];
}
hi = hierarchy[hiParent.Child];
while (hi.Previous >= 0)
{
childs.Add(contours[hi.Previous].ToList());
hi = hierarchy[hi.Previous];
}
}
yield return Tuple.Create(pl, childs);
}
}
By drawing the holes as black, we can use each blob as a single mask:
var blobContours = blobs.FindContoursWithHoles().ToList();
using (Mat mask = Mat.Zeros(mat0.Size(), MatType.CV_8UC1))
for (int i = 0; i < blobContours.Count; i++)
{
var tu = blobContours[i];
var bl = tu.Item1;
if (Cv2.ContourArea(bl) > 100)
{
mask.DrawContour(bl, Scalar.White, -1);
foreach (var child in tu.Item2)
mask.DrawContour(child, Scalar.Black, -1);
Rect rect = Cv2.BoundingRect(bl);
Scalar mean = Cv2.Mean(mat0[rect], mask[rect]);
}
}
I think there should be an easier way.
And yet there is another problem. In some cases, an individual red part of the sign (which is a seperate white blob) does not found as a parent outside circle and a child inside circle, but a large parent contour outside with two circles as children (ie. hole inside another hole, makes a seperate blob which is not found as a parent). Yes it is hierarchically correct but does not help me. I hope I could make my self clear, sorry for my English.
#Miki thank you very much. I was able to achieve what I want using ConnectedComponents. Its simple and fast:
var cc = Cv2.ConnectedComponentsEx(binaryImage, PixelConnectivity.Connectivity8);
foreach (var bl in cc.Blobs)
using (Mat mask = new Mat())
{
cc.FilterByBlob(binaryImage, mask, bl);
Rect rect = bl.Rect;
Scalar mean = Cv2.Mean(colorImage[rect], mask[rect]);
}
I am using OrbFeaturesFinder to detect keypoints in Images.
Ptr<FeaturesFinder> finder;
finder = makePtr<OrbFeaturesFinder>();
vector<ImageFeatures> features(num_images);
(*finder)(img, features[i]);
I used this code on linux and implemented the same on android, but the results are different sometimes, as in given link
http://imgur.com/a/wQXZx
What can be reason behind this nature of output.
method of accessing images in android
Image is saved in jpeg form, then read[edit] -
for(int i = 0; i < imgNames.size(); i++){
Bitmap bitmap = getThumbnail(imgNames.get(i));
int imageW = bitmap.getWidth();
int imageH = bitmap.getHeight();
byte[] rgb = getByteArray(imageW, imageH, bitmap, "RGB");
bitmap.recycle();
Mat mRgb = new Mat(imageH, imageW, CvType.CV_8UC3);
mRgb.put(0, 0, rgb);
Imgproc.cvtColor(mRgb, mRgb, Imgproc.COLOR_BGR2RGB, 3);
panoImgs.add(mRgb);
}
and sent to jni -
jclass matClass = env->FindClass("org/opencv/core/Mat");
jmethodID getNativeAddr = env->GetMethodID(matClass, "getNativeObjAddr", "()J");
int numImgs = env->GetArrayLength(jInputArray);
vector<Mat> natImgs;
for(int i=0; i < numImgs; ++i) {
natImgs.push_back(
*(Mat*)env->CallLongMethod(
env->GetObjectArrayElement(jInputArray, i),
getNativeAddr
)
);
}
for linux - I am saving the same image in jpeg format and then using imread to access files.
my program is Directx Program that draws a container cube within it smaller cubes....these smaller cubes fall by time i hope you understand what i mean...
The program isn't complete yet ...it should draws the container only ....but it draws nothing ...only the background color is visible... i only included what i think is needed ...
this is the routines that initialize the program
bool Game::init(HINSTANCE hinst,HWND _hw){
Directx11 ::init(hinst , _hw);
return LoadContent();}
Directx11::init()
bool Directx11::init(HINSTANCE hinst,HWND hw){
_hinst=hinst;_hwnd=hw;
RECT rc;
GetClientRect(_hwnd,&rc);
height= rc.bottom - rc.top;
width = rc.right - rc.left;
UINT flags=0;
#ifdef _DEBUG
flags |=D3D11_CREATE_DEVICE_DEBUG;
#endif
HR(D3D11CreateDevice(0,_driverType,0,flags,0,0,D3D11_SDK_VERSION,&d3dDevice,&_featureLevel,&d3dDeviceContext));
if (d3dDevice == 0 || d3dDeviceContext == 0)
return 0;
DXGI_SWAP_CHAIN_DESC sdesc;
ZeroMemory(&sdesc,sizeof(DXGI_SWAP_CHAIN_DESC));
sdesc.Windowed=true;
sdesc.BufferCount=1;
sdesc.BufferDesc.Format=DXGI_FORMAT_R8G8B8A8_UNORM;
sdesc.BufferDesc.Height=height;
sdesc.BufferDesc.Width=width;
sdesc.BufferDesc.Scaling=DXGI_MODE_SCALING_UNSPECIFIED;
sdesc.BufferDesc.ScanlineOrdering=DXGI_MODE_SCANLINE_ORDER_UNSPECIFIED;
sdesc.OutputWindow=_hwnd;
sdesc.BufferDesc.RefreshRate.Denominator=1;
sdesc.BufferDesc.RefreshRate.Numerator=60;
sdesc.Flags=0;
sdesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
if (m4xMsaaEnable)
{
sdesc.SampleDesc.Count=4;
sdesc.SampleDesc.Quality=m4xMsaaQuality-1;
}
else
{
sdesc.SampleDesc.Count=1;
sdesc.SampleDesc.Quality=0;
}
IDXGIDevice *Device=0;
HR(d3dDevice->QueryInterface(__uuidof(IDXGIDevice),reinterpret_cast <void**> (&Device)));
IDXGIAdapter*Ad=0;
HR(Device->GetParent(__uuidof(IDXGIAdapter),reinterpret_cast <void**> (&Ad)));
IDXGIFactory* fac=0;
HR(Ad->GetParent(__uuidof(IDXGIFactory),reinterpret_cast <void**> (&fac)));
fac->CreateSwapChain(d3dDevice,&sdesc,&swapchain);
ReleaseCOM(Device);
ReleaseCOM(Ad);
ReleaseCOM(fac);
ID3D11Texture2D *back = 0;
HR(swapchain->GetBuffer(0,__uuidof(ID3D11Texture2D),reinterpret_cast <void**> (&back)));
HR(d3dDevice->CreateRenderTargetView(back,0,&RenderTarget));
D3D11_TEXTURE2D_DESC Tdesc;
ZeroMemory(&Tdesc,sizeof(D3D11_TEXTURE2D_DESC));
Tdesc.BindFlags = D3D11_BIND_DEPTH_STENCIL;
Tdesc.ArraySize = 1;
Tdesc.Format= DXGI_FORMAT_D24_UNORM_S8_UINT;
Tdesc.Height= height;
Tdesc.Width = width;
Tdesc.Usage = D3D11_USAGE_DEFAULT;
Tdesc.MipLevels=1;
if (m4xMsaaEnable)
{
Tdesc.SampleDesc.Count=4;
Tdesc.SampleDesc.Quality=m4xMsaaQuality-1;
}
else
{
Tdesc.SampleDesc.Count=1;
Tdesc.SampleDesc.Quality=0;
}
HR(d3dDevice->CreateTexture2D(&Tdesc,0,&depthview));
HR(d3dDevice->CreateDepthStencilView(depthview,0,&depth));
d3dDeviceContext->OMSetRenderTargets(1,&RenderTarget,depth);
D3D11_VIEWPORT vp;
vp.TopLeftX=0.0f;
vp.TopLeftY=0.0f;
vp.Width = static_cast <float> (width);
vp.Height= static_cast <float> (height);
vp.MinDepth = 0.0f;
vp.MaxDepth = 1.0f;
d3dDeviceContext -> RSSetViewports(1,&vp);
return true;
SetBuild() Prepare the matrices inside the container for the smaller cubes ....i didnt program it to draw the smaller cubes yet
and this the function that draws the scene
void Game::Render(){
d3dDeviceContext->ClearRenderTargetView(RenderTarget,reinterpret_cast <const float*> (&Colors::LightSteelBlue));
d3dDeviceContext->ClearDepthStencilView(depth,D3D11_CLEAR_DEPTH | D3D11_CLEAR_STENCIL,1.0f,0);
d3dDeviceContext-> IASetInputLayout(_layout);
d3dDeviceContext-> IASetPrimitiveTopology(D3D10_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
d3dDeviceContext->IASetIndexBuffer(indices,DXGI_FORMAT_R32_UINT,0);
UINT strides=sizeof(Vertex),off=0;
d3dDeviceContext->IASetVertexBuffers(0,1,&vertices,&strides,&off);
D3DX11_TECHNIQUE_DESC des;
Tech->GetDesc(&des);
Floor * Lookup; /*is a variable to Lookup inside the matrices structure (Floor Contains XMMATRX Piese[9])*/
std::vector<XMFLOAT4X4> filled; // saves the matrices of the smaller cubes
XMMATRIX V=XMLoadFloat4x4(&View),P = XMLoadFloat4x4(&Proj);
XMMATRIX vp = V * P;XMMATRIX wvp;
for (UINT i = 0; i < des.Passes; i++)
{
d3dDeviceContext->RSSetState(BuildRast);
wvp = XMLoadFloat4x4(&(B.Memory[0].Pieces[0])) * vp; // Loading The Matrix at translation(0,0,0)
HR(ShadeMat->SetMatrix(reinterpret_cast<float*> ( &wvp)));
HR(Tech->GetPassByIndex(i)->Apply(0,d3dDeviceContext));
d3dDeviceContext->DrawIndexed(build_ind_count,build_ind_index,build_vers_index);
d3dDeviceContext->RSSetState(PieseRast);
UINT r1=B.GetSize(),r2=filled.size();
for (UINT j = 0; j < r1; j++)
{
Lookup = &B.Memory[j];
for (UINT r = 0; r < Lookup->filledindeces.size(); r++)
{
filled.push_back(Lookup->Pieces[Lookup->filledindeces[r]]);
}
}
for (UINT j = 0; j < r2; j++)
{
ShadeMat->SetMatrix( reinterpret_cast<const float*> (&filled[i]));
Tech->GetPassByIndex(i)->Apply(0,d3dDeviceContext);
d3dDeviceContext->DrawIndexed(piese_ind_count,piese_ind_index,piese_vers_index);
}
}
HR(swapchain->Present(0,0));}
thanks in Advance
One bug in your program appears to be that you're using i, the index of the current pass, as an index into the filled vector, when you should apparently be using j.
Another apparent bug is that in the loop where you are supposed to be iterating over the elements of filled, you're not iterating over all of them. The value r2 is set to the size of filled before you append anything to it during that pass. During the first pass this means that nothing will be drawn by this loop. If your technique only has one pass then this means that the second DrawIndexed call in your code will never be executed.
It also appears you should be only adding matrices to filled once, regardless of the number of the passes the technique has. You should consider if your code is actually meant to work with techniques with multiple passes.
I am in the process of creating a small program which detects objects(small image) in the large image and I am using OpenCV java.
As I have to consider rotation and scaling I have used FeatureDetector.BRISK and DescriptorExtractor.BRISK.
Following approach is used to filter the match results to get the best matches only.
I have two questions
Is there a way to find the below min_dist and max_dist with the loop I have used?
Most important question - Now the problem is I need to use these matches to determine whether the object(template) found or not. Would be great if some one help me here.
Thanks in advance.
FeatureDetector fd = FeatureDetector.create(FeatureDetector.BRISK);
final MatOfKeyPoint keyPointsLarge = new MatOfKeyPoint();
final MatOfKeyPoint keyPointsSmall = new MatOfKeyPoint();
fd.detect(largeImage, keyPointsLarge);
fd.detect(smallImage, keyPointsSmall);
System.out.println("keyPoints.size() : "+keyPointsLarge.size());
System.out.println("keyPoints2.size() : "+keyPointsSmall.size());
Mat descriptorsLarge = new Mat();
Mat descriptorsSmall = new Mat();
DescriptorExtractor extractor = DescriptorExtractor.create(DescriptorExtractor.BRISK);
extractor.compute(largeImage, keyPointsLarge, descriptorsLarge);
extractor.compute(smallImage, keyPointsSmall, descriptorsSmall);
System.out.println("descriptorsA.size() : "+descriptorsLarge.size());
System.out.println("descriptorsB.size() : "+descriptorsSmall.size());
MatOfDMatch matches = new MatOfDMatch();
DescriptorMatcher matcher = DescriptorMatcher.create(DescriptorMatcher.BRUTEFORCE_HAMMINGLUT);
matcher.match(descriptorsLarge, descriptorsSmall, matches);
System.out.println("matches.size() : "+matches.size());
MatOfDMatch matchesFiltered = new MatOfDMatch();
List<DMatch> matchesList = matches.toList();
List<DMatch> bestMatches= new ArrayList<DMatch>();
Double max_dist = 0.0;
Double min_dist = 100.0;
for (int i = 0; i < matchesList.size(); i++)
{
Double dist = (double) matchesList.get(i).distance;
if (dist < min_dist && dist != 0)
{
min_dist = dist;
}
if (dist > max_dist)
{
max_dist = dist;
}
}
System.out.println("max_dist : "+max_dist);
System.out.println("min_dist : "+min_dist);
double threshold = 3 * min_dist;
double threshold2 = 2 * min_dist;
if (threshold2 >= max_dist)
{
threshold = min_dist * 1.1;
}
else if (threshold >= max_dist)
{
threshold = threshold2 * 1.4;
}
System.out.println("Threshold : "+threshold);
for (int i = 0; i < matchesList.size(); i++)
{
Double dist = (double) matchesList.get(i).distance;
System.out.println(String.format(i + " match distance best : %s", dist));
if (dist < threshold)
{
bestMatches.add(matches.toList().get(i));
System.out.println(String.format(i + " best match added : %s", dist));
}
}
matchesFiltered.fromList(bestMatches);
System.out.println("matchesFiltered.size() : " + matchesFiltered.size());
Edit
Edited my code as follows.I know still it's not the best way to come to a conclusion whether the object found or not based on no of best matches.
So please share your views.
System.out.println("max_dist : "+max_dist);
System.out.println("min_dist : "+min_dist);
if(min_dist > 50 )
{
System.out.println("No match found");
System.out.println("Just return ");
return false;
}
double threshold = 3 * min_dist;
double threshold2 = 2 * min_dist;
if (threshold > 75)
{
threshold = 75;
}
else if (threshold2 >= max_dist)
{
threshold = min_dist * 1.1;
}
else if (threshold >= max_dist)
{
threshold = threshold2 * 1.4;
}
System.out.println("Threshold : "+threshold);
for (int i = 0; i < matchesList.size(); i++)
{
Double dist = (double) matchesList.get(i).distance;
if (dist < threshold)
{
bestMatches.add(matches.toList().get(i));
//System.out.println(String.format(i + " best match added : %s", dist));
}
}
matchesFiltered.fromList(bestMatches);
System.out.println("matchesFiltered.size() : " + matchesFiltered.size());
if(matchesFiltered.rows() >= 1)
{
System.out.println("match found");
return true;
}
else
{
return false;
}
Your Edited code is working fine for me, and working perfectly,
Following are changes that i have done in your code for detecting objects(small image) in the large image :
using SURF method for feature detection as well as feature extraction.(SURF is available in opencv 4.1.1 for Android and earlier, after that it have been removed from that, so here i have used opencv 4.1.1)
change threshold of image matched or not from 1 to 4, in following line
if(matchesFiltered.rows() >= 1)
to
if(matchesFiltered.rows() >= 4)
only this changes have worked perfectly for me, make sure that object/small image have rich texture(atleast should have keypoints that can be matched)
There are several approaches for detecting objects inside images. Just put some links here:
Open CV 2 Computer Vision Application Programming Cookbook, Chapter 8/9
http://docs.opencv.org/doc/tutorials/features2d/feature_homography/feature_homography.html
http://robocv.blogspot.de/2012/02/real-time-object-detection-in-opencv.html
The last link shows a way to calculate the min and max value, should be nearly the same in Java. All links should hopefully show some ideas how to match objects.
I also recognized that there are a lot of magic numbers inside your code. Maybe you could put them in variables to reduce the possibility of error and have a better overview.
I am taking a computer graphics class, and I need to work with textures, but I can't use any library to do it. I am stuck on loading the rgb values of the images I need to use (the images can be in any format, jpg, raw, png, etc..) so my question is, which is the easiest way to get the rgb values of an image (of any format) without using any libraries to get this values?? Here is what I found already on the site:
unsigned char *data;
File *file;
file = fopen("image.png", "r");//
data = (unsigned char *)malloc(TH*TV*3); //TH and TV are both 50
fread(data, TH*TV*3, 1, file);
fclose(file);
int i;
for(i=0;i<TH*TV*3;i++){
//suposing I have a struct RGB for the rgb values
RGB.r = data[?];// how do I get the r value
RGB.g = data[?];// how do I get the g value
RGB.b = data[?];// how do I get the b value
}
Thanks
Rather than iterating through every byte that you read in, you want to iterate every pixel which consists of 3 bytes. So replace i++ with i+=3.
for(i=0;i<TH*TV*3;i+=3){
RGB.r = data[i];
RGB.g = data[i+1];
RGB.b = data[i+2];
}
Try to use some framework like OpenCV there are several options to get the colors or to manipulate an image.
Here I found this example code:
cv::Mat img = cv::imread("lenna.png");
for(int i=0; i<img.rows; i++) {
for(int j=0; j<img.cols; j++) {
// You can now access the pixel value with cv::Vec3b
std::cout << img.at<cv::Vec3b>(i,j)[0] << " ";
str::cout << img.at<cv::Vec3b>(i,j)[1] << " ";
str::cout << img.at<cv::Vec3b>(i,j)[2] << std::endl;
}
}
But please note that the code above is not very performance, but the code above should give you an idea how to read the pixels.