How to Crop a cuda.GpuMat in GoCV (opencv 4)? - opencv

I am using GoCV (Open CV 4 bindings for go) and I want to crop an image represented as a cuda.GpuMat given an image.Rectangle.
With a regular gocv.Mat this operation is simple enough:
func Crop(src *gocv.Mat, rect image.Rectangle) *gocv.Mat {
res := src.Region(rect)
return &res
}
However, I do not see a similar Region method on the cuda.GpuMat in GoCv (some bindings are not implemented yet), nor do I see a region method in the C++ source/docs: https://docs.opencv.org/master/d0/d60/classcv_1_1cuda_1_1GpuMat.html.
I have managed to effectively crop using cuda.Remap as follows:
func Crop(src *cuda.GpuMat, rect image.Rectangle) *cuda.GpuMat {
rectWidth := rect.Dx()
rectHeight := rect.Dy()
dst := cuda.NewGpuMat()
map1 := gocv.NewMatWithSize(rectHeight, rectWidth, gocv.MatTypeCV32F)
defer map1.Close()
map2 := gocv.NewMatWithSize(rectHeight, rectWidth, gocv.MatTypeCV32F)
defer map2.Close()
offsetX := rect.Min.X
offsetY := rect.Min.Y
for x := 0; x < map1.Cols(); x++ {
for y := 0; y < map2.Rows(); y++ {
map1.SetFloatAt(x, y, float32(y+offsetY))
map2.SetFloatAt(x, y, float32(x+offsetX))
}
}
gmap1, gmap2 := cuda.NewGpuMat(), cuda.NewGpuMat()
defer gmap1.Close()
defer gmap2.Close()
gmap1.Upload(map1)
gmap2.Upload(map2)
cuda.Remap(*src, &dst, &gmap1, &gmap2, cuda.InterpolationDefault, cuda.BorderConstant, color.RGBA{0, 0, 0, 0})
return &dst
}
But after running some benchmarks this implementation is 1-2 orders of magnitude slower than gocv.Mat#Region.
Given that there is no Region method on cuda.GpuMat in GoCV or OpenCV4, what would be the most optimal equivalent operation?
I'm not too concerned if the GoCV bindings do not exist. I'm more interested in the OpenCV4 equivalent to Region for a GpuMat.
UPDATE
I have found another method of cropping a cuda.GpuMat, but it is also about an order of magnitude slower than gocv.Mat#Region.
func Crop2(src *cuda.GpuMat, rect image.Rectangle) *cuda.GpuMat {
rectWidth := rect.Dx()
rectHeight := rect.Dy()
dst := cuda.NewGpuMat()
sz := image.Point{
X: rectWidth,
Y: rectHeight,
}
cuda.Rotate(*src, &dst, sz, 0, -float64(rect.Min.X), -float64(rect.Min.Y), cuda.InterpolationDefault)
return &dst
}

Related

Implementing a linear regression using gradient descent

I'm trying to implement a linear regression with gradient descent as explained in this article (https://towardsdatascience.com/linear-regression-using-gradient-descent-97a6c8700931).
I've followed to the letter the implementation, yet my results overflow after a few iterations.
I'm trying to get this result approximately: y = -0.02x + 8499.6.
The code:
package main
import (
"encoding/csv"
"fmt"
"strconv"
"strings"
)
const (
iterations = 1000
learningRate = 0.0001
)
func computePrice(m, x, c float64) float64 {
return m * x + c
}
func computeThetas(data [][]float64, m, c float64) (float64, float64) {
N := float64(len(data))
dm, dc := 0.0, 0.0
for _, dataField := range data {
x := dataField[0]
y := dataField[1]
yPred := computePrice(m, x, c)
dm += (y - yPred) * x
dc += y - yPred
}
dm *= -2/N
dc *= -2/N
return m - learningRate * dm, c - learningRate * dc
}
func main() {
data := readXY()
m, c := 0.0, 0.0
for k := 0; k < iterations; k++ {
m, c = computeThetas(data, m, c)
}
fmt.Printf("%.4fx + %.4f\n", m, c)
}
func readXY() ([][]float64) {
file := strings.NewReader(data)
reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
panic(err)
}
records = records[1:]
size := len(records)
data := make([][]float64, size)
for i, v := range records {
val1, err := strconv.ParseFloat(v[0], 64)
if err != nil {
panic(err)
}
val2, err := strconv.ParseFloat(v[1], 64)
if err != nil {
panic(err)
}
data[i] = []float64{val1, val2}
}
return data
}
var data = `km,price
240000,3650
139800,3800
150500,4400
185530,4450
176000,5250
114800,5350
166800,5800
89000,5990
144500,5999
84000,6200
82029,6390
63060,6390
74000,6600
97500,6800
67000,6800
76025,6900
48235,6900
93000,6990
60949,7490
65674,7555
54000,7990
68500,7990
22899,7990
61789,8290`
And here it can be worked on in the GO playground:
https://play.golang.org/p/2CdNbk9_WeY
What do I need to fix to get the correct result ?
Why would a formula work on one data set and not another one?
In addition to sascha's remarks, here's another way to look at problems of this application of gradient descent: The algorithm offers no guarantee that an iteration yields a better result than the previous, so it doesn't necessarily converge to a result, because:
The gradients dm and dc in axes m and c are handled indepently from each other; m is updated in the descending direction according to dm, and c at the same time is updated in the descending direction according to dc — but, with certain curved surfaces z = f(m, c), the gradient in a direction between axes m and c can have the opposite sign compared to m and c on their own, so, while updating any one of m or c would converge, updating both moves away from the optimum.
However, more likely the failure reason in this case of linear regression to a point cloud is the entirely arbitrary magnitude of the update to m and c, determined by the product of an obscure learning rate and the gradient. It is quite possible that such an update oversteps a minimum for the target function, even that this is repeated with higher amplitude in each iteration.

OpenCV equivalent of np.where()

When using gocv package it is possible, for example, to perform template matching of a pattern within an image. The package also provide the MinMaxLoc function to retrieve locations of minimums and maximums within the matrix.
However, in below python example, the writer uses numpy.Where to threshold the matrix and get locations of multiple maximums. The python zip function is used to glue values together so they are like a slice [][2]int, the inner slice being xs and ys of the matches found.
The syntax loc[::-1] reverses the array.
The star operator in zip(*loc..) is being used to unpack the slices given to zip.
https://docs.opencv.org/master/d4/dc6/tutorial_py_template_matching.html
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
img_rgb = cv.imread('mario.png')
img_gray = cv.cvtColor(img_rgb, cv.COLOR_BGR2GRAY)
template = cv.imread('mario_coin.png',0)
w, h = template.shape[::-1]
res = cv.matchTemplate(img_gray,template,cv.TM_CCOEFF_NORMED)
threshold = 0.8
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
cv.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
cv.imwrite('res.png',img_rgb)
How do I implement the same np.where algorithm in Go to get the multiple locations after the threshold is applied?
OpenCV has a built-in (semi-)equivalent function to np.where(), which is findNonZero(). As implied by the name, it finds the non-zero elements in an image, which is what np.where() does when called with a single argument, as the numpy docs state.
And this is available in the golang bindings as well. From the gocv docs on FindNonZero:
func FindNonZero(src Mat, idx *Mat)
FindNonZero returns the list of locations of non-zero pixels.
For further details, please see: https://docs.opencv.org/master/d2/de8/group__core__array.html#gaed7df59a3539b4cc0fe5c9c8d7586190
Note: np.where() returns indexes in array order, that is, (row, col) or (i, j) which is opposite to typical image indexing (x, y). That is why loc is reversed in Python. When using findNonZero() you won't need to do that, since OpenCV always uses (x, y) for points.
For anyone coming across this I hope a full example keeps you from spending days hitting your head against the wall and reading the same google results over and over until something clicks.
package main
import (
"fmt"
"image"
"image/color"
"os"
"gocv.io/x/gocv"
)
func OpenImage(path string) (image.Image, error) {
f, err := os.Open(path)
if err != nil {
return nil, err
}
defer f.Close()
img, _, err := image.Decode(f)
return img, err
}
func main() {
src := gocv.IMRead("haystack.png", gocv.IMReadGrayScale)
tgt := gocv.IMRead("needle.png", gocv.IMReadGrayScale)
if src.Empty() {
fmt.Printf("failed to read image")
os.Exit(1)
}
if tgt.Empty() {
fmt.Printf("failed to read image")
os.Exit(1)
}
// Get image size
tgtImg, _ := tgt.ToImage()
iX, iY := tgtImg.Bounds().Size().X, tgtImg.Bounds().Size().Y
// Perform a match template operation
res := gocv.NewMat()
gocv.MatchTemplate(src, tgt, &res, gocv.TmSqdiffNormed, gocv.NewMat())
// Set a thresh hold. Using the `gocv.TmSqdiffNormed` confidence levels are
// reversed. Meaning the lowest value is actually the greatest confidence.
// So here I perform an Inverse Binary Threshold setting all values
// above 0.16 to 1.
thresh := gocv.NewMat()
gocv.Threshold(res, &thresh, 0.16, 1.0, gocv.ThresholdBinaryInv)
// Filter out all the non-zero values.
gocv.FindNonZero(thresh, &res)
// FindNonZero returns a list or vector of locations in the form of a gocv.Mat when using gocv.
// There may be a better way to do this, but I iterate through each found location getting the int vector in value
// at each row. I have to convert the returned int32 values into ints. Then draw a rectangle around each point.
//
// The result of get res.GetVeciAt(i, 0) is just a slice of x, y integers so each value can be accessed by
// using slice/array syntax.
for i := 0; i < res.Rows(); i++ {
x, y := res.GetVeciAt(i, 0)[0], res.GetVeciAt(i, 0)[1]
xi, yi := int(x), int(y)
gocv.Rectangle(&src, image.Rect(xi, yi, xi+iX, yi+iY), color.RGBA{0, 0, 0, 1}, 2)
}
w := gocv.NewWindow("Test")
w.IMShow(src)
if w.WaitKey(0) > 1 {
os.Exit(0)
}
}

Find people with GOCV

I worked last year with OpenCV and Python. Today I wanted to try OpenCV using Golang with the GOCV package. I just wanted a simple Python example () to evalute but in Golang. I used even the same parameters (except the hiThresh and finalThreshold, i used the default values). Somehow I cannot get it working with GOCV, he only finds one centered result.
Here is my code:
package main
import (
"encoding/json"
"fmt"
"image"
"image/color"
"gocv.io/x/gocv"
)
func main() {
// define default hog descriptor
hog := gocv.NewHOGDescriptor()
defer hog.Close()
hog.SetSVMDetector(gocv.HOGDefaultPeopleDetector())
// color for the rect when faces detected
blue := color.RGBA{0, 0, 255, 0}
// read image
img := gocv.IMRead("images/person_010.bmp", 0)
//resize image
fact := float64(400) / float64(img.Cols())
newY := float64(img.Rows()) * fact
gocv.Resize(img, img, image.Point{X: 400, Y: int(newY)}, 0, 0, 1)
// detect people in image
rects := hog.DetectMultiScaleWithParams(img, 0, image.Point{X: 8, Y: 8}, image.Point{X: 16, Y: 16}, 1.05, 2, false)
// print found points
printStruct(rects)
// draw a rectangle around each face on the original image,
// along with text identifing as "Human"
for _, r := range rects {
gocv.Rectangle(img, r, blue, 3)
size := gocv.GetTextSize("Human", gocv.FontHersheyPlain, 1.2, 2)
pt := image.Pt(r.Min.X+(r.Min.X/2)-(size.X/2), r.Min.Y-2)
gocv.PutText(img, "Human", pt, gocv.FontHersheyPlain, 1.2, blue, 2)
}
if ok := gocv.IMWrite("loool.jpg", img); !ok {
fmt.Println("Error")
}
}
func printStruct(i interface{}) {
b, err := json.Marshal(i)
if err != nil {
fmt.Println(err)
return
}
fmt.Println(string(b))
}
Here is the input image:
And here is the result:
Actually, I've just run the code you posted with the image you provided—and I've got another resulting image:
I'm running:
gocv version: 0.10.0
opencv lib version: 3.4.1

Copying transparent (32bit alpha) bitmap from TImageList in ComboBox DrawItem event

I am customizing OnDrawItem event to draw icons next to item names.
Here is my code so far for the event OnDrawItem:
void __fastcall Form1::ComboBox1DrawItem(TWinControl *Control, int Index,
TRect &Rect, TOwnerDrawState State)
{
TComboBox* CB = static_cast<TComboBox*>(Control);
CB->Canvas->FillRect(Rect);
boost::scoped_ptr<Graphics::TBitmap> bitmap(new Graphics::TBitmap());
bitmap->PixelFormat = pf32bit;
bitmap->AlphaFormat = afPremultiplied;
ImageList1->GetBitmap(Index, bitmap.get());
bitmap->AlphaFormat = afPremultiplied;
if (bitmap->Canvas->Handle)
{
// structure for alpha blending
BLENDFUNCTION bf;
bf.BlendOp = AC_SRC_OVER;
bf.BlendFlags = 0;
bf.SourceConstantAlpha = 0xFF; // 0x00 (transparent) through 0xFF (opaque)
bf.AlphaFormat = AC_SRC_ALPHA; // Use bitmap alpha
::AlphaBlend(CB->Canvas->Handle, // handle to destination DC
Rect.Left + 2, // x-coord of upper-left corner
Rect.Top, // y-coord of upper-left corner
bitmap->Width, // destination width
bitmap->Height, // destination height
bitmap->Canvas->Handle, // handle to source DC
0, // x-coord of upper-left corner
0, // y-coord of upper-left corner
bitmap->Width, // source width
bitmap->Height, // source height
bf // alpha-blending function
);
}
Rect = Bounds(Rect.Left + 20 + 2, Rect.Top, Rect.Right - Rect.Left, Rect.Bottom - Rect.Top);
DrawTextW(CB->Canvas->Handle, CB->Items->Strings[Index].c_str(), -1, &Rect, DT_VCENTER | DT_SINGLELINE | DT_END_ELLIPSIS);
}
The problem of course is getting a transparent TImageList1 to copy to transparent TBitmap preserving 32-bit alpha transparency/semi-transparency. Currently I get it out with white background in the resulting TBitmap.
Just to be clear, TImageList ColorDepth is set to cd32bit with DrawingStyle = dsTransparent before loading images to it and the images on it are transparent, no problems there.
What is the trick to solve this?
UPDATE AND MY FINAL SOLUTION
Based on a reply here here is my final working code for someone else who might need it in the future. This of course is just a template code which you might want to customize further to your own needs.
void __fastcall TForm1::ComboBox1DrawItem(TWinControl *Control, int Index, TRect &Rect, TOwnerDrawState State)
{
if (Index >= 0)
{
TComboBox* CB = static_cast<TComboBox*>(Control);
CB->Canvas->FillRect(Rect);
// Note - ImageList1 already has DrawingStyle set to dsTransparent
ImageList1->Draw(CB->Canvas, Rect.Left + 2, Rect.Top, 0);
Rect = Bounds(Rect.Left + ImageList1->Width + 2 + 2, Rect.Top, Rect.Right - Rect.Left - ImageList1->Width - 2, Rect.Bottom - Rect.Top);
DrawTextW(CB->Canvas->Handle, CB->Items->Strings[Index].c_str(), -1, &Rect, DT_VCENTER | DT_SINGLELINE | DT_END_ELLIPSIS);
}
}
You don't need to try and grab the original bitmap from the imagelist because the imagelist itself knows how to draw honoring transparency information. You can use its Draw method for that.
Otherwise, an answer here suggests that setting AlphaFormat to 'afIgnored' before calling GetBitmap should preserve transparency.

How to discover the area chart data if we only have the image?

The area chart (image) has a few data series, which are charted with different colors. We know the image size and co-ordinates of each lable on x-Axis, is it possible to discover the series of y-Axis by image recongition? Can anybody shed some light?
If you know the y-axis scale, it should be possible.
To screenscrape, you could first filter your image with a color filter for each of the series.
Second step would be to gather the coordinates of all remaining pixels in your temporary image and transform them these to the scale needed.
given
a pixel at coordinates x,y
the offset of the charts Origin in image pixels xoffset, yoffset
the Scale of you chart axis xscale, yscale
you could calculate the data for this pixel (pseudocode)
pixelData.x := (x - xoffset) * xscale
pixeldata.y := (y - yoffset) * yscale
And afterwards, do some interpolation if your series line is more then one pixel wide (for example get the average data for all pixels in a single column or so).
Update1: Pseudocode for naive color filter filtering out red charts
//set up desired color levels to filter out
redmin := 240;
redmax := 255
bluemin := 0;
bluemax := 0;
greenmin := 0
greenmax := 0;
//load source bitmap
myBitmap := LoadBitmap("Chartfile.bmp");
//loop over bitmap pixels
for iX := 0 to myBitmap.width-1 do
for iY := 0 myBitmap.height-1 do
begin
myColorVal := myBitmap.GetPixels(iX, iY);
//if the pixel color is inside your target color range, store it
if ((mycolorVal.r >=redmin) and (myColorVal.r <= redmax)) and
((mycolorVal.g >=greenmin) and (myColorVal.g <= greenmax)) and
((mycolorVal.b >=bluemin) and (myColorVal.b <= bluemax)) then
storeDataValue(iX, iY); //performs the value scaling operation mentioned above
end;

Resources