I'm calling a function that returns an integer which represents a bitfield of 16 binary inputs each of the colors can either be on or off.
I'm trying to create a function to get the changes between the oldstate and the new state,
function getChanges(oldColors,newColors)
sampleOutput = {white = "",orange="added",magenta="removed" .....}
return sampleOutput
I've tried subtracting the oldColors from the newColors and the new Colors from the oldColors but this seems to result in chaos should more then 1 value change.
this is to detect rising / falling edges from multiple inputs.
**Edit: there appears to be a subset of the lua bit api available
from:ComputerCraft wiki
colors.white 1 0x1 0000000000000001
colors.orange 2 0x2 0000000000000010
colors.magenta 4 0x4 0000000000000100
colors.lightBlue 8 0x8 0000000000001000
colors.yellow 16 0x10 0000000000010000
colors.lime 32 0x20 0000000000100000
colors.pink 64 0x40 0000000001000000
colors.gray 128 0x80 0000000010000000
colors.lightGray 256 0x100 0000000100000000
colors.cyan 512 0x200 0000001000000000
colors.purple 1024 0x400 0000010000000000
colors.blue 2048 0x800 0000100000000000
colors.brown 4096 0x1000 0001000000000000
colors.green 8192 0x2000 0010000000000000
colors.red 16384 0x4000 0100000000000000
colors.black 32768 0x8000 1000000000000000
(there was supposed to be a table of values here, but I can't work out the syntax for markdown, it would appear stackoverflow ignores the html part of the standard.)
function getChanges(oldColors,newColors)
local added = bit.band(newColors, bit.bnot(oldColors))
local removed = bit.band(oldColors, bit.bnot(newColors))
local color_names = {
white = 1,
orange = 2,
magenta = 4,
lightBlue = 8,
yellow = 16,
lime = 32,
pink = 64,
gray = 128,
lightGray = 256,
cyan = 512,
purple = 1024,
blue = 2048,
brown = 4096,
green = 8192,
red = 16384,
black = 32768
local diff = {}
for cn, mask in pairs(color_names) do
diff[cn] = bit.band(added, mask) ~= 0 and 'added'
or bit.band(removed, mask) ~= 0 and 'removed' or ''
return diff
I am new to openCV - CUDA so I have been testing the most simple one which is loading a model on GPU rather than CPU to see how fast GPU is and I am horrified at the result I get.
--- GPU vs CPU ---
--- ---
--- 21.913758993148804 seconds ---3.0586464405059814 seconds ---
--- 22.379303455352783 seconds ---3.1384341716766357 seconds ---
--- 21.500431060791016 seconds ---2.9400241374969482 seconds ---
--- 21.292986392974854 seconds ---3.3738017082214355 seconds ---
--- 20.88358211517334 seconds ---3.388749599456787 seconds ---
I will give my code snippet in case I may be doing something wrong that cause GPU time to spike so high.
def loadYolo():
net = cv2.dnn.readNet("yolov4.weights", "yolov4.cfg")
classes = []
with open("coco.names", "r") as f:
classes = [line.strip() for line in f.readlines()]
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
return net,classes,layer_names,output_layers
def image(data_image):
sbuf = StringIO()
b = io.BytesIO(base64.b64decode(data_image))
if(str(data_image) == 'data:,'):
pimg = Image.open(b)
frame = cv2.cvtColor(np.array(pimg), cv2.COLOR_RGB2BGR)
frame = resize(frame, width=700)
frame = cv2.flip(frame, 1)
height, width, channels = frame.shape
blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416),
swapRB=True, crop=False)
outs = net.forward(output_layers)
print("--- %s seconds ---" % (time.time() - start_time))
class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.5:
# Object detected
center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
w = int(detection[2] * width)
h = int(detection[3] * height)
# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)
boxes.append([x, y, w, h])
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
colors = np.random.uniform(0, 255, size=(len(classes), 3))
for i in range(len(boxes)):
if i in indexes:
x, y, w, h = boxes[i]
label = str(classes[class_ids[i]])
color = colors[class_ids[i]]
cv2.rectangle(frame, (x, y), (x + w, y + h), color, 2)
cv2.putText(frame, label, (x, y + 30), font, 1, color, 2)
imgencode = cv2.imencode('.jpg', frame)[1]
stringData = base64.b64encode(imgencode).decode('utf-8')
b64_src = 'data:image/jpg;base64,'
stringData = b64_src + stringData
emit('response_back', stringData)
My Gpu is Nvidia 1050 Ti and my CPU is i5 gen 9 in case someone need the specification. Can someone please enlighten me as I am super confused right now? Thank you very much
EDIT 1: I tried to use cv2.dnn.DNN_TARGET_CUDA instead of cv2.dnn.DNN_TARGET_CUDA_FP16, but the time is still terrible compare to CPU. Below is the GPU result :
--- 10.91195559501648 seconds ---
--- 11.344025135040283 seconds ---
--- 11.754926204681396 seconds ---
--- 12.779674530029297 seconds ---
Below is CPU result :
--- 4.780993223190308 seconds ---
--- 4.910650253295898 seconds ---
--- 4.990436553955078 seconds ---
--- 5.246175050735474 seconds ---
it is still slower than CPU
EDIT 2: OpenCv is 4.5.0, CUDA 11.1 and CUDNN 8.0.1
You should definitely only load YOLO once. Recreating it for every image that comes through the socket is slow for both CPU and GPU, but GPU takes longer to initially load which is why you're seeing it run slower than CPU.
I don't understand what you mean by using an LRU cache for your YOLO model. Without seeing the rest of your code structure I can't make any real suggestions, but can you try at least temporarily putting the network into the global space just to see if it runs faster? (remove the function altogether and put its body in the global space)
something like this
net = cv2.dnn.readNet("yolov4.weights", "yolov4.cfg")
classes = []
with open("coco.names", "r") as f:
classes = [line.strip() for line in f.readlines()]
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
def image(data_image):
sbuf = StringIO()
b = io.BytesIO(base64.b64decode(data_image))
if(str(data_image) == 'data:,'):
pimg = Image.open(b)
frame = cv2.cvtColor(np.array(pimg), cv2.COLOR_RGB2BGR)
frame = resize(frame, width=700)
frame = cv2.flip(frame, 1)
height, width, channels = frame.shape
blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416),
swapRB=True, crop=False)
outs = net.forward(output_layers)
print("--- %s seconds ---" % (time.time() - start_time))
class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.5:
# Object detected
center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
w = int(detection[2] * width)
h = int(detection[3] * height)
# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)
boxes.append([x, y, w, h])
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
colors = np.random.uniform(0, 255, size=(len(classes), 3))
for i in range(len(boxes)):
if i in indexes:
x, y, w, h = boxes[i]
label = str(classes[class_ids[i]])
color = colors[class_ids[i]]
cv2.rectangle(frame, (x, y), (x + w, y + h), color, 2)
cv2.putText(frame, label, (x, y + 30), font, 1, color, 2)
imgencode = cv2.imencode('.jpg', frame)[1]
stringData = base64.b64encode(imgencode).decode('utf-8')
b64_src = 'data:image/jpg;base64,'
stringData = b64_src + stringData
emit('response_back', stringData)
From the previous two answer I manage to get the solution changing :
into :
have help to twice the GPU speed due to my GPU type is not compatible with FP16 this is thanks to Amir Karami and also despite Ian Chu answer did not solve my problem it give me basis to forcefully make all the images to only use one net instances this actually lower the processing time significantly from each needing 10 second into 0.03-0.04 seconds thus surpassing CPU speed by many times. The reason I did not accept both answer because neither really solve my problem but both become strong basis to my solution so I still upvote them. I just leave my answer here in case anyone encounter this problem like me.
DNN_TARGET_CUDA_FP16 refers to 16-bit floating-point. since your gpu is 1050 Ti, your gpu seems not works too well with FP16.you can check it from here and your compute capability from here.
i think you should change this line :
into :
import cv2
def clear(img):
back = cv2.imread("back.png", cv2.IMREAD_GRAYSCALE)
img = cv2.bitwise_xor(img, back)
ret, img = cv2.threshold(img, 120, 255, cv2.THRESH_BINARY_INV)
return img
def threshold(img):
ret, img = cv2.threshold(img, 120, 255, cv2.THRESH_BINARY_INV)
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
ret, img = cv2.threshold(img, 248, 255, cv2.THRESH_BINARY)
return img
def fomatImage(img):
img = threshold(img)
img = clear(img)
return img
img = fomatImage(cv2.imread("1566135246468.png",cv2.IMREAD_COLOR))
This is my code. But when I tried to identify it with tesseract-ocr, I got a warning.
Warning: Invalid resolution 0 dpi. Using 70 instead.
How should I set up dpi?
AFAIK, OpenCV doesn't set the dpi of PNG files it writes, so you are looking at work-arounds. Here are some ideas...
Method 1 - Use PIL/Pillow instead of OpenCV
PIL/Pillow can write dpi information into PNG files. So you would:
Step 1 - Convert your BGR OpenCV image into RGB to match PIL's channel ordering
from PIL import Image
RGBimage = cv2.cvtColor(BGRimage, cv2.COLOR_BGR2RGB)
Step 2 - Convert OpenCV Numpy array onto PIL Image
PILimage = Image.fromarray(RGBimage)
Step 3 - Write with PIL
PILimage.save('result.png', dpi=(72,72))
As Fred mentions in the comments, you could equally use Python Wand in much the same way.
Method 2 - Write with OpenCV but modify afterwards with some tool
You could use Python's subprocess module to shell out to, say, ImageMagick and set the dpi like this:
magick OpenCVImage.png -set units pixelspercentimeter -density 28.3 result.png
All you need to know is that PNG uses metric (dots per centimetre) rather than imperial (dots per inch) and there are 2.54cm in an inch, so 72 dpi becomes 28.3 dots per cm.
If your ImageMagick version is older than v7, replace magick with convert.
Method 3 - Write with OpenCV and insert dpi yourself
You could write your file to memory using OpenCV's imencode(). Then search in the file for the IDAT (image data) chunk - which is the one containing the image pixels and insert a pHYs chunk before that which sets the density. Then write to disk.
It's not that hard actually - it's just 9 bytes, see here and also look at pngcheck output at end of answer.
This code is not production tested but seems to work pretty well for me:
#!/usr/bin/env python3
import struct
import numpy as np
import cv2
import zlib
def writePNGwithdpi(im, filename, dpi=(72,72)):
"""Save the image as PNG with embedded dpi"""
# Encode as PNG into memory
retval, buffer = cv2.imencode(".png", im)
s = buffer.tostring()
# Find start of IDAT chunk
IDAToffset = s.find(b'IDAT') - 4
# Create our lovely new pHYs chunk - https://www.w3.org/TR/2003/REC-PNG-20031110/#11pHYs
pHYs = b'pHYs' + struct.pack('!IIc',int(dpi[0]/0.0254),int(dpi[1]/0.0254),b"\x01" )
pHYs = struct.pack('!I',9) + pHYs + struct.pack('!I',zlib.crc32(pHYs))
# Open output filename and write...
# ... stuff preceding IDAT as created by OpenCV
# ... new pHYs as created by us above
# ... IDAT onwards as created by OpenCV
with open(filename, "wb") as out:
# main
# Load sample image
im = cv2.imread('lena.png')
# Save at specific dpi
writePNGwithdpi(im, "result.png", (32,300))
Whichever method you use, you can use pngcheck --v image.png to check what you have done:
pngcheck -vv a.png
Sample Output
File: a.png (306 bytes)
chunk IHDR at offset 0x0000c, length 13
100 x 100 image, 1-bit palette, non-interlaced
chunk gAMA at offset 0x00025, length 4: 0.45455
chunk cHRM at offset 0x00035, length 32
White x = 0.3127 y = 0.329, Red x = 0.64 y = 0.33
Green x = 0.3 y = 0.6, Blue x = 0.15 y = 0.06
chunk PLTE at offset 0x00061, length 6: 2 palette entries
chunk bKGD at offset 0x00073, length 1
index = 1
chunk pHYs at offset 0x00080, length 9: 255x255 pixels/unit (1:1). <-- THIS SETS THE DENSITY
chunk tIME at offset 0x00095, length 7: 19 Aug 2019 10:15:00 UTC
chunk IDAT at offset 0x000a8, length 20
zlib: deflated, 2K window, maximum compression
row filters (0 none, 1 sub, 2 up, 3 avg, 4 paeth):
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
(100 out of 100)
chunk tEXt at offset 0x000c8, length 37, keyword: date:create
chunk tEXt at offset 0x000f9, length 37, keyword: date:modify
chunk IEND at offset 0x0012a, length 0
No errors detected in a.png (11 chunks, 76.5% compression).
While I am editing PNG chunks, I also managed to set a tIME chunk and a tEXt chunk with the Author. They go like this:
# Create a new tIME chunk - https://www.w3.org/TR/2003/REC-PNG-20031110/#11tIME
year, month, day, hour, min, sec = 2020, 12, 25, 12, 0, 0 # Midday Christmas day 2020
tIME = b'tIME' + struct.pack('!HBBBBB',year,month,day,hour,min,sec)
tIME = struct.pack('!I',7) + tIME + struct.pack('!I',zlib.crc32(tIME))
# Create a new tEXt chunk - https://www.w3.org/TR/2003/REC-PNG-20031110/#11tEXt
Author = "Author\x00Sir Mark The Great"
tEXt = b'tEXt' + bytes(Author.encode('ascii'))
tEXt = struct.pack('!I',len(Author)) + tEXt + struct.pack('!I',zlib.crc32(tEXt))
# Open output filename and write...
# ... stuff preceding IDAT as created by OpenCV
# ... new pHYs as created by us above
# ... new tIME as created by us above
# ... new tEXt as created by us above
# ... IDAT onwards as created by OpenCV
with open(filename, "wb") as out:
Keywords: OpenCV, PIL, Pillow, dpi, density, imwrite, PNG, chunks, pHYs chunk, Python, image, image-processing, tEXt chunk, tIME chunk, author, comment
I'm quite puzzled about the endianness on an ARM device. The device I'm testing uses little endian.
Say there's code here which swaps elements in an array:
uint32_t* srcPtr = (uint32_t*)src->get();
uint8_t* dstPtr = dst->get();
dstPtr[0] = ((*srcPtr) >> 16) & 0xFF;
dstPtr[1] = ((*srcPtr) >> 8) & 0xFF;
dstPtr[2] = (*srcPtr) & 0xFF;
dstPtr[3] = ((*srcPtr) >> 24);
My understanding is that if srcPtr contains {0, 1, 2, 3} the output dstPtr should be {1, 2, 3, 0}.
But the output is dstPtr is {2, 1, 0, 3}.
Does this mean that the srcPtr read in this way 3, 2, 1 -> 0 ?
Can someone please help me ? :)
Is this due to the little endian ?
so at address 0x100 I have the values 0x00, 0x11, 0x22, 0x33. 0x00 is at 0x100, 0x11 at 0x101 and so on. If I point at address 0x100 with a 32 bit unsigned pointer, then I get the value 0x33221100, true for ARM (little endian), true for x86 (little endian) etc.
So now if I take 0x33221100 and (x>>16)&0xFF I get 0x22. (x>>8)&0xFF is 0x11, x&0xFF is 0x00 and (x>>24)&0xFF is 0x33. {2,1,0,3}
Where is your confusion? Is it the conversion from 0x00,0x11,0x22,0x33 to 0x33221100? Little endian, least significant byte first, so the lowest or first address you come across (0x100) has the least significant byte (0x00 the lower 8 bits of the number) and so on 0x101 the next least significant bits 8 to 15, 0x102 bits 16 to 23 and 0x103 bits 24 to 31. for a 32 bit value.
I'm trying to set up a scene for veralite. I want the LEDs to change color depending to the temperature. In the following LUUP code, Device ID 12 is the CurrentTemperature ; R G B and W are respectively 18, 17, 19 and 20.
I would like to know why my code doesn't work.
Thank you so much for your help.
local lul_temp = luup.variable_get("urn:upnp-org:serviceId:TemperatureSensor1","CurrentTemperature", 12)
local R = 18 -- RGB Red, device ID
local G = 17 -- RGB Green, device ID
local B = 19 -- RGB Blue, device ID
local W = 20 -- RGB White, device ID
local Colours = {
[32] = {Temp=32, R=32, G=32, B=32, W=0, Name='32'},
[33] = {Temp=33, R=33, G=33, B=33, W=0, Name='33'},
[34] = {Temp=34, R=34, G=34, B=34, W=0, Name='34'},
[72] = {Temp=72, R=72, G=72, B=72, W=0, Name='72'}
local v = Colours[tonumber(lul_temp)] -- look up the table value using index
if (tonumber(lul_temp) > 31) then
luup.call_action("urn:upnp-org:serviceId:Dimming1", "SetLoadLevelTarget", {newLoadlevelTarget = v.R, R) -- RGB Red
luup.call_action("urn:upnp-org:serviceId:Dimming1", "SetLoadLevelTarget", {newLoadlevelTarget = v.G, G) -- RGB Green
luup.call_action("urn:upnp-org:serviceId:Dimming1", "SetLoadLevelTarget", {newLoadlevelTarget = v.B, B) -- RGB Blue
luup.call_action("urn:upnp-org:serviceId:Dimming1", "SetLoadLevelTarget", {newLoadlevelTarget = v.W, W) -- RGB White
I am using an image, the details of which I got using imfinfo in matlab are as follows:
Filename: 'dog.jpg'
FileModDate: '25-Mar-2011 15:54:00'
FileSize: 8491
Format: 'jpg'
FormatVersion: ''
Width: 194
Height: 206
BitDepth: 24
ColorType: 'truecolor'
FormatSignature: ''
NumberOfSamples: 3
CodingMethod: 'Huffman'
CodingProcess: 'Sequential'
Comment: {}
NewSubFileType: 0
BitsPerSample: [8 8 8]
PhotometricInterpretation: 'RGB'
ImageDescription: [1x13 char]
StripOffsets: 154
SamplesPerPixel: 3
RowsPerStrip: 206
StripByteCounts: 119892
It shows number of channels =3(NumberOfSamples: 3) but when I find the number of channels in opencv using the following code, I get No. of channels = 1
Mat img = imread("dog.jpg", 0);
printf("No. of Channels = %d\n", img.channels());
Why so?? Please explain.
As #berak commented, by using 0 as the second parameter of imread(), you are loading it as a grayscale image. Try to load it by passing it a negative value <0 in order to return the loaded image as is (with alpha channel) or a positive value >0 to return a 3-channel color image.
Mat img = imread("dog.jpg", -1); // <0 Return the loaded image as is