openCV RunningAvg implementation - opencv

I am writing a small script (in Python) that generates and updates a running average of a camera feed. When I call cv.RunningAvg it returns:
cv2.error: func != 0
Where am I stumbling in implementing cv.RunningAvg? Script follows:
import cv
feed = cv.CaptureFromCAM(0)
frame = cv.QueryFrame(feed)
moving_average = cv.QueryFrame(feed)
cv.NamedWindow('live', cv.CV_WINDOW_AUTOSIZE)
def loop():
frame = cv.QueryFrame(feed)
cv.ShowImage('live', frame)
c = cv.WaitKey(10)
cv.RunningAvg(frame, moving_average, 0.020, None)
while True:
loop()

I am not sure about the error, but check out the documentation for cv.RunningAvg
It says destination should be 32 or 64-bit floating point.
So I made a small correction in your code and it works. I created a 32-bit floating point image to store running average values, then another 8 bit image so that I can show running average image :
import cv2.cv as cv
feed = cv.CaptureFromCAM(0)
frame = cv.QueryFrame(feed)
moving_average = cv.CreateImage(cv.GetSize(frame),32,3) # image to store running avg
avg_show = cv.CreateImage(cv.GetSize(frame),8,3) # image to show running avg
def loop():
frame = cv.QueryFrame(feed)
c = cv.WaitKey(10)
cv.RunningAvg(frame, moving_average, 0.1, None)
cv.ConvertScaleAbs(moving_average,avg_show) # converting back to 8-bit to show
cv.ShowImage('live', frame)
cv.ShowImage('avg',avg_show)
while True:
loop()
cv.DestroyAllWindows()
Now see the result :
At a particular instant, I saved a frame and its corresponding running average frame.
Original frame :
You can see the obstacle (my hand) blocks the objects in behind.
Now running average frame :
It almost removed my hand and shows objects in background.
That is how it is a good tool for background subtraction.
One more example from a typical traffic video :
You can see more details and samples here : http://opencvpython.blogspot.com/2012/07/background-extraction-using-running.html

Related

Texture transformation

I am working on eigen transformation - texture to detect object from an image. This work was published in ACCV 2006 page number 71. Full pdf is available on chapter-3 in this pdf https://www.diva-portal.org/smash/get/diva2:275069/FULLTEXT01.pdf. I am not able to follow after getting texture descriptors.
I am working on the attached image. The image size is 9541440.
I took image patches of 3232 and for every patch calculated eigenvalues and got the texture descriptor. After that what to do with these texture descriptors is what I am not able to follow.
Any help to unblock will be really appreciated. Code looks for calculating descriptors looks like below:
descriptors = np.zeros((gray.shape[0]//w, gray.shape[1]//w))
w = 32
for i in range(gray.shape[0]//w):
temp = []
for j in range(gray.shape[1]//w):
sorted_eigen = -np.sort(-np.linalg.eigvals(gray[i*w:
(i+1)*w,j*w:(j+1)*w]))
l = i*w + 13
k = (i+1)*w
theta_svd = (1/(k-l+1))* np.sum([np.abs(val) for val in s[l:k]])
descriptors[i,j] = theta_svd

Run Mediapipe pose tracker on Colab as a continuous video stream

I am running a pose tracking application on Colab (with mediapipe). It does not show a continuous video, instead it runs my video frame by frame, showing them in succession in the output section below my block of code. So it becomes very slow at processing a single video and I have the output section full of frames, so I have to scroll a lot of stuff to check the top or the bottom of the output section. The goal is to have a video stream like a normal linux application on my PC.
This is the main() file of my application
cap = cv2.VideoCapture('1500_doha.mp4')
pTime = 0
detector = poseDetector()
while cap.isOpened():
success, img = cap.read()
width,height, c=img.shape
img = detector.findPose(img)
lmList = detector.findPosition(img, draw=False)
angle=detector.findAngle(img, 11, 13, 15) #attenzione, cambia braccio ogni tanto!!
cTime = time.time()
fps = 1 / (cTime - pTime)
pTime = cTime
text=cv2.putText(img, str(int(fps)), (70, 50), cv2.FONT_HERSHEY_PLAIN, 3,
(255, 0, 0), 3)
cv2_imshow(img)
cv2.waitKey(10)
The problem is clearly in cv2_imshow(), because if I run a YOLO-V4 box detector I don't need this command and I obtain a continuous stream. Have you any suggestions? Is there already a solution online?
Here you find part of the output box of my google colab.
Here you find the complete file https://colab.research.google.com/drive/1uEWiCGh8XY5DwalAzIe0PpzYkvDNtXID#scrollTo=HPF2oi7ydpdV

OpenCV 4.1.1.26 reports 90000.0 fps for a 25fps RTSP stream

I have an RTP/RTSP stream that's running at 25fps, as verified by ffprobe -i <URI>. Also, VLC plays back the RTSP stream at a real-time rate, but doesn't show me the FPS in the Media Information window.
However, when I use OpenCV 4.1.1.26 to retrieve the input stream's frame rate, it is giving me a response of 90000.0.
Question: How can I use OpenCV to probe for the correct frame rate of the RTSP stream? What would cause it to report 90000.0 instead of 25?
Here's my Python function to retrieve the frame rate:
import cv2
vid : cv2.VideoCapture = cv2.VideoCapture('rtsp://192.168.1.10/cam1/mpeg4')
def get_framerate(video: cv2.VideoCapture):
fps = video.get(cv2.CAP_PROP_FPS)
print('FPS is {0}'.format(fps))
get_framerate(vid)
MacOS Catalina
Python 3.7.4
I hope this helps you somehow. It is a simple calculator that takes cont captures and measure the beginning and the ending time. Then with the rule of three, i converted it to fps.
Related to you second question i read here that it could be due to bad installation. Also, you can check that your camera is working properly by printing ret variable. If it is true then you should be able to see the fps, if it is false then you can have an unpredictable result.
cv2.imshow() and key = cv2.waitKey(1) should be commented as it adds ping/delay resulting in bad measurement.
I post this as a comment because i do not have enough reputation points.
img = cv2.VideoCapture('rtsp://192.168.1.10/cam1/mpeg4')
while True:
if cont == 50:
a = datetime.now() - start
b = (a.seconds * 10e6 + a.microseconds)
print((a.seconds * 10e6 + a.microseconds), "fps = ", (50 * 10e6)/ b)
break
ret, frame = img.read()
# Comment for best test
cv2.imshow('fer', frame)
key = cv2.waitKey(1)
if key == ord('q'):
break
cont+=1
img.release()
cv2.destroyAllWindows()`

How to skip to a specific frame in a given spectrogram file

I'm encountering problems skipping ahead to a specific frame of a melspec feature set found here. The aim of getting features from the feature set is to analyse the difference in beats per second (BPS) so that i can match up the BPS of two tracks in order to mix between the two tracks or warp the timing of the track to synchronise the two pieces of music together. The feature set does specify the following:
Pre-extracted in the "feature" directory are space-delimited floating-point ASCII matrices:
beat_synchronus: one beat-synchronus vector per line
non-beat-synchronus: 512-sample hop frames # 22050Hz sample rate, one vector per line one vector per line:"
I'm not quite sure how to interpret this - is the melspec beat or non beat synchronous and how does that work in regards to delimitating frames?
I've got as far as working out the frame duration thanks to this answer but I don't know how to apply the knowledge gained from the frame duration to the task of navigating to a specific timecode or frame. The closest I've got is working out the offset divided by the frame to work out how many frames need to be skipped to get to the offset (1 second into the track for example gives 2583 frames). However, the file is not demarcated into lines and as far as I can tell is just a continuous list of entries. This leads to the question of what the size is of a given frame is (if that's the right terminology) is it the case that it is 2383 entries to the second need to be skipped to get to the right entry or is it the case that each frame has a specific number of entries and I need to skip 2583 frames of size x? what is size x (512?)?
I've been able to open the file for melspec but for the melspec file there are no delimiters between entries. It is instead a continuous list of entries.
The code I have so far is as follows to work out the duration of a frame, and therefore the number of frames in an offset track to be skipped. However this does not indicate the size of a given frame and how to access that from the file for the melspec.
spectrogram is the file_path for a given feature set. The offset is the time in seconds offset from the start of a track.
def skipToFrame(spectrogram, offset):
SAMPLE_RATE =22050
HOP_LENGTH = 512
#work out the duration of each frame.
FRAME_TIME = HOP_LENGTH/SAMPLE_RATE
# work out how many frames are in the offset period (e.g 1 second).
SHIFT_FRAMES = offset/FRAME_TIME
# readlines of file so that offset is applied.
with open(spectrogram) as feature_set:
indices = int(SHIFT_FRAMES)
for line in feature_set:
print(line)
feature_set.close()
This gives a list of 10 lines of results, which do not seem to be naturally delimited by line.
The sample file you are referring is a matrix of 128 x 7392 values.
To better understand the format of this file, you may look at the extractFeatures.py script used to extract the features. You may notice that the melspec feature is described as "non-beat-synchronus" and computed using librosa.feature.melspectrogram, using mostly default arguments and producing an output S of n_mel rows by t columns.
To figure out the value of n_mel you need to look at librosa.filters.mel, which indicates a default value of 128. The number of frames t on the other hand is computed internally in librosa.util.frame as 1 + int((len(y) - frame_length) / hop_length), where the frame_length uses the default value 2048 and the hop_length uses the default value 512.
To summarize, the 128 rows correspond to the 128 MEL-frequency bins, and the 7392 columns correspond to time frames.
You could thus use the following to extract the column of interest:
def skipToFrame(spectrogram, offset):
SAMPLE_RATE =22050
HOP_LENGTH = 512
#work out the duration of each frame.
FRAME_TIME = HOP_LENGTH/SAMPLE_RATE
# work out how many frames are in the offset period (e.g 1 second).
SHIFT_FRAMES = offset/FRAME_TIME
# readlines of file so that offset is applied.
with open(spectrogram) as feature_set:
indices = int(SHIFT_FRAMES)
for line in feature_set:
print(line.split(" ")[indices])
feature_set.close()
Using numpy you could also read the entire spectrogram and address a specific column:
import numpy as np
def skipToFrame(spectrogram, offset):
SAMPLE_RATE =22050
HOP_LENGTH = 512
#work out the duration of each frame.
FRAME_TIME = HOP_LENGTH/SAMPLE_RATE
# work out how many frames are in the offset period (e.g 1 second).
SHIFT_FRAMES = offset/FRAME_TIME
data = np.loadtxt(spectrogram)
column = int(SHIFT_FRAMES)
print(data[:,column])
Going back on the fact that the feature extraction was done using librosa, you may also consider using librosa.core.time_to_frames instead of manually computing the frame number:
def skipToFrame(spectrogram, offset):
SHIFT_FRAMES = librosa.core.time_to_frames(offset, sr=22050, hop_length=512, n_fft=2048)
...
On a final note, you should be aware that each of these time frame uses 2048 samples, but they overlap such that each successive frame advances 512 sample relative to the previous sample. So the frames cover the following time intervals:
frame # | start (s) | end (s)
================================
1 | 0.000 | 0.093
2 | 0.023 | 0.116
3 | 0.046 | 0.139
...
41 | 0.929 | 1.022
42 | 0.952 | 1.045
...
7392 | 171.619 | 171.712

python opencv create image from bytearray

I am capturing video from a Ricoh Theta V camera. It delivers the video as Motion JPEG (MJPEG). To get the video you have to do an HTTP POST alas which means I cannot use the cv2.VideoCapture(url) feature.
So the way to do this per numerous posts on the web and SO is something like this:
bytes = bytes()
while True:
bytes += stream.read(1024)
a = bytes.find(b'\xff\xd8')
b = bytes.find(b'\xff\xd9')
if a != -1 and b != -1:
jpg = bytes[a:b+2]
bytes = bytes[b+2:]
i = cv2.imdecode(np.fromstring(jpg, dtype=np.uint8), cv2.IMREAD_COLOR)
cv2.imshow('i', i)
if cv2.waitKey(1) == 27:
exit(0)
That actually works, except it is slow. I'm processing a 1920x1080 jpeg stream. on a Mac Book Pro running OSX 10.12.6. The call to imdecode takes approx 425000 microseconds to process each image
Any idea how to do this without imdecode or make imdecode faster? I'd like it to work at 60FPS with HD video (at least).
I'm using Python3.7 and OpenCV4.
Updated Again
I looked into JPEG decoding from the memory buffer using PyTurboJPEG, the code goes like this to compare with OpenCV's imdecode():
#!/usr/bin/env python3
import cv2
from turbojpeg import TurboJPEG, TJPF_GRAY, TJSAMP_GRAY
# Load image into memory
r = open('image.jpg','rb').read()
inp = np.asarray(bytearray(r), dtype=np.uint8)
# Decode JPEG from memory into Numpy array using OpenCV
i0 = cv2.imdecode(inp, cv2.IMREAD_COLOR)
# Use default library installation
jpeg = TurboJPEG()
# Decode JPEG from memory using turbojpeg
i1 = jpeg.decode(r)
cv2.imshow('Decoded with TurboJPEG', i1)
cv2.waitKey(0)
And the answer is that TurboJPEG is 7x faster! That is 4.6ms versus 32.2ms.
In [18]: %timeit i0 = cv2.imdecode(inp, cv2.IMREAD_COLOR)
32.2 ms ± 346 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [19]: %timeit i1 = jpeg.decode(r)
4.63 ms ± 55.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Kudos to #Nuzhny for spotting it first!
Updated Answer
I have been doing some further benchmarks on this and was unable to verify your claim that it is faster to save an image to disk and read it with imread() than it is to use imdecode() from memory. Here is how I tested in IPython:
import cv2
# First use 'imread()'
%timeit i1 = cv2.imread('image.jpg', cv2.IMREAD_COLOR)
116 ms ± 2.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# Now prepare the exact same image in memory
r = open('image.jpg','rb').read()
inp = np.asarray(bytearray(r), dtype=np.uint8)
# And try again with 'imdecode()'
%timeit i0 = cv2.imdecode(inp, cv2.IMREAD_COLOR)
113 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
So, I find imdecode() around 3% faster than imread() on my machine. Even if I include the np.asarray() into the timing, it is still quicker from memory than disk - and I have seriously fast 3GB/s NVME disks on my machine...
Original Answer
I haven't tested this but it seems to me that you are doing this in a loop:
read 1k bytes
append it to a buffer
look for JPEG SOI marker (0xffdb)
look for JPEG EOI marker (0xffd9)
if you have found both the start and the end of a JPEG frame, decode it
1) Now, most JPEG images with any interesting content I have seen are between 30kB to 300kB so you are going to do 30-300 append operations on a buffer. I don't know much abut Python but I guess that may cause a re-allocation of memory, which I guess may be slow.
2) Next you are going to look for the SOI marker in the first 1kB, then again in the first 2kB, then again in the first 3kB, then again in the first 4kB - even if you have already found it!
3) Likewise, you are going to look for the EOI marker in the first 1kB, the first 2kB...
So, I would suggest you try:
1) allocating a bigger buffer at the start and acquiring directly into it at the appropriate offset
2) not searching for the SOI marker if you have already found it - e.g. set it to -1 at the start of each frame and only try and find it if it is still -1
3) only look for the EOI marker in the new data on each iteration, not in all the data you have already searched on previous iterations
4) furthermore, actually, don't bother looking for the EOI marker unless you have already found the SOI marker, because the end of a frame without the corresponding start is no use to you anyway - it is incomplete.
I may be wrong in my assumptions, (I have been before!) but at least if they are public someone cleverer than me can check them!!!
I recommend to use turbo-jpeg. It has a python API: PyTurboJPEG.

Resources