i have a django web application hosted locally on a desktop workstation that is supposed to retrieve a video feed/video frames from a webcamera connected via usb. What then happens is that the frame would used as input into an object detection model to count some stuff, but since the objects are small, i need to retrieve the frame at a higher resolution (720, 1280) instead of the default resolution. after the counting is done, i stop reading frames from the web camera - here is the sequence:
Press button in web application to start retrieving video from webcamera
create new opencv cap, since i need a higher resolution, i have to specify the resolution
cap = cv2.VideoCapture(0 + cv2.CAP_DSHOW)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
while loop that include frame retrieval, counting logic and visualization of the frame with inference results using opencv cv2.imshow()
stop retrieving video frames (cv2.destroyAllWindows() and break out of while loop) after counting is done, finally calling cap.release()
the problem with this sequence is that the video stream takes a few seconds (about 3 seconds) to load (this means that after clicking the button in step 1, i only see the video stream in step 3 after about 3 seconds which is quite slow, and there is an autofocus that takes about 1 second after the video stream pops up)
i am considering this new sequence which does not involve the release of the cap. i found that the video stream takes much faster to load and there is no autofocus each time the video stream is loaded:
when starting the django server, create the opencv cap and specify the required resolution (meaning this is only done once)
cap = cv2.VideoCapture(0 + cv2.CAP_DSHOW)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
while loop that include frame retrieval, counting logic and visualization of the frame with inference results using opencv cv2.imshow()
stop retrieving video frames (break out of while loop WITHOUT calling cap.release()) after counting is done
Would the second flow result in any problems or damage to the webcamera, especially if the cap is opened for a long time (like weeks or even months).
Related
I realized that when I use OpenCV to grab videos (cv2.VideoCapture('rtsp://...')) in a rtsp url, actually I am getting everyframe of the stream and not the real time frame.
Example: If a video has 30fps and 10 seconds long, if I get the first frame and wait 1 second to get the next, I get the frame number 2 and not the real time frame (it should be frame number 30 or 31).
I am worried about these because if my code take a little longer to do the video processing (deep learning convolutions), the result will always be delivered later and not in real time.
Any ideas how can I manage to always get the current frame when I capture from rtsp?
Thanks!
This is not about the code. Many IP cameras gives you encoded output(H.265/H.264).
When you used VideoCapture() , the output data of the camera is decoded by the CPU. Getting delay as you mentioned also such as between 1 sec and 2 sec is normal.
What can be done to make it faster:
If you have a GPU hardware, you can decode the data via on it. This
will give you really good results(according to experiences by using
latest version of NVIDIA gpus: you will get almost 25 milisecond
delay) To achieve that on your code, you need:
CUDA installation
CUDA enabled OpenCV installation
VideoReader class of OpenCV
You can use VideoCapture() with FFMPEG flag, FFMPEG has advanced methods to decode encoded data and this will give you probably most faster output which you can get with your CPU. But this will not decrease time much.
I'm using OpenCV to record video for a Full HD Camera Running at 30 FPS. I know the FPS of the camera because I measure it by counting the number of valid frames (this also coincides with the specifications). I do this by using a 5 ms second timer. Here is the code that I run each 5 ms.
cv::Mat frame;
if (capture.read(frame)){
showCurrentFrame(frame);
fpsCounter.newFrame();
ui->labelVideo->setText("Video (" + QString::number(frame.cols) + "x" + QString::number(frame.rows) + ")");
ui->labFPS->setText("FPS: " + QString::number(fpsCounter.getFPS()));
if (isRecording){
recorder << frame;
fpsRecordCounter++;
}
}
I set the FPS of the recorded video to 30 when I press the "Start Recording Button"
recorder.open(currentVideoFile.toStdString(),
VIDEO_CODEC_FOURCC, // Tells it to record MJPEG
VIDEO_REC_FPS, // Define with 30
frameSize)){ // 1920x1080
I've developed my Program in my workstation which runs Centos 7 using Qt and OpenCV 2.4.5. When I record in the desktop PC the FPS show consistently to be about 30.
However, this needs to record from a moving car. So I copy-pasted the code AS IS into my laptop and compiled it with zero issues.
My laptop uses Debian Testing and OpenCV 2.4.9. It is HERE that the slowdown is observed.
Since I'm using Qt I need to process the cv::Mat in order to display it. I do this in the showCurrentFrame function.
If I deactivate this function when recording I get maybe 23 FPS (and can't see what I'm recording)
If I leave it as is in the code above I get about 16~17 FPS.
The first thing that I thought was that my computer was not powerful enough, but it shouldn't be. This is the model:
https://www.asus.com/Notebooks/ASUS_VivoBook_S550CA/specifications/
It's the I5 variant with 500 Gb of HDD.
So I'm at a loss. Is it some sort of bug on OpenCV that was introduced in newer openCV versions or is it simply that my laptop is not powerful enough?
On iOS 7, how do I get the current microphone input volume in a range between 0 and 1?
I've seen several approaches like this one, but the results I get baffle me.
The return values of peakPowerForChannel: are documented to be in the range of -160 to 0 with 0 being the loudest and -160 near absolute silence.
Problem: Given a quite room and a short but loud noise, the power goes all the way up in an instant but takes very long time to drop back to quite level (way longer than the actual noise...)
What I want: Essentially I want an exact copy of the Audio Input patch of Quartz Composer with its Volume Peak output. Any tips?
To get a similar volume peak measurement, you might have to input raw audio via the iOS Audio Queue API (or the RemoteIO Audio Unit), and analyze the raw PCM waveform samples in each audio callback, looking for a magnitude maxima over your desired frame width or analysis time.
This one keeps me awake:
I have an OS X audio application which has to react if the user changes the current sample rate of the device.
To do this I register a callback for both in- and output devices on ‘kAudioDevicePropertyNominalSampleRate’.
So if one of the devices sample rates get changed I get the callback and set the new sample rate on the devices with 'AudioObjectSetPropertyData' and 'kAudioDevicePropertyNominalSampleRate' as the selector.
The next steps were mentioned on the apple mailing list and i followed them:
stop the input AudioUnit and the AUGraph which consists of a mixer and the output AudioUnit
uninitalize them both.
check for the node count, step over them and use AUGraphDisconnectNodeInput to disconnect the mixer from the output
now set the new sample rate on the output scope of the input unit
and on the in- and output scope on the mixer unit
reconnect the mixer node to the output unit
update the graph
init input and graph
start input and graph
Render and Output callbacks start again but now the audio is distorted. I believe it's the input render callback which is responsible for the signal but I'm not sure.
What did I forget?
The sample rate doesn't affect the buffer size as far as i know.
If I start my application with the other sample rate everything is OK, it's the change that leads to the distorted signal.
I look at the stream format (kAudioUnitProperty_StreamFormat) before and after. Everything stays the same except the sample rate which of course changes to the new value.
As I said I think it's the input render callback which needs to be changed. Do I have to notify the callback that more samples are needed? I checked the callbacks and buffer sizes with 44k and 48k and nothing was different.
I wrote a small test application so if you want me to provide code, I can show you.
Edit: I recorded the distorted audio(a sine) and looked at it in Audacity.
What I found was that after every 495 samples the audio drops for another 17 samples.
I think you see where this is going: 495 samples + 17 samples = 512 samples. Which is the buffer size of my devices.
But I still don't know what I can do with this finding.
I checked my Input and Output render procs and their access of the RingBuffer(I'm using the fixed Version of CARingBuffer)
Both store and fetch 512 frames so nothing is missing here...
Got it!
After disconnecting the Graph it seems to be necessary to tell both devices the new sample rate.
I already did this before the callback but it seems this has to be done at a later time.
I'm trying to develop an iPhone app that will use the camera to record only the last few minutes/seconds.
For example, you record some movie for 5 minutes click "save", and only the last 30s will be saved. I don't want to actually record five minutes and then chop last 30s (this wont work for me). This idea is called "Loop recording".
This results in an endless video recording, but you remember only last part.
Precorder app do what I want to do. (I want use this feature in other context)
I think this should be easily simulated with a Circular buffer.
I started a project with AVFoundation. It would be awesome if I could somehow redirect video data to a circular buffer (which I will implement). I found information only on how to write it to a file.
I know I can chop video into intervals and save them, but saving it and restarting camera to record another part will take time and it is possible to lose some important moments in the movie.
Any clues how to redirect data from camera would be appreciated.
Important! As of iOS 8 you can use VTCompressionSession and have direct access to the NAL units instead of having to dig through the container.
Well luckily you can do this and I'll tell you how, but you're going to have to get your hands dirty with either the MP4 or MOV container. A helpful resource for this (though, more MOV-specific) is Apple's Quicktime File Format Introduction manual
http://developer.apple.com/library/mac/#documentation/QuickTime/QTFF/QTFFPreface/qtffPreface.html#//apple_ref/doc/uid/TP40000939-CH202-TPXREF101
First thing's first, you're not going to be able to start your saved movie from an arbitrary point 30 seconds before the end of the recording, you'll have to use some I-Frame at approximately 30 seconds. Depending on what your Keyframe Interval is, it may be several seconds before or after that 30 second mark. You could use all I-frames and start from an arbitrary point, but then you'll probably want to re-encode the video afterward because it will be quite large.
SO knowing that, let's move on.
First step is when you set up your AVAssetWriter, you will want to set its AVAssetWriterInput's expectsMediaDataInRealTime property to YES.
In the captureOutput callback you'll be able to do an fread from the file you are writing to. The first fread will get you a little bit of MP4/MOV (whatever format you're using) header (i.e. 'ftyp' atom, 'wide' atom, and the beginning of the 'mdat' atom). You want what's inside the 'mdat' section. So the offset you'll start saving data from will be 36 or so.
Each read will get you 0 or more AVC NAL Units. You can find a listing of NAL unit types from ISO/IEC 14496-10 Table 7-1. They will be in a slightly different format than specified in Annex B, but it's fine. Additionally, there will only be IDR slices and non-IDR slices in the MP4/MOV file. IDR will be the I-Frame you're looking to hang onto.
The NAL unit format in the MP4/MOV container is as follows:
4 bytes - Size
[Size] bytes - NALU Data
data[0] & 0x1F - NALU Type
So now you have the data you're looking for. When you go to save this file, you'll have to update the MPV/MOV container with the correct length, sample count, you'll have to update the 'stsz' atom with the correct sizes for each sample and things like updating the media headers and track headers with the correct duration of the movie and so on. What I would probably recommend doing is creating a sample container on first run that you can more or less just overwrite/augment with the appropriate data for that particular movie. You'll want to do this because the encoders on the various iDevices don't all have the same settings and the 'avcC' atom contains encoder information.
You don't really need to know much about the AVC stream in this case, so you'll probably want to concentrate your experimenting around updating the container format you choose correctly. Good luck.