Displaying frames in a form of video on google colab - opencv

I am a trying video processing on google colab. My code read the video and break it into frames and after the processing on the frame I want to display the video as frames are processed. Like what cv2.imshow does (on local computer). But cv2.imshow gives error in colab so as it suggested I used cv2_imshow using from google.colab.patches import cv2_imshow . It is displaying the frames but in an column(like separate images) but replacing the previous displayed. Here is my colab link: https://colab.research.google.com/drive/1RUOGahcGngTWG9nBoisrsPzCLQ1Jq88v?usp=sharing
You can see the output at the end of the page where multiple images are.
Any help is really appreciated :)

try:
from google.colab.patches import cv2_imshow
from IPython.display import clear_output
from time import sleep
clear_output()
cv2_imshow(img)
sleep(0.1)
it's far from perfect (since there are some frame drops for some reason), but that's the closest thing I could find.

Related

The use of librosa.effects.trim to remove the silent part in audio

I am doing a speech emotion recognition ML.
I currently use pyAudioAnalysis to do a multi-directory feature extraction. However, the dataset involved in audios containing a lot of approximately silent sections. My objective is to remove the approximately silent parts from all the audios then extract meaningful features.
My current approach is to use librosa to trim the silent parts.
from librosa.effects import trim
import librosa
from pyAudioAnalysis import audioBasicIO
import matplotlib.pyplot as plt
signal, Fs = librosa.load(file_directory)
trimed_signal = trim(signal,top_db=60)
fig, ax = plt.subplots(nrows=3, sharex=True, sharey=True)
librosa.display.waveplot(trimed_signal, sr=Fs, ax=ax[0])
ax[0].set(title='Monophonic')
ax[0].label_outer()
I tried to plot the wave after trimming using librosa.display.waveplot but an AttributeError occurred showing AttributeError: module 'librosa' has no attribute 'display'
My questions are
How to plot the trimmed wave?
Is it possible to generate a trimmed .wav file? This is because pyAudioAnalysis's input for feature extraction is .wav file path but the output of librosa is array.
You need to import librosa.display separately. See this issue for the reason.
You can use librosa.output.write_wav (check the docs) to store the trimmed array as a wave file. E.g. librosa.output.write_wav(path, trimed_signal, Fs).

can't show an image using PIL on google colab

I am trying to use PIL to show an image. I know that I can use other modules to do that. I am working on google colab. But I can't figure out why PIL is not showing output image.
% matplotlib inline
import numpy as np
import PIL
im=Image.open('/content/drive/My Drive/images-process.jpeg')
print(im.width, im.height, im.mode, im.format, type(im))
im.show()
output: 739 415 RGB JPEG < class 'PIL.JpegImagePlugin.JpegImageFile'>
Instead of
im.show()
Try just
im
Colab should try to display it on its own. See example notebook
Use
display(im)
instead of im.show() or im.
When using these options after multiple lines or in a loop, im won't work.
After you open an image(which you have done using Image.open()), try converting using im.convert() to which ever mode image is in then do display(im)
It will work

saving figures seaborn to sageplot

I have been trying to save statistical graphs (boxplot, barchart, histogram etc) of some random data generated using Seaborn into LaTeX without saving them into a file first. I use \sageplot[width=8cm][png]{(Python_Graphics_Format)} from SageTex package to do this.
For example, when I draw a Box Plot using Seaborn it generates all kinds of format (name.gcf(), name.show(), name.plot(), name.draw() etc) but Graphics format. Is there any way to do this without using 'name.savefig()' or likes?
Why is it important for me?
I would like to generate a list of predefined sage functions in a separate tex file together with bunch of randomly generated data and input them on top of my TeX code after \maketitle. This way, I will be able to generate multiple problems of similar nature and upload them to the online HW system Ximera.
Here is a code that I took from stackoverflow:
import seaborn as sns
sns.set_style("whitegrid")
tips = sns.load_dataset("tips")
ax = sns.boxplot(x=tips["total_bill"])
Your help is most appreciated.

Mov file has more frames than written/Possible iOS AVAsset writer usage issue

I am manually generated a .mov video file.
Here is a link to an example file: link, I wrote a few image frames, and then after a long break wrote approximately 15 image frames just to emphasise my point for debuting purposes. When I extract images from the video ffmpeg returns around 400 frames instead of the 15-20 I expected. Is this because the API i am using is inserting these image files automatically? Is it a part of the .mov file format that requires this? Or is it due to the way the library is extracting the image frames from the video? I have tried searching the internet but could not arrive at an answer.
My use case is that I am trying to write the current "sensor data" (from core motion) from core motion while writing a video. For each frame I receive from the camera, I use "AppendPixelBuffer" to write the frame to the video and then
Thanks for any help. The end result is I want a 1:1 ratio of Frames in the video to rows in the CSV file. I have confirmed I am writing the CSV file correctly using various counters etc. So my issue is cleariy the understanding of the movie format or API.
Thanks for any help.
UPDATED
It looks like your ffmpeg extractor is wrong. To extract only the timestamped frames (and not frames sampled at 24Hz) in your file, try this:
ffmpeg -i video.mov -r 1/1 image-%03d.jpeg
This gives me the 20 frames expected.
OLD ANSWER
ffprobe reports that your video has a frame rate of 2.19 frames/s and a duration of 17s, which gives 2.19 * 17 = 37 frames, which is closer to your expected 15-20 than ffmpeg's 400.
So maybe the ffmpeg extractor is at fault?
Hard to say if you don't show how you encode and decode the file.

Distorted sound after sample rate change

This one keeps me awake:
I have an OS X audio application which has to react if the user changes the current sample rate of the device.
To do this I register a callback for both in- and output devices on ‘kAudioDevicePropertyNominalSampleRate’.
So if one of the devices sample rates get changed I get the callback and set the new sample rate on the devices with 'AudioObjectSetPropertyData' and 'kAudioDevicePropertyNominalSampleRate' as the selector.
The next steps were mentioned on the apple mailing list and i followed them:
stop the input AudioUnit and the AUGraph which consists of a mixer and the output AudioUnit
uninitalize them both.
check for the node count, step over them and use AUGraphDisconnectNodeInput to disconnect the mixer from the output
now set the new sample rate on the output scope of the input unit
and on the in- and output scope on the mixer unit
reconnect the mixer node to the output unit
update the graph
init input and graph
start input and graph
Render and Output callbacks start again but now the audio is distorted. I believe it's the input render callback which is responsible for the signal but I'm not sure.
What did I forget?
The sample rate doesn't affect the buffer size as far as i know.
If I start my application with the other sample rate everything is OK, it's the change that leads to the distorted signal.
I look at the stream format (kAudioUnitProperty_StreamFormat) before and after. Everything stays the same except the sample rate which of course changes to the new value.
As I said I think it's the input render callback which needs to be changed. Do I have to notify the callback that more samples are needed? I checked the callbacks and buffer sizes with 44k and 48k and nothing was different.
I wrote a small test application so if you want me to provide code, I can show you.
Edit: I recorded the distorted audio(a sine) and looked at it in Audacity.
What I found was that after every 495 samples the audio drops for another 17 samples.
I think you see where this is going: 495 samples + 17 samples = 512 samples. Which is the buffer size of my devices.
But I still don't know what I can do with this finding.
I checked my Input and Output render procs and their access of the RingBuffer(I'm using the fixed Version of CARingBuffer)
Both store and fetch 512 frames so nothing is missing here...
Got it!
After disconnecting the Graph it seems to be necessary to tell both devices the new sample rate.
I already did this before the callback but it seems this has to be done at a later time.

Resources