Using DirectX api to view h264 stream decoded by FFMPEG - directx

I am trying to stream a video between two clients.
Client A shall upstream the video to a server in h264 format and Client B shall downstream it from the server. To downstream, I am using FFMPEG to decode the NAT over RTP packages.
My problem is that I must display the image using the DirectX API which requires parameters:
bitstream
picture parameters
quantization matrix
slice info.
On the other hand, the resulting parameters from downstreaming with FFMPEG are SPS (Sequence Parameter Set) and PPS (Picture Parameter Set).
I assume that FFMPEG's PPS and DirectX's "picture parameters" are at least tangentially related, however I'm not sure how to obtain the remaining parameters (bitstream, quant_matrx and slce_info) from PPS and SPS.
Any suggestions (barring those that send me back to Google whence I wearily trudge after two days worth of searches) are greatly appreciated.
Regards
-E

Sounds like you're trying to use a DirectX interface that wants encoded video, not decoded video as you should be getting from ffmpeg. You should have a series of decoded frames you need to simply display via DirectX/DirectShow.
If you want to have DirectX and/or the video driver/hardware decode it, you need to find the right interface to submit it to.
I'm afraid your question is lacking in detail needed to give any better answer.

Related

Will there be any trace if I encode video with my laptop?

As the title says
I want to know if there are anything that will help someone to trace the laptop or machine used for encoding the video?
Also is there any trace in image file too? Like I watermark with ffmpeg and my machine code is added into metadata of that image?
With ffmpeg, no. Add -bitexact to be sure.
Depending on the applicatzion you are using, container and codec you are encoding to, this is possible.
For ffmpeg i am not aware that it puts any machine related stuff into any format or codec.
Even when you are using external encoders instead of built-in ones like AMD or NVIDIA stuff, currently the codecs do not allow to put such data into the stream.
Sure, future audio/video codecs might allow such metadata in order to find out if the encoder is licensed correctly, but as by now i am not aware of such stuff.
What cameras do for example to overcome the lack of codecs and formats support for storage of this information is to just write some xml along to the media file where they store serial number and such.
If there was such information contained, analyzer tools like "mediainfo" would show this info. I am not yet affiliated with mediainfo Sarl.

Raspberry Pi camera and OpenCv: 10bit?

The Raspberry Pi Camera v1 contains a OmniVision OV5647 sensors which offers up to 10bit raw RGB data. Using opencv's cvQueryFrame I get only 8bit data. I am only interested in grayscale imagery - how do I get 10bit data?
There may be simpler options available, but here are a couple of possible ideas. I have not coded or tested either, like I normally would - sorry.
Option 1.
Use "Video for Linux" (v4l2) and open the camera, do the ioctl()s and manage the buffers yourself - great link here.
Option 2.
Use popen() to start raspivid and tell it you want the raw option (--raw) and grab the raw data off the end of the JPEG with information on Bayer decoding from - here. Other, somewhat simpler to follow information available at section 5.11 here.
Assuming you want to capture RAW data from still images and not necessarily video, you have 2 options I know of:
Option 1: picamera
picamera is a Python library that will let you capture data to a stream. Be sure to read the docs as it's pretty tricky to work with.
Option 2: raspistill
You can also shell out to raspistill to capture your image file, and the process that however you want - if you want to process the raw data (captured raspistill --raw), you can use picamraw on- or offboard the Pi.
Even though we're a heavily Python shop, my team went with option 2 (in combination with picamraw, which we released ourselves) because picamera was not stable enough.

Is there a simple DirectShow filter that can mix audio together of the exact same format?

I have a DirectShow application written in Delphi 6 using the DSPACK component library. I want to be able to mix together audio coming from the output pins from multiple Capture Filters that are set to the exact same media format. Is there an open source or "sdk sample" filter that does this?
I know that intelligent mixing is a big deal and that I'd most likely have to buy a commercial library to do that. But all I need is a DirectShow filter that can accept wave audio input from multiple output pins and does a straight addition of the samples received. I know there are Tee Filter's for splitting a single stream into multiple streams (one-to-many), but I need something that does the opposite (many-to-one), preferably with format checking on each input connection attempt so that any attempt to attach an output pin with a different media format than the ones already added is thwarted with an error. Is there anything out there?
Not sure about anything available out of the box, however it would be definitely a third party component.
The complexity of creating this custom filter is not very high (it is not a rocket science in terms of creating such component yourself for specific need). You basically need to have all input audio converted to the same PCM format, match the timestamps, add the data and then deliver via output pin.

Converting raw pcm to speex?

For latency issues, I would like to send speex encoded audio frame data to a server instead of the raw PCM like I'm sending right now.
The problem is that I'm doing this in flash, and I want to use a socket connection to stream encoded spx frames of data.
I read the speex manual and it unfortunately does not go over the actual CELP algorithm used to convert pcm to spx data, it briefly introduces the use of excitation gains and how it grabs the filter coefficients.
It's libraries are in dlls- dead ends.
I really would like to create a conversion class in actionscript. Is this possible? Is there any documentation on this? I've been googling to no avail. You'd think there would be more documentation on speex out there...
And if I can't do this, what would be the most documente audio format to use?
thanks

Snapshot using vlc (to get snapshot on RAM)

I was planning to use the vlc library to decode an H.264 based RTSP stream and extract each frame from it (convert vlc picture to IplImage). I have done a bit of exploration of the vlc code and concluded that there is a function called libvlc_video_take_snapshot which does a similar thing. However the captured frame in this case is saved on the hard disk which I wish to avoid due to the real time nature of my application. What would be the best way to do this? Would it be possible without modifying the vlc source (I want to avoid recompilation if possible). I have heard of vmem etc but could not really figure out what it does and how to use it.
The picture_t structure is internal to the library, how can we get an access to the same.
Awaiting your response.
P.S. Earlier I tried doing this using FFMPEG, however the ffmpeg library has a lot of issues while decoding an H.264 based RTSP stream on windows and hence I had to switch to VLC.
Regards,
Saurabh Gandhi

Resources