RTP H.264 Packet Depacketizer

RTP H.264 Packet Depacketizer - parsing

Usually for videos the marker bit of RTP Packet indicates the last packet of the RTP.
So, with this it is guaranteed that I will receive 1 frame per packet or can receive more than one?
In the case beyond the depacketization I would have to make a parser to separate the H.264 frames?
If I can get more than one frame per RTP packetit is possible to get a piece of the next frame? Or all frames within the RTP packet even if more than one are completes?
Best regards,

RFC 6184 "RTP Payload Format for H.264 Video" has answers for the raised questions. It can be both ways: 2+ NAL units per packet, and 1 NAL unit fragmented over 2+ packets.
See quotes below:
5.7.1. Single-Time Aggregation Packet (STAP)
A single-time aggregation packet (STAP) SHOULD be used whenever NAL
units are aggregated that all share the same NALU-time.
and
5.8. Fragmentation Units (FUs)
This payload type allows fragmenting a NAL unit into several RTP
packets. Doing so on the application layer instead of relying on
lower-layer fragmentation (e.g., by IP) has the following advantages:

Related

Does SCTP really prevent head-of-line blocking?

I've known about SCTP for a decade or so, and although I never got to use it yet, I've always wanted to, because of some of its promising (purported) features:
multi-homing
multiplexing w/o head-of-line blocking
mixed order/unordered delivery on the same connection (aka association)
no TIME_WAIT
no SYN flooding
A Comparison between QUIC and SCTP however claims
SCTP intended to get rid of HOL-Blocking by substreams, but its
Transmission Sequence Number (TSN) couples together the transmission
of all data chunks. [...] As a result, in SCTP if a packet is lost,
all the packets with TSN after this lost packet cannot be received
until it is retransmitted.
That statement surprised me because:
removing head-of-line blocking is a stated goal of SCTP
SCTP does have a per-stream sequence number, see below quote from RFC 4960, which should allow processing per stream, regardless of the association-global TSN
SCTP has been in use in the telecommunications sector for perhaps close to 2 decades, so how could this have been missed?
Internally, SCTP assigns a Stream Sequence Number to each message
passed to it by the SCTP user. On the receiving side, SCTP ensures
that messages are delivered to the SCTP user in sequence within a
given stream. However, while one stream may be blocked waiting for
the next in-sequence user message, delivery from other streams may
proceed.
Also, there is a paper Head-of-line Blocking in TCP and SCTP: Analysis and Measurements that actually measures round-trip time of a multiplexed echo service in the face of package loss and concludes:
Our results reveal that [..] a small number of SCTP streams or SCTP unordered mode can avoid this head-of-line blocking. The alternative solution of multiple TCP connections performs worse in most cases.

The answer is not very scholarly, but at least according to the specification in RFC 4960, SCTP seems capable of circumventing head-of-line blocking. The relevant claim seems to be in Section 7.1.
Note: TCP guarantees in-sequence delivery of data to its upper-layer protocol within a single TCP session. This means that when TCP notices a gap in the received sequence number, it waits until the gap is filled before delivering the data that was received with sequence numbers higher than that of the missing data. On the other hand, SCTP can deliver data to its upper-layer protocol even if there is a gap in TSN if the Stream Sequence Numbers are in sequence for a particular stream (i.e., the missing DATA chunks are for a different stream) or if unordered delivery is indicated. Although this does not affect cwnd, it might affect rwnd calculation.
A dilemma is what does "are in sequence for a particular stream" entail? There is some stipulation about delaying delivery to the upper layer until packages are reordered (see Section 6.6, below), but reordering doesn't seem to be conditioned by filling the gaps at the level of the association. Also note the mention in Section 6.2 on the complex distinction between ACK and delivery to the ULP (Upper Layer Protocol).
Whether other stipulations of the RFC indirectly result in the occurence of HOL, and whether it is effective in real-life implementations and situations - these questions warrant further investigation.
Below are some of the excerpts which I've come across in the RFC and which may be relevant.
RFC 4960, Section 6.2 Acknowledgement on Reception of DATA Chunks
When the receiver's advertised window is 0, the receiver MUST drop any new incoming DATA chunk with a TSN larger than the largest TSN received so far. If the new incoming DATA chunk holds a TSN value less than the largest TSN received so far, then the receiver SHOULD drop the largest TSN held for reordering and accept the new incoming DATA chunk. In either case, if such a DATA chunk is dropped, the receiver MUST immediately send back a SACK with the current receive window showing only DATA chunks received and accepted so far. The dropped DATA chunk(s) MUST NOT be included in the SACK, as they were not accepted.
Under certain circumstances, the data receiver may need to drop DATA chunks that it has received but hasn't released from its receive buffers (i.e., delivered to the ULP). These DATA chunks may have been acked in Gap Ack Blocks. For example, the data receiver may be holding data in its receive buffers while reassembling a fragmented user message from its peer when it runs out of receive buffer space. It may drop these DATA chunks even though it has acknowledged them in Gap Ack Blocks. If a data receiver drops DATA chunks, it MUST NOT include them in Gap Ack Blocks in subsequent SACKs until they are received again via retransmission. In addition, the endpoint should take into account the dropped data when calculating its a_rwnd.
Circumstances which highlight how senders may receive acknowledgement for chunks which are ultimately not delivered to the ULP (Upper Layer Protocol).Note this applies to chunks with TSN higher than the Cumulative TSN (i.e. from Gap Ack Blocks). This together with unreliability of SACK order represent good reasons for the stipulation in Section 7.1 (see below).
RFC 4960, Section 6.6 Ordered and Unordered Delivery
Within a stream, an endpoint MUST deliver DATA chunks received with the U flag set to 0 to the upper layer according to the order of their Stream Sequence Number. If DATA chunks arrive out of order of their Stream Sequence Number, the endpoint MUST hold the received DATA chunks from delivery to the ULP until they are reordered.
This is the only stipulation on ordered delivery within a stream in this section; seemingly, reordering does not depend on filling the gaps in ACK-ed chunks.
RFC 4960, Section 7.1 SCTP Differences from TCP Congestion Control
Gap Ack Blocks in the SCTP SACK carry the same semantic meaning as the TCP SACK. TCP considers the information carried in the SACK as advisory information only. SCTP considers the information carried in the Gap Ack Blocks in the SACK chunk as advisory. In SCTP, any DATA chunk that has been acknowledged by SACK, including DATA that arrived at the receiving end out of order, is not considered fully delivered until the Cumulative TSN Ack Point passes the TSN of the DATA chunk (i.e., the DATA chunk has been acknowledged by the Cumulative TSN Ack field in the SACK).
This is stated from the perspective of the sending endpoint, and is accurate for the reason emphasized in section 6.6 above.
Note: TCP guarantees in-sequence delivery of data to its upper-layer protocol within a single TCP session. This means that when TCP notices a gap in the received sequence number, it waits until the gap is filled before delivering the data that was received with sequence numbers higher than that of the missing data. On the other hand, SCTP can deliver data to its upper-layer protocol even if there is a gap in TSN if the Stream Sequence Numbers are in sequence for a particular stream (i.e., the missing DATA chunks are for a different stream) or if unordered delivery is indicated. Although this does not affect cwnd, it might affect rwnd calculation.
This seems to be the core answer to what interests you.
In support of this argument, the format of the SCTP SACK chunk as exposed here and here.

How to track mpeg ts i b p frames in a PCAP file through wireshark/dpkt

I am working through a PCAP file that consists of a single channel of MPEG TS packets carried over UDP in Wireshark and I had a few questions
What's the difference between the more numerous TS packets and the PES packets? The TS packets are far more numerous
Is there a way to analyze the payloads of the TS packets and extract the i b p frames from the data along with timestamps so that I could perhaps see their throughput?

You can usually check for I frames by looking at the "random access indicator" bit. But it is possible for the muxer to set that incorrectly. For P and B frames; Its is codec dependent and each codec has a different process that requires parsing the codec bitstream.

How to parse WMV (ASF) file? Can't find length of data packets

I try to parse WMV (ASF) files without any SDK, just by decoding raw bytes. Now I have problem with ASF_Data_Object, where I can't find length of data packet. More precise, Single payload data packet.
See image:
Here I have 9 packets, but unable to find size of individual packet. How I can determine border between packets?
I think, my problem at byte 0x411, where field "Length type flags". As you can see, here 0 value, so all flags are zero. Even Packet Length Type.
Yes, 0 value here allowed here. But how to read this type of content?
This is now compressed payload, as replication data is 8, not 1. So, this is single payload without additional fields of size.
Sample of WMV file: https://files.catbox.moe/b51l2j.wmv

You seem to be having fixed size packets with no explicit payload length included, meaning that payload data size is derived from top level data object structure.
Spec quote commented:
That is, the ASF data object carries 9 packets, 3200 bytes each, then internally the packets contain payload 3174 bytes of payload per packet except the last one which has less data and some padding.

How to MUX RTP stream depending on the type of NAL unit

My work is like the following:
Before streaming a video, the type, size and start address of each NAL unit is extracted using H.264 bitstream parser.
During streaming, a single NAL unit is encapsulated into one RTP packet.
My question: I need to use the extracted NAL unit's information as input to a mux so this mux could determine whether an RTP packet contains a specific NAL unit type such as PPS. If so, it will steer it to a TCP tunnel else, the RTP packet will be steered to a UDP tunnel.
FYI : I'm using OEFMON which integrates Qualnet and Directshow
Any help will be appreciated.

The rtp header does not have information about the NAL type. You have to parse the RTP payload to get the nal type.
The following code snippet shows the basics:
nType = getbits(pRaw+12, 3, 5);
where pRaw is the start of the entire RTP packet, which makes pRaw + 12 the start of the RTP payload. So the function essentially reads the value defined by the 5 bits starting at offset 3 from the beginning of the RTP payload data.
This is defined in RFC 6184.

How can I convert the RTP payload containing SILK-encoded audio into a file?

I have the pcap of a VoIP call involving SILK.
I'm able to see in Wireshark the RTP payload.
From the RTP headers I can understand the sample rate (e.g. 24 KHz) and the frame size (e.g. 20 ms).
What I'd like to do is extract the RTP payload and generate a file containing the SILK-encoded audio.
From the RTP payload format description I can see that in the case of storage in a file, each block of audio needs a block header, to specify the sample rate and block size (because the block size is variable and can be different on each frame).
How can I generate a file with the correct file header ("magic number") and add a block header for each block of audio?
I can use a few different programming languages so I'm mainly interested in the required algorithm, but would appreciate references to code implementations (or perhaps some existing tool?).

Use pcaputil of pjproject: Converts captured RTP packets in PCAP file to WAV file or play it to audio device. Can filter the PCAP file on source or destination IP or port, is able to deal with SRTP and supports all codecs in PJMEDIA, including SILK (have not tried this myself).
Examples:
pcaputil file.pcap output.wav
pcaputil -c AES_CM_128_HMAC_SHA1_80 -k VLDONbsbGl2Puqy+0PV7w/uGfpSPKFevDpxGsxN3 file.pcap output.wav

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart