How can i get the length of an IDR slice in H264 stream - parsing

Please guide me to resolve this issue.
I have parsed the h264 video stream and identified the frames[I/P/B]. I have followed the below steps.
• NAL Units start code: 00 00 01 X Y
• X = IDR Picture NAL Units (25, 45, 65)
• Y = Non IDR Picture NAL Units (01, 21, 41, 61) ; 01 = b-frames, 41 = p-frames
Now my question is how to know the length of individual frames so that i can write each frames to a file. Please give some help.
Regards,
Spk

Ok, so your source is an annex-b formated elementary stream. Basically every NALu begins with a start code (2 or more 0x00 bytes followed by a 0x01 byte). The next byte contains the type (the first 5 bits). The rest is payload. The NALU ends when the next start code in encountered, or you reach the end of the stream. So, to get the length, you must look for the next start code and subtract.
You will likely find this post useful. Possible Locations for Sequence/Picture Parameter Set(s) for H.264 Stream

Related

Printing NV logo with Star SP700 (ESC-POS)

I'm Trying to print The logo stored in my printer.
\x1B\x1C\x70\x00\x1
this is the code i'm executing.
But I still cant see the logo when I print.
What am I missing?
im using c++ btw
Printer SP700 Star printer
According to the specification, the 4th byte is a value in the range 0x01 to 0xFF instead of 0x00.
Dot Impact Printer STAR Command Specifications Rev. 1.91
Page 56, section 3-48
ASCII ESC FS p n m
Hexadecimal 1B 1C 70 n m
n: Logo Specification
n Function
1 to 255 Specified logo number
this work for me : \x1B\x1C\x70\x01\x00

Direct Mapped Cache of Blocks Example

So i have this question in my homework assignment that i have struggling a bit with. I looked over my lecture content/notes and have been able to utilize those to answer the questions, however, i am not 100% sure that i did everything correctly. There are two parts (part C and D) in the question that i was not able to figure out even after consulting my notes and online sources. I am not looking for a solution for those two parts by any means, but it would be greatly appreciated if i could get, at least, a nudge in the right direction in how i can go about solving it.
I know this is a rather large question, however, i hope someone could possibly check my answers and tell me if all my work and methods of looking at this problem is correct. As always, thank you for any help :)
Alright, so now that we have the formalities out of the way,
--------------------------Here is the Question:--------------------------
Suppose a small direct-mapped cache of blocks with 32 blocks is constructed. Each cache block stores
eight 32-bit words. The main memory—which is byte addressable1—is 16,384 bytes in size. 32-bit words are stored
word aligned in memory, i.e., at an address that is divisible by 4.
(a) How many 32-bit words can the memory store (in decimal)?
(b) How many address bits would be required to address each byte of memory?
(c) What is the range of memory addresses, in hex? That is, what are the addresses of the first and last bytes of
memory? I'll give you a hint: memory addresses are numbered starting at 0.
(d) What would be the address of the last word in memory?
(e) Using the cache mapping scheme discussed in the Chapter 5 lecture notes, how many and which address bits
would be used to form the block offset?
(f) How many and which memory address bits would be used to form the cache index?
(g) How many and which address bits would be used to form the tag field for each cache block?
(h) To which cache block (in decimal) would memory address 0x2A5C map to?
(i) What would be the block offset (in decimal) for 0x2A5C?
(j) How many other main memory words would map to the same block as 0x2A5C?
(k) When the word at 0x2A5C is moved into a cache block, what are the memory addresses (in hex) of the other
words which will also be moved into this block? Express your answer as a range, e.g., [0x0000, 0x0200].
(l) The first word of a main memory block that is mapped to a cache block will always be at an address that is
divisible by __ (in decimal)?
(m) Including the V and tag bits of each cache block, what would be the total size of the cache (in bytes)
(n) what would be the size allocated for the data bits (in bytes)?
----------------------My answers and work-----------------------------------
a) memory = 16384 bytes. 16384 bytes into bits = 131072 bits. 131072/32 = 4096 32-bit words
b) 2^14 (main memory) * 2^2 (4 bits/word) = 2^16. take log(base2)(2^16) = 16 bits
c) couldnt figure this part out (would appreciate some input (NOT A SOLUTION) on how i can go about looking at this problem
d)could not figure this part out either :(
e)8 words in each cache line. 8 * 4(2^2 bits/word) = 32 bits in each cache line. log(base2)(2^5) = 5 bits used for block offset.
f) # of blocks = 2^5 = 32 blocks. log(base2)(2^5) = 5 bits for cache index
g) tag = 16 - 5 - 5 - 2(word alignment) = 4 bits
h) 0x2A5C
0010 10100 10111 00
tag index offset word aligned bits
maps to cache block index = 10100 = 0x14
i) maps to block offset = 10111 = 0x17
j) 4 tag bits, 5 block offset = 2^9 other main memory words
k) it is a permutation of the block offsets. so it maps the memory addresses with the same tag and cache index bits and block offsets of 0x00 0x01 0x02 0x04 0x08 0x10 0x11 0x12 0x14 0x18 0x1C 0x1E 0x1F
l)divisible by 4
m) 2(V+tag+data) = 2(1+4+2^3*2^5) = 522 bits = 65.25 bytes
n)data bits = 2^5 blocks * 2^3 words per block = 256 bits = 32 bytes
Part C:
If a memory has M bytes, and the memory is byte addressable, the the memory addresses range from 0 to M - 1.
For your question, this means that memory addresses range from 0 to 16383, or in hex 0x0 to 0x3FFF.
Part D:
Words are 4 bytes long. So given your answer to C, the last word is at:
(0x3FFFF - 3) -> 0x3FFC.
You can see that this is correct because the lowest 2 bits of the address are 0, which must be true of any 4 byte aligned address.

Incorrect values from reading image EXIF Orientation on iOS?

I am using Exif information to have a correct rotation for an image captured from mobile camera.
In Android version the possible values are 1,3,6,8, and 9.
In iOS, I am using the same code, but getting invalid values like 393216, 196608, 524288, 65536 etc..
I don't understand why there is such a difference ?
Short answer:
For iOS you need to read those bytes in reverse order for correct value. Plus you are incorrectly reading 24-bits (3 bytes) instead of just 16-bits (2 bytes). Or maybe you are extracting 2 bytes but somehow your bytes are getting an extra "zero" byte added at the end??
You could try having an OR check inside an If statement thats checks both Endian type equivalents. Since where Android = 3 would become iOS = 768, you can try:
if (orient_val == 3 || orient_val == 768)
{ /* do whatever you do here */ }
PS: 1==256 2==512 3==768 4==1024 5==1280 6==1536 7==1792 8==2048, 9==2304
long version:
Android processors typically read bytes as Little Endian. Apple processors read bytes as Big Endian. Basically one type is read right-to-left, the other, is left-to-right. Where Android has ABCD that becomes in iOS as DCBA.
Some pointers:
Your 3 as (2 bytes) in Lil' E is written 00+03... but in
Big E it's written 03+00.
Problem is, if you dont adapt and just read that 03 00 as though it's still LE then you get 768.
Worst still, somehow you are reading it as 03 00 00 which gives you
that 196608.
Another is 06 00 00 giving you 393216 instead of reading 60 00 for 1536.
Fix your code to drop the extra 00 byte at the end.
You were lucky on Android cos I suspect it wants 4 bytes instead of 2 bytes. So that 00 00 06 was being read as 00 00 00 06 and since x000006 and x00000006 mean the same thing=6.
Anyways to fix this normally you could just tell AS3 to consider your Jpeg bytes as Big Endian but that would now fix iOS but then break it on Android.
A quick easy solution is to check if the number you got is bigger than 1 digit, if it is then you assume app is running on iOS and try reverse-ordering to see if now the result is 1 digit. So..
Note: option B shown in code is risky because if you have wrong numbers anyway you'll get a wrong result. You know computers.. "bad input = bad output; do Next();"
import flash.utils.ByteArray;
var Orientation_num:uint = 0;
var jpeg_bytes:ByteArray = new ByteArray(); //holds entire JPEG data as bytes
var bytes_val:ByteArray = new ByteArray(); //holds byte values as needed
Orientation_num = 2048; //Example: Detected big number that should be 8.
if (Orientation_num > 8 ) //since 8 is maximum of orientation types
{
trace ("Orientation_num is too big : Attempting fix..");
//## A: CORRECT.. Either read directly from JPEG bytes
//jpeg_bytes.position = (XX) - 1; //where XX is start of EXIF orientation (2 bytes)
//bytes_val = jpeg_bytes.readShort(); //extracts the 2 bytes
//## B: RISKY.. Or use the already detected big number anyway
bytes_val.writeShort(Orientation_num);
//Flip the bytes : Make x50 x00 become x00 x50
var tempNum_ba : ByteArray = new ByteArray(); //temporary number as bytes
tempNum_ba[0] = bytes_val[1];
tempNum_ba[1] = bytes_val[0];
//tempNum_ba.position = 0; //reset pos before checking
Orientation_num = tempNum_ba.readShort(); //pos also MOVES forward by 2 bytes
trace ("Orientation_num (FIXED) : " + Orientation_num);
}

Non IDR Picture NAL Units - 0x21 and 0x61 meaning

Does anyone know what does 0x21 and 0x61 means in h.264 encoded video stream?
I know that 0x01 means it's a b-frame and 0x41 means it's a p-frame. My encoded video gives me two 0x21 frame followed by one b-frame.
I 21 21 B 21 21 B......
What is this 0x21?
First point, a NALu is not the same than as a frame. A frame can contain more that 1 NALu (but not less). A frame can also be made up of more than one slice type. A single frame can have I, B and P slices. If it is an IDR frame, then EVERY slice of that frame must be IDR.
0x01 is NOT a B slice. it is a "Coded slice of a non-IDR picture". exactly like 0x21 and 0x61. It could be a I/B/P or p slice. you need to parse the slice_type to know more.
From H.264 spec:
7.3.1 NAL unit syntax
forbidden_zero_bit - 1 bit - shall be equal to 0.
nal_ref_idc - 2 bits - not equal to 0 specifies that the content of the NAL unit contains a sequence parameter set [...]
nal_unit_type - 5 bits - specifies the type of RBSP data structure contained in the NAL unit [...]
0x21 and 0x61 make it NAL unit type 1 (Coded slice of a non-IDR picture) with different values for nal_ref_idc.
UPD. There is no one to one mapping of specific bit, esp. at fixed position from the beginning of the "frame" that says it's I/P/B frame. You will need to parse out the bitstream to read values per 7.4.3 Slice header semantics of H.264 spec (it is still doable in most cases since the value is real close to the beginning of the bitstream - check H.264 spec for details):

websocket client packet unframe/unmask

I am trying to implement latest websocket spec. However, i am unable to get through the unmasking step post successful handshake.
I receive following web socket frame:
<<129,254,1,120,37,93,40,60,25,63,71,88,92,125,80,81,73,
51,91,1,2,53,92,72,85,103,7,19,79,60,74,94,64,47,6,83,
87,58,7,76,87,50,92,83,70,50,68,19,77,41,92,76,71,52,
70,88,2,125,90,85,65,96,15,14,20,107,31,14,28,100,27,9,
17,122,8,72,74,96,15,86,68,37,68,18,76,48,15,28,93,48,
68,6,73,60,70,91,24,122,77,82,2,125,80,81,85,45,18,74,
64,47,91,85,74,51,21,27,20,115,24,27,5,37,69,80,75,46,
18,68,72,45,88,1,2,40,90,82,31,37,69,76,85,103,80,94,
74,46,64,27,5,60,75,87,24,122,25,27,5,47,71,73,81,56,
21,27,93,48,88,76,31,57,77,74,11,55,73,68,73,115,65,81,
31,104,26,14,23,122,8,75,68,52,92,1,2,110,24,27,5,53,
71,80,65,96,15,13,2,125,75,83,75,41,77,82,81,96,15,72,
64,37,92,19,93,48,68,7,5,62,64,93,87,46,77,72,24,40,92,
90,8,101,15,28,83,56,90,1,2,108,6,13,21,122,8,82,64,42,
67,89,92,96,15,93,19,56,28,8,65,101,31,94,16,105,28,10,
20,56,30,14,65,56,27,93,71,106,16,11,17,63,25,4,17,57,
73,89,17,59,29,88,29,106,24,27,5,46,65,72,64,54,77,69,
24,122,66,93,93,49,5,12,8,109,15,28,76,59,90,93,72,56,
76,1,2,41,90,73,64,122,8,89,85,50,75,84,24,122,25,15,
23,105,25,5,19,106,26,14,20,111,25,27,5,53,77,85,66,53,
92,1,2,110,26,13,2,125,95,85,65,41,64,1,2,108,27,10,19,
122,7,2>>
As per base framing protocol defined here (https://datatracker.ietf.org/doc/html/draft-ietf-hybi-thewebsocketprotocol-17#section-5.2) i have:
fin:1, rsv:0, opcode:1, mask:1, length:126
Masked application+payload data comes out to be:
<<87,58,7,76,87,50,92,83,70,50,68,19,77,41,92,76,71,52,70,88,2,125,90,85,65,96,
15,14,20,107,31,14,28,100,27,9,17,122,8,72,74,96,15,86,68,37,68,18,76,48,15,
28,93,48,68,6,73,60,70,91,24,122,77,82,2,125,80,81,85,45,18,74,64,47,91,85,
74,51,21,27,20,115,24,27,5,37,69,80,75,46,18,68,72,45,88,1,2,40,90,82,31,37,
69,76,85,103,80,94,74,46,64,27,5,60,75,87,24,122,25,27,5,47,71,73,81,56,21,
27,93,48,88,76,31,57,77,74,11,55,73,68,73,115,65,81,31,104,26,14,23,122,8,75,
68,52,92,1,2,110,24,27,5,53,71,80,65,96,15,13,2,125,75,83,75,41,77,82,81,96,
15,72,64,37,92,19,93,48,68,7,5,62,64,93,87,46,77,72,24,40,92,90,8,101,15,28,
83,56,90,1,2,108,6,13,21,122,8,82,64,42,67,89,92,96,15,93,19,56,28,8,65,101,
31,94,16,105,28,10,20,56,30,14,65,56,27,93,71,106,16,11,17,63,25,4,17,57,73,
89,17,59,29,88,29,106,24,27,5,46,65,72,64,54,77,69,24,122,66,93,93,49,5,12,8,
109,15,28,76,59,90,93,72,56,76,1,2,41,90,73,64,122,8,89,85,50,75,84,24,122,
25,15,23,105,25,5,19,106,26,14,20,111,25,27,5,53,77,85,66,53,92,1,2,110,26,
13,2,125,95,85,65,41,64,1,2,108,27,10,19,122,7,2>>
While the 32-bit masking key is:
<<37,93,40,60,25,63,71,88,92,125,80,81,73,51,91,1,2,53,92,72,85,103,7,19,79,60,
74,94,64,47,6,83>>
As per https://datatracker.ietf.org/doc/html/draft-ietf-hybi-thewebsocketprotocol-17#section-5.2 :
j = i MOD 4
transformed-octet-i = original-octet-i XOR masking-key-octet-j
however, i doesn't seem to get my original octet sent from client side, which is basically a xml packet. Any direction, correction, suggestions are greatly appreciated.
I think you've mis-read the data framing section of the protocol spec.
Your interpretation of the first byte (129) is correct - fin + opcode 1 - final (and first) fragment of a text message.
The next byte (254) implies that the body of the message is masked and that the following 2 bytes provide its length (lengths of 126 or 127 imply longer messages whose length's can't be represented in 7 bits. 126 means that the following 2 bytes hold the length; 127 mean that its the following 4 bytes).
The following 2 bytes - 1, 120 - imply a message length of 376 bytes.
The following 4 bytes - 37,93,40,60 - are your mask.
The remaining data is your message which should be transformed as you write, giving the message
&ltbody xmlns='http://jabber.org/protocol/httpbind' rid='2167299354' to='jaxl.im' xml:lang='en' xmpp:version='1.0' xmlns:xmpp='urn:xmpp:xbosh' ack='1' route='xmpp:dev.jaxl.im:5222' wait='30' hold='1' content='text/xml; charset=utf-8' ver='1.1
0' newkey='a6e44d87b54461e62de3ab7874b184dae4f5d870' sitekey='jaxl-0-0' iframed='true' epoch='1324196722121' height='321' width='1366'/>

Resources