H.264 NALU Byte Alignment - parsing

I am trying to get my head around the H.264 NALU headers in the following data stored in a mov container.
Example from file:
00 00 00 02 09 30 00 00 00 0E 06 01 09 00 02 08
24 68 00 00 03 00 01 80 00 00 2B 08 21 9A 01 01
64 47 D4 B2 5C 45 76 DA 72 E4 3B F3 AE A9 56 91
B2 3F FE CE 87 1A 48 13 14 A9 E0 12 C8 AD E9 22
...
So far I have assumed that the bit-stream is not byte aligned due to the start code sequence offset to the left by one bit:
0x00 0x00 0x00 0x02 -> 00000000 00000000 00000000 00000010
So I have shifted the these and subsequent bytes to the right one bit which results in the following start sequence code and header bits for the first header:
0000000 00000000 00000000 00000001 [0 00 00100]
However I am coming unstuck when I reach following byte sequence in the example:
0x00 0x00 0x00 0x0E
I am assuming it is another start sequence code but with a different byte alignment.
00000000 00000000 00000000 00001110 00000110 00000001 00001001 00000000
After byte alignment I am getting the following header byte:
00000 00000000 00000000 00000001 [1 10 00000]
The first bit in the header (the forbidden_zero_bit) is non-zero which violates the rule that it must be zero
Where am I tripping up?
Am I making the wrong assumptions here?

As was already answered MOV-container (or MP4) doesn't use Annex B encoding with start codes. It use MP4-style encoding where NALs are prefixed with NALUnitLength field. This field can be of different size (and that size signaled somewhere else in container) but usually it is 4 bytes. In your case NALUnitLength is probably 4 bytes and 3 NALs from you dump have sizes of: 2-bytes (00 00 00 02), 14-bytes (00 00 00 0E) and 11016-bytes (00 00 2B 08).

Start codes are used in "Byte stream format" (H.264 Annex B) and are byte aligned themselves. Decoder is supposed to identify start code by checking byte sequences, without bit shifts.
MOV, MP4 containers don't use start codes, however they have their own structure (atoms, boxes) with parameter set NAL units, without prefixes, in sample description atoms and then data itself separately again as original NAL units.
What you quoted is presumably a fragment of MOV atoms which correspond to file structure bytes and not NAL units.

Related

Unable to get the correct CRC16 with a given Polynomial

I'm struggling with an old radiation sensor and his communication protocol.
The sensor is event driven, the master starts the communication with a data transmission or a data request.
Each data telegram uses a CRC16 to check only the variable data block and a CRC8 to check all the telegram.
My main problem is the crc16, According to the datasheet the poly used to check the data block is: CRC16 = X^14 + X^12 + X^5 + 1 --> 0x5021 ??
I captured some data with a valid CRC16 and tried to replicate the expected value in order to send my own data transmission, but I can't get the same value.
I'm using the sunshine CRC calculator trying any possible combination with that poly.
I also try CRC Reveng but no results.
Here are a few data with the correct CRC16:.
Data | CRC16 (MSB LSB)
14 00 00 0A | 1B 84
15 00 00 0C | 15 88
16 00 00 18 | 08 1D
00 00 00 00 | 00 00
00 00 00 01 | 19 D8
00 00 00 02 | 33 B0
01 00 00 00 | 5A DC
08 00 00 00 | c6 c2
10 00 00 00 | 85 95
80 00 00 00 | 0C EC
ff ff ff ff | f3 99
If I send an invalid CRC16 in the telegram, the sensor send a negative acknowledge with the expected value, so I can try any data in order to test or get more examples if needed.
if useful, the sensor uses a 8bit 8051 microprocessor, and this is an example of a valid CRC8 checked with sunshine CRC:
CRC8 = X^8 + X^6 + X^3 + 1 --> 0x49
Input reflected Result reflected
control byte | Data |CRC16 | CRC8
01 0E 01 00 24 2A 06 ff ff ff ff f3 99 |-> 0F
Any help is appreciated !
Looks like a typo on the polynomial. An n-bit CRC polynomial always starts with xn. Like your correct 8-bit polynomial. The 16-bit polynomial should read X16 + X12 + X5 + 1, which in fact is a very common 16-bit CRC polynomial.
To preserve the note in the comment, the four data bytes in the examples are swapped in each pair of bytes, which needs to be undone to get the correct CRC. (The control bytes in the CRC8 example are not swapped.)
So 14 00 00 0a becomes 00 14 0a 00, for which the above-described CRC gives the expected 0x1b84.
I would guess that the CRC is stored in the stream also swapped, so the message as bytes would be 00 14 0a 00 84 1b. That results in a sequence whose total CRC is 0.

What is this unusual text being used with loadstring() in Lua?

I have some Lua code which appears to be an attempt to secure the code by obscurity. My understanding of the loadstring() function is a text string is composed of Lua source code text and then converted to executable Lua code by the loadstring() method.
With the following Lua source, I tried to read the contents of the variable code by invoking print on the variable code; while I did see some valid source text in the converted string, a majority of the characters were not displayed (I assume ones with character codes below 40 and above 176). Note that there are some particularly high values in there for ASCII, e.g. 231 is obviously in the extended set, being the trademark sign. Additionally, there are several null characters in there. All this makes me doubt if it is indeed ASCII.
Could someone please tell me if the string is valid Lua source, and how to be able to get Lua to return the string as printable characters so that I can see what this code does?
When I run my version with print in the Lua console on Windows I get many empty boxes, presumably the console can only print pure ASCII?
Note that the code is executed using Lua version 5.0.2
code='\27\76\117\97\80\1\4\4\4\6\8\9\9\8\182\9\147\104\231\245\125\65\12\0\0\0\64\108\117\97\101\109\103\46\108\117\97\0\1\0\0\0\0\0\0\5\23\0\0\0\8\0\0\0\16\0\0\0\17\0\0\0\17\0\0\0\17\0\0\0\17\0\0\0\17\0\0\0\18\0\0\0\18\0\0\0\19\0\0\0\20\0\0\0\21\0\0\0\35\0\0\0\35\0\0\0\26\0\0\0\49\0\0\0\49\0\0\0\37\0\0\0\59\0\0\0\59\0\0\0\54\0\0\0\61\0\0\0\66\0\0\0\2\0\0\0\4\0\0\0\104\52\120\0\1\0\0\0\22\0\0\0\7\0\0\0\77\111\100\117\108\101\0\12\0\0\0\22\0\0\0\0\0\0\0\12\0\0\0\4\13\0\0\0\122\122\97\78\111\100\101\78\97\109\101\115\0\4\6\0\0\0\90\90\65\48\49\0\4\6\0\0\0\90\90\65\48\50\0\4\14\0\0\0\122\122\97\84\101\120\116\90\101\105\108\101\110\0\4\12\0\0\0\122\122\97\80\111\115\105\116\105\111\110\0\3\0\0\0\0\0\0\240\63\4\8\0\0\0\122\122\97\84\101\120\116\0\4\1\0\0\0\0\4\20\0\0\0\122\122\97\67\117\114\114\101\110\116\84\101\120\116\86\97\108\117\101\0\4\9\0\0\0\122\122\97\83\101\116\117\112\0\4\10\0\0\0\122\122\97\83\101\108\101\99\116\0\4\9\0\0\0\122\122\97\82\101\115\101\116\0\4\0\0\0\0\0\0\0\2\0\0\0\0\1\0\7\14\0\0\0\3\0\0\0\3\0\0\0\4\0\0\0\4\0\0\0\4\0\0\0\5\0\0\0\5\0\0\0\5\0\0\0\5\0\0\0\4\0\0\0\5\0\0\0\7\0\0\0\7\0\0\0\8\0\0\0\4\0\0\0\7\0\0\0\115\116\114\116\98\108\0\0\0\0\0\13\0\0\0\16\0\0\0\40\102\111\114\32\103\101\110\101\114\97\116\111\114\41\0\5\0\0\0\11\0\0\0\12\0\0\0\40\102\111\114\32\115\116\97\116\101\41\0\5\0\0\0\11\0\0\0\2\0\0\0\118\0\5\0\0\0\11\0\0\0\0\0\0\0\2\0\0\0\4\7\0\0\0\98\117\102\102\101\114\0\4\1\0\0\0\0\0\0\0\0\14\0\0\0\65\0\0\1\7\0\0\1\0\0\0\1\3\128\1\2\222\0\128\1\5\0\0\4\198\0\0\5\83\1\2\4\7\0\0\4\29\0\0\1\84\254\127\0\5\0\0\1\27\0\1\1\27\128\0\0\0\0\0\0\26\0\0\0\1\1\0\4\18\0\0\0\27\0\0\0\28\0\0\0\28\0\0\0\29\0\0\0\29\0\0\0\30\0\0\0\32\0\0\0\32\0\0\0\32\0\0\0\32\0\0\0\32\0\0\0\33\0\0\0\33\0\0\0\33\0\0\0\33\0\0\0\33\0\0\0\27\0\0\0\35\0\0\0\2\0\0\0\8\0\0\0\122\122\97\70\105\108\101\0\0\0\0\0\17\0\0\0\6\0\0\0\122\101\105\108\101\0\3\0\0\0\16\0\0\0\1\0\0\0\7\0\0\0\77\111\100\117\108\101\0\5\0\0\0\4\5\0\0\0\114\101\97\100\0\0\4\14\0\0\0\122\122\97\84\101\120\116\90\101\105\108\101\110\0\4\12\0\0\0\122\122\97\80\111\115\105\116\105\111\110\0\3\0\0\0\0\0\0\240\63\0\0\0\0\18\0\0\0\148\3\128\0\139\62\0\1\153\0\1\1\85\128\125\0\20\0\128\0\148\2\128\0\4\0\0\2\6\63\1\2\4\0\0\3\70\191\1\3\73\128\1\2\4\0\0\2\4\0\0\3\70\191\1\3\140\191\1\3\201\128\126\2\212\251\127\0\27\128\0\0\0\0\0\0\37\0\0\0\1\2\0\7\21\0\0\0\39\0\0\0\39\0\0\0\39\0\0\0\39\0\0\0\39\0\0\0\40\0\0\0\40\0\0\0\40\0\0\0\40\0\0\0\43\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\49\0\0\0\3\0\0\0\6\0\0\0\118\97\108\117\101\0\0\0\0\0\20\0\0\0\9\0\0\0\110\111\100\101\78\97\109\101\0\0\0\0\0\20\0\0\0\20\0\0\0\122\122\97\83\101\108\101\99\116\101\100\80\111\115\105\116\105\111\110\0\10\0\0\0\20\0\0\0\1\0\0\0\7\0\0\0\77\111\100\117\108\101\0\7\0\0\0\4\8\0\0\0\122\122\97\84\101\120\116\0\4\14\0\0\0\122\122\97\84\101\120\116\90\101\105\108\101\110\0\4\20\0\0\0\122\122\97\67\117\114\114\101\110\116\84\101\120\116\86\97\108\117\101\0\4\5\0\0\0\67\97\108\108\0\4\5\0\0\0\90\90\65\48\0\4\14\0\0\0\58\65\99\116\105\118\97\116\101\78\111\100\101\0\3\0\0\0\0\0\0\240\63\0\0\0\0\21\0\0\0\4\0\0\2\4\0\0\3\198\190\1\3\6\128\1\3\201\0\125\2\4\0\0\2\4\0\0\3\134\190\1\3\201\0\126\2\0\0\0\2\197\0\0\3\1\1\0\4\0\128\0\5\65\1\0\6\147\1\2\4\1\1\0\5\0\0\1\6\147\129\2\5\129\1\0\6\89\0\2\3\27\128\0\0\0\0\0\0\54\0\0\0\1\0\0\4\19\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\59\0\0\0\0\0\0\0\1\0\0\0\7\0\0\0\77\111\100\117\108\101\0\7\0\0\0\4\5\0\0\0\67\97\108\108\0\4\13\0\0\0\122\122\97\78\111\100\101\78\97\109\101\115\0\3\0\0\0\0\0\0\240\63\4\14\0\0\0\58\65\99\116\105\118\97\116\101\78\111\100\101\0\4\4\0\0\0\97\108\108\0\3\0\0\0\0\0\0\0\0\3\0\0\0\0\0\0\0\64\0\0\0\0\19\0\0\0\5\0\0\0\4\0\0\1\198\190\0\1\6\191\0\1\193\0\0\2\147\128\0\1\1\1\0\2\65\1\0\3\89\0\2\0\5\0\0\0\4\0\0\1\198\190\0\1\6\192\0\1\193\0\0\2\147\128\0\1\1\1\0\2\65\1\0\3\89\0\2\0\27\128\0\0\23\0\0\0\34\0\0\0\202\0\0\1\10\0\1\2\65\0\0\3\129\0\0\4\95\0\0\2\137\0\125\1\10\0\0\2\137\128\126\1\201\63\127\1\73\64\128\1\73\64\129\1\98\0\0\2\0\128\0\0\137\128\129\1\162\0\0\2\0\128\0\0\137\0\130\1\226\0\0\2\0\128\0\0\137\128\130\1\27\0\1\1\27\128\0\0';
return loadstring(code)();
This string is valid chunk of Lua code precompiled into bytecode. Header say it's for Lua 5.0. It's not a text, it doesn't need decoding, so can be run directly with loadstring()
To provide a few more details than Vlad's answer for anyone who may come across this posting.
The Lua loadstring() function accepts a string of characters that are either Lua source text or Lua bytecode. It appears that the function determines type of the text by looking at the first character of the string to see if it is an escape character (0x1b or decimal 27) or not.
The loadstring() function returns an anonymous function so in the code sample:
code='\27\76\117\97\80\1\4\4\4\6\8\9\9\8\182\9\147\104\231\245\125\65\12\0\0\0\64\108\117\97\101\109\103\46\108\117\97\0\1\0\0\0\0\0\0\5\23\0\0\0\8\0\0\0\16\0\0\0\17\0\0\0\17\0\0\0\17\0\0\0\17\0\0\0\17\0\0\0\18\0\0\0\18\0\0\0\19\0\0\0\20\0\0\0\21\0\0\0\35\0\0\0\35\0\0\0\26\0\0\0\49\0\0\0\49\0\0\0\37\0\0\0\59\0\0\0\59\0\0\0\54\0\0\0\61\0\0\0\66\0\0\0\2\0\0\0\4\0\0\0\104\52\120\0\1\0\0\0\22\0\0\0\7\0\0\0\77\111\100\117\108\101\0\12\0\0\0\22\0\0\0\0\0\0\0\12\0\0\0\4\13\0\0\0\122\122\97\78\111\100\101\78\97\109\101\115\0\4\6\0\0\0\90\90\65\48\49\0\4\6\0\0\0\90\90\65\48\50\0\4\14\0\0\0\122\122\97\84\101\120\116\90\101\105\108\101\110\0\4\12\0\0\0\122\122\97\80\111\115\105\116\105\111\110\0\3\0\0\0\0\0\0\240\63\4\8\0\0\0\122\122\97\84\101\120\116\0\4\1\0\0\0\0\4\20\0\0\0\122\122\97\67\117\114\114\101\110\116\84\101\120\116\86\97\108\117\101\0\4\9\0\0\0\122\122\97\83\101\116\117\112\0\4\10\0\0\0\122\122\97\83\101\108\101\99\116\0\4\9\0\0\0\122\122\97\82\101\115\101\116\0\4\0\0\0\0\0\0\0\2\0\0\0\0\1\0\7\14\0\0\0\3\0\0\0\3\0\0\0\4\0\0\0\4\0\0\0\4\0\0\0\5\0\0\0\5\0\0\0\5\0\0\0\5\0\0\0\4\0\0\0\5\0\0\0\7\0\0\0\7\0\0\0\8\0\0\0\4\0\0\0\7\0\0\0\115\116\114\116\98\108\0\0\0\0\0\13\0\0\0\16\0\0\0\40\102\111\114\32\103\101\110\101\114\97\116\111\114\41\0\5\0\0\0\11\0\0\0\12\0\0\0\40\102\111\114\32\115\116\97\116\101\41\0\5\0\0\0\11\0\0\0\2\0\0\0\118\0\5\0\0\0\11\0\0\0\0\0\0\0\2\0\0\0\4\7\0\0\0\98\117\102\102\101\114\0\4\1\0\0\0\0\0\0\0\0\14\0\0\0\65\0\0\1\7\0\0\1\0\0\0\1\3\128\1\2\222\0\128\1\5\0\0\4\198\0\0\5\83\1\2\4\7\0\0\4\29\0\0\1\84\254\127\0\5\0\0\1\27\0\1\1\27\128\0\0\0\0\0\0\26\0\0\0\1\1\0\4\18\0\0\0\27\0\0\0\28\0\0\0\28\0\0\0\29\0\0\0\29\0\0\0\30\0\0\0\32\0\0\0\32\0\0\0\32\0\0\0\32\0\0\0\32\0\0\0\33\0\0\0\33\0\0\0\33\0\0\0\33\0\0\0\33\0\0\0\27\0\0\0\35\0\0\0\2\0\0\0\8\0\0\0\122\122\97\70\105\108\101\0\0\0\0\0\17\0\0\0\6\0\0\0\122\101\105\108\101\0\3\0\0\0\16\0\0\0\1\0\0\0\7\0\0\0\77\111\100\117\108\101\0\5\0\0\0\4\5\0\0\0\114\101\97\100\0\0\4\14\0\0\0\122\122\97\84\101\120\116\90\101\105\108\101\110\0\4\12\0\0\0\122\122\97\80\111\115\105\116\105\111\110\0\3\0\0\0\0\0\0\240\63\0\0\0\0\18\0\0\0\148\3\128\0\139\62\0\1\153\0\1\1\85\128\125\0\20\0\128\0\148\2\128\0\4\0\0\2\6\63\1\2\4\0\0\3\70\191\1\3\73\128\1\2\4\0\0\2\4\0\0\3\70\191\1\3\140\191\1\3\201\128\126\2\212\251\127\0\27\128\0\0\0\0\0\0\37\0\0\0\1\2\0\7\21\0\0\0\39\0\0\0\39\0\0\0\39\0\0\0\39\0\0\0\39\0\0\0\40\0\0\0\40\0\0\0\40\0\0\0\40\0\0\0\43\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\46\0\0\0\49\0\0\0\3\0\0\0\6\0\0\0\118\97\108\117\101\0\0\0\0\0\20\0\0\0\9\0\0\0\110\111\100\101\78\97\109\101\0\0\0\0\0\20\0\0\0\20\0\0\0\122\122\97\83\101\108\101\99\116\101\100\80\111\115\105\116\105\111\110\0\10\0\0\0\20\0\0\0\1\0\0\0\7\0\0\0\77\111\100\117\108\101\0\7\0\0\0\4\8\0\0\0\122\122\97\84\101\120\116\0\4\14\0\0\0\122\122\97\84\101\120\116\90\101\105\108\101\110\0\4\20\0\0\0\122\122\97\67\117\114\114\101\110\116\84\101\120\116\86\97\108\117\101\0\4\5\0\0\0\67\97\108\108\0\4\5\0\0\0\90\90\65\48\0\4\14\0\0\0\58\65\99\116\105\118\97\116\101\78\111\100\101\0\3\0\0\0\0\0\0\240\63\0\0\0\0\21\0\0\0\4\0\0\2\4\0\0\3\198\190\1\3\6\128\1\3\201\0\125\2\4\0\0\2\4\0\0\3\134\190\1\3\201\0\126\2\0\0\0\2\197\0\0\3\1\1\0\4\0\128\0\5\65\1\0\6\147\1\2\4\1\1\0\5\0\0\1\6\147\129\2\5\129\1\0\6\89\0\2\3\27\128\0\0\0\0\0\0\54\0\0\0\1\0\0\4\19\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\56\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\57\0\0\0\59\0\0\0\0\0\0\0\1\0\0\0\7\0\0\0\77\111\100\117\108\101\0\7\0\0\0\4\5\0\0\0\67\97\108\108\0\4\13\0\0\0\122\122\97\78\111\100\101\78\97\109\101\115\0\3\0\0\0\0\0\0\240\63\4\14\0\0\0\58\65\99\116\105\118\97\116\101\78\111\100\101\0\4\4\0\0\0\97\108\108\0\3\0\0\0\0\0\0\0\0\3\0\0\0\0\0\0\0\64\0\0\0\0\19\0\0\0\5\0\0\0\4\0\0\1\198\190\0\1\6\191\0\1\193\0\0\2\147\128\0\1\1\1\0\2\65\1\0\3\89\0\2\0\5\0\0\0\4\0\0\1\198\190\0\1\6\192\0\1\193\0\0\2\147\128\0\1\1\1\0\2\65\1\0\3\89\0\2\0\27\128\0\0\23\0\0\0\34\0\0\0\202\0\0\1\10\0\1\2\65\0\0\3\129\0\0\4\95\0\0\2\137\0\125\1\10\0\0\2\137\128\126\1\201\63\127\1\73\64\128\1\73\64\129\1\98\0\0\2\0\128\0\0\137\128\129\1\162\0\0\2\0\128\0\0\137\0\130\1\226\0\0\2\0\128\0\0\137\128\130\1\27\0\1\1\27\128\0\0';
return loadstring(code)();
you have a text string that contains Lua bytecode, as indicated by the leading escape character of \27, and then a call to loadstring() to create a function which is then executed.
The first few characters of the text string contain the precompiled Lua header (see Lua 5.2 Bytecode and Virtual Machine). The length of this header varies depending on the version of Lua. However the first few characters seem to be fairly standard. code='\27\76\117\97\80 ... contains the escape character (0x1b or decimal 27), the capital letter L (decimal 76), the lower case letter u (decimal 117), the lower case letter a (decimal 97), and the Lua version (decimal 80 is 0x50 indicating version 5.0).
The following example is from Lua 5.2 Bytecode and Virtual Machine.
What exactly is in the bytecode? Here is the hexdump of hello.luac
(made by hd on my system).
00000000 1b 4c 75 61 52 00 01 04 04 04 08 00 19 93 0d 0a |.LuaR...........|
00000010 1a 0a 00 00 00 00 00 00 00 00 00 01 04 07 00 00 |................|
00000020 00 01 00 00 00 46 40 40 00 80 00 00 00 c1 80 00 |.....F##........|
00000030 00 96 c0 00 01 5d 40 00 01 1f 00 80 00 03 00 00 |.....]#.........|
00000040 00 04 06 00 00 00 48 65 6c 6c 6f 00 04 06 00 00 |......Hello.....|
The format is not officially documented, and needs to be
reverse-engineered. The necessary material is in the Lua source code,
of course, in several places, mainly ldump.c and lundump.c. I have
also cross-checked with NFI and LAT, but any remaining errors are
mine.
The code starts with an 18-byte file header, which is the same for all
official Lua 5.2 bytecode compiled on a machine like yours, whether by
luac or load or loadfile. Lua 5.1 only had a 12-byte header, similar
to the first 12 bytes of this one.
Byte numbers are in origin-1 decimal (mostly showing the arithmetic)
and origin-0 hex.
1 x00: 1b 4c 75 61 LUA_SIGNATURE from lua.h.
5 x04: 52 00
Binary-coded decimal 52 for the Lua version, 00 to say the bytecode is
compatible with the "official" PUC-Rio implementation.
5+2 x06: 01 04
04 04 08 00 Six system parameters. On x386 machines they mean:
little-endian, 4-byte integers, 4-byte VM instructions, 4-byte size_t
numbers, 8-byte Lua numbers, floating-point. These parameters must all
match up between the bytecode file and the Lua interpreter, otherwise
the bytecode is invalid.
7+6 x0c: 19 93 0d 0a 1a 0a
Present in all
bytecode produced by Lua 5.2 from PUC-Rio. Described in lundump.h as
"data to catch conversion errors". Might be constructed from
binary-coded decimal 1993 (the year it all started), Windows line
terminator, MS-DOS text file terminator, Unix line terminator.
After these 18 bytes come the functions defined in the file. Each function
starts with an 11-byte function header.
13+6 x12: 00 00 00 00 Line number in source code where chunk starts.
0 for the main chunk.
19+4 x16: 00 00 00 00 Line number in source
code where chunk stops. 0 for the main chunk.
23+4 x1a: 00 01 04
Number of parameters, vararg flag, number of registers used by this
function (not more than 255, obviously). Local variables are stored in
registers; there may not be more than 200 of them (see lparser.c).

Option rom: PXE design

During the study of the PCI firmware specification and the looking at the existing implementations of the PXE Boot Agents, I had a misunderstanding of how this should work.
According to PCI Firmware Specification, during the POST procedure the BIOS should map Option ROM into UMB memory (0xC000-0xF000), then call "Init" entry point by the offset 0x3, and after this the BIOS can disable Option ROM.
PXE oprom binary consists from three parts: "Initialization code", "Base code" and "UNDI code".
BIOS loads into UMB only "Initialization code". Base code and UNDI code are loaded into memory later through copying directly from the flash memory (from PCI Flash BAR (BAR1, according Intel specifications).
The question: what are the reasons for the need for such an algorithm of work?
Why the vendors do not use the BIOS mechanisms and do not load the entire Extension ROM into memory (instead copying from Flash BARs)?
A monolithic PXE option ROM was a single unit but most PXE option ROMs now have a split architecture (split into UNDI option ROM and a BC option ROM). Although, the BC ROM is typically embedded in the BIOS and may not even appear as an option ROM.
The NIC only has one option ROM nowadays, the UNDI option ROM.
Option ROM Header: 0x000DA000
55 AA 08 E8 76 10 CB 55 BC 01 00 00 00 00 00 00 U...v..U........
00 00 00 00 00 00 20 00 40 00 60 00 ...... .#.`.
Signature 0xAA55
Length 0x08 (4096 bytes)
Initialization entry 0xCB1076E8 //call then far return
Reserved 0x55 0xBC 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
Reserved 0x00 0x00 0x00 0x00 0x00
PXEROMID Offset 0x0020 //RWEverything didn't pick it up as a separate field and made it part of the reserved section so I separated it.
PCI Data Offset 0x0040
Expansion Header Offset 0x0060
UNDI ROM ID Structure: 0x000DA020 //not recognised by RW Everything so I parsed it myself
55 4E 44 49 16 08 00 00 01 02 32 0D 00 08 B0 C4 UNDI......2...
80 46 50 43 49 52 ¦-ÇFPCIR
Signature UNDI
StructLength 0x16
Checksum 0x08
StructRev 0x00
UNDIRev 0x00 0x01 0x02
UNDI Loader Offset 0x0D32
StackSize 0x0800
DataSize 0xC4B0
CodeSize 0x4680
BusType PCIR
PCI Data Structure: 0x000DA040
50 43 49 52 EC 10 68 81 00 00 1C 00 03 00 00 02 PCIR..h.........
08 00 01 02 00 80 08 00 ........
Signature PCIR
Vendor ID 0x10EC - Realtek Semiconductor
Device ID 0x8168
Product Data 0x0000
Structure Length 0x001C
Structure Revision 0x03
Class Code 0x00 0x00 0x02
Image Length 0x0008
Revision Level 0x0201
Code Type 0x00
Indicator 0x80
Reserved 0x0008
PnP Expansion Header: 0x000DA060
24 50 6E 50 01 02 00 00 00 D7 00 00 00 00 AF 00 $PnP............
92 01 02 00 00 E4 00 00 00 00 C1 0B 00 00 00 00 ................
Signature $PnP
Revision 0x01
Length 0x02 (32 bytes)
Next Header 0x0000
Reserved 0x00
Checksum 0xD7
Device ID 0x00000000
Manufacturer 0x00AF - Intel Corporation
Product Name 0x0192 - Realtek PXE B02 D00
Device Type Code 0x02 0x00 0x00
Device Indicators 0xE4
Boot Connection Vector 0x0000
Disconnect Vector 0x0000
Bootstrap Entry Vector 0x0BC1 // will be at 0xDABC1
Reserved 0x0000
Resource info. vector 0x0000

Why could be natural thinking to Unicode encoding as an array of 32-bit integers?

I was reading Python guide about Unicode. In this section, it says:
To summarize the previous section: a Unicode string is a sequence of code points, which are numbers from 0 to 0x10ffff. This sequence needs to be represented as a set of bytes (meaning, values from 0-255) in memory. The rules for translating a Unicode string into a sequence of bytes are called an encoding.
The first encoding you might think of is an array of 32-bit integers. In this representation, the string “Python” would look like this:
P y t h o n
0x50 00 00 00 79 00 00 00 74 00 00 00 68 00 00 00 6f 00 00 00 6e 00 00 00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Why might we think of 32-bit integers if code points are numbers from 0 to 0x10ffff? Maybe is it assuming that we are on a 32-bit system?

4 byte checksum, sum32 algorithm for Epson printers

I'm programming a low level communication with an Epson tm-t88iv thermal printer on a Linux device, which receives only hexadecimal packages. I have read the manual trying to understand how the checksum is built but i can't manage to recreate it.
the manual says that the checksum are 4 bytes representing the 2 bytes sum of all the data in the package sent.
I have currently four working examples I found by listening to a port on a windows computer with a different program. the last 4 hexadecimals are the checksum (03 marks the end of the data and is included in the checksum calculation, according to the manual).
02 AC 00 01 1C 00 00 03 30 30 43 45
02 AC 00 00 1C 80 80 1C 00 00 1C 00 00 1C 03 30 32 32 31
02 AD 07 01 1C 00 00 1C 31 30 03 30 31 35 33
02 AD 00 00 1C 80 80 1C 00 00 1C 00 00 1C 03 30 32 32 32
I have read somewhere that there is a sum32 algorithm but i can't find any example of it or how to program it.
Wow, this is a bad algorithm! If someone else finds himself trying to understand Epson's terrible low-level communication manual, this is how the check-sum is done:
The checksum base is 30 30 30 30
Sum in hexadecimals all of the data package (for example, 02+89+00+00+1C+80+80+1C+00+01+1C+09+0C+1C+03 = 214)
Then separate the result digit by digit, if its a letter add 1 to the value (for example B2 would be 2|1|4).
sum it to the checksum base number by number starting from right to left (this would be a checksum of 30 32 31 34).
Note: It works perfectly, but for some reason the examples I posted above don't seem to match so much. They are all the printers response, but slightly after it got a hardware problem and had to be reformatted by technical support, so maybe it got fixed.
I hope it helps somebody somewhere.

Resources