Checksum/CRC reverse engineering of Microsoft NDIS packet - checksum

I am trying to decode a 42-byte packet which seems to include 2-byte CRC / checksum field
This is a Microsoft NDIS packet (type IPX) which is sent in HLK (WHQL) tests
I have decoded most parts of the NDIS header but I can't seem to figure out the CRC/Checksum algorithm
Sample of a 45-byte packet (just to explain the decoded fields):
char packet_bytes[] = {
0x02, 0xe4, 0x55, 0xee, 0x12, 0x56, 0x02, 0x93,
0x19, 0x40, 0x89, 0x00, 0x00, 0x1f, 0xaa, 0xaa,
0x03, 0x00, 0x00, 0x00, 0x81, 0x37, 0x4e, 0x44,
0x49, 0x53, 0x01, 0x49, 0x03, 0x00, 0x98, 0xd4,
0x58, 0x55, 0x25, 0xf5, 0x39, 0x00, 0x14, 0x00,
0x00, 0x00, 0x49, 0x4a, 0x4b
};
Raw: 02e455ee1256029319408900001faaaa0300000081374e4449530149030098d4585525f5390014000000494a4b
Decoded fields:
802.2 ethernet header: (Wireshark decoding)
02e455ee1256 : Destination
029319408900 : Source
001f : Length
Logical_link Control: (Wireshark decoding)
aa : DSAP
aa : SSAP
03 : Control
000000 : Organization
8137 : Type (Netware IPX/SPX)
NDIS header: (my estimation for NDIS decoded fields)
4e444953 : NDIS ascii String ("NDIS")
01 : Unknown
49 : payload counter start (first byte of payload, with increasing value afterwards)
0300 : Payload length ( = 0003)
98d4 : test identification number (equal on all packets of the same test)
5855 : Assumed to be checksum
25f53900 : Packet counter ( = 0039f525, Increases gradually per packet)
14000000 : Payload offset ( = 00000014), offset from start of NDIS header to start of payload
494a4b : Payload (3 bytes of increasing counter 49,4a,4b)
To try to understand the checksum algorithm with minimal packet bytes,
I've captured the minimal packets size (42 bytes)
Those packets include the headers above but without payload at all
And tried to reverse eng them using reveng CRC decoder which fail to find any known CRC algorithm
Sample 42-byte packets:
02e455ee1256029319408900001caaaa0300000081374e444953016b000098d495262502000014000000
02e455ee1256029319408900001caaaa0300000081374e44495301a2000098d481ef3802000014000000
02e455ee1256029319408900001caaaa0300000081374e4449530152000098d47f3f3b02000014000000
02e455ee1256029319408900001caaaa0300000081374e44495301d0000098d476c14302000014000000
02e455ee1256029319408900001caaaa0300000081374e44495301f7000098d4539a6602000014000000
02e455ee1256029319408900001caaaa0300000081374e44495301b6000098d444db7502000014000000
02e455ee1256029319408900001caaaa0300000081374e44495301a6000098d431eb8802000014000000
02e455ee1256029319408900001caaaa0300000081374e444953016a000098d40627b402000014000000
Reverse eng the CRC:
reveng.exe -w 16 -s 02e455ee1256029319408900001caaaa0300000081374e444953016b000098d495262502000014000000 02e455ee1256029319408900001caaaa0300000081374e44495301a2000098d481ef3802000014000000 02e455ee1256029319408900001caaaa0300000081374e4449530152000098d47f3f3b02000014000000 02e455ee1256029319408900001caaaa0300000081374e44495301d0000098d476c14302000014000000 02e455ee1256029319408900001caaaa0300000081374e44495301f7000098d4539a6602000014000000 02e455ee1256029319408900001caaaa0300000081374e44495301b6000098d444db7502000014000000 02e455ee1256029319408900001caaaa0300000081374e44495301a6000098d431eb8802000014000000 02e455ee1256029319408900001caaaa0300000081374e444953016a000098d40627b402000014000000
reveng.exe: no models found
Tried reverse eng only the NDIS header part:
4e444953016b000098d495262502000014000000
4e44495301a2000098d481ef3802000014000000
4e4449530152000098d47f3f3b02000014000000
4e44495301d0000098d476c14302000014000000
4e44495301f7000098d4539a6602000014000000
4e44495301b6000098d444db7502000014000000
4e44495301a6000098d431eb8802000014000000
4e444953016a000098d40627b402000014000000
reveng.exe -w 16 -s 4e444953016b000098d495262502000014000000 4e44495301a2000098d481ef3802000014000000 4e4449530152000098d47f3f3b02000014000000 4e44495301d0000098d476c14302000014000000 4e44495301f7000098d4539a6602000014000000 4e44495301b6000098d444db7502000014000000 4e44495301a6000098d431eb8802000014000000 4e444953016a000098d40627b402000014000000
reveng.exe: no models found
Any help would be appreciated.

This seems to be the Internet Checksum, described in RFC 1071, calculated over the NDIS header part of the packet.
In short, you need to add up all of the header contents (except the 16-bit checksum field itself) as 16-bit values, then add the carries (if any) to the least significant 16 bits of the result (thus forming the one's complement sum), and finally, calculate one's complement of this one's complement sum by inverting all bits.
For the example packet you listed, the manual calculation steps would be the following.
Given the whole packet:
02e455ee1256029319408900001faaaa0300000081374e4449530149030098d4585525f5390014000000494a4b
Extract the NDIS header part only, without the payload:
4e4449530149030098d4585525f5390014000000
Split into 16-bit values:
4e44
4953
0149
0300
98d4
5855
25f5
3900
1400
0000
Substitute the checksum field with zeroes:
4e44
4953
0149
0300
98d4
0000
25f5
3900
1400
0000
Add all those 16-bit values together:
1A7A9
Here, the 16 least significant bits are A7A9 and the arithmetic carry is 1. So, add these together (as 16-bit words), to form the so-called one's complement sum:
0001
+ A7A9
= A7AA
Now, invert all bits (apply the bitwise NOT operation), to get the one's complement:
~ A7AA
= 5855
Place this checksum back into the place (which we temporarily zeroed out):
4e44
4953
0149
0300
98d4
5855
25f5
3900
1400
0000
If you only want to check the checksum, do the following.
First, take the original NDIS header (as 16-bit values):
4e44
4953
0149
0300
98d4
5855
25f5
3900
1400
0000
Then sum all of this up:
1FFFE
Again, add the carry to the 16-bit LSB part:
0001
+ FFFE
= FFFF
If all the bits of the result are 1 (i.e., if the result is FFFF), the check is successful.

Related

CRC exploring encoding principles(reserve)

We are looking for a solution to the problem of guessing the result value of crc16 with a specific Hex input.
hello. Currently, I am working on estimating the result of crc-16 using specific hex data.
I have figured out the type of input hex value and the crc-16 algorithm, but the result value does not match no matter how the hex value is combined, so I leave a question.
The types of hex values are 0x170, 0xA, 0x00, 0x31
The CRC-16 algorithm used is CRC-16-CCITT XMODEM (Poly = 0x1021, Init = 0x0000).
And the result you want to output is 0x6121 or 0x2161.
It is thought that 0x0170 and 0xA among the above input hex are mixed and divided in some way and input to CRC-16 (for example, after AND operation with 0x017A, division into 0x01 and 0x7A), 0x01, 0x70, 0x0A, 0x00, 0x31 in order Input, 0x31, 0x00, 0x0A, 0x70, 0x01 Even if you change the input order in various ways, such as inputting in reverse order, the result does not come out.
Can you tell me how to find an input sequence or hex input data combination that can solve the above problem?
Waiting for your reply.
thank you

How to decode GSM-TCAP messages using asn1c generated code

I am using the c code generated by asn1c from the TCAP protocol specification (i.e., the corresponding ASN1 files).
I can successfully encode TCAP packets by the generated code.
However, trying to "decode" related byte streams fails.
A sample code is as follows.
// A real byte stream of a TCAP message:
unsigned char packet_bytes[] = {
0x62, 0x43, 0x48, 0x04, 0x00, 0x18, 0x02, 0x78,
0x6b, 0x1a, 0x28, 0x18, 0x06, 0x07, 0x00, 0x11,
0x86, 0x05, 0x01, 0x01, 0x01, 0xa0, 0x0d, 0x60,
0x0b, 0xa1, 0x09, 0x06, 0x07, 0x04, 0x00, 0x00,
0x01, 0x00, 0x14, 0x03, 0x6c, 0x1f, 0xa1, 0x1d,
0x02, 0x01, 0x00, 0x02, 0x01, 0x2d, 0x30, 0x15,
0x80, 0x07, 0x91, 0x64, 0x21, 0x92, 0x05, 0x31,
0x74, 0x81, 0x01, 0x01, 0x82, 0x07, 0x91, 0x64,
0x21, 0x00, 0x00, 0x90, 0x02
};
// Initializing ...
TCAP_TCMessage_t _pdu, *pdu = &_pdu;
memset(pdu, 0, sizeof(*pdu));
// Decoding:
asn_dec_rval_t dec_ret = ber_decode(NULL, &asn_DEF_TCAP_TCMessage, (void **) &pdu, packet_bytes, sizeof(packet_bytes));
While the message type ("Begin", in this case), is correctly detected, but other paramters are not parsed.
Using other encoding rules, i.e., aper_decode() and uper_decode(), also fails.
I would be thankful if anyone can describe how to use the auto-generated c code for decoding (parsing) a byte string of TCAP messages.
#Vasil, thank you very much for your answer.
Which asn1c are you using (git commit id) and where do you get it
from as there are quite a log of forks out there?
I use the mouse07410's branch.
How do you know that Begin is correctly detected?
From the field present of the pdu variable that is evaluated by ber_decode (you can see the pdu type in the sample code).
From the "Wireshark" output for this byte stream, I know that the correct type of the message is Begin.
You could try compiling with -DASN_EMIT_DEBUG=1 in CFLAGS (or
-DEMIT_ASN_DEBUG=1 depending on the asn1c version you are using) to get some more debug messages.
Thanks for providing the hint; it was helpful.
The problem was related to the asn1 files I was using.
I used osmocom asn1 files and compiled them by
ASN=../asn
asn1c $ASN/DialoguePDUs.asn $ASN/tcap.asn $ASN/UnidialoguePDUs.asn
in which, DialoguePortion is defined as follows (note that the first definition is commented):
--DialoguePortion ::= [APPLICATION 11] EXPLICIT EXTERNAL
-- WS adaptation
DialoguePortion ::= [APPLICATION 11] IMPLICIT DialogueOC
DialogueOC ::= OCTET STRING
To be able to decode TCAP messages,
one needs to use the former definition (as is in the standard), i.e., DialoguePortion should be defined as
DialoguePortion ::= [APPLICATION 11] EXPLICIT EXTERNAL
When using this latter definition in the asn1 file,
and recompiling the asn1 files, the problem solved.
P.S.: This question is also related to my problem.
I am using the c code generated by asn1c from the TCAP protocol specification
Which asn1c are you using (git commit id) and where do you get it from as there are quite a log of forks out there?
While the message type ("Begin", in this case), is correctly detected, but other paramters are not parsed.
How do you know that Begin is correctly detected?
Using other encoding rules, i.e., aper_decode() and uper_decode(), also fails.
There is no point in trying other encodings as they are not binary compatible.
I would be thankful if anyone can describe how to use the auto-generated c code for decoding (parsing) a byte string of TCAP messages.
You are using it correctly and probably there is a bug somewhere in the BER decoder.
You could try compiling with -DASN_EMIT_DEBUG=1 in CFLAGS (or -DEMIT_ASN_DEBUG=1 depending on the asn1c version you are using) to get some more debug messages.

SSE: shuffle (permutevar) 4x32 integers

I have some code using the AVX2 intrinsic _mm256_permutevar8x32_epi32 aka vpermd to select integers from an input vector by an index vector. Now I need the same thing but for 4x32 instead of 8x32. _mm_permutevar_ps does it for floating point, but I'm using integers.
One idea is _mm_shuffle_epi32, but I'd first need to convert my 4x32 index values to a single integer, that is:
imm[1:0] := idx[31:0]
imm[3:2] := idx[63:32]
imm[5:4] := idx[95:64]
imm[7:6] := idx[127:96]
I'm not sure what's the best way to do that, and moreover I'm not sure it's the best way to proceed. I'm looking for the most efficient method on Broadwell/Haswell to emulate the "missing" _mm_permutevar_epi32(__m128i a, __m128i idx). I'd rather use 128-bit instructions than 256-bit ones if possible (i.e. I don't want to widen the 128-bit inputs then narrow the result).
It's useless to generate an immediate at run-time, unless you're JITing new code. An immediate is a byte that's literally part of the machine-code instruction encoding. That's great if you have a compile-time-constant shuffle (after inlining + template expansion), otherwise forget about those shuffles that take the control operand as an integer1.
Before AVX, the only variable-control shuffle was SSSE3 pshufb. (_mm_shuffle_epi8). That's still the only 128-bit (or in-lane) integer shuffle instruction in AVX2 and I think AVX512.
AVX1 added some in-lane 32-bit variable shuffles, like vpermilps (_mm_permutevar_ps). AVX2 added lane-crossing integer and FP shuffles, but somewhat strangely no 128-bit version of vpermd. Perhaps because Intel microarchitectures have no penalty for using FP shuffles on integer data. (Which is true on Sandybridge family, I just don't know if that was part of the reasoning for the ISA design). But you'd think they would have added __m128i intrinsics for vpermilps if that's what you were "supposed" to do. Or maybe the compiler / intrinsics design people didn't agree with the asm instruction-set people?
If you have a runtime-variable vector of 32-bit indices and want to do a shuffle with 32-bit granularity, by far your best bet is to just use AVX _mm_permutevar_ps.
_mm_castps_si128( _mm_permutevar_ps (_mm_castsi128_ps(a), idx) )
On Intel at least, it won't even introduce any extra bypass latency when used between integer instructions like paddd; i.e. FP shuffles specifically (not blends) have no penalty for use on integer data in Sandybridge-family CPUs.
If there's any penalty on AMD Bulldozer or Ryzen, it's minor and definitely cheaper than the cost of calculating a shuffle-control vector for (v)pshufb.
Using vpermd ymm and ignoring the upper 128 bits of input and output (i.e. by using cast intrinsics) would be much slower on AMD (because its 128-bit SIMD design has to split lane-crossing 256-bit shuffles into several uops), and also worse on Intel where it makes it 3c latency instead of 1 cycle.
#Iwill's answer shows a way to calculate a shuffle-control vector of byte indices for pshufb from a vector of 4x32-bit dword indices. But it uses SSE4.1 pmulld which is 2 uops on most CPUs, and could easily be a worse bottleneck than shuffles. (See discussion in comments under that answer.) Especially on older CPUs without AVX, some of which can do 2 pshufb per clock unlike modern Intel (Haswell and later only have 1 shuffle port and easily bottleneck on shuffles. IceLake will add another shuffle port, according to Intel's Sunny Cove presentation.)
If you do have to write an SSSE3 or SSE4.1 version of this, it's probably best to still use only SSSE3 and use pshufb plus a left shift to duplicate a byte within a dword before ORing in the 0,1,2,3 into the low bits, not pmulld. SSE4.1 pmulld is multiple uops and even worse than pshufb on some CPUs with slow pshufb. (You might not benefit from vectorizing at all on CPUs with only SSSE3 and not SSE4.1, i.e. first-gen Core2, because it has slow-ish pshufb.)
On 2nd-gen Core2, and Goldmont, pshufb is a single-uop instruction with 1-cycle latency. On Silvermont and first-gen Core 2 it's not so good. But overall I'd recommend pshufb + pslld + por to calculate a control-vector for another pshufb if AVX isn't available.
An extra shuffle to prepare for a shuffle is far worse than just using vpermilps on any CPU that supports AVX.
Footnote 1:
You'd have to use a switch or something to select a code path with the right compile-time-constant integer, and that's horrible; only consider that if you don't even have SSSE3 available. It may be worse than scalar unless the jump-table branch predicts perfectly.
Although Peter Cordes is correct in saying that the AVX instruction vpermilps and its intrinsic _mm_permutevar_ps() will probably do the job, if you're working on machines older than Sandy Bridge, an SSE4.1 variant using pshufb works quite well too.
AVX variant
Credits to #PeterCordes
#include <stdio.h>
#include <immintrin.h>
__m128i vperm(__m128i a, __m128i idx){
return _mm_castps_si128(_mm_permutevar_ps(_mm_castsi128_ps(a), idx));
}
int main(int argc, char* argv[]){
__m128i a = _mm_set_epi32(0xDEAD, 0xBEEF, 0xCAFE, 0x0000);
__m128i idx = _mm_set_epi32(1,0,3,2);
__m128i shu = vperm(a, idx);
printf("%04x %04x %04x %04x\n", ((unsigned*)(&shu))[3],
((unsigned*)(&shu))[2],
((unsigned*)(&shu))[1],
((unsigned*)(&shu))[0]);
return 0;
}
SSE4.1 variant
#include <stdio.h>
#include <immintrin.h>
__m128i vperm(__m128i a, __m128i idx){
idx = _mm_and_si128 (idx, _mm_set1_epi32(0x00000003));
idx = _mm_mullo_epi32(idx, _mm_set1_epi32(0x04040404));
idx = _mm_or_si128 (idx, _mm_set1_epi32(0x03020100));
return _mm_shuffle_epi8(a, idx);
}
int main(int argc, char* argv[]){
__m128i a = _mm_set_epi32(0xDEAD, 0xBEEF, 0xCAFE, 0x0000);
__m128i idx = _mm_set_epi32(1,0,3,2);
__m128i shu = vperm(a, idx);
printf("%04x %04x %04x %04x\n", ((unsigned*)(&shu))[3],
((unsigned*)(&shu))[2],
((unsigned*)(&shu))[1],
((unsigned*)(&shu))[0]);
return 0;
}
This compiles down to the crisp
0000000000400550 <vperm>:
400550: c5 f1 db 0d b8 00 00 00 vpand 0xb8(%rip),%xmm1,%xmm1 # 400610 <_IO_stdin_used+0x20>
400558: c4 e2 71 40 0d bf 00 00 00 vpmulld 0xbf(%rip),%xmm1,%xmm1 # 400620 <_IO_stdin_used+0x30>
400561: c5 f1 eb 0d c7 00 00 00 vpor 0xc7(%rip),%xmm1,%xmm1 # 400630 <_IO_stdin_used+0x40>
400569: c4 e2 79 00 c1 vpshufb %xmm1,%xmm0,%xmm0
40056e: c3 retq
The AND-masking is optional if you can guarantee that the control indices will always be the 32-bit integers 0, 1, 2 or 3.

Export an elliptic curve key from iOS to work with OpenSSL

I have a private/public key pair generated and stored inside Secure Enclave.
It is 256-bit elliptic curve key. (The only key type that can be stored in Secure Enclave).
I use SecKeyCreateWithData and SecKeyCopyExternalRepresentation to import/export the public key between iOS devices, and it works.
However, the exported key doesn't seem to work with OpenSSL.
Because it always show 'unable to load Key' on this command.
openssl ec -pubin -in public_key_file -text
What's the way to export the key ? So I can use it with OpenSSL.
To work with OpenSSL, you need subject public key info (SPKI), either DER or PEM format.
SPKI contains essential information, for example, key.type, key.parameters, key.value.
SecKeyCopyExternalRepresentation only returns raw key binary which is only key.value part.
You have to create SPKI from that key.value. The normal way to do this is to read https://www.rfc-editor.org/rfc/rfc5480, and encode ASN.1 structure to binary-encoded DER format.
But here is a shortcut.
Secure Enclave only supports one key type, 256-bit EC key secp256r1 (equivalent to prime256v1 in OpenSSL).
The SPKI in DER format is a binary encoded data, for example,
3059301306072a8648ce3d020106082a8648ce3d03010703420004fad2e70b0f70f0bf80d7f7cbe8dd4237ca9e59357647e7a7cb90d71a71f6b57869069bcdd24272932c6bdd51895fe2180ea0748c737adecc1cefa3a02022164d
It always consist of two parts
fixed schema header 3059301306072a8648ce3d020106082a8648ce3d030107034200
raw key value 04.......
You can create SPKI by combining these two parts.
spki = fixed_schema_header + SecKeyCopyExternalRepresentation(...)
func createSubjectPublicKeyInfo(rawPublicKeyData: Data) -> Data {
let secp256r1Header = Data(bytes: [
0x30, 0x59, 0x30, 0x13, 0x06, 0x07, 0x2a, 0x86, 0x48, 0xce, 0x3d, 0x02, 0x01, 0x06, 0x08, 0x2a,
0x86, 0x48, 0xce, 0x3d, 0x03, 0x01, 0x07, 0x03, 0x42, 0x00
])
return secp256r1Header + rawPublicKeyData
}
// Usage
let rawPublicKeyData = SecKeyCopyExternalRepresentation(...)!
let publicKeyDER = createSubjectPublicKeyInfo(rawPublicKeyData: rawPublicKeyData)
write(publicKeyDER, to: "public_key.der")
// Test with OpenSSL
// openssl ec -pubin -in public_key.der -text -inform der

Reading temperature and fan speeds on chipset without WMI support in Windows, from Delphi

I have searched and searched but found nothing about how in Delphi, and I am using XE2 how to read sensor information from the Unvoton NCT6776F chip. I am guessing I need some assembly somewhere but there is nothing I can find on how to even begin. Here are the registry details of the chip.
Bus Type = ISAIO
One NCT6776F
Unvoton NCT6776F, IndexReg=A35, DataReg=A36
=============================================================
Fan1 Fan Speed, Bank 6, Offset 0x30, 0x31 RPM = 1350000/(Data=HighByte[12:5], LowByte
[4:0])
Fan2 Fan Speed, Bank 6, Offset 0x32, 0x33 RPM = 1350000/(Data=HighByte[12:5], LowByte
[4:0])
Fan3 Fan Speed, Bank 6, Offset 0x34, 0x35 RPM = 1350000/(Data=HighByte[12:5], LowByte
[4:0])
CPU Voltage, Bank 0, Offset 0x20 Voltage = Data* 0.008
VCCSA Voltage, Bank 0, Offset 0x21 Voltage = Data* 0.008
+3.3V Voltage, Bank 0, Offset 0x22 Voltage = Data* 0.016
Gfx Voltage, Bank 0, Offset 0x24 Voltage = Data* 0.008
+5V Voltage, Bank 0, Offset 0x25 Voltage = Data* 0.008/ (10./40.)
+12V Voltage, Bank 0, Offset 0x26 Voltage = Data* 0.008/ (10./66.2)
3.3VSB Voltage, Bank 5, Offset 0x50 Voltage = Data* 0.016
VBAT Voltage, Bank 5, Offset 0x51 Voltage = Data* 0.016
CPU Temperature, Bank 7, Offset 0x17, 0x18 PECI Count = (Data=HighByte,LowByte<15:6>
hightest bit as sign bit)
High: PECI Count>-15; Midium: -40<PECI Count<=-15; Low: PECI Count<=-40
System Temperature, Bank 0, Offset 0x27 Temperature = Data
Peripheral Temperature, Bank 1, Offset 0x50 Temperature = Data
Chassis Intrusion, Bank 0, Offset 0x42, BitMask 0x10 1 = Bad, 0 = Good
(Clear Bit: Bank 0, Offset 0x46, BitMask 0x80)
Power Supply Failure, NCT6776F, Logical Device 0x0B, CRF7h, BitMask 0x01 0 = Good, 1
= Bad
If anyone has any idea how I can read these addresses and get the required information I would be very grateful. If anyone could post some example code, that would be even better. What I am in fact trying to do is add a Temperature sensor gauge to my server software for monitoring purposes. I need to integrate the data directly and not use a third party application due to the nature of the application I am building.
Thanks.
Alex.
Based on the information on the lm-sensors wiki - the device is accessed using the LPC bus. There is a dedicated GPLed linux driver that can be downloaded to access the device under linux. I would not look at this source if I was planning on an implementation myself because of the possibility of tainting any proprietary code that is written to access the device.
In order to perform peripheral I/O using delphi (as in the inb/outb instructions or their equivalent), you should look at the question how to write to I/O ports in Windows XP

Resources