Manual CBC encryption handing with Crypto++ - crypto++

I am trying to play around with a manual encryption in CBC mode but still use Crypto++, just to know can I do it manually.
The CBC algorithm is (AFAIK):
Presume we have n block K[1]....k[n]
0. cipher = empty;
1. xor(IV, K1) -> t1
2. encrypt(t1) -> r1
3. cipher += r1
4. xor (r1, K2) -> t2
5. encrypt(t2) -> r2
6. cipher += r2
7. xor(r2, K3)->t3
8. ...
So I tried to implement it with Crypto++. I have a text file with alphanumeric characters only. Test 1 is read file chunk by chunk (16 byte) and encrypt them using CBC mode manually, then sum up the cipher. Test 2 is use Crypto++ built-in CBC mode.
Test 1
char* key;
char* iv;
//Iterate in K[n] array of n blocks
BSIZE = 16;
std::string vectorToString(vector<char> v){
string s ="";
for (int i = 0; i < v.size(); i++){
s[i] = v[i];
}
return s;
}
vector<char> xor( vector<char> s1, vector<char> s2, int len){
vector<char> r;
for (int i = 0; i < len; i++){
int u = s1[i] ^ s2[i];
r.push_back(u);
}
return r;
}
vector<char> byteToVector(byte *b, int len){
vector<char> v;
for (int i = 0; i < len; i++){
v.push_back( b[i]);
}
return v;
}
string cbc_manual(byte [n]){
int i = 0;
//Open a file and read from it, buffer size = 16
// , equal to DEFAULT_BLOCK_SIZE
std::ifstream fin(fileName, std::ios::binary | std::ios::in);
const int BSIZE = 16;
vector<char> encryptBefore;
//This function will return cpc
string cpc ="";
while (!fin.eof()){
char buffer[BSIZE];
//Read a chunk of file
fin.read(buffer, BSIZE);
int sb = sizeof(buffer);
if (i == 0){
encryptBefore = byteToVector( iv, BSIZE);
}
//If i == 0, xor IV with current buffer
//else, xor encryptBefore with current buffer
vector<char> t1 = xor(encryptBefore, byteToVector((byte*) buffer, BSIZE), BSIZE);
//After xored, encrypt the xor result, it will be current step cipher
string r1= encrypt(t1, BSIZE).c_str();
cpc += r1;
const char* end = r1.c_str() ;
encryptBefore = stringToVector( r1);
i++;
}
return cpc;
}
This is my encrypt() function, because we have only one block so I use ECB (?) mode
string encrypt(string s, int size){
ECB_Mode< AES >::Encryption e;
e.SetKey(key, size);
string cipher;
StringSource ss1(s, true,
new StreamTransformationFilter(e,
new StringSink(cipher)
) // StreamTransformationFilter
); // StringSource
return cipher;
}
And this is 100% Crypto++ made solution:
Test 2
encryptCBC(char * plain){
CBC_Mode < AES >::Encryption encryption(key, sizeof(key), iv);
StreamTransformationFilter encryptor(encryption, NULL);
for (size_t j = 0; j < plain.size(); j++)
encryptor.Put((byte)plain[j]);
encryptor.MessageEnd();
size_t ready = encryptor.MaxRetrievable();
string cipher(ready, 0x00);
encryptor.Get((byte*)&cipher[0], cipher.size());
}
Result of Test 1 and Test 2 are different. In the fact, ciphered text from Test 1 is contain the result of Test 2. Example:
Test 1's result aaa[....]bbb[....]ccc[...]...
Test 2 (Crypto++ built-in CBC)'s result: aaabbbccc...
I know the xor() function may cause a problem relate to "sameChar ^ sameChar = 0", but is there any problem relate to algorithm in my code?
This is my Test 2.1 after the 1st solution of jww.
static string auto_cbc2(string plain, long size){
CBC_Mode< AES >::Encryption e;
e.SetKeyWithIV(key, sizeof(key), iv, sizeof(iv));
string cipherText;
CryptoPP::StringSource ss(plain, true,
new CryptoPP::StreamTransformationFilter(e,
new CryptoPP::StringSink(cipherText)
, BlockPaddingSchemeDef::NO_PADDING
) // StreamTransformationFilter
); // StringSource
return cipherText;
}
It throw an error:
Unhandled exception at 0x7407A6F2 in AES-CRPP.exe: Microsoft C++
exception: CryptoPP::InvalidDataFormat at memory location 0x00EFEA74
I only got this error when use BlockPaddingSchemeDef::NO_PADDING, tried to remove BlockPaddingSchemeDef or using BlockPaddingSchemeDef::DEFAULT_PADDING, I got no error . :?

StringSource ss1(s, true,
new StreamTransformationFilter(e,
new StringSink(cipher)));
This uses PKCS padding by default. It takes a 16-byte input and produces a 32-byte output due to padding. You should do one of two things.
First, you can use BlockPaddingScheme::NO_PADDING. Something like:
StringSource ss1(s, true,
new StreamTransformationFilter(e,
new StringSink(cipher)
BlockPaddingScheme::NO_PADDING));
Second, you can process blocks manually, 16 bytes at a time. Something like:
AES::Encryption encryptor(key, keySize);
byte ibuff[<some size>] = ...;
byte obuff[<some size>];
ASSERT(<some size> % AES::BLOCKSIZE == 0);
unsigned int BLOCKS = <some size>/AES::BLOCKSIZE;
for (unsigned int i=0; i<BLOCKS; i==)
{
encryptor.ProcessBlock(&ibuff[i*16], &obuff[i*16]);
// Do the CBC XOR thing...
}
You may be able to call ProcessAndXorBlock from the BlockCipher base class and do it in one shot.

Related

What are my options to convert OpenCV reduce loop to a native iOS code. SIMD anyone?

Which native iOS framework is best used to eradicate this cpu hog written in OpenCV?
/// Reduce the channel elements of given Mat to a single channel
static func reduce(input: Mat) throws -> Mat {
let output = Mat(rows: input.rows(), cols: input.cols(), type: CvType.CV_8UC1)
for x in 0 ..< input.rows() {
for y in 0 ..< input.cols() {
let value = input.get(row: x, col: y)
let dataValue = value.reduce(0, +)
try output.put(row: x, col: y, data: [dataValue])
}
}
return output
}
takes about 20+ seconds to do those gets and puts on real world data I put this code through.
Assuming your input matrix is CV_64FC2, call computeSumX2 C function for each row.
Untested.
#include <arm_neon.h>
#include <stdint.h>
#include <stddef.h>
// Load 8 FP64 values, add pairwise, narrow uint64 to uint32, combine into a single vector
inline uint32x4_t reduce4( const double* rsi )
{
// Load 8 values
float64x2x4_t f64 = vld1q_f64_x4( rsi );
// Add them pairwise
float64x2_t f64_1 = vpaddq_f64( f64.val[ 0 ], f64.val[ 1 ] );
float64x2_t f64_2 = vpaddq_f64( f64.val[ 2 ], f64.val[ 3 ] );
// Convert FP64 to uint64
uint64x2_t i64_1 = vcvtq_u64_f64( f64_1 );
uint64x2_t i64_2 = vcvtq_u64_f64( f64_2 );
// Convert int64 to int32 in a single vector, using saturation
uint32x2_t low = vqmovn_u64( i64_1 );
return vqmovn_high_u64( low, i64_2 );
}
// Compute pairwise sum of FP64 values, cast to bytes
void computeSumX2( uint8_t* rdi, size_t length, const double* rsi )
{
const double* const rsiEnd = rsi + length * 2;
size_t lengthAligned = ( length / 16 ) * 16;
const double* const rsiEndAligned = rsi + lengthAligned * 2;
for( ; rsi < rsiEndAligned; rsi += 16 * 2, rdi += 16 )
{
// Each iteration of the loop loads 32 source values, stores 16 bytes
uint16x4_t low16 = vqmovn_u32( reduce4( rsi ) );
uint16x8_t u16 = vqmovn_high_u32( low16, reduce4( rsi + 8 ) );
uint8x8_t low8 = vqmovn_u16( u16 );
low16 = vqmovn_u32( reduce4( rsi + 8 * 2 ) );
u16 = vqmovn_high_u32( low16, reduce4( rsi + 8 * 3 ) );
uint8x16_t res = vqmovn_high_u16( low8, u16 );
vst1q_u8( rdi, res );
}
for( ; rsi < rsiEnd; rsi += 2, rdi++ )
{
// Each iteration of the loop loads 2 source values, stores a single byte
float64x2_t f64 = vld1q_f64( rsi );
double sum = vaddvq_f64( f64 );
*rdi = (uint8_t)sum;
}
}
For folks such as myself who have a poor comprehension of ARM Intrinsics
a simpler solution is to bridge into Objective C code as Soonts did
and thusly ditch crude Swift api to opencv bypassing costly memory copying with gets and puts.
void fasterSumX2( const char *input,
int rows,
int columns,
long step,
int channels,
char* output,
long output_step
)
{
for(int j = 0;j < rows;j++){
for(int i = 0;i < columns;i++){
long offset = step * j + i * channels;
const unsigned char *ptr = (const unsigned char *)(input + offset);
int res = ptr[0]+ptr[1];
if (res > 0) {
if (res > 255) {
assert(false);
}
}
*(output + output_step * j + i) = res;
}
}
}

Windows DPDK L2fwd- Receiving packets out of sequence

I am validating DPDK receive functionality & for this I'm shooting a pcap externally &
added code in l2fwd to dump received packets to pcap, the l2fwd dumped pcap have all the packets from shooter but some of them are not in sequence.
Shooter is already validated.
DPDK version in use-21.11
link of the pcap used : https://wiki.wireshark.org/uploads/__moin_import__/attachments/SampleCaptures/tcp-ecn-sample.pcap
Out of order packets are random. For the first run I saw no jumbled packets but was able to replicate the issue on second run with the 2nd,3rd,4th packets jumbled having order 3,4,2.
Below is snipped from l2fwd example & our modifications as //TESTCODE..
/* Read packet from RX queues. 8< */
for (i = 0; i < qconf->n_rx_port; i++) {
portid = qconf->rx_port_list[i];
nb_rx = rte_eth_rx_burst(portid, 0,
pkts_burst, MAX_PKT_BURST);
port_statistics[portid].rx += nb_rx;
for (j = 0; j < nb_rx; j++) {
m = pkts_burst[j];
// TESTCODE_STARTS
uint8_t* pkt = rte_pktmbuf_mtod(m, uint8_t*);
dump_to_pcap(pkt, rte_pktmbuf_pkt_len(m));
// TESTCODE_ENDS
rte_prefetch0(rte_pktmbuf_mtod(m, void *));
l2fwd_simple_forward(m, portid);
}
}
/* >8 End of read packet from RX queues. */
Below is code for dump_to_pcap
static int
dump_to_pcap(uint8_t* pkt, int pkt_len)
{
static FILE* fp = NULL;
static int init_file = 0;
if (0 == init_file) {
printf("Creating pcap\n");
char pcap_filename[256] = { 0 };
char Two_pcap_filename[256] = { 0 };
currentDateTime(pcap_filename);
sprintf(Two_pcap_filename,".\\Rx_%d_%s.pcap", 0, pcap_filename);
printf("FileSName to Create: %s\n", Two_pcap_filename);
fp = fopen(Two_pcap_filename, "wb");
if (NULL == fp) {
printf("Unable to open file\n");
fp = NULL;
}
else {
printf("File create success..\n");
init_file = 1;
typedef struct pcap_file_header1 {
unsigned int magic; // a 32-bit "magic number"
unsigned short version_major; //a 16-bit major version number
unsigned short version_minor; //a 16-bit minor version number
unsigned int thiszone; //a 32-bit "time zone offset" field that's actually not used, so ou can (and probably should) just make it 0
unsigned int sigfigs; //a 32-bit "time stamp accuracy" field that's not actually used,so you can (and probably should) just make it 0;
unsigned int snaplen; //a 32-bit "snapshot length" field
unsigned int linktype; //a 32-bit "link layer type" field
}dumpFileHdr;
dumpFileHdr file_hdr;
file_hdr.magic = 2712847316; //0xa1b2c3d4;
file_hdr.version_major = 2;
file_hdr.version_minor = 4;
file_hdr.thiszone = 0;
file_hdr.sigfigs = 0;
file_hdr.snaplen = 65535;
file_hdr.linktype = 1;
fwrite((void*)(&file_hdr), sizeof(dumpFileHdr), 1, fp);
//printf("Pcap Header written\n");
}
}
typedef struct pcap_pkthdr1 {
unsigned int ts_sec; /* time stamp */
unsigned int ts_usec;
unsigned int caplen; /* length of portion present */
unsigned int len; /* length this packet (off wire) */
}dumpPktHdr;
dumpPktHdr pkt_hdr;
static int ts_sec = 1;
pkt_hdr.ts_sec = ts_sec++;
pkt_hdr.ts_usec = 0;
pkt_hdr.caplen = pkt_hdr.len = pkt_len;
if (NULL != fp) {
fwrite((void*)(&pkt_hdr), sizeof(dumpPktHdr), 1, fp);
fwrite((void*)(pkt), pkt_len, 1, fp);
fflush(fp);
}
return 0;
}

How can I generate check sum code in dart?

I want to use PayMaya EMV Merchant Presented QR Code Specification for Payment Systems everything is good except CRC i don't understand how to generate this code.
that's all exist about it ,but i still can't understand how to generate this .
The checksum shall be calculated according to [ISO/IEC 13239] using the polynomial '1021' (hex) and initial value 'FFFF' (hex). The data over which the checksum is calculated shall cover all data objects, including their ID, Length and Value, to be included in the QR Code, in their respective order, as well as the ID and Length of the CRC itself (but excluding its Value).
Following the calculation of the checksum, the resulting 2-byte hexadecimal value shall be encoded as a 4-character Alphanumeric Special value by converting each nibble to an Alphanumeric Special character.
Example: a CRC with a two-byte hexadecimal value of '007B' is included in the QR Code as "6304007B".
This converts a string to its UTF-8 representation as a sequence of bytes, and prints out the 16-bit Cyclic Redundancy Check of those bytes (CRC-16/CCITT-FALSE).
int crc16_CCITT_FALSE(String data) {
int initial = 0xFFFF; // initial value
int polynomial = 0x1021; // 0001 0000 0010 0001 (0, 5, 12)
Uint8List bytes = Uint8List.fromList(utf8.encode(data));
for (var b in bytes) {
for (int i = 0; i < 8; i++) {
bool bit = ((b >> (7-i) & 1) == 1);
bool c15 = ((initial >> 15 & 1) == 1);
initial <<= 1;
if (c15 ^ bit) initial ^= polynomial;
}
}
return initial &= 0xffff;
}
The CRC for ISO/IEC 13239 is this CRC-16/ISO-HDLC, per the notes in that catalog. This implements that CRC and prints the check value 0x906e:
import 'dart:typed_data';
int crc16ISOHDLC(Uint8List bytes) {
int crc = 0xffff;
for (var b in bytes) {
crc ^= b;
for (int i = 0; i < 8; i++)
crc = (crc & 1) != 0 ? (crc >> 1) ^ 0x8408 : crc >> 1;
}
return crc ^ 0xffff;
}
void main() {
Uint8List msg = Uint8List.fromList([0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39]);
print("0x" + crc16ISOHDLC(msg).toRadixString(16));
}

Crypto++ CTR mode manual implement

I am trying to make CTR manually on top of ECB mode (but still) using Crypto++.
The idea is:
For single block: Just use ECB For multiple block, use CTR algorithm
(AFAIK):
//We have n block of plain data -> M
PlainData M[n];
key;
iv;
char *CTR;
cipher ="";
for(i = 0; i<n; i++ ){
if(i ==0){
CTR = iv;
}
ei = encryptECB(CTR + i)
cipherI = xor(ei, M[i])
cipher += cipherI;
}
//My xor() to XOR two char array
void xor(char *s1, char* s2, char *& result, int len){
try{
int i;
for (i = 0; i < len; i++){
int u = s1[i] ^ s2[i];
result[i] = u;
}
result[i] = '\0';
}
catch (...){
cout << "Errp";
}
}
Test 1: 100% Crypto++ CTR
string auto_ctr(char * s1, long size){
CTR_Mode< AES >::Encryption e;
e.SetKeyWithIV(key, sizeof(key), iv, sizeof(iv));
string cipherZ;
StringSource s(s1, true,
new StreamTransformationFilter(e,
new StringSink(cipherZ), BlockPaddingSchemeDef::BlockPaddingScheme::NO_PADDING
)
);
return cipherZ;
}
Test 2: Manual CTR based on ECB
string encrypt(char* s1, int size){
ECB_Mode< AES >::Encryption e;
e.SetKey(key, size);
string cipher;
string s(s1, size);
StringSource ss1(s, true,
new StreamTransformationFilter(e,
new StringSink(cipher), BlockPaddingSchemeDef::BlockPaddingScheme::NO_PADDING
) // StreamTransformationFilter
); // StringSource
return cipher;
}
static string manual_ctr(char *plain, long &size){
int nBlocks = size / BLOCK_SIZE;
char* encryptBefore = new char[BLOCK_SIZE];
char *ci = new char[BLOCK_SIZE] ;
string cipher;
for (int i = 0; i < nBlocks; i++){
//If the first loop, CTR = IV
if (i == 0){
memcpy(encryptBefore, iv, BLOCK_SIZE);
}
encryptBefore[BLOCK_SIZE] = '\0';
memcpy(encryptBefore, encryptBefore + i, BLOCK_SIZE);
char *buffer = new char[BLOCK_SIZE];
memcpy(buffer, &plain[i], BLOCK_SIZE);
buffer[BLOCK_SIZE] = '\0';
//Encrypt the CTR
string e1 = encrypt(encryptBefore, BLOCK_SIZE);
//Xor it with m[i] => c[i]
xor((char*)e1.c_str(), buffer, ci, BLOCK_SIZE);
//Append to the summary cipher
/*for (int j = 0; j < BLOCK_SIZE/2; j++){
SetChar(cipher, ci[j], i*BLOCK_SIZE + j);
}*/
cipher += ci;
//Set the cipher back to iv
//memcpy(encryptBefore, ci, BLOCK_SIZE);
}
return cipher;
}
And this is Main for testing:
void main(){
long size = 0;
char * plain = FileUtil::readAllByte("some1.txt", size);
string auto_result = auto_ctr(plain, size);
string manual_result = manual_ctr(plain, size);
getchar();
}
The auto_result is:
"Yž+eÞsÂÙ\bü´\x1a¨Ü_ÙR•L¸Ð€¦å«ÎÍÊ[w®Ÿg\fT½\ý7!p\r^ÍdžúP\bîT\x3\x1cZï.s%\x1ei{ÚMˆØ…Pä¾õ\x46\r5\tâýï‚ú\x16ç’Qiæ²\x15š€á^ªê]W
ÊNqdŒ¥ ˆ†¾j%8.Ìù\x6Þ›ÔÏ’[c\x19"
The manual_result is:
"Yž+eÞsÂÙ\bü´\x1a¨Ü_Ù·\x18ýuù\n\nl\x11Á\x19À†Žaðƒºñ®GäþŽá•\x11ÇYœf+^Q\x1a\x13B³‘QQµºëÑÌåM\"\x12\x115â\x10¿Ô„›s°‰=\x18*\x1c:²IF'n#ŠŠ¾mGÂzõžÀ\x1eÏ\SëYU¼í‘"
>
What is the problem with my implement?
Since your first block seems to be working fine, I've only searched for problems in the management of the counter itself and here is what seems me wrong :
memcpy(encryptBefore, encryptBefore + i, BLOCK_SIZE);
Here you are trying to increment your IV i times, I presume, but this is not what happens, what you do is trying to copy into your encryptBefore pointer the content of the encryptBefore+i pointer spanning over BLOCK_SIZE bytes. This is not at all incrementing the IV, but it works for the first block because then i=0.
What you want to do is actually creating a big integer using CryptoPP::Integer to use as an IV and increment that integer and then convert it into a byte array using the Encode(byte *output, size_t outputLen, Signedness sign=UNSIGNED) const function from the CryptoPP Integer class when you need to use bytes instead of integers.
Ps: when performing i/o operations, I recommend you to use hexadecimal strings, take a look at the CryptoPP::HexEncoder and HexDecoder classes, they both are well documented on CryptoPP wiki.

get 32 bit number in ios

How to get a 32 bit number in objective c when an byte array is passed to it, similarly as in java where,
ByteBuffer bb = ByteBuffer.wrap(truncation);
return bb.getInt();
Where truncation is the byte array.
It returns 32 bit number.. Is this possible in objective c?
If the number is encoded in little-endian within the buffer, then use:
int32_t getInt32LE(const uint8_t *buffer)
{
int32_t value = 0;
unsigned length = 4;
while (length > 0)
{
value <<= 8;
value |= buffer[--length];
}
return value;
}
If the number is encoded in big-endian within the buffer, then use:
int32_t getInt32BE(const uint8_t *buffer)
{
int32_t value = 0;
for (unsigned i = 0; i < 4; i++)
{
value <<= 8;
value |= *buffer++;
}
return value;
}
UPDATE If you are using data created on the same host then endianness is not an issue, in which case you can use a union as a bridge between the buffer and integers, which avoids some unpleasant casting:
union
{
uint8_t b[sizeof(int32_t)];
int32_t i;
} u;
memcpy(u.b, buffer, sizeof(u.b));
// value is u.i
Depending on the endianness:
uint32_t n = b0 << 24 | b1 << 16 | b2 << 8 | b3;
or
uint32_t n = b3 << 24 | b2 << 16 | b1 << 8 | b0
Not sure if you just want to read 4 bytes and assign that value to an integer. This case:
int32_t number;
memcpy(&number, truncation, sizeof(uint32_t));
About endianess
From your question (for me) was clear that the bytes were already ordered correctly. However if you have to re-order these bytes, use ntohl() after memcpy() :
number=ntohl(number);

Resources