given a 64bit int I need to split it into 4 x 2bytes int.
for example decimal 66309 is 0000 0000 0000 0001 0000 0011 0000 0101
I need to convert this into an array of 4 ints {0, 1, 3, 5}. How can I do it in lua?
First, the conversion of 66309 into four 16 bit integers wouldn't be {0, 1, 3, 5}, but {0, 0, 1, 773}. In your example, you are splitting it into 8 bit integers. The below does 16 bit integers.
local int = 66309
local t = {}
for i = 0, 3 do
t[i+1] = (int >> (i * 16)) & 0xFFFF
end
If you want it to be 8 bit integers change the 3 in the loop to 7, the 16 in the shift expression to an 8, and the hex mask 0xFFFF to 0xFF.
And finally, this only works for Lua 5.3. You cannot accurately represent a 64 bit integer in Lua before this version without external libraries.
Related
byte 0: min_value (0-3 bit)
max_value (4-7 bit)
The byte0 should be the min and max values combined.
min and max values are both integers (in 0-15 range).
I should convert them into 4-bit binary, and combine them somehow? (how?)
E.g.
min_value=2 // 0010
max_value=3 // 0011
The result should be an Uint8, and the value: 00100011
You can use the shift left operator << to get the result you want:
result = ((min_value << 4) + max_value).toRadixString(2).padLeft(8, '0');
I am learning little by little SIMD programming, and I've devised a (seemingly) simple problem that I hope I can speed-up using SIMD (AVX, at the moment I have access only to AVX CPUs).
I have a long string constituted by an alphabet of 2^k characters (for instance 0, 1, 2, 3), and I'd like to:
generate all substrings of a given length substringlength
convert all the substrings in bits
The substrings are just sequences of characters from the input string:
012301230123012301230123012301233012301301230123123213012301230
substringlength = 6;
string bits
------+--+-----------------
012301 -> 01 00 11 10 01 00
123012 -> 10 01 00 11 10 01
230123 -> 11 10 01 00 11 10
301230 -> 00 11 10 01 00 11
...
My question is due to my inexperience with SIMD (I've only read "Modern x86 Assembly Language Programming", by Kusswurm):
Is this a task where SIMD could help?
Edit: for simplicity, let's just assume k = 2, and so the ASCII numbers will be just '0'..'3'.
Iteration 1
Reading the comments and playing around I've come to these realizations. I can convert the the ASCII into values, and as suggested, multiply-add adjacent bytes:
// SIMD 128-bit registers, apparently I cannot use AVX ones directly (some operations are AVX2 or AVX-512)
__m128i sse, val, adj, res;
auto mask = _mm_set_epi8(1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4);
auto zero = _mm_set_epi8('0', '0', '0', '0', '0', '0', '0', '0',
'0', '0', '0', '0', '0', '0', '0', '0');
// Load ascii values
sse = _mm_loadu_si128((__m128i*) s.data());
// Convert to integer values
val = _mm_sub_epi8(sse[0], zero);
// Multiply with mask byte by byte (aka SHL second bytes of val) and sum
adj = _mm_maddubs_epi16(val, mask);
An idea of what it does, to people learning like me, is given here (I will need more 128-bits to encode one substring, ascii is in hex):
bytes 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ascii 30 31 30 31 30 31 30 31 30 31 30 31 30 31 30 31
_mm_sub_epi8:
value 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
_mm_maddubs_epi16:
value 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
* * * * * * * * * * * * * * * *
mask 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4
+ + + + + + + +
| | | | | | | | |
(16-bits)
bits ....0100 ....0100 ....0100 ....0100 ....0100 ....0100 ....0100 ....0100
In other words the first 4 bits are correct, encoding 2 ascii chars, if I understand correctly what _mm_maddubs_epi16 did to my values, which I am not sure at all!
Now I'd need some sort of "shift-or" of adjacent bytes, something like _mm_maddubs_epi16 that shifts left the first, and ORs with the second argument, producing an 8-bit or 16-bit value:
(16-bits)
bits ....0100 ....0100 ....0100 ....0100 ....0100 ....0100 ....0100 ....0100
| shl 4 | | shl 4 | | shl 4 | | shl 4 |
0100.... ....0100 0100.... ....0100 0100.... ....0100 0100.... ....0100
OR OR OR OR
....01000100 ....01000100 ....01000100 ....01000100
However, I cannot see how _mm_bslli_si128 could help me here, or if there is a smarter way to do this. Maybe even this "horizontal" approach is foolish, and I have to rethink it.
Any hint is welcome!
I'm quite puzzled about the endianness on an ARM device. The device I'm testing uses little endian.
Say there's code here which swaps elements in an array:
uint32_t* srcPtr = (uint32_t*)src->get();
uint8_t* dstPtr = dst->get();
dstPtr[0] = ((*srcPtr) >> 16) & 0xFF;
dstPtr[1] = ((*srcPtr) >> 8) & 0xFF;
dstPtr[2] = (*srcPtr) & 0xFF;
dstPtr[3] = ((*srcPtr) >> 24);
My understanding is that if srcPtr contains {0, 1, 2, 3} the output dstPtr should be {1, 2, 3, 0}.
But the output is dstPtr is {2, 1, 0, 3}.
Does this mean that the srcPtr read in this way 3, 2, 1 -> 0 ?
Can someone please help me ? :)
Is this due to the little endian ?
so at address 0x100 I have the values 0x00, 0x11, 0x22, 0x33. 0x00 is at 0x100, 0x11 at 0x101 and so on. If I point at address 0x100 with a 32 bit unsigned pointer, then I get the value 0x33221100, true for ARM (little endian), true for x86 (little endian) etc.
So now if I take 0x33221100 and (x>>16)&0xFF I get 0x22. (x>>8)&0xFF is 0x11, x&0xFF is 0x00 and (x>>24)&0xFF is 0x33. {2,1,0,3}
Where is your confusion? Is it the conversion from 0x00,0x11,0x22,0x33 to 0x33221100? Little endian, least significant byte first, so the lowest or first address you come across (0x100) has the least significant byte (0x00 the lower 8 bits of the number) and so on 0x101 the next least significant bits 8 to 15, 0x102 bits 16 to 23 and 0x103 bits 24 to 31. for a 32 bit value.
I am using Lua on Redis and want to compare two signed 64-bit numbers, which are stored in two 8-byte/character strings.
How can I compare them using the libraries available in Redis?
http://redis.io/commands/EVAL#available-libraries
I'd like to know >/< and == checks. I think this probably involves pulling two 32-bit numbers for each 64-bit int, and doing some clever math on those, but I am not sure.
I have some code to make this less abstract. a0, a1, b0, b1 are all 32 bit numbers used to represent the msb & lsb's of two 64-bit signed int 64s:
-- ...
local comp_int64s = function (a0, a1, b0, b1)
local cmpres = 0
-- TOOD: Real comparison
return cmpres
end
local l, a0, a1, b0, b1
a0, l = bit.tobit(struct.unpack("I4", ARGV[1]))
a1, l = bit.tobit(struct.unpack("I4", ARGV[1], 5))
b0, l = bit.tobit(struct.unpack("I4", blob))
b1, l = bit.tobit(struct.unpack("I4", blob, 5))
print("Cmp result", comp_int64s(a0, a1, b0, b1))
EDIT: Added code
I came up with a method that looks like it's working. It's a little ugly though.
The first step is to compare top 32 bits as 2 compliment #’s
MSB sign bit stays, so numbers keep correct relations
-1 —> -1
0 —> 0
9223372036854775807 = 0x7fff ffff ffff ffff -> 0x7ffff ffff = 2147483647
So returning the result from the MSB's works unless they are equal, then the LSB's need to get checked.
I have a few cases to establish the some patterns:
-1 = 0xffff ffff ffff ffff
-2 = 0xffff ffff ffff fffe
32 bit is:
-1 -> 0xffff ffff = -1
-2 -> 0xffff fffe = -2
-1 > -2 would be like -1 > -2 : GOOD
And
8589934591 = 0x0000 0001 ffff ffff
8589934590 = 0x0000 0001 ffff fffe
32 bit is:
8589934591 -> ffff ffff = -1
8589934590 -> ffff fffe = -2
8589934591 > 8589934590 would be -1 > -2 : GOOD
The sign bit on MSB’s doesn’t matter b/c negative numbers have the same relationship between themselves as positive numbers. e.g regardless of sign bit, lsb values of 0xff > 0xfe, always.
What about if the MSB on the lower 32 bits is different?
0xff7f ffff 7fff ffff = -36,028,799,166,447,617
0xff7f ffff ffff ffff = -36,028,797,018,963,969
32 bit is:
-..799.. -> 0x7fff ffff = 2147483647
-..797.. -> 0xffff ffff = -1
-..799.. < -..797.. would be 2147483647 < -1 : BAD!
So we need to ignore the sign bit on the lower 32 bits. And since the relationships are the same for the LSBs regardless of sign, just using
the lowest 32 bits unsigned works for all cases.
This means I want signed for the MSB's and unsigned for the LSBs - so chaging I4 to i4 for the LSBs. Also making big endian official and using '>' on the struct.unpack calls:
-- ...
local comp_int64s = function (as0, au1, bs0, bu1)
if as0 > bs0 then
return 1
elseif as0 < bs0 then
return -1
else
-- msb's equal comparing lsbs - these are unsigned
if au1 > bu1 then
return 1
elseif au1 < bu1 then
return -1
else
return 0
end
end
end
local l, as0, au1, bs0, bu1
as0, l = bit.tobit(struct.unpack(">i4", ARGV[1]))
au1, l = bit.tobit(struct.unpack(">I4", ARGV[1], 5))
bs0, l = bit.tobit(struct.unpack(">i4", blob))
bu1, l = bit.tobit(struct.unpack(">I4", blob, 5))
print("Cmp result", comp_int64s(as0, au1, bs0, bu1))
Comparing is a simple string compare s1 == s2.
Greater than is when not s1 == s2 and i1 < i2.
Less than is the real work. string.byte allows to get single bytes as unsigned char. In case of unsigned integer, you would just have to check bytes-downwards: b1==b2 -> check next byte; through all bytes -> false (equal); b1>b2 -> false (greater than); b1<b2 -> true. Signed requires more steps: first check the sign bit (uppermost byte >127). If sign 1 is set but not sign 2, integer 1 is negative but not integer 2 -> true. The opposite would obviously result in false. When both signs are equal, you can do the unsigned processing.
When you can pack more bytes to an integer, it's fine too, but you have to adjust the sign bit check. When you have LuaJIT, you can use the ffi library to cast your string into a byte array into an int64.
I need set some bits in ByteData at position counted in bits.
How I can do this?
Eg.
var byteData = new ByteData(1024);
var bitData = new BitData(byteData);
// Offset in bits: 387
// Number of bits: 5
// Value: 3
bitData.setBits(387, 5, 3);
Yes it is quite complicated. I dont know dart, but these are the general steps you need to take. I will label each variable as a letter and also use a more complicated example to show you what happens when the bits overflow.
1. Construct the BitData object with a ByteData object (A)
2. Call setBits(offset (B), bits (C), value (D));
I will use example values of:
A: 11111111 11111111 11111111 11111111
B: 7
C: 10
D: 00000000 11111111
3. Rather than using an integer with a fixed length of bits, you could
use another ByteData object (D) containing your bits you want to write.
Also create a mask (E) containing the significant bits.
e.g.
A: 11111111 11111111 11111111 11111111
D: 00000000 11111111
E: 00000011 11111111 (2^C - 1)
4. As an extra bonus step, we can make sure the insignificant
bits are really zero by ANDing with the bitmask.
D = D & E
D 00000000 11111111
E 00000011 11111111
5. Make sure D and E contain at least one full zero byte since we want
to shift them.
D 00000000 00000000 11111111
E 00000000 00000011 11111111
6. Work out these two integer values:
F = The extra bit offset for the start byte: B mod 8 (e.g. 7)
G = The insignificant bits: size(D) - C (e.g. 14)
7. H = G-F which should not be negative here. (e.g. 14-7 = 7)
8. Shift both D and E left by H bits.
D 00000000 01111111 10000000
E 00000001 11111111 10000000
9. Work out first byte number (J) floor(B / 8) e.g. 0
10. Read the value of A at this index out and let this be K
K = 11111111 11111111 11111111
11. AND the current (K) with NOT E to set zeros for the new bits.
Then you can OR the new bits over the top.
L = (K & !E) | D
K & !E = 11111110 00000000 01111111
L = 11111110 01111111 11111111
12. Write L to the same place you read it from.
There is no BitData class, so you'll have to do some of the bit-pushing yourself.
Find the corresponding byte offset, read in some bytes, mask out the existing bits and set the new ones at the correct bit offset, then write it back.
The real complexity comes when you need to store more bits than you can read/write in a single operation.
For endianness, if you are treating the memory as a sequence of bits with arbitrary width, I'd go for little-endian. Endianness only really makes sense for full-sized (2^n-bit, n > 3) integers. A 5 bit integer as the one you are storing can't have any endianness, and a 37 bit integer also won't have any natural way of expressing an endianness.
You can try something like this code (which can definitely be optimized more):
import "dart:typed_data";
void setBitData(ByteBuffer buffer, int offset, int length, int value) {
assert(value < (1 << length));
assert(offset + length < buffer.lengthInBytes * 8);
int byteOffset = offset >> 3;
int bitOffset = offset & 7;
if (length + bitOffset <= 32) {
ByteData data = new ByteData.view(buffer);
// Can update it one read/modify/write operation.
int mask = ((1 << length) - 1) << bitOffset;
int bits = data.getUint32(byteOffset, Endianness.LITTLE_ENDIAN);
bits = (bits & ~mask) | (value << bitOffset);
data.setUint32(byteOffset, bits, Endianness.LITTLE_ENDIAN);
return;
}
// Split the value into chunks of no more than 32 bits, aligned.
do {
int bits = (length > 32 ? 32 : length) - bitOffset;
setBitData(buffer, offset, bits, value & ((1 << bits) - 1));
offset += bits;
length -= bits;
value >>= bits;
bitOffset = 0;
} while (length > 0);
}
Example use:
main() {
var b = new Uint8List(32);
setBitData(b.buffer, 3, 8, 255);
print(b.map((v)=>v.toRadixString(16)));
setBitData(b.buffer, 13, 6*4, 0xffffff);
print(b.map((v)=>v.toRadixString(16)));
setBitData(b.buffer, 47, 21*4, 0xaaaaaaaaaaaaaaaaaaaaa);
print(b.map((v)=>v.toRadixString(16)));
}