How much value do the 8 bit variable holds? - memory

Probably answer is 256 but I am not satisfied with it.
Suppose a variable has 8 bits , its mean its 8th bit can hold the value 256 . But it also has other seven bits . Wouldn't the total value be the sum of all bits?
To me final value that 8 bit variable holds would be the sum of all bits. But it doesn't. Why?

The max value 8 bits can hold is: 11111111 which is equal to 255. If you have a signed value, the max value it can hold is 127, the left-most bit is used for sign.
The binary 10000000 equals 128 (2 ^ 7), not 256. That's where your confusion lays I think.
00000001 = 2 ^ 0 = 1
00000010 = 2 ^ 1 = 2
00000100 = 2 ^ 2 = 4
00001000 = 2 ^ 3 = 8
00010000 = 2 ^ 4 = 16
00100000 = 2 ^ 5 = 32
01000000 = 2 ^ 6 = 64
10000000 = 2 ^ 7 = 128

The value is indeed the sum of all bits set to 1, but the place value of the eighth bit is 27 (128), not 256 as you suggest - the least significant bit is 20 (i.e. 1), so for eight bits the MSB is 27. You appear to have started from 21 (2) .
For an unsigned integer:
Bit 0 = 20 = 1
Bit 1 = 21 = 2
Bit 2 = 22 = 4
Bit 3 = 23 = 8
Bit 4 = 24 = 16
Bit 5 = 25 = 32
Bit 6 = 26 = 64
Bit 7 = 27 = 128
Sum of all ones = 255 - not 256 as you suggest: 0 to 255 = 28 (256) values.
For a two's complement signed 8 bit type:
Bit 7 = -27 = -128
Sum of all ones = -1,
while if Bit 8 = 0, sum = +127,
and all zeros except bit 8 = -128.
(-128 to +127 = 28 (256) values).
Either way an 8 bit integer signed or otherwise has 28 (256) possible bit patterns.

Related

Displaying the bits values of a number in Wireshark Postdissector

I am writing a wireshark dissector of a custom protocol using LUA.For this custom protocol,there are no underlying TCP port or UDP port hence i have written a postdissector.
I am able to capture the payload from the below layers and convert it into a string.
local io_b = tostring(customprotocol)
After this, io_b has the following data
io_b = 10:10:10:10:01:0f:00:0d:00:00:00:00:01:00:00:00:00:20:0a:00:00
At first I split this string with : as the seperator and copy the elements into an array/table.
datafields = {}
index = 1
for value in string.gmatch(io_b, "[^:]+") do
datafields[index] = value
index = index + 1
end
Then I read each element of the datafield array as a uint8 value and check if a bit is set in that datafield element.How to make sure that each element of the table is uint8?
function lshift(x, by)
return x * 2 ^ by
end
--checks if a bit is set at a position
function IsBitSet( b, pos)
if b ~= nil then
return tostring(bit32.band(tonumber(b),lshift(1,pos)) ~= 0)
else
return "nil"
end
end
Then I want to display the value of each bit in the wireshark.I dont care about the first four bytes. The script displays each bit of the 5th byte(which is the 1st considered byte) correctly but displays all the bits value of the 6th byte and other remaining bytes as "nil".
local data_in_2 = subtree:add(customprotocol,"secondbyte")
data_in_2:add(firstbit,(IsBitSet((datafields[6]),7)))
data_in_2:add(secondbit,(IsBitSet((datafields[6]),6)))
data_in_2:add(thirdbit,(IsBitSet((datafields[6]),5)))
data_in_2:add(fourbit,(IsBitSet((datafields[6]),4)))
data_in_2:add(fivebit,(IsBitSet((datafields[6]),3)))
data_in_2:add(sixbit,(IsBitSet((datafields[6]),2)))
data_in_2:add(sevenbit,(IsBitSet((datafields[6]),1)))
data_in_2:add(eightbit,(IsBitSet((datafields[6]),0)))
What am i doing wrong?
Maybe i am wrong but it seems you can do it simpler with...
io_b = '10:10:10:10:01:0f:00:0d:00:00:00:00:01:00:00:00:00:20:0a:00:00'
-- Now replace all : on the fly with nothing and convert it with #Egor' comment tip
-- Simply by using string method gsub() from within io_b
b_num = tonumber(io_b:gsub('%:', ''), 16)
print(b_num)
-- Output: 537526272
#shakingwindow - I cant comment so i ask here...
Do you mean...
io_b = '10:10:10:10:01:0f:00:0d:00:00:00:00:01:00:00:00:00:20:0a:00:00'
-- Converting HEX to string - Replacing : with ,
io_hex = io_b:gsub('[%x]+', '"%1"'):gsub(':', ',')
-- Converting string to table
io_hex_tab = load('return {' .. io_hex .. '}')()
-- Put out key/value pairs by converting HEX value string to a number on the fly
for key, value in pairs(io_hex_tab) do
print(key, '=', tonumber(value, 16))
end
...that puts out...
1 = 16
2 = 16
3 = 16
4 = 16
5 = 1
6 = 15
7 = 0
8 = 13
9 = 0
10 = 0
11 = 0
12 = 0
13 = 1
14 = 0
15 = 0
16 = 0
17 = 0
18 = 32
19 = 10
20 = 0
21 = 0
...?

256 possible values in a 8 bits

I am confused when I read the details section that says 1 byte which is 8 bits gives us a potential of 2^8 or 256 possible values. (https://en.wikipedia.org/wiki/8-bit_computing)
If i am doing the math correctly
2^0 = 1
2^1 = 2
2^2 = 4
2^3 = 8
2^4 = 16
2^5 = 32
2^6 = 64
2^7 = 128
Total = 255
The way i see it there is total or possible 255 values.
0 is also a value so for 8 bits, the value range is 0-255.
00000000 is the lowest and 11111111 (255) is the highest.
2^x gives you the total number of possible values for x bits. You should be using 2^x to get the number of possible combinations only where x > 0. If x = 0, it points to a no-bit scenario which is irrelevant.
For your case, it is not correct to sum values from 2^0 to 2^7. The correct approach should be just calculating 2^8, which is 256.

Character to bits with SIMD (and substrings)

I am learning little by little SIMD programming, and I've devised a (seemingly) simple problem that I hope I can speed-up using SIMD (AVX, at the moment I have access only to AVX CPUs).
I have a long string constituted by an alphabet of 2^k characters (for instance 0, 1, 2, 3), and I'd like to:
generate all substrings of a given length substringlength
convert all the substrings in bits
The substrings are just sequences of characters from the input string:
012301230123012301230123012301233012301301230123123213012301230
substringlength = 6;
string bits
------+--+-----------------
012301 -> 01 00 11 10 01 00
123012 -> 10 01 00 11 10 01
230123 -> 11 10 01 00 11 10
301230 -> 00 11 10 01 00 11
...
My question is due to my inexperience with SIMD (I've only read "Modern x86 Assembly Language Programming", by Kusswurm):
Is this a task where SIMD could help?
Edit: for simplicity, let's just assume k = 2, and so the ASCII numbers will be just '0'..'3'.
Iteration 1
Reading the comments and playing around I've come to these realizations. I can convert the the ASCII into values, and as suggested, multiply-add adjacent bytes:
// SIMD 128-bit registers, apparently I cannot use AVX ones directly (some operations are AVX2 or AVX-512)
__m128i sse, val, adj, res;
auto mask = _mm_set_epi8(1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4, 1, 1<<4);
auto zero = _mm_set_epi8('0', '0', '0', '0', '0', '0', '0', '0',
'0', '0', '0', '0', '0', '0', '0', '0');
// Load ascii values
sse = _mm_loadu_si128((__m128i*) s.data());
// Convert to integer values
val = _mm_sub_epi8(sse[0], zero);
// Multiply with mask byte by byte (aka SHL second bytes of val) and sum
adj = _mm_maddubs_epi16(val, mask);
An idea of what it does, to people learning like me, is given here (I will need more 128-bits to encode one substring, ascii is in hex):
bytes 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ascii 30 31 30 31 30 31 30 31 30 31 30 31 30 31 30 31
_mm_sub_epi8:
value 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
_mm_maddubs_epi16:
value 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
* * * * * * * * * * * * * * * *
mask 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4
+ + + + + + + +
| | | | | | | | |
(16-bits)
bits ....0100 ....0100 ....0100 ....0100 ....0100 ....0100 ....0100 ....0100
In other words the first 4 bits are correct, encoding 2 ascii chars, if I understand correctly what _mm_maddubs_epi16 did to my values, which I am not sure at all!
Now I'd need some sort of "shift-or" of adjacent bytes, something like _mm_maddubs_epi16 that shifts left the first, and ORs with the second argument, producing an 8-bit or 16-bit value:
(16-bits)
bits ....0100 ....0100 ....0100 ....0100 ....0100 ....0100 ....0100 ....0100
| shl 4 | | shl 4 | | shl 4 | | shl 4 |
0100.... ....0100 0100.... ....0100 0100.... ....0100 0100.... ....0100
OR OR OR OR
....01000100 ....01000100 ....01000100 ....01000100
However, I cannot see how _mm_bslli_si128 could help me here, or if there is a smarter way to do this. Maybe even this "horizontal" approach is foolish, and I have to rethink it.
Any hint is welcome!

formatting strings in lua in a pattern

I want to make a script that takes any number, counts up to them and returns them in a format.
so like this
for i = 1,9 do
print(i)
end
will return
1
2
3
4
5
6
7
8
9
however I want it to print like this
1 2 3
4 5 6
7 8 9
and I want it to work even with things more than 9 so things like 20 would be like this
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 20
I'm sure it can be done using the string library in lua but I am not sure how to use that library.
Any help?
function f(n,per_line)
per_line = per_line or 3
for i = 1,n do
io.write(i,'\t')
if i % per_line == 0 then io.write('\n') end
end
end
f(9)
f(20)
The for loop takes an optional third step:
for i = 1, 9, 3 do
print(string.format("%d %d %d", i, i + 1, i + 2))
end
I can think of 2 ways to do this:
local NUMBER = 20
local str = {}
for i=1,NUMBER-3,3 do
table.insert(str,i.." "..i+1 .." "..i+2)
end
local left = {}
for i=NUMBER-NUMBER%3+1,NUMBER do
table.insert(left,i)
end
str = table.concat(str,"\n").."\n"..table.concat(left," ")
And another one using gsub:
local NUMBER = 20
local str = {}
for i=1,NUMBER do
str[i] = i
end
-- Makes "1 2 3 4 ..."
str = table.concat(str," ")
-- Divides it per 3 numbers
-- "%d+ %d+ %d+" matches 3 numbers divided by spaces
-- (You can replace the spaces (including in concat) with "\t")
-- The (...) capture allows us to get those numbers as %1
-- The "%s?" at the end is to remove any trailing whitespace
-- (Else each line would be "N N N " instead of "N N N")
-- (Using the '?' as the last triplet might not have a space)
-- ^ e.g. NUMBER = 6 would make it end with "4 5 6"
-- The "%1\n" just gets us our numbers back and adds a newline
str = str:gsub("(%d+ %d+ %d+)%s?","%1\n")
print(str)
I've benchmarked both code snippets. The upper one is a tiny bit faster, although the difference is almost nothing:
Benchmarked using 10000 interations
NUMBER 20 20 20 100 100
Upper 256 ms 276 ms 260 ms 1129 ms 1114 ms
Lower 284 ms 280 ms 282 ms 1266 ms 1228 ms
Use a temporary table to contain the values until you print them:
local temp = {}
local cols = 3
for i = 1,9 do
if #temp == cols then
print(table.unpack(temp))
temp = {}
end
temp[#temp + 1] = i
end
--Last minute check for leftovers
if #temp > 0 then
print(table.unpack(temp))
end
temp = nil

Negative probability in GMM

I am so confused. I have tested a program for myself by following MATLAB code :
feature_train=[1 1 2 1.2 1 1 700 709 708 699 678];
No_of_Clusters = 2;
No_of_Iterations = 10;
[m,v,w]=gaussmix(feature_train,[],No_of_Iterations,No_of_Clusters);
feature_ubm=[1000 1001 1002 1002 1000 1060 70 79 78 99 78 23 32 33 23 22 30];
No_of_Clusters = 3;
No_of_Iterations = 10;
[mubm,vubm,wubm]=gaussmix(feature_ubm,[],No_of_Iterations,No_of_Clusters);
feature_test=[2 2 2.2 3 1 600 650 750 800 658];
[lp_train,rp,kh,kp]=gaussmixp(feature_test,m,v,w);
[lp_ubm,rp,kh,kp]=gaussmixp(feature_test,mubm,vubm,wubm);
However, the result is wondering me because the feature_test must be classified in feature_train not feature_ubm. As you see below the probability of feature_ubm is more than feature_train!?!
Can anyone explain for me what is the problem ?
Is the problem related to gaussmip and gaussmix MATLAB functions ?
sum(lp_ubm)
ans =
-3.4108e+06
sum(lp_train)
ans =
-1.8658e+05
As you see below the probability of feature_ubm is more than feature_train!?!
You see exactly the opposite, despite the absolute value of ubm is big, you are considering negative numbers and
sum(lp_train) > sum(lp_ubm)
hense
P(test|train) > P(test|ubm)
So your test chunk is correctly classified as train, not as ubm.

Resources