I'm not sure how to go about solving this.
I have a byte of data that is received that defines which colour LEDs are available in a piece of hardware.
There are 4 LEDs Red, Green, Blue and White.
bit 0 = Red (1 On | 0 Off)
bit 1 = Green (1 On | 0 Off)
bit 2 = Blue (1 On | 0 Off)
bit 3 = White (1 On | 0 Off)
bit 4 = Unused / Future Use
bit 5 = Unused / Future Use
bit 6 = Unused / Future Use
bit 7 = Unused / Future Use
If I get an int value 11 from the hardware the bit mask is: 0000 1011 (Little Endian) so the LEDs in use are White, Green and Red.
If I got an int value 15 then all LEDs are in use, if it is 7 all but white are in use.
What I'm trying to work out is a good way to evaluate the bits that are set and then display which LEDs are available.
What is the best way to evaluate which bits are set and then display it in NSString to the user.
Should I use an enum as below and then try to evaluate that, how would I do the evaluation?
typedef enum {
RedLEDAvailable = 1 << 0,
GreenLEDAvailable = 1 << 1,
BlueLEDAvailable = 1 << 2,
WhiteLEDAvailable = 1 << 3
} LEDStatus;
Thanks in advance.
Yes you have a bit mask for each LED and then you can compare the value you get from the hardware to determine which LEDs are available.
Evaluating bit masks is easy you either use the & or the | operators to "and" or "or" the bits together.
As an example of how this works take the numbers 0001 and 0101 (in binary). If you | them together the computer examines each bit in turn of both numbers sees if EITHER one or both numbers has a 1 in each position and if that is the case then a 1 goes in that position for the result.
If you & them together then a 1 is put in that position only if BOTH bit masks have a 1 there.
Sorry if this is a bit garbled but basically this means that 0001 & 0101 = 0001. And 0001 | 0101 = 0101.
This means that if you want to COMBINE bit masks you can use the | operator and if you want to evaluate them you can use the & operator.
E.g.:
if ((LEDsAvailable & RedLEDAvailable) == RedLEDAvailable) {
// Red light ready
}
else {
// Red light not ready
}
In terms of displaying this as a string to the user you can just create an NSMutableString and append the appropriate messages depending on how your bit mask evaluates.
Related
I'm looking for an SSE Bitwise OR between components of same vector. (Editor's note: this is potentially an X-Y problem, see below for the real comparison logic.)
I am porting some SIMD logic from SPU intrinsics. It has an instruction
spu_orx(a)
Which according to the docs
spu_orx: OR word across d = spu_orx(a) The four word elements of
vector a are logically Ored. The result is returned in word element 0
of vector d. All other elements (1,2,3) of d are assigned a value of
zero.
How can I do that with SSE 2 - 4 involving minimum instruction? _mm_or_ps is what I got here.
UPDATE:
Here is the scenario from SPU based code:
qword res = spu_orx(spu_or(spu_fcgt(x, y), spu_fcgt(z, w)))
So it first ORs two 'greater' comparisons, then ORs its result.
Later couples of those results are ANDed to get final comparison value.
This is effectively doing (A||B||C||D||E||F||G||H) && (I||J||K||L||M||N||O||P) && ... where A..D are the 4x 32-bit elements of the fcgt(x,y) and so on.
Obviously vertical _mm_or_ps of _mm_cmp_ps results is a good way to reduce down to 1 vector, but then what? Shuffle + OR, or something else?
UPDATE 1
Regarding "but then what?"
I perform
qword res = spu_orx(spu_or(spu_fcgt(x, y), spu_fcgt(z, w)))
On SPU it goes like this:
qword aRes = si_and(res, res1);
qword aRes1 = si_and(aRes, res2);
qword aRes2 = si_and(aRes1 , res3);
return si_to_uint(aRes2 );
several times on different inputs,then AND those all into a single result,which is finally cast to integer 0 or 1 (false/true test)
SSE4.1 PTEST bool any_nonzero = !_mm_testz_si128(v,v);
That would be a good way to horizontal OR + booleanize a vector into a 0/1 integer. It will compile to multiple instructions, and ptest same,same is 2 uops on its own. But once you have the result as a scalar integer, scalar AND is even cheaper than any vector instruction, and you can branch on the result directly because it sets integer flags.
#include <immintrin.h>
bool any_nonzero_bit(__m128i v) {
return !_mm_testz_si128(v,v);
}
On Godbolt with gcc9.1 -O3 -march=nehalem:
any_nonzero(long long __vector(2)):
ptest xmm0, xmm0 # 2 uops
setne al # 1 uop with false dep on old value of RAX
ret
This is only 3 uops on Intel for a horizontal OR into a single bit in an integer register. AMD Ryzen ptest is only 1 uop so it's even better.
The only risk here is if gcc or clang creates false dependencies by not xor-zeroing eax before doing a setcc into AL. Usually gcc is pretty fanatical about spending extra uops to break false dependencies so I don't know why it doesn't here. (I did check with -march=skylake and -mtune=generic in case it was relying on Nehalem partial-register renaming for -march=nehalem. Even -march=znver1 didn't get it to xor-zero EAX before the ptest.)
It would be nice if we could avoid the _mm_or_ps and have PTEST do all the work. But even if we consider inverting the comparisons, the vertical-AND / horizontal-OR behaviour doesn't let us check something about all 8 elements of 2 vectors, or about any of those 8 elements.
e.g. Can PTEST be used to test if two registers are both zero or some other condition?
// NOT USEFUL
// 1 if all the vertical pairs AND to zero.
// but 0 if even one vertical AND result is non-zero
_mm_testz_si128( _mm_castps_si128(_mm_cmpngt_ps(x,y)),
_mm_castps_si128(_mm_cmpngt_ps(z,w)));
I mention this only to rule it out and save you the trouble of considering this optimization idea. (#chtz suggested it in comments. Inverting the comparison is a good idea that can be useful for other ways of doing things.)
Without SSE4.1 / delaying the horizontal OR
We might be able to delay horizontal ORing / booleanizing until after combining some results from multiple vectors. This makes combining more expensive (imul or something), but saves 2 uops in the vector -> integer stage vs. PTEST.
x86 has cheap vector mask->integer bitmap with _mm_movemask_ps. Especially if you ultimately want to branch on the result, this might be a good idea. (But x86 doesn't have a || instruction that booleanizes its inputs either so you can't just & the movemask results).
One thing you can do is integer multiply movemask results: x * y is non-zero iff both inputs are non-zero. Unlike x & y which can be false for 0b0101 &0b1010for example. (Our inputs are 4-bit movemask results andunsigned` is 32-bit so we have some room before we overflow). AMD Bulldozer family has an integer multiply that isn't fully pipelined so this could be a bottleneck on old AMD CPUs. Using just 32-bit integers is also good for some low-power CPUs with slow 64-bit multiply.
This might be good if throughput is more of a bottleneck than latency, although movmskps can only run on one port.
I'm not sure if there are any cheaper integer operations that let us recover the logical-AND result later. Adding doesn't work; the result is non-zero even if only one of the inputs was non-zero. Concatenating the bits together (shift+or) is also of course like an OR if we eventually just test for any non-zero bit. We can't just bitwise AND because 2 & 1 == 0, unlike 2 && 1.
Keeping it in the vector domain
Horizontal OR of 4 elements takes multiple steps.
The obvious way is _mm_movehl_ps + OR, then another shuffle+OR. (See Fastest way to do horizontal float vector sum on x86 but replace _mm_add_ps with _mm_or_ps)
But since we don't actually need an exact bitwise-OR when our inputs are compare results, we just care if any element is non-zero. We can and should think of the vectors as integer, and look at integer instructions like 64-bit element ==. One 64-bit element covers/aliases two 32-bit elements.
__m128i cmp = _mm_castps_si128(cmpps_result); // reinterpret: zero instructions
// SSE4.1 pcmpeqq 64-bit integer elements
__m128i cmp64 = _mm_cmpeq_epi64(cmp, _mm_setzero_si128()); // -1 if both elements were zero, otherwise 0
__m128i swap = _mm_shuffle_epi32(cmp64, _MM_SHUFFLE(1,0, 3,2)); // copy and swap, no movdqa instruction needed even without AVX
__m128i bothzero = _mm_and_si128(cmp64, swap); // both halves have the full result
After this logical inversion, ORing together multiple bothzero results will give you the AND of multiple conditions you're looking for.
Alternatively, SSE4.1 _mm_minpos_epu16(cmp64) (phminposuw) will tell us in 1 uop (but 5 cycle latency) if either qword is zero. It will place either 0 or 0xFFFF in the lowest word (16 bits) of the result in this case.
If we inverted the original compares, we could use phminposuw on that (without pcmpeqq) to check if any are zero. So basically a horizontal AND across the whole vector. (Assuming that it's elements of 0 / -1). I think that's a useful result for inverted inputs. (And saves us from using _mm_xor_si128 to flip the bits).
An alternative to pcmpeqq (_mm_cmpeq_epi64) would be SSE2 psadbw against a zeroed vector to get 0 or non-zero results in the bottom of each 64-bit element. It won't be a mask, though, it's 0xFF * 8. Still, it's always that or 0 so you can still AND it. And it doesn't invert.
So i have this question in my homework assignment that i have struggling a bit with. I looked over my lecture content/notes and have been able to utilize those to answer the questions, however, i am not 100% sure that i did everything correctly. There are two parts (part C and D) in the question that i was not able to figure out even after consulting my notes and online sources. I am not looking for a solution for those two parts by any means, but it would be greatly appreciated if i could get, at least, a nudge in the right direction in how i can go about solving it.
I know this is a rather large question, however, i hope someone could possibly check my answers and tell me if all my work and methods of looking at this problem is correct. As always, thank you for any help :)
Alright, so now that we have the formalities out of the way,
--------------------------Here is the Question:--------------------------
Suppose a small direct-mapped cache of blocks with 32 blocks is constructed. Each cache block stores
eight 32-bit words. The main memory—which is byte addressable1—is 16,384 bytes in size. 32-bit words are stored
word aligned in memory, i.e., at an address that is divisible by 4.
(a) How many 32-bit words can the memory store (in decimal)?
(b) How many address bits would be required to address each byte of memory?
(c) What is the range of memory addresses, in hex? That is, what are the addresses of the first and last bytes of
memory? I'll give you a hint: memory addresses are numbered starting at 0.
(d) What would be the address of the last word in memory?
(e) Using the cache mapping scheme discussed in the Chapter 5 lecture notes, how many and which address bits
would be used to form the block offset?
(f) How many and which memory address bits would be used to form the cache index?
(g) How many and which address bits would be used to form the tag field for each cache block?
(h) To which cache block (in decimal) would memory address 0x2A5C map to?
(i) What would be the block offset (in decimal) for 0x2A5C?
(j) How many other main memory words would map to the same block as 0x2A5C?
(k) When the word at 0x2A5C is moved into a cache block, what are the memory addresses (in hex) of the other
words which will also be moved into this block? Express your answer as a range, e.g., [0x0000, 0x0200].
(l) The first word of a main memory block that is mapped to a cache block will always be at an address that is
divisible by __ (in decimal)?
(m) Including the V and tag bits of each cache block, what would be the total size of the cache (in bytes)
(n) what would be the size allocated for the data bits (in bytes)?
----------------------My answers and work-----------------------------------
a) memory = 16384 bytes. 16384 bytes into bits = 131072 bits. 131072/32 = 4096 32-bit words
b) 2^14 (main memory) * 2^2 (4 bits/word) = 2^16. take log(base2)(2^16) = 16 bits
c) couldnt figure this part out (would appreciate some input (NOT A SOLUTION) on how i can go about looking at this problem
d)could not figure this part out either :(
e)8 words in each cache line. 8 * 4(2^2 bits/word) = 32 bits in each cache line. log(base2)(2^5) = 5 bits used for block offset.
f) # of blocks = 2^5 = 32 blocks. log(base2)(2^5) = 5 bits for cache index
g) tag = 16 - 5 - 5 - 2(word alignment) = 4 bits
h) 0x2A5C
0010 10100 10111 00
tag index offset word aligned bits
maps to cache block index = 10100 = 0x14
i) maps to block offset = 10111 = 0x17
j) 4 tag bits, 5 block offset = 2^9 other main memory words
k) it is a permutation of the block offsets. so it maps the memory addresses with the same tag and cache index bits and block offsets of 0x00 0x01 0x02 0x04 0x08 0x10 0x11 0x12 0x14 0x18 0x1C 0x1E 0x1F
l)divisible by 4
m) 2(V+tag+data) = 2(1+4+2^3*2^5) = 522 bits = 65.25 bytes
n)data bits = 2^5 blocks * 2^3 words per block = 256 bits = 32 bytes
Part C:
If a memory has M bytes, and the memory is byte addressable, the the memory addresses range from 0 to M - 1.
For your question, this means that memory addresses range from 0 to 16383, or in hex 0x0 to 0x3FFF.
Part D:
Words are 4 bytes long. So given your answer to C, the last word is at:
(0x3FFFF - 3) -> 0x3FFC.
You can see that this is correct because the lowest 2 bits of the address are 0, which must be true of any 4 byte aligned address.
I'm trying to get my head around the Bit Syntax in Erlang and I'm having some trouble understand how this works:
Red = 10.
Green = 61.
Blue = 20.
Color = << Red:5, Green:6, Blue:5 >> .
I've seen this example in the Software for a concurrent world by Joe Armstrong second edition and this code will
create a 16 bit memory area containing a single RGB triplet.
My question is how can 3 bytes be packed in a 16-bit memory area?. I'm not familiar whatsoever with bit shifting and I wasn't able to find anything relevant to this subject referring to erlang as well. My understand so far is that the segment is made up of 16 parts and that Red occupies 5, green 6 and blue 5 however I'm note sure how this is even possible.
Given that
61 = 0011011000110001
which alone is 16 bits how is this packaging possible?
To start with, 61 is only equal to 00110110 00110001 if you store it as two ASCII digits. When written in binary, 61 is 111101.
Note that the binary representation requires six binary digits, or six "bits" for short. That's what we're taking advantage of in this line:
Color = << Red:5, Green:6, Blue:5 >> .
We're using 5 bits for the red value, 6 bits for the green value, and 5 bits for the blue value, for a total of 16 bits. This works since the red value and the blue value are both less than 32 (since 31 is the largest number that can be represented with 5 bits), and the green value is less than 64 (since 63 is the largest number that can be represented with 6 bits).
The complete value is 01010 111101 10100 (three segments for red, green and blue), or if we split it into two bytes, 01010111 10110100.
I was reading through some code examples and came across a & on Oracle's website on their Bitwise and Bit Shift Operators page. In my opinion it didn't do too well of a job explaining the bitwise &. I understand that it does a operation directly to the bit, but I am just not sure what kind of operation, and I am wondering what that operation is. Here is a sample program I got off of Oracle's website: http://docs.oracle.com/javase/tutorial/displayCode.html?code=http://docs.oracle.com/javase/tutorial/java/nutsandbolts/examples/BitDemo.java
An integer is represented as a sequence of bits in memory. For interaction with humans, the computer has to display it as decimal digits, but all the calculations are carried out as binary. 123 in decimal is stored as 1111011 in memory.
The & operator is a bitwise "And". The result is the bits that are turned on in both numbers. 1001 & 1100 = 1000, since only the first bit is turned on in both.
The | operator is a bitwise "Or". The result is the bits that are turned on in either of the numbers. 1001 | 1100 = 1101, since only the second bit from the right is zero in both.
There are also the ^ and ~ operators, that are bitwise "Xor" and bitwise "Not", respectively. Finally there are the <<, >> and >>> shift operators.
Under the hood, 123 is stored as either 01111011 00000000 00000000 00000000 or 00000000 00000000 00000000 01111011 depending on the system. Using the bitwise operators, which representation is used does not matter, since both representations are treated as the logical number 00000000000000000000000001111011. Stripping away leading zeros leaves 1111011.
It's a binary AND operator. It performs an AND operation that is a part of Boolean Logic which is commonly used on binary numbers in computing.
For example:
0 & 0 = 0
0 & 1 = 0
1 & 0 = 0
1 & 1 = 1
You can also perform this on multiple-bit numbers:
01 & 00 = 00
11 & 00 = 00
11 & 01 = 01
1111 & 0101 = 0101
11111111 & 01101101 = 01101101
...
If you look at two numbers represented in binary, a bitwise & creates a third number that has a 1 in each place that both numbers have a 1. (Everywhere else there are zeros).
Example:
0b10011011 &
0b10100010 =
0b10000010
Note that ones only appear in a place when both arguments have a one in that place.
Bitwise ands are useful when each bit of a number stores a specific piece of information.
You can also use them to delete/extract certain sections of numbers by using masks.
If you expand the two variables according to their hex code, these are:
bitmask : 0000 0000 0000 1111
val: 0010 0010 0010 0010
Now, a simple bitwise AND operation results in the number 0000 0000 0000 0010, which in decimal units is 2. I'm assuming you know about the fundamental Boolean operations and number systems, though.
Its a logical operation on the input values. To understand convert the values into the binary form and where bot bits in position n have a 1 the result has a 1. At the end convert back.
For example with those example values:
0x2222 = 10001000100010
0x000F = 00000000001111
result = 00000000000010 => 0x0002 or just 2
Knowing how Bitwise AND works is not enough. Important part of learning is how we can apply what we have learned. Here is a use case for applying Bitwise AND.
Example:
Adding any even number in binary with 1's binary will result in zeros. Because all the even number has it's last bit(reading left to right) 0 and the only bit 1 has is 1 at the end.
If you were to ask write a function which takes an argument as a number and returns true for even number without using addition, multiplication, division, subtraction, modulo and you cannot convert number to string.
This function is a perfect use case for using Bitwise AND. As I have explained earlier. You ask show me the code? Here is the java code.
/**
* <p> Helper function </p>
* #param number
* #return 0 for even otherwise 1
*/
private int isEven(int number){
return (number & 1);
}
is doing the logical and digit by digit so for example 4 & 1 became
10 & 01 = 1x0,0x1 = 00 = 0
n & 1 is used for checking even numbers since if a number is even the oeration it will aways be 0
import.java.io.*;
import.java.util.*;
public class Test {
public static void main(String[] args) {
int rmv,rmv1;
//this R.M.VIVEK complete bitwise program for java
Scanner vivek=new Scanner();
System.out.println("ENTER THE X value");
rmv = vivek.nextInt();
System.out.println("ENTER THE y value");
rmv1 = vivek.nextInt();
System.out.println("AND table based\t(&)rmv=%d,vivek=%d=%d\n",rmv,rmv1,rmv&rmv1);//11=1,10=0
System.out.println("OR table based\t(&)rmv=%d,vivek=%d=%d\n",rmv,rmv1,rmv|rmv1);//10=1,00=0
System.out.println("xOR table based\t(&)rmv=%d,vivek=%d=%d\n",rmv,rmv1,rmv^rmv1);
System.out.println("LEFT SWITH based to %d>>4=%d\n",rmv<<4);
System.out.println("RIGTH SWITH based to %d>>2=%d\n",rmv>>2);
for(int v=1;v<=10;v++)
System.out.println("LIFT SWITH based to (-NAGATIVE VALUE) -1<<%d=%p\n",i,-1<<1+i);
}
}
This question already has answers here:
What does the ^ operator do to a BOOL?
(7 answers)
Closed 10 years ago.
Sorry to ask such a simple question but these things are hard to Google.
I have code in iOS which is connected to toggle which is switching between Celsius and Fahrenheit and I don't know what ^ 1 means. self.celsius is Boolean
Thanks
self.celsius = self.celsius ^ 1;
It's a C-language operator meaning "Bitwise Exclusive OR".
Wikipedia gives a good explanation:
XOR
A bitwise XOR takes two bit patterns of equal length and performs the
logical exclusive OR operation on each pair of corresponding bits. The
result in each position is 1 if only the first bit is 1 or only the
second bit is 1, but will be 0 if both are 0 or both are 1. In this we
perform the comparison of two bits, being 1 if the two bits are
different, and 0 if they are the same. For example:
0101 (decimal 5)
XOR 0011 (decimal 3)
= 0110 (decimal 6)
The bitwise XOR may be used to invert selected bits in a register
(also called toggle or flip). Any bit may be toggled by XORing it with
1. For example, given the bit pattern 0010 (decimal 2) the second and fourth bits may be toggled by a bitwise XOR with a bit pattern
containing 1 in the second and fourth positions:
0010 (decimal 2)
XOR 1010 (decimal 10)
= 1000 (decimal 8)
It's the bitwise XOR operator (see http://www.techotopia.com/index.php/Objective-C_Operators_and_Expressions#Bitwise_XOR).
What it's doing in this case is switching back and forth, because 0 ^ 1 is 1, and 1 ^ 1 is 0.
It's an exclusive OR operation.