Empty Binary List with pattern matching - erlang

I'm trying to parse a binary file and when it comes to returning numbers packed in little endian into 16 bits, I am hoping that this would work:
foo(Bin, Bits) when is_binary(Bin) ->
<<A, B, C, D, _Rest>> = Bin,
(bar(<<A, B>>, Bits) =/= 0) and (bar(<<C, D>>, Bits) =/= 0).
bar(<<N:16/little-unsigned-integer>>, Bits) ->
binary:at(Bits, N).
Unfortunately, the matcher doesn't work when Bin is 4 bytes or less. Is there a better way to make it so the rest can be empty? If I could avoid testing binary length in the caller, the better.

You could do something like:
foo(<<A:16/little-unsigned-integer,B:16/little-unsigned-integer,_Rest/binary>>, Bits) ->
(binary:at(Bits, A) =/= 0) and (binary:at(Bits, B) =/= 0).
This will not work with a binary which is less than 4 bytes long. What is supposed to happen in that case?
N.B. binary:at/2 works on binaries not bitstrings and the offset is in bytes.

Related

Implementation of split_binary function of Erlang

I'm new in the Erlang world. I'm trying to implement the function split_binary. The function takes as input (list, index) and it splits the list in two lists according to the index.
split(Lst, N) when N>=list:lenght(Lst) -> Lst;
split(Lst, N) when N<list:lenght(Lst) -> splitHelper(list:reverse(Lst), 0, N, []).
splitHelper([H|T], X, N, Acc) ->
if
X>=N ->
(list:reverse([H|T]), list:reverse(Acc));
X<N ->
splitHelper(T, X+1, N, [H|Acc])
end.
How can I improve my code?
I'm new in the Erlang world. I'm trying to implement the function
split_binary. The function takes as input (list, index) and it splits
the list in two lists according to the index.
According to the erlang docs for split_binary/2, the two arguments are a binary, which is not a list, and the number of bytes where you want to split the binary.
First, you need to have a basic understanding of what a binary is. A binary is a sequence of bytes, where each byte is 8 bits representing some integer, e.g.
0010 0001
which is 33. Here is an example of a binary:
<<1, 2, 3>>
When you don't specify a size for each integer, by default each integer will occupy one byte. If you wanted the 2 to occupy two bytes instead, i.e. 0000 0000 0000 0010, which is 16 bits, then you could write:
<<1, 2:16, 3>>
which the shell would display as:
<<1,0,2,3>>
Huh? Where did that 0 come from? The shell displays a binary byte by byte, and the first byte of the integer 0000 0000 0000 0010 is 0000 0000, which is 0.
Next, you can step through a binary just like you can for a list, extracting any number of bits at a time from the front of the binary. It so happens that split_binary/2 extracts 8 bits, or 1 byte, at a time from the head of the binary.
There are a couple of tricks to learning how to step through a binary:
For lists, [] means an empty list, and for binaries <<>> means an empty binary.
For lists you write [Head|Tail] to extract the head of the list, and for binaries you write <<Bits:3, Rest/binary>> to extract 3 bits from the front of the binary. In your case, you need to extract 8 bits from the front of the binary.
Here is an example of what you can do:
-module(a).
-compile(export_all).
split_b(Bin, N) ->
split_b(Bin, N, _Acc = <<>>).
split_b( Bin, _N = 0, Acc) -> [Acc, Bin];
split_b(<<Bits:8, Rest/binary>>, N, Acc) ->
split_b(Rest, N-1, <<Acc/binary, Bits>>).
In the shell:
40> c(a).
a.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,a}
41> a:split_b(<<5,6,7>>, 1).
[<<5>>,<<6,7>>]
42> a:split_b(<<5,6,7>>, 2).
[<<5,6>>,<<7>>]
Note that when constructing a binary one of the segments of the binary can be another binary:
23> Bin = <<1, 2, 3>>.
<<1,2,3>>
24> Acc = <<Bin/binary, 4>>.
<<1,2,3,4>>
If you are actually trying to implement lists:split/2, you can do this:
-module(a).
-compile(export_all).
split_l(N, List) ->
split_l(N, List, _Acc=[]).
split_l(_N=0, List, Acc) ->
[lists:reverse(Acc), List];
split_l(N, [H|T], Acc) ->
split_l(N-1, T, [H|Acc]).
In the shell:
2> c(a).
a.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,a}
3> a:split_l(1, [10, 20, 30]).
["\n",[20,30]]
4> shell:strings(false).
true
5> a:split_l(1, [10, 20, 30]).
[[10],[20,30]]
6> a:split_l(2, [10, 20, 30]).
[[10,20],[30]]
I think #7stud's answer is the best one, but I wanted to add a few minor details about your code, without actually checking if it works or not…
list:lenght/1 doesn't exist (unless you also created your own list module.
If you created your own list module, you can't use it in guards. Only BIFs are allowed there.
If you're trying to use stdlib's function to check the length of a list, then you should use erlang:length/1 or just length/1.
It's more idiomatic in Erlang to use snake_case (e.g. split_helper) instead of camelCase (e.g. splitHelper) for module names, function names and atoms in general.
You can use pattern-matching directly instead of writing an if as the sole expression of your function…
split_helper([H|T], X, N, Acc) when X > N ->
(list:reverse([H|T]), list:reverse(Acc));
split_helper([H|T], X, N, Acc) when X<N ->
split_helper(T, X+1, N, [H|Acc]).
Tuples are denoted with curly braces and not parentheses: {list:reverse([H|T]),…. BTW… This should have prevented your code from compiling at all. The error should've looked like syntax error before: ','
Also, you might have written your own list module, but if not and if you're trying to use stdlib functionality, it's lists:reverse/1 not list:reverse/1.
Finally, out of that list, I would strongly recommend you to write some simple tests for your code. This article may help you with that.

cryptol: arithmetic with different width

How to perform arithmetic with values of different widths ?
In verilog there is no problem xoring 2 bits with 8 bits but cryptol complains:
cryptol> let test(x: [2],y: [8]) = x ^ y
[error] at <interactive>:1:31--1:32:
Type mismatch:
Expected type: 2
Inferred type: 8
My original problem:
I would like to rotate the bytes in a 64 bit value, with the number of bytes to shift depending on a two bit input. I struggle to get this working:
cryptol> let shift (v, s:[2]) = v >>> (s*16+8)
[error] at <interactive>:1:5--1:38:
Unsolved constraint:
2 >= 5
arising from
use of literal or demoted expression
at <interactive>:1:33--1:35
In the interpreter I can remove the type specification of s and then it works however I need to get that working from a file and with s being really a 2 bit value.
The type of ^ is:
Cryptol> :t (^)
(^) : {a} (Logic a) => a -> a -> a
Note that it requires both arguments to be exactly the same. You're getting the type-error because [2] is not the same as [8]; as they differ in size. Unlike Verilog, Cryptol will not "pad" things implicitly, and I think Cryptol is definitely doing the right thing here. Verilog programmers can chime in with countless bugs they had due to implicit casting.
All such casting in Cryptol has to be explicit.
The typical way to deal with this situation in Cryptol is to use the polymorphic constant zero:
Cryptol> :t zero
zero : {a} (Zero a) => a
The value zero inhabits all types (you can ignore the Zero constraint for now), and as you can imagine is the "right" padding value in this case. So, you'd define your function as:
Cryptol> let test(x:[2], y:[8]) = (zero#x)^y
Cryptol> :t test
test : ([2], [8]) -> [8]
And use it like this:
Cryptol> test (1, 5)
0x04
And if you wanted to pad on the right for some reason, you'd do:
Cryptol> let test(x:[2], y:[8]) = (x#zero)^y
Cryptol> test(1,5)
0x45
This way, everything is explicit and you don't have to know all the magical rules about how things get padded to become the right size.
If you want to get real fancy, then you can do:
Cryptol> let test(x, y) = (zero#x)^(zero#y)
Cryptol> :t test
test : {n, m, i, j, a} (Logic a, Zero a, m + i == n + j, fin n,
fin m) =>
([i]a, [j]a) -> [m + i]a
Now, that type looks a bit scary; but essentially it's telling you that you can give it any sized arguments, and it would be valid for any other size, so long as the new size is larger than the maximum of the two you've given. Of course, this inferred size is way more polymorphic then you probably cared for; so you can give it something more readable:
test : {m, n} (fin m, fin n) => [m] -> [n] -> [max m n]
test x y = (zero#x) ^ (zero#y)
I believe this captures your intent perfectly. Note how cryptol will make sure your inputs are finite, and you get the maximum of the two sizes given.
Getting back to your example, Cryptol is telling you that to multiply by 16 you need at least 5 bits, and thus 2>=5 is not satisfiable. This is a bit cryptic, but arises from the use of literals which are polymorphically typed. You can use the zero trick to address the issue in the same way as before:
Cryptol> let shift (v, s:[2]) = v >>> ((zero#s)*16+8)
[warning] at <interactive>:1:32--1:38:
Defaulting type argument 'front' of '(#)' to 3
But note how cryptol is warning you about the type of zero that's used there, since the type of >>> is polymorphic enough to allow different size shifts/rotates:
Cryptol> :t (>>>)
(>>>) : {n, ix, a} (fin n, fin ix) => [n]a -> [ix] -> [n]a
In these cases, Cryptol will pick the smallest possible size to default to by looking at the expressions. Unfortunately, it does the wrong thing here. By picking size 3 for zero, you'll have a 5 bit shift, but your expression can produce the maximum value of 3*16+8=56, which requires at least 6 bits to represent. Note that Cryptol only uses the minimum size required to handle the multiplication there, and does not care about overflows! This is why it's important to pay attention to such warnings.
To be clear: Cryptol did the right thing per the language rules on how type inference works, but it ended up picking a size that is just too small for what you wanted to do.
So, you should write your shift as follows:
Cryptol> let shift (v, s:[2]) = v >>> (((zero:[4])#s)*16+8)
Cryptol> :t shift
shift : {n, a} (fin n) => ([n]a, [2]) -> [n]a
The important thing here is to make sure the expression s*16+8 will fit in the final result, and since s is only 2 bits wide the largest value will be 56 as discussed above, which needs at least 6-bits to represent. This is why I chose [4] as the size of zero.
The moral of the story here is that you should always be explicit about the sizes of your bitvectors, and Cryptol will give you the right framework to express your constraints in a polymorphic way to allow for code reuse without ambiguity, avoiding many of the pitfalls of Verilog and other similar languages.

F#: inefficient sequence processing

I have the following code to perform the Sieve of Eratosthenes in F#:
let sieveOutPrime p numbers =
numbers
|> Seq.filter (fun n -> n % p <> 0)
let primesLessThan n =
let removeFirstPrime = function
| s when Seq.isEmpty s -> None
| s -> Some(Seq.head s, sieveOutPrime (Seq.head s) (Seq.tail s))
let remainingPrimes =
seq {3..2..n}
|> Seq.unfold removeFirstPrime
seq { yield 2; yield! remainingPrimes }
This is excruciatingly slow when the input to primesLessThan is remotely large: primes 1000000 |> Seq.skip 1000;; takes nearly a minute for me, though primes 1000000 itself is naturally very fast because it's just a Sequence.
I did some playing around, and I think that the culprit has to be that Seq.tail (in my removeFirstPrime) is doing something intensive. According to the docs, it's generating a completely new sequence, which I could imagine being slow.
If this were Python and the sequence object were a generator, it would be trivial to ensure that nothing expensive happens at this point: just yield from the sequence, and we've cheaply dropped its first element.
LazyList in F# doesn't seem to have the unfold method (or, for that matter, the filter method); otherwise I think LazyList would be the thing I wanted.
How can I make this implementation fast by preventing unnecessary duplications/recomputations? Ideally primesLessThan n |> Seq.skip 1000 would take the same amount of time regardless of how large n was.
Recursive solutions and sequences don't go well together (compare the answers here, it's very much the same pattern you are using). You might want to inspect the generated code, but I'd just consider this a rule of thumb.
LazyList (as defined in FSharpX) does of course come with unfold and filter defined, it would have been quite bizarre if it didn't. Typically in F# code this sort of functionality is provided in separate modules rather than as instance members on the type itself, a convention that does seem to confuse most of those documentation systems.
As you probably know Seq is a lazily evaluated collection. Sieve algorithm is about filtering out non-primes from a sequence so that you don't have to consider them again.
However, when you combine Sieve with a lazily evaluated collection you end up do the filtering of the same non-primes over and over again.
You see much better performance if you switch from Seq to Array or List because of the non-lazy aspect of those collections means that you only filter non-primes once.
One way to improve performance in your code is to introduce caching.
let removeFirstPrime s =
let s = s |> Seq.cache
match s with
| s when Seq.isEmpty s -> None
| s -> Some(Seq.head s, sieveOutPrime (Seq.head s) (Seq.tail s))
I implemented a LazyList that works alot like Seq that allows me to count the number of evaluations:
For all primes up to 2000.
Without caching: 14753706 evaluations
With caching: 97260 evaluations
Of course if you really need performance you use a mutable array implementation.
PS. Performance metrics
Running 'seq' ...
it took 271 ms with cc (16, 4, 0), the result is: 1013507
Running 'list' ...
it took 14 ms with cc (16, 0, 0), the result is: 1013507
Running 'array' ...
it took 14 ms with cc (10, 0, 0), the result is: 1013507
Running 'mutable' ...
it took 0 ms with cc (0, 0, 0), the result is: 1013507
This is Seq with caching. Seq in F# has rather high overhead, there are interesting lazy alternatives to Seq like Nessos.
List and Array run roughly similar but because of the more compact memory layout the GC metrics are better for Array (10 cc0 collections for Array vs 16 cc0 collections for List). Seq has worse GC metrics in that it forced 4 cc1 collections.
The mutable implementation of sieve algorithm has better memory and performance metrics by a large margin.

Bit manipulation with Erlang's bitwise bnot operator

Reading chapter 2 of Hacker's Delight and trying ti implement bit manipulation in Erlang.
I'm stuck on this one:
Use the following formula to create a word with 0's at the positions of the trailing 1's in x, and 1's elsewhere, producing all 1's if none (e.g. 10100111 => 11111000):
¬ x | (x + 1)
Here is what I tried:
(bnot X) bor (X + 2#01)
But the result is -1000 for some reason, and not 2#11111000.
What's strange is that not 2#10100111 is -10101000 (base 2).
Any idea what's going on?
You have to limit the width oh the numbers manipulated (problem of sign, problem of bignum and integer representation).
The next example use 8 bits but it would work the same with 128 bits, in this case the result would be 340282366920938463463374607431768211448 instead of 248 for your test case.
1> Msk = fun(X) -> X band 2#11111111 end. % limit to 8 bits
#Fun<erl_eval.6.52032458>
2> Op = fun(X) -> Msk(bnot(X)) bor Msk(X+1) end.
#Fun<erl_eval.6.52032458>
3> Op(2#10100111).
248
4> 2#11111000.
248

What does this symbol mean: ∧?

I need to calculate this equation using Delphi programming language
z = (Rot(y ∧ n1 , K2) ∧ K1 ) ⊕ n2
Where:
K1, K2, n1, n2, y are 96-bits binary values
I just want to know what does this symbol means "∧", and how to us it in Delphi?
It might be bitwise AND.
The ⊕ could be exclusive or XOR in Delphi.
The tricky bit might be the ROT operation which rotates the bits of a variable. There is no ROT operation but there is shl and shr for left and right shift. See Delphi Expressions
To make things even harder you don't have a native 96 bit datatype. LongInt is 4 bytes = 32 bit. You will need to use an array if you need to represent the fill 96 bit.

Resources