Pattern matching a binary in erlang

Pattern matching a binary in erlang - erlang

I'm trying to pattern match a binary against this
<<_:(A * ?N + A + B)/binary,T:1/binary,_/binary>>
However it seems erlang throws an error saying that variable T is unbound. Just a quick explanation: I want to ignore a certain number of bytes and then read a byte and then ignore the remaining bytes. How can I achieve this?

In bit syntax we can't use runtime expressions as bit size.
We can use only constants, compile time expressions like _:(4*8)/binary and variables: _:Var/binary.
In your case, solution is to bind A * ?N + A + B to variable first.
IgnoredBytes = A * ?N + A + B,
<<_:IgnoredBytes/binary,T:1/binary,_/binary>> = SomeBinary,
T.
It's Better explained in answer from [erlang-questions]

Related

Performant, idiomatic way to concatenate two chars that are not in a list into a string

I've done most of my development in C# and am just learning F#. Here's what I want to do in C#:
string AddChars(char char1, char char2) => char1.ToString() + char2.ToString();
EDIT: added ToString() method to the C# example.
I want to write the same method in F# and I don't know how to do it other than this:
let addChars char1 char2 = Char.ToString(char1) + Char.ToString(char2)
Is there a way to add concatenate these chars into a string without converting both into strings first?
Sidenote:
I also have considered making a char array and converting that into a string, but that seems similarly wasteful.
let addChars (char1:char) (char2: char) = string([|char1; char2|])

As I said in my comment, your C# code is not going to do what you want ( i.e. concatenate the characters into a string). In C#, adding a char and a char will result in an int. The reason for this is because the char type doesn't define a + operator, so C# reverts to the nearest compatable type that does, which just happens to be int. (Source)
So to accomplish this behavior, you will need to do something similar to what you are already trying to do in F#:
char a = 'a';
char b = 'b';
// This is the wrong way to concatenate chars, because the
// chars will be treated as ints and the result will be 195.
Console.WriteLine(a + b);
// These are the correct ways to concatenate characters into
// a single string. The result of all of these will be "ab".
// The third way is the recommended way as it is concise and
// involves creating the fewest temporary objects.
Console.WriteLine(a.ToString() + b.ToString());
Console.WriteLine(Char.ToString(a) + Char.ToString(b));
Console.WriteLine(new String(new[] { a, b }));
(See https://dotnetfiddle.net/aEh1FI)
F# is the same way in that concatenating two or more chars doesn't result in a String. Unlike C#, it results instead in another char, but the process is the same - the char values are treated like int and added together, and the result is the char representation of the sum.
So really, the way to concatenate chars into a String in F# is what you already have, and is the direct translation of the C# equivalent:
let a = 'a'
let b = 'b'
// This is still the wrong way (prints 'Ã')
printfn "%O" (a + b)
// These are still the right ways (prints "ab")
printfn "%O" (a.ToString() + b.ToString())
printfn "%O" (Char.ToString(a) + Char.ToString(b))
printfn "%O" (String [| a;b |]) // This is still the best way
(See https://dotnetfiddle.net/ALwI3V)
The reason the "String from char array" approach is the best way is two-fold. First, it is the most concise, since you can see that that approach offers the shortest line of code in both languages (and the difference only increases as you add more and more chars together). And second, only one temporary object is created (the array) before the final String, whereas the other two methods involve making two separate temporary String objects to feed into the final result.
(Also, I'm not sure if it works this way as the String constructors are hidden in external sources, but I imagine that the array passed into the constructor would be used as the String's backing data, so it wouldn't end up getting wasted at all.) Strings are immutable, but using the passed array directly as the created String's backing data could result in a situation where a reference to the array could be held elsewhere in the program and jeopardize the String's immutability, so this speculation wouldn't fly in practice. (Credit: #CaringDev)
Another option you could do in F# that could be more idiomatic is to use the sprintf function to combine the two characters (Credit: #rmunn):
let a = 'a'
let b = 'b'
let s = sprintf "%c%c" a b
printfn "%O" s
// Prints "ab"
(See https://dotnetfiddle.net/Pp9Tee)
A note of warning about this method, however, is that it is almost certainly going to be much slower than any of the other three methods listed above. That's because instead of processing array or String data directly, sprintf is going to be performing more advanced formatting logic on the output. (I'm not in a position where I could benchmark this myself at the moment, but plugged into #TomasPetricek's benckmarking code below, I wouldn't be surprised if you got performance hits of 10x or more.)
This might not be a big deal as for a single conversion it will still be far faster than any end-user could possibly notice, but be careful if this is going to be used in any performance-critical code.

The answer by #Abion47 already lists all the possible sensible methods I can think of. If you are interested in performance, then you can run a quick experiment using the F# Interactive #time feature:
#time
open System
open System.Text
let a = 'a'
let b = 'b'
Comparing the three methods, the one with String [| a; b |] turns out to be about twice as fast as the methods involving ToString. In practice, that's probably not a big deal unless you are doing millions of such operations (as my experiment does), but it's an interesting fact to know:
// 432ms, 468ms, 472ms
for i in 0 .. 10000000 do
let s = a.ToString() + b.ToString()
ignore s
// 396ms 440ms, 458ms
for i in 0 .. 10000000 do
let s = Char.ToString(a) + Char.ToString(b)
ignore s
// 201ms, 171ms, 170ms
for i in 0 .. 10000000 do
let s = String [| a;b |]
ignore s

Display polynomials in reverse order in SageMath

So I would like to print polynomials in one variable (s) with one parameter (a), say
a·s^3 − s^2 - a^2·s − a + 1.
Sage always displays it with decreasing degree, and I would like to get something like
1 - a - a^2·s - s^2 + a·s^3
to export it to LaTeX. I can't figure out how to do this... Thanks in advance.

As an alternative to string manipulation, one can use the series expansion.
F = a*s^3 - s^2 - a^2*s - a + 1
F.series(s, F.degree(s)+1)
returns
(-a + 1) + (-a^2)*s + (-1)*s^2 + (a)*s^3
which appears to be what you wanted, save for some redundant parentheses.
This works because (a) a power series is ordered from lowest to highest coefficients; (b) making the order of remainder greater than the degree of the polynomial ensures that the series is just the polynomial itself.

This is not easy, because the sort order is defined in Pynac, a fork of Ginac, which Sage uses for its basic symbolic manipulation. However, depending on what you need, it is possible programmatically:
sage: F = 1 + x + x^2
sage: "+".join(map(str,sorted([f for f in F.operands()],key=lambda exp:exp.degree(x))))
'1+x+x^2'
I don't know whether this sort of thing is powerful enough for your needs, though. You may have to traverse the "expression tree" quite a bit but at least your sort of example seems to work.
sage: F = a + a^2*x + x^2 - a*x^2
sage: "+".join(map(str,sorted([f for f in F.operands()],key=lambda exp:exp.degree(x))))
'a+a^2*x+-a*x^2+x^2'
Doing this in a short statement requires a number of Python tricks like this, which are very well worth learning if you are going to use Sage (or Numpy, or pandas, or ...) a fair amount.

Matching a string with a specific end in LPeg

I'm trying to capture a string with a combination of a's and b's but always ending with b. In other words:
local patt = S'ab'^0 * P'b'
matching aaab and bbabb but not aaa or bba. The above however does not match anything. Is this because S'ab'^0 is greedy and matches the final b? I think so and can't think of any alternatives except perhaps resorting to lpeg.Cmt which seems like overkill. But maybe not, anyone know how to match such a pattern? I saw this question but the problem with the solution there is that it would stop at the first end marker (i.e. 'cat' there, 'b' here) and in my case I need to accept the middle 'b's.
P.S.
What I'm actually trying to do is match an expression whose outermost rule is a function call.
E.g.
func();
func(x)(y);
func_arr[z]();
all match but
exp;
func()[1];
4 + 5;
do not. The rest of my grammar works and I'm pretty sure this boils down to the same issue but for completeness, the grammar I'm working with looks something like:
top_expr = V'primary_expr' * V'postfix_op'^0 * V'func_call_op' * P';';
postfix_op = V'func_call_op' + V'index_op';
And similarly the V'postfix_op'^0 eats up the func_call_op I'm expecting at the end.

Yes, there is no backtracking, so you've correctly identified the problem. I think the solution is to list the valid postfix_op expressions; I'd change V'func_call_op' + V'index_op' to V'func_call_op'^0 * V'index_op' and also change the final V'func_call_op' to V'func_call_op'^1 to allow several function calls at the end.
Update: as suggested in the comments, the solution to the a/b problem would be (P'b'^0 * P'a')^0 * P'b'^1.

How about this?
local final = P'b' * P(-1)
local patt = (S'ab' - final)^0 * final
The pattern final is what we need at the end of the string.
The pattern patt matches the set 'ab' unless it is followed by the final sequence. Then it asserts that we have the final sequence. That stops the final 'b' from being eaten.
This doesn't guarantee that we get any a's (but neither would the pattern in the question have).

Sorry my answer comes too late but I think it's worth to give this question a more correct answer.
As I understand it, you just want a non-blind greedy match. But unfortunately the "official documentation" of LPeg only tells us how to use LPeg for blind greedy match (or repetition). But this pattern can be described by a parsing expression grammar. For rule S if you want to match as many E1 as you can followed by E2, you need to write
S <- E1 S / E2
The solution to a/b problem becomes
S <- [ab] S / 'b'
You might want to optimize the rule by inserting some a's in the first option
S <- [ab] 'a'* S / 'b'
which will reduce the recursions a lot. As for your real problem, here's my answser:
top_expr <- primary_expr p_and_f ';'
p_and_f <- postfix_op p_and_f / func_call_op
postfix_op <- func_call_op / index_op

Implementing post / pre increment / decrement when translating to Lua

I'm writing a LSL to Lua translator, and I'm having all sorts of trouble implementing incrementing and decrementing operators. LSL has such things using the usual C like syntax (x++, x--, ++x, --x), but Lua does not. Just to avoid massive amounts of typing, I refer to these sorts of operators as "crements". In the below code, I'll use "..." to represent other parts of the expression.
... x += 1 ...
Wont work, coz Lua only has simple assignment.
... x = x + 1 ...
Wont work coz that's a statement, and Lua can't use statements in expressions. LSL can use crements in expressions.
function preIncrement(x) x = x + 1; return x; end
... preIncrement(x) ...
While it does provide the correct value in the expression, Lua is pass by value for numbers, so the original variable is not changed. If I could get this to actually change the variable, then all is good. Messing with the environment might not be such a good idea, dunno what scope x is. I think I'll investigate that next. The translator could output scope details.
Assuming the above function exists -
... x = preIncrement(x) ...
Wont work for the "it's a statement" reason.
Other solutions start to get really messy.
x = preIncrement(x)
... x ...
Works fine, except when the original LSL code is something like this -
while (doOneThing(x++))
{
doOtherThing(x);
}
Which becomes a whole can of worms. Using tables in the function -
function preIncrement(x) x[1] = x[1] + 1; return x[1]; end
temp = {x}
... preincrement(temp) ...
x = temp[1]
Is even messier, and has the same problems.
Starting to look like I might have to actually analyse the surrounding code instead of just doing simple translations to sort out what the correct way to implement any given crement will be. Anybody got any simple ideas?

I think to really do this properly you're going to have to do some more detailed analysis, and splitting of some expressions into multiple statements, although many can probably be translated pretty straight-forwardly.
Note that at least in C, you can delay post-increments/decrements to the next "sequence point", and put pre-increments/decrements before the previous sequence point; sequence points are only located in a few places: between statements, at "short-circuit operators" (&& and ||), etc. (more info here)
So it's fine to replace x = *y++ + z * f (); with { x = *y + z * f(); y = y + 1; }—the user isn't allowed to assume that y will be incremented before anything else in the statement, only that the value used in *y will be y before it's incremented. Similarly, x = *--y + z * f(); can be replaced with { y = y - 1; x = *y + z * f (); }

Lua is designed to be pretty much impervious to implementations of this sort of thing. It may be done as kind of a compiler/interpreter issue, since the interpreter can know that variables only change when a statement is executed.
There's no way to implement this kind of thing in Lua. Not in the general case. You could do it for global variables by passing a string to the increment function. But obviously it wouldn't work for locals, or for variables that are in a table that is itself global.
Lua doesn't want you to do it; it's best to find a way to work within the restriction. And that means code analysis.

Your proposed solution only will work when your Lua variables are all global. Unless this is something LSL also does, you will get trouble translating LSL programs that use variables called the same way in different places.
Lua is only able of modifying one lvalue per statement - tables being passed to functions are the only exception to this rule. You could use a local table to store all locals, and that would help you out with the pre-...-crements; they can be evaluated before the expression they are contained in is evauated. But the post-...-crements have to be evaluated later on, which is simply not possible in lua - at least not without some ugly code involving anonymous functions.
So you have one options: you must accept that some LSL statements will get translated to several Lua statements.
Say you have a LSL statement with increments like this:
f(integer x) {
integer y = x + x++;
return (y + ++y)
}
You can translate this to a Lua statement like this:
function f(x) {
local post_incremented_x = x + 1 -- extra statement 1 for post increment
local y = x + post_incremented_x
x = post_incremented_x -- extra statement 2 for post increment
local pre_incremented_y = y + 1
return y + pre_incremented_y
y = pre_incremented_y -- this line will never be executed
}
So you basically will have to add two statements per ..-crement used in your statements. For complex structures, that will mean calculating the order in which the expressions are evaluated.
For what is worth, I like with having post decrements and predecrements as individual statements in languages. But I consider it a flaw of the language when they can also be used as expressions. The syntactic sugar quickly becomes semantic diabetes.

After some research and thinking I've come up with an idea that might work.
For globals -
function preIncrement(x)
_G[x] = _G[x] + 1
return _G[x]
end
... preIncrement("x") ...
For locals and function parameters (which are locals to) I know at the time I'm parsing the crement that it is local, I can store four flags to tell me which of the four crements is being used in the variables AST structure. Then when it comes time to output the variables definition, I can output something like this -
local x;
function preIncrement_x() x = x + 1; return x; end
function postDecrement_x() local y = x; x = x - 1; return y; end
... preIncrement_x() ...

In most of your assessment of configurability to the code. You are trying to hard pass data types from one into another. And call it a 'translator'. And in all of this you miss regex and other pattern match capacities. Which are far more present in LUA than LSL. And since the LSL code is being passed into LUA. Try using them, along with other functions. Which would define the work more as a translator, than a hard pass.
Yes I know this was asked a while ago. Though, for other viewers of this topic. Never forget the environments you are working in. EVER. Use what they give you to the best ability you can.

What does 'deferred substitution' mean?

I'm writing a simple parser/interpreter for a language. The instructions keep mentioning 'deferred substitution', as in
Extend the fun language feature described so that functions
can accept a list of zero or more arguments instead of just one. All
arguments to the function must evaluate with the same deferred
substitutions.
I don't need any help with implementing this, I'm just confused about what 'deferred substitution' means. Any thoughts?

Deferred substitution refers to the practice of substituting the values of variables at the latest step possible. By doing so, you are deferring the substitution of it!
Here's an example that might help you understand what it means:
Suppose that you have the following function:
f(x) = 500 + 300 + 2x + 45x
Let's say that x = 1
If you want to defer the substitution of x, you would probably do:
f(x) = 800 + 2x + 45x
f(x) = 800 + 47x
f(1) = 800 + 47(1)
Notice that we have substituted the values of x at the latest step possible, after simplifying everything that is not a variable in this function.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Pattern matching a binary in erlang - erlang

Related

Performant, idiomatic way to concatenate two chars that are not in a list into a string

Display polynomials in reverse order in SageMath

Matching a string with a specific end in LPeg

Implementing post / pre increment / decrement when translating to Lua

What does 'deferred substitution' mean?

Categories

Resources