Erlang - Parse data from the enclosed curly braces - erlang

Erlang experts, I am getting a data like the following from ejabberd server
I(<0.397.0>:mod_http_offline:38) : Data of Fromu {jid,"timok","localhost",
"25636221451404911062246700",
"timok","localhost",
"25636221451404911062246700"}
I am a very much confused about this data type. All I need is to get timok from the enclosed flower braces. {} But not sure how to get the value. Any code to get the value will be much helpful. Currently I am printing the values using the below code
?INFO_MSG("Data of Fromu ~p",[_From]),
Thanks once again for your time and effort.

That's an erlang record (it's a tuple, first element an atom, other elements lists/strings/binaries).
Recommended:
Ejabberd has a jid record definition (line 411):
-record(jid, {user = <<"">> :: binary(),
server = <<"">> :: binary(),
resource = <<"">> :: binary(),
luser = <<"">> :: binary(),
lserver = <<"">> :: binary(),
lresource = <<"">> :: binary()}).
It's in the ejabberd/include/jlib.hrl file, so you should be able make it known to your module by including it this way:
-include_lib("ejabberd/include/jlib.hrl").
Now, in your module to access the (first) "timok" element of your data, you can use the erlang record syntax (assuming JidData contains the data mentioned above):
Out = JidData#jid.user.
Not recommended:
As records are, behind their appearance, tuples, you can also access the nth element of the tuple
Out = element(2,JidData).
Or simply use pattern matching:
{_, Out, _, _, _, _} = JidData.
Use Record Definitions
A record is basically syntaxic sugar on a tuple. It remains a tuple and can be treated as such. They're easy to work with, but you should do what you can to avoid treating a record as a tuple, unless you really know what you're doing.
And because in this case you don't even control the record definition, you really should use it, otherwise changes in the definition, following an update, will invalidate your code.

You seem to be trying to access the second item in the tuple stored in variable _From. This can be accessed simply by using pattern matching:
{_, Username, _, _, _, _} = _From
Since you are using the from variable, you should not have a underscore in front of it. In your code change _From to From.

Related

Erlang syntax in function

I have found this code in Ejabberd:
maybe_post_request([$< | _ ] = Data, Host, ClientIp)
I don't understand what [$< | _ ] = Data part do with Data. Could somebody explain?
The construct
[$< | _] = Data
applies a pattern match to Data, expecting it to be a list variable whose first element is the character < and ignoring the rest the elements. Try it in the Erlang shell:
1> Data = "<foo>".
"<foo>"
2> [$<|_] = Data.
"<foo>"
But if Data doesn't match, we get an exception:
3> f(Data), Data = "foo".
"foo"
4> [$<|_] = Data.
** exception error: no match of right hand side value "foo"
I don't understand what [$< | _ ] = Data part do with Data. Could
somebody explain?
It binds the variable Data to the entire first argument to the function.
The left hand side pattern matches the first argument so that this function clause only matches when the first argument is a string (list) starting with the character <. The variable Data is assigned the entire string fr use in the function body.
It's a way of having your cake and eating it at the same time. Data refers to the whole thing while the [$<|_] lets you match it and pull it apart. The putting then together with = in a patterns allows you to do both. In a pattern like this it is generally called an alias. It means that both sides much match and in an argument in a function head (which is where you saw it) the order is irrelevant so the function head could have been written as
maybe_post_request([$< | _ ] = Data, Host, ClientIp)
or
maybe_post_request(Data = [$< | _ ], Host, ClientIp)
Of course in the function body or in the shell they are not equivalent.
I personally prefer the first alternative as that says matching, pulling apart to me.

Simple explanation of Erlang atom

I am learning Erlang and stuck trying to understand the concept of atoms. I know Python: What is a good explanation of these "atoms" in simple terms, or analogously with Python. So far, my understanding is that the type is like a string but without string operations?
Docs say that:
An atom is a literal, a constant with name.
Sometimes you have couple of options, that you would like to choose from. In C for example, you have enum:
enum Weekday { Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday };
In C, it is really an integer, but you can use it in code as one of options. Atoms in Erlang are very useful in pattern matching. Lets consider very simple server:
loop() ->
receive
{request_type_1, Request} ->
handle_request_1(Request),
loop();
{request_type_2, Request} ->
handle_request_2(Request),
loop();
{stop, Reason} ->
{ok, Reason};
_ ->
{error, bad_request}
end.
Your server receives messages, that are two element tuples and uses atoms to differentiate between different types of requests: request_type_1, request_type_2 and stop. It is called pattern matching.
Server also uses atoms as return values. ok atom means, that everything went ok. _ matches everything, so in case, that simple server receives something unexpected, it quits with tuple {error, Reason}, where the reason is also atom bad_request.
Boolean values true and false are also atoms. You can build logical functions using function clauses like this:
and(true, true) ->
true;
and(_, _) ->
false.
or(false, false) ->
false;
or(_, _) ->
true.
(It is a little bit oversimplified, because you can call it like this: or(atom1, atom2) and it will return true, but it is only for illustration.)
Module names in Erlang are also atoms, so you can bind module name to variable and call it, for example type this in Erlang shell:
io:format("asdf").
Variable = io.
Variable:format("asdf").
You should not use atoms as strings, because they are not garbage collected. If you start creating them dynamically, you can run out of memory. They should be only used, when there is fixed amount of options, that you type in code by hand. Of course, you can use the same atom as many times as you want, because it always points to the same point in memory (an atom table).
They are better than C enums, because the value is known at runtime. So while debugging C code, you would see 1 instead of Tuesday in debugger. Using atoms doesn't have that disadvantage, you will see tuesday in your both in your code and Erlang shell.
Also, they're often used to tag a tuple, for descriptiveness. For example:
{age, 42}
Rather than just
42
Atom is a literal constant. Has no value but can be used as a value. Examples are: true, false, undefined. If you want to use it as a string, you need to apply atom_to_list(atom) to get a string (list) to work with. Module names are also atoms.
Take a look at http://www.erlang.org/doc/reference_manual/data_types.html

Erlang matchspecs with tuple comparison

I want to use erlang datetime values in the standard format {{Y,M,D},{H,Min,Sec}} in a MNESIA table for logging purposes and be able to select log entries by comparing with constant start and end time tuples.
It seems that the matchspec guard compiler somehow confuses tuple values with guard sub-expressions. Evaluating ets:match_spec_compile(MatchSpec) fails for
MatchSpec = [
{
{'_','$1','$2'}
,
[
{'==','$2',{1,2}}
]
,
['$_']
}
]
but succeeds when I compare $2 with any non-tuple value.
Is there a restriction that match guards cannot compare tuple values?
I believe the answer is to use double braces when using tuples (see Variables and Literals section of http://www.erlang.org/doc/apps/erts/match_spec.html#id69408). So to use a tuple in a matchspec expression, surround that tuple with braces, as in,
{'==','$2',{{1,2}}}
So, if I understand your example correctly, you would have
22> M=[{{'_','$1','$2'},[{'==','$2',{{1,2}}}],['$_']}].
[{{'_','$1','$2'},[{'==','$2',{{1,2}}}],['$_']}]
23> ets:match_spec_run([{1,1,{1,2}}],ets:match_spec_compile(M)).
[{1,1,{1,2}}]
24> ets:match_spec_run([{1,1,{2,2}}],ets:match_spec_compile(M)).
[]
EDIT: (sorry to edit your answer but this was the easiest way to get my comment in a readable form)
Yes, this is how it must be done. An easier way to get the match-spec is to use the (pseudo) function ets:fun2ms/1 which takes a literal fun as an argument and returns the match-spec. So
10> ets:fun2ms(fun ({A,B,C}=X) when C == {1,2} -> X end).
[{{'$1','$2','$3'},[{'==','$3',{{1,2}}}],['$_']}]
The shell recognises ets:fun2ms/1. For more information see ETS documentation. Mnesia uses the same match-specs as ETS.

How to retrieve value from optional parser in Parsec?

Sorry if it's a novice question - I want to parse something defined by
Exp ::= Mandatory_Part Optional_Part0 Optional_Part1
I thought I could do this:
proc::Parser String
proc = do {
;str<-parserMandatoryPart
;str0<-optional(parserOptionalPart0) --(1)
;str1<-optional(parserOptionalPart1) --(2)
;return str++str0++str1
}
I want to get str0/str1 if optional parts are present, otherwise, str0/str1 would be "".
But (1) and (2) won't work since optional() doesn't allow extracting result from its parameters, in this case, parserOptionalPart0/parserOptionalPart1.
Now What would be the proper way to do it?
Many thanks!
Billy R
The function you're looking for is optionMaybe. It returns Nothing if the parser failed, and returns the content in Just if it consumed input.
From the docs:
option x p tries to apply parser p. If p fails without consuming input, it returns the value x, otherwise the value returned by p.
So you could do:
proc :: Parser String
proc = do
str <- parserMandatoryPart
str0 <- option "" parserOptionalPart0
str1 <- option "" parserOptionalPart1
return (str++str0++str1)
Watch out for the "without consuming input" part. You may need to wrap either or both optional parsers with try.
I've also adjusted your code style to be more standard, and fixed an error on the last line. return isn't a keyword; it's an ordinary function. So return a ++ b is (return a) ++ b, i.e. almost never what you want.

String splitting problems in Erlang

I've been playing around with the splitting of atoms and have a problem with strings. The input data will always be an atom that consists of some letters and then some numbers, for instance ms444, r64 or min1. Since the function lists:splitwith/2 takes a list the atom is first converted into a list:
24> lists:splitwith(fun (C) -> is_atom(C) end, [m,s,4,4,4]).
{[m,s],[4,4,4]}
25> lists:splitwith(fun (C) -> is_atom(C) end, atom_to_list(ms444)).
{[],"ms444"}
26> atom_to_list(ms444).
"ms444"
I want to separate the letters from the numbers and I've succeeded in doing that when using a list, but since I start out with an atom I get a "string" as result to put into my splitwith function...
Is it interpreting each item in the list as a string or what is going on?
You might want to have a look at the string module documentation:
http://www.erlang.org/doc/man/string.html
The following function might interest you:
tokens(String, SeparatorList) -> Tokens
Since strings in Erlang are just a list() of integer() the test in the fun will be made if the item is an atom() when it is in fact an integer(). If the test is changed to look for letters it works:
29> lists:splitwith(fun (C) -> (C >= $a) and (C =< $Z) end, atom_to_list(ms444)).
{"ms","444"}
An atom in erlang is a named constant and not a variable (or not like a variable is in an imperative language).
You should really not create atoms in dynamic fashion (that is, don't convert things to atoms at runtime)
They are used more in pattern matching and send recive code.
Pid ! {matchthis, X}
recive
{foobar,Y} -> doY(Y);
{matchthis,X} -> doX(X);
Other -> doother(Other)
end
A variable, like X could be set to an atom. For example X=if 1==1 -> ok; true -> fail end. I could suffer from poor imagination but I can't think of a way why you would like to parse atom. You should be in charge of what atoms you write and not use list_to_atom(CharIntegerList).
Can you perhaps give a more overview of what you like to accomplish?
A "string" in Erlang is not a primitive type: it is just a list() of integers(). So if you want to "separate" the letters from the digits, you'll have to do comparison with the integer representation of the characters.

Resources