Hidden Features of Erlang [closed] - erlang

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
In the spirit of:
Hidden Features of C#
Hidden Features of Java
Hidden Features of ASP.NET
Hidden Features of Python
Hidden Features of HTML
and other Hidden Features questions
What are the hidden features of Erlang that every Erlang developer should be aware of?
One hidden feature per answer, please.

Inheritance! http://www.erlang.se/euc/07/papers/1700Carlsson.pdf
Parent
-module(parent).
-export([foo/0, bar/0]).
foo() ->
io:format("parent:foo/0 ~n", []).
bar() ->
io:format("parent:bar/0 ~n", []).
Child
-module(child).
-extends(parent).
-export([foo/0]).
foo() ->
io:format("child:foo/0 ~n", []).
Console
23> parent:foo().
parent:foo/0
ok
24> parent:bar().
parent:bar/0
ok
25> child:foo().
child:foo/0
ok
26> child:bar().
parent:bar/0
ok

The magic commands in the shell. The full list is in the manual, but the ones I use most are:
f() - forget all variables
f(X) - forget X
v(42) - recall result from line 42
v(-1) - recall result from previous line
e(-1) - reexecute expression on previous line
rr(foo) - read record definitions from module foo
rr("*/*") - read record definitions from every module in every subdirectory
rp(expression) - print full expression with record formating

Parameterized Modules! From http://www.lshift.net/blog/2008/05/18/late-binding-with-erlang and http://www.erlang.se/euc/07/papers/1700Carlsson.pdf
-module(myclass, [Instvar1, Instvar2]).
-export([getInstvar1/0, getInstvar2/0]).
getInstvar1() -> Instvar1.
getInstvar2() -> Instvar2.
And
Eshell V5.6 (abort with ^G)
1> Handle = myclass:new(123, 234).
{myclass,123,234}
2> Handle:getInstvar1().
123
3> Handle:getInstvar2().
234

user_default.erl - you can build your own shell builtins by having a compiled user_default.beam in your path which can be pretty nifty

beam_lib:chunks can get source code from a beam that was compiled with debug on which can be really usefull
{ok,{_,[{abstract_code,{_,AC}}]}} = beam_lib:chunks(Beam,[abstract_code]).
io:fwrite("~s~n", [erl_prettypr:format(erl_syntax:form_list(AC))]).

Ports, external or linked-in, accept something called io-lists for sending data to them. An io-list is a binary or a (possibly deep) list of binaries or integers in the range 0..255.
This means that rather than concatenating two lists before sending them to a port, one can just send them as two items in a list. So instead of
"foo" ++ "bar"
one do
["foo", "bar"]
In this example it is of course of miniscule difference. But the iolist in itself allows for convenient programming when creating output data. io_lib:format/2,3 itself returns an io list for example.
The function erlang:list_to_binary/1 accepts io lists, but now we have erlang:iolist_to_binary/1 which convey the intention better. There is also an erlang:iolist_size/1.
Best of all, since files and sockets are implemented as ports, you can send iolists to them. No need to flatten or append.

That match specifications can be built using ets:fun2ms(...) where the Erlang fun syntax is used and translated into a match specification with a parse transform.
1> ets:fun2ms(fun({Foo, _, Bar}) when Foo > 0 -> {Foo, Bar} end).
[{{'$1','_','$2'},[{'>','$1',0}],[{{'$1','$2'}}]}]
So no fun-value is ever built, the expression gets replaced with the match-spec at compile-time. The fun may only do things a match expression could do.
Also, ets:fun2ms is available for usage in the shell, so fun-expressions can be tested easily.

.erlang_hosts gives a nice way to share names across machines

Not necessarily "hidden", but I don't see this often. Anonymous functions can have multiple clauses, just like module functions, i.e.
-module(foo).
-compile(export_all).
foo(0) -> "zero";
foo(1) -> "one";
foo(_) -> "many".
anon() ->
fun(0) ->
"zero";
(1) ->
"one";
(_) ->
"many"
end.
1> foo:foo(0).
"zero"
2> foo:foo(1).
"one"
3> foo:foo(2).
"many"
4> (foo:anon())(0).
"zero"
5> (foo:anon())(1).
"one"
6> (foo:anon())(2).
"many"

The gen___tcp and ssl sockets have a {packet, Type} socket option to aid in decoding a number of protocols. The function erlang:decode_packet/3 has a good description on what the various Type values can be and what they do.
Together with a {active, once} or {active, true} setting, each framed value will be delivered as a single message.
Examples: the packet http mode is used heavily for iserve and the packet fcgi mode for ifastcgi. I can imagine that many of the other http servers use packet http as well.

.erlang can preload libraries and run commands on a shells startup, you can also do specific commands for specific nodes by doing a case statement on node name.

If you want to execute more than one expression in a list comprehension, you can use a block. For example:
> [begin erlang:display(N), N*10 end || N <- lists:seq(1,3)].
1
2
3
[10,20,30]

It is possible to define your own iterator for QLC to use. For example, a result set from an SQL query could be made into a QLC table, and thus benefit from the features of QLC queries.
Besides mnesia tables, dets and ets have the table/1,2 functions to return such a "Query Handle" for them.

Not so hidden, but one of the most important aspects, when chosing Erlang as platform for development:
Possibility of enhanced tracing on live nodes (in-service) and being one of the best in debugging!

You can hide an Erlang node by starting it with:
erl -sname foo -hidden
You can still connect to the node, but it won't appear in the list returned by nodes/0.

Matching with the append operator:
"pajamas:" ++ Color = "pajamas:blue"
Color now has the value "blue". Be aware that this trick has it’s limitations - as far as I know it only works with a single variable and a single constant in the order given above.

Hot code loading. From wiki.
Code is loaded and managed as "module" units, the module is a compilation unit. The system can keep two versions of a module in memory at the same time, and processes can concurrently run code from each.
The versions are referred to the "new" and the "old" version. A process will not move into the new version until it makes an external call to its module.

Related

Erlang regexp matching on Chinese characters

TL;DR:
25> re:run("йцу.asd", xmerl_regexp:sh_to_awk("*.*"), [{capture, none}]).
** exception error: bad argument
in function re:run/3
called as re:run([1081,1094,1091,46,97,115,100],
"^(.*\\..*)$",
[{capture,none}])
How to make this work? 'йцу' are characters that don't belong in a latin charset, obviously; is there a way to tell the re module or entire system to run with a different charset for "strings"?
ORIGINAL QUESTION (for the record):
Another "Programming Erlang" question )
in Chapter 16 there's an example about reading tags from the mp3 files. It works, great. But, there seems to be some bug in a provided module, lib_find, which has a function for searching a path for matching files. This is the call that works:
61> lib_find:files("../..", "*.mp3", true).
["../../early/files/Veronique.mp3"]
and this call fails:
62> lib_find:files("../../..", "*.mp3", true).
** exception error: bad argument
in function re:run/3
called as re:run([46,46,47,46,46,47,46,46,47,46,107,101,114,108,47,98,117,
105,108,100,115,47,50,48,46,49,47,111|...],
"^(.*\\.mp3)$",
[{capture,none}])
in call from lib_find:find_files/6 (lib_find.erl, line 29)
in call from lib_find:find_files/6 (lib_find.erl, line 39)
in call from lib_find:files/3 (lib_find.erl, line 17)
Ironically, the investigation led to finding the culprit in Erlang's own installation:
.kerl/builds/20.1/otp_src_20.1/lib/ssh/test/ssh_sftp_SUITE_data/sftp_tar_test_data_高兴
OK, this seems to mean Erlang is using a more restrictive default charset, which doesn't include hànzì. What are the options? Obviously, I can just ignore this and move on with my study, but I feel I can learn more from this one =) Such as - where/how can I fix the default charset? I'm a little surprised it's something other than UTF8 by default - so maybe I'm on a wrong track?
Thanks!
TL;DR:
UTF-8 regexs are accessible by putting the regex pattern into unicode mode with the option unicode. (Note below that the string "^(.*\\..*)$" is the result of your call to xmerl_regexp:sh_to_awk/1.)
1> re:run("なにこれ.txt", "^(.*\\..*)$").
** exception error: bad argument
in function re:run/2
called as re:run([12394,12395,12371,12428,46,116,120,116],"^(.*\\..*)$")
2> re:run("なにこれ.txt", "^(.*\\..*)$", [unicode]).
{match,[{0,16},{0,16}]}
And from your exact example:
11> re:run("йцу.asd", "^(.*\\..*)$", [unicode, {capture, none}]).
match
Or
12> {ok, Pattern} = re:compile("^(.*\\..*)$", [unicode]).
{ok,{re_pattern,1,1,0,
<<69,82,67,80,87,0,0,0,16,8,0,0,65,0,0,0,255,255,255,
255,255,255,...>>}}
13> re:run("йцу.asd", Pattern, [{capture, none}]).
match
The docs for re are pretty long and extensive, but that's because regexs are an inherently complex subject. You can find options for compiled regexs in the docs for re:compile/2 and the options for run in the docs for re:run/3.
Discussion
Erlang has settled on the idea that strings, though still a list of codepoints, are all UTF-8 everywhere. As I work in Japan and deal with this all the time, this has come as a big relief to me because I can stop using about half of the conversion libraries I had needed in the past (yay!), but has complicated matters a bit for users of the string module because many operations there now perform under slightly different assumptions (a string is still considered "flat" even if it is a deep list of grapheme clusters, so long as those clusters exist on the first level of the list).
Unfortunately, encodings are just not very easy things to deal with and UTF-8 is anything but simple once you step out of the most common representations -- so much of this is a work in progress. I can tell you with confidence, though, that dealing with UTF-8 data in binary, string, deep list, and io_data() forms, whether file names, file data, network data, or user input from WX or web forms works as expected once you read the unicode, regex and string docs.
But that is, of course, a lot of stuff to get familiar with. 99% of the time things will work as expected if you decode everything incoming from outside as UTF-8 using unicode:characters_to_list/1 and unicode:characters_to_binary/1, and specify binary strings as utf8 binary types everywhere:
3> UnicodeBin = <<"この文書はUTF-8です。"/utf8>>.
<<227,129,147,227,129,174,230,150,135,230,155,184,227,129,
175,85,84,70,45,56,227,129,167,227,129,153,227,128,130>>
4> UnicodeString = unicode:characters_to_list(UnicodeBin).
[12371,12398,25991,26360,12399,85,84,70,45,56,12391,12377,
12290]
5> io:format("~ts~n", [UnicodeString]).
この文書はUTF-8です。
ok
6> re:run(UnicodeString, "UTF-8", [unicode]).
{match,[{15,5}]}
7> re:run(UnicodeBin, "UTF-8", [unicode]).
{match,[{15,5}]}
8> unicode:characters_to_binary(UnicodeString).
<<227,129,147,227,129,174,230,150,135,230,155,184,227,129,
175,85,84,70,45,56,227,129,167,227,129,153,227,128,130>>
9> unicode:characters_to_binary(UnicodeBin).
<<227,129,147,227,129,174,230,150,135,230,155,184,227,129,
175,85,84,70,45,56,227,129,167,227,129,153,227,128,130>>

How to invoke Erlang function with variable?

4> abs(1).
1
5> X = abs.
abs
6> X(1).
** exception error: bad function abs
7> erlang:X(1).
1
8>
Is there any particular reason why I have to use the module name when I invoke a function with a variable? This isn't going to work for me because, well, for one thing it is just way too much syntactic garbage and makes my eyes bleed. For another thing, I plan on invoking functions out of a list, something like (off the top of my head):
[X(1) || X <- [abs, f1, f2, f3...]].
Attempting to tack on various module names here is going to make the verbosity go through the roof, when the whole point of what I am doing is to reduce verbosity.
EDIT: Look here: http://www.erlangpatterns.org/chain.html The guy has made some pipe-forward function. He is invoking functions the same way I want to above, but his code doesn't work when I try to use it. But from what I know, the guy is an experienced Erlang programmer - I saw him give some keynote or whatever at a conference (well I saw it online).
Did this kind of thing used to work but not anymore? Surely there is a way I can do what I want - invoke these functions without all the verbosity and boilerplate.
EDIT: If I am reading the documentation right, it seems to imply that my example at the top should work (section 8.6) http://erlang.org/doc/reference_manual/expressions.html
I know abs is an atom, not a function. [...] Why does it work when the module name is used?
The documentation explains that (slightly reorganized):
ExprM:ExprF(Expr1,...,ExprN)
each of ExprM and ExprF must be an atom or an expression that
evaluates to an atom. The function is said to be called by using the
fully qualified function name.
ExprF(Expr1,...,ExprN)
ExprF
must be an atom or evaluate to a fun.
If ExprF is an atom the function is said to be called by using the implicitly qualified function name.
When using fully qualified function names, Erlang expects atoms or expression that evaluates to atoms. In other words, you have to bind X to an atom: X = atom. That's exactly what you provide.
But in the second form, Erlang expects either an atom or an expression that evaluates to a function. Notice that last word. In other words, if you do not use fully qualified function name, you have to bind X to a function: X = fun module:function/arity.
In the expression X=abs, abs is not a function but an atom. If you want thus to define a function,you can do so:
D = fun erlang:abs/1.
or so:
X = fun(X)->abs(X) end.
Try:
X = fun(Number) -> abs(Number) end.
Updated:
After looking at the discussion more, it seems like you're wanting to apply multiple functions to some input.
There are two projects that I haven't used personally, but I've starred on Github that may be what you're looking for.
Both of these projects use parse transforms:
fun_chain https://github.com/sasa1977/fun_chain
pipeline https://github.com/stolen/pipeline
Pipeline is unique because it uses a special syntax:
Result = [fun1, mod2:fun2, fun3] (Arg1, Arg2).
Of course, it could also be possible to write your own function to do this using a list of {module, function} tuples and applying the function to the previous output until you get the result.

Simple explanation of Erlang atom

I am learning Erlang and stuck trying to understand the concept of atoms. I know Python: What is a good explanation of these "atoms" in simple terms, or analogously with Python. So far, my understanding is that the type is like a string but without string operations?
Docs say that:
An atom is a literal, a constant with name.
Sometimes you have couple of options, that you would like to choose from. In C for example, you have enum:
enum Weekday { Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday };
In C, it is really an integer, but you can use it in code as one of options. Atoms in Erlang are very useful in pattern matching. Lets consider very simple server:
loop() ->
receive
{request_type_1, Request} ->
handle_request_1(Request),
loop();
{request_type_2, Request} ->
handle_request_2(Request),
loop();
{stop, Reason} ->
{ok, Reason};
_ ->
{error, bad_request}
end.
Your server receives messages, that are two element tuples and uses atoms to differentiate between different types of requests: request_type_1, request_type_2 and stop. It is called pattern matching.
Server also uses atoms as return values. ok atom means, that everything went ok. _ matches everything, so in case, that simple server receives something unexpected, it quits with tuple {error, Reason}, where the reason is also atom bad_request.
Boolean values true and false are also atoms. You can build logical functions using function clauses like this:
and(true, true) ->
true;
and(_, _) ->
false.
or(false, false) ->
false;
or(_, _) ->
true.
(It is a little bit oversimplified, because you can call it like this: or(atom1, atom2) and it will return true, but it is only for illustration.)
Module names in Erlang are also atoms, so you can bind module name to variable and call it, for example type this in Erlang shell:
io:format("asdf").
Variable = io.
Variable:format("asdf").
You should not use atoms as strings, because they are not garbage collected. If you start creating them dynamically, you can run out of memory. They should be only used, when there is fixed amount of options, that you type in code by hand. Of course, you can use the same atom as many times as you want, because it always points to the same point in memory (an atom table).
They are better than C enums, because the value is known at runtime. So while debugging C code, you would see 1 instead of Tuesday in debugger. Using atoms doesn't have that disadvantage, you will see tuesday in your both in your code and Erlang shell.
Also, they're often used to tag a tuple, for descriptiveness. For example:
{age, 42}
Rather than just
42
Atom is a literal constant. Has no value but can be used as a value. Examples are: true, false, undefined. If you want to use it as a string, you need to apply atom_to_list(atom) to get a string (list) to work with. Module names are also atoms.
Take a look at http://www.erlang.org/doc/reference_manual/data_types.html

Why doesn't Haskell's Prelude.read return a Maybe?

Is there a good reason why the type of Prelude.read is
read :: Read a => String -> a
rather than returning a Maybe value?
read :: Read a => String -> Maybe a
Since the string might fail to be parseable Haskell, wouldn't the latter be be more natural?
Or even an Either String a, where Left would contain the original string if it didn't parse, and Right the result if it did?
Edit:
I'm not trying to get others to write a corresponding wrapper for me. Just seeking reassurance that it's safe to do so.
Edit: As of GHC 7.6, readMaybe is available in the Text.Read module in the base package, along with readEither: http://hackage.haskell.org/packages/archive/base/latest/doc/html/Text-Read.html#v:readMaybe
Great question! The type of read itself isn't changing anytime soon because that would break lots of things. However, there should be a maybeRead function.
Why isn't there? The answer is "inertia". There was a discussion in '08 which got derailed by a discussion over "fail."
The good news is that folks were sufficiently convinced to start moving away from fail in the libraries. The bad news is that the proposal got lost in the shuffle. There should be such a function, although one is easy to write (and there are zillions of very similar versions floating around many codebases).
See also this discussion.
Personally, I use the version from the safe package.
Yeah, it would be handy with a read function that returns Maybe. You can make one yourself:
readMaybe :: (Read a) => String -> Maybe a
readMaybe s = case reads s of
[(x, "")] -> Just x
_ -> Nothing
Apart from inertia and/or changing insights, another reason might be that it's aesthetically pleasing to have a function that can act as a kind of inverse of show. That is, you want that read . show is the identity (for types which are an instance of Show and Read) and that show . read is the identity on the range of show (i.e. show . read . show == show)
Having a Maybe in the type of read breaks the symmetry with show :: a -> String.
As #augustss pointed out, you can make your own safe read function. However, his readMaybe isn't completely consistent with read, as it doesn't ignore whitespace at the end of a string. (I made this mistake once, I don't quite remember the context)
Looking at the definition of read in the Haskell 98 report, we can modify it to implement a readMaybe that is perfectly consistent with read, and this is not too inconvenient because all the functions it depends on are defined in the Prelude:
readMaybe :: (Read a) => String -> Maybe a
readMaybe s = case [x | (x,t) <- reads s, ("","") <- lex t] of
[x] -> Just x
_ -> Nothing
This function (called readMaybe) is now in the Haskell prelude! (As of the current base -- 4.6)

Can I disable printing lists of small integers as strings in Erlang shell?

The Erlang shell "guesses" whether a given list is a printable string and prints it that way for convenience. Can this "convenience" be disabled?
I don't know if it's possible to change the default behavior of the shell, but you can at least format your output correctly, using io:format.
Here is an example:
1> io:format("~p~n", [[65, 66, 67]]).
"ABC"
ok
2> io:format("~w~n", [[65, 66, 67]]).
[65,66,67]
ok
And since the shell is only for experimenting / maintenance, io:format() should be at least enough for your real application. Maybe you should also consider to write your own format/print method, e.g. formatPerson() or something like that, which formats everything nicely.
You can disable such behavior with shell:strings/1 function starting with Erlang R16B.
Just remember that this is option global to all node shells, and it might be wise to set it back after finishing playing is shell in longer living nodes.
I tend to do it by prepending an atom to my list in the shell.
for example:
Eshell V5.7.4 (abort with ^G)
1> [65,66,67].
"ABC"
2> [a|[65,66,67]].
[a,65,66,67]
could also be [a,65,66,67], of course. but [a|fun_that_returns_a_list()] will print "the right thing(ish) most of the time"
As of Erlang/OTP R16B, you can use the function shell:strings/1 to turn this on or off. Note that it also affects printing of things that are actually meant to be strings, such as "foo" in the following example:
1> {[8,9,10], "foo"}.
{"\b\t\n","foo"}
2> shell:strings(false).
true
3> {[8,9,10], "foo"}.
{[8,9,10],[102,111,111]}
No, there is no way to disable it. The best alternative I find is to either explicitly print out the value in the query (with io:format) or after the fact do: io:format("~w\n", [v(-1)]).
I don't think you can prevent it.
Prepending an atom seems like a kludge - it does alter your original string.
I typically use lists:flatten(String) to force it to a string - especially the returnvalue of io_lib:format() does not always print as a string. Using lists:flatten() on it makes it one.
I use the following "C-style":
sprintf(Format) ->
sprintf(Format, []).
sprintf(Format, Args) ->
lists:flatten(io_lib:format(Format, Args)).
The problem is that the string is not a type in Erlang. A string is just a list of integers, so there's no way for the shell to distinguish a printable string from a generic list. Don't know if this answer to your question.

Resources