httpc + many request in spawn not worked - erlang

I tried to use Erlang's httpc module for high concurrent requests
My code for many requests in spawn hasn't worked:
-module(t).
-compile(export_all).
start() ->
ssl:start(),
inets:start( httpc, [{profile, default}] ),
httpc:set_options([{max_sessions, 200}, {pipeline_timeout, 20000}], default),
{ok, Device} = file:open("c:\urls.txt", read),
read_each_line(Device).
read_each_line(Device) ->
case io:get_line(Device, "") of
eof -> file:close(Device);
Line -> go( string:substr(Line, 1,length(Line)-1)),
read_each_line(Device)
end.
go(Url)->
spawn(t,geturl, [Url] ).
geturl(Url)->
UrlHTTP=lists:concat(["http://www.", Url]),
io:format(UrlHTTP),io:format("~n"),
{ok, RequestId}=httpc:request(get,{UrlHTTP,[{"User-Agent", "Opera/9.80 (Windows NT 6.1; U; ru) Presto/2.8.131 Version/11.10"}]}, [],[{sync, false}]),
receive
{http, {RequestId, {_HttpOk, _ResponseHeaders, Body}}} -> io:format("ok"),ok
end.
httpc:request is not received in html Body - if I can use spawn in
go(Url)->
spawn(t,geturl, [Url] ).
http://erlang.org/doc/man/httpc.html
Note
If possible the client will keep its connections alive and use
persistent connections with or without pipeline depending on
configuration and current circumstances. The HTTP/1.1 specification
does not provide a guideline for how many requests would be ideal to
be sent on a persistent connection, this very much depends on the
application. Note that a very long queue of requests may cause a user
perceived delay as earlier requests may take a long time to complete.
The HTTP/1.1 specification does suggest a limit of 2 persistent
connections per server, which is the default value of the max_sessions
option
urls.txt contain different urls - for example
google.com
amazon.com
alibaba.com
...
What's wrong?

Your code never actually starts the httpc service (and inets, the application that it depends on), and the confusion probably comes from the unfortunate overloading of the inets:start/[0,1,2,3] function:
inets:start/[0,1] starts the inets application itself and the httpc service with the default profile (called default).
inets:start/[2,3] (which should be called start_service) starts one of the services that can run atop inets (viz. ftpc, tftp, httpc, httpd) once the inets application has already started.
start() ->
start(Type) -> ok | {error, Reason}
    Starts the Inets application.
start(Service, ServiceConfig) -> {ok, Pid} | {error, Reason}
start(Service, ServiceConfig, How) -> {ok, Pid} | {error, Reason}
    Dynamically starts an Inets service after
the Inets application has been started
    (with inets:start/[0,1]).
So your spawned process simply crashed when trying to call httpc:request/4 as the service itself was not running. To illustrate, inets:start( httpc, [{profile, default}] ) from your start/0 function would fail to start inets and the httpc service:
Eshell V10.7 (abort with ^G)
1> inets:start(httpc, [{profile, default}]).
{error,inets_not_started}
You should check the returned value of application start to track potential problems:
...
ok = ssl:start(),
ok = inets:start(),
...
Or, if the application could be already started, use a function like this:
...
ok = ensure_start(ssl),
ok = ensure_start(inets),
...
ensure_start(M) ->
case M:start() of
ok -> ok;
{error,{already_started,M}} -> ok;
Other -> Other
end.
[edit 2 - small code enhancement]
I have tested this code and it works on my PC. Note that you are using a '' in the string for file access, this is an escape sequence that make the line to fail.
-module(t).
-compile(export_all).
start() -> start(2000).
% `To` is a parameter which is passed to `getUrl`
% to change the timeout value. You can play with
% it to see the request queue effect, and how
% much the response times of each site varies.
%
% The default timeout value is set to 2 seconds.
start(To) ->
ok = ensure_start(ssl),
ok = ensure_start(inets),
ok = httpc:set_options([{max_sessions, 200}, {pipeline_timeout, 20000}], default),
{ok, Device} = file:open("D:/urls.txt", read),
read_each_line(Device,To).
read_each_line(Device,To) ->
case io:get_line(Device, "") of
eof -> file:close(Device);
Line -> go( string:substr(Line, 1,length(Line)-1),To),
read_each_line(Device,To)
end.
go(Url,To)->
spawn(t,geturl, [Url,To] ).
geturl(Url,To)->
UrlHTTP=lists:concat(["http://www.", Url]),
io:format(UrlHTTP), io:format("~n"),
{ok, RequestId}=httpc:request(get,{UrlHTTP,[{"User-Agent", "Opera/9.80 (Windows NT 6.1; U; ru) Presto/2.8.131 Version/11.10"}]}, [],[{sync, false}]),
M = receive
{http, {RequestId, {_HttpOk, _ResponseHeaders, _Body}}} -> ok
after To ->
not_ok
end,
io:format("httprequest to ~p: ~p~n",[UrlHTTP,M]).
ensure_start(M) ->
case M:start() of
ok -> ok;
{error,{already_started,M}} -> ok;
Other -> Other
end.
and in the console:
1> t:start().
http://www.povray.org
http://www.google.com
http://www.yahoo.com
ok
httprequest to "http://www.google.com": ok
httprequest to "http://www.povray.org": ok
httprequest to "http://www.yahoo.com": ok
2> t:start().
http://www.povray.org
http://www.google.com
http://www.yahoo.com
ok
httprequest to "http://www.google.com": ok
httprequest to "http://www.povray.org": ok
httprequest to "http://www.yahoo.com": ok
3>
Note that thanks to the ensure_start/1 you can launch the application twice.
I have tested also with a bad url and it is detected.
My test include only 3 urls, and I guess that if there are many urls, the time to get the response will increase, because the loop to spawn processes is faster to execute than the request themselves. So you must expect at some point some timeout issue. There may be also some limitation in the http client, I didn't check the doc for this particular point.

Related

Erlang server connecting with ports to send and receive a Json file to a Java application

I have tried to implement a server with Erlang to my Java application.
Seems that my server is working, but still full of bugs and dead points.
I need to receive a JSON file parsed by the Java application into a map and send it back to all clients, including the one that uploaded the file.
Meanwhile, I need to keep track who made the request and which part of the message was sent, in case of any problems the client should be restarted from this point, not from the beginning. Unless the client leaves the application, then it should restart.
My three pieces of code will be below:
The app.erl
-module(erlServer_app).
-behaviour(application).
%% Application callbacks
-export([start/2, stop/1]).
%%%===================================================================
%%% Application callbacks
%%%===================================================================
start(_StartType, _StartArgs) ->
erlServer_sup:start_link().
stop(_State) ->
ok.
The supervisor.erl:
-module(erlServer_sup).
-behaviour(supervisor).
%% API
-export([start_link/0]).
%% Supervisor callbacks
-export([init/1, start_socket/0, terminate_socket/0, empty_listeners/0]).
-define(SERVER, ?MODULE).
%%--------------------------------------------------------------------
%% #doc
%% Starts the supervisor
%%
%% #end
%%--------------------------------------------------------------------
start_link() ->
supervisor:start_link({local, ?SERVER}, ?MODULE, []).
%%%===================================================================
%%% Supervisor callbacks
%%%===================================================================
init([]) -> % restart strategy 'one_for_one': if one goes down only that one is restarted
io:format("starting...~n"),
spawn_link(fun() -> empty_listeners() end),
{ok,
{{one_for_one, 5, 30}, % The flag - 5 restart within 30 seconds
[{erlServer_server, {erlServer_server, init, []}, permanent, 1000, worker, [erlServer_server]}]}}.
%%%===================================================================
%%% Internal functions
%%%===================================================================
start_socket() ->
supervisor:start_child(?MODULE, []).
terminate_socket() ->
supervisor:delete_child(?MODULE, []).
empty_listeners() ->
[start_socket() || _ <- lists:seq(1,20)],
ok.
The server.erl:(I have a lot of debugging io:format.)
-module(erlServer_server).
%% API
-export([init/0, start_server/0]).
%% Defining the port used.
-define(PORT, 8080).
%%%===================================================================
%%% API
%%%===================================================================
init() ->
start_server().
%%%===================================================================
%%% Server callbacks
%%%===================================================================
start_server() ->
io:format("Server started.~n"),
Pid = spawn_link(fun() ->
{ok, ServerSocket} = gen_tcp:listen(?PORT, [binary, {packet, 0},
{reuseaddr, true}, {active, true}]),
io:format("Baba~p", [ServerSocket]),
server_loop(ServerSocket) end),
{ok, Pid}.
server_loop(ServerSocket) ->
io:format("Oba~p", [ServerSocket]),
{ok, Socket} = gen_tcp:accept(ServerSocket),
Pid1 = spawn(fun() -> client() end),
inet:setopts(Socket, [{packet, 0}, binary,
{nodelay, true}, {active, true}]),
gen_tcp:controlling_process(Socket, Pid1),
server_loop(ServerSocket).
%%%===================================================================
%%% Internal functions
%%%===================================================================
client() ->
io:format("Starting client. Enter \'quit\' to exit.~n"),
Client = self(),
{ok, Sock} = gen_tcp:connect("localhost", ?PORT, [{active, false}, {packet, 2}]),
display_prompt(Client),
client_loop(Client, Sock).
%%%===================================================================
%%% Client callbacks
%%%===================================================================
send(Sock, Packet) ->
{ok, Sock, Packet} = gen_tcp:send(Sock, Packet),
io:format("Sent ~n").
recv(Packet) ->
{recv, ok, Packet} = gen_tcp:recv(Packet),
io:format("Received ~n").
display_prompt(Client) ->
spawn(fun () ->
Packet = io:get_line("> "),
Client ! {entered, Packet}
end),
Client ! {requested},
ok.
client_loop(Client, Sock) ->
receive
{entered, "quit\n"} ->
gen_tcp:close(Sock);
{entered, Packet} ->
% When a packet is entered we receive it,
recv(Packet),
display_prompt(Client),
client_loop(Client, Sock);
{requested, Packet} ->
% When a packet is requested we send it,
send(Sock, Packet),
display_prompt(Client),
client_loop(Client, Sock);
{error, timeout} ->
io:format("Send timeout, closing!~n", []),
Client ! {self(),{error_sending, timeout}},
gen_tcp:close(Sock);
{error, OtherSendError} ->
io:format("Some other error on socket (~p), closing", [OtherSendError]),
Client ! {self(),{error_sending, OtherSendError}},
gen_tcp:close(Sock)
end.
This is the first server I'm doing and I got lost probably in the middle. When I run it seems to be working, but hanging. Can someone help me? My localhost never loads anything it keeps loading forever.
How can my java app receive it from the same port?
I MUST use Erlang and I MUST use ports to connect to the java application.
Thanks for helping me!
Let's rework this just a little...
First up: Naming
We don't use camelCase in Erlang. It is confusing because capitalized variable names and lower-case (or single-quoted) atoms mean different things. Also, module names must be the same as file names which causes problems on case-insensitive filesystems.
Also, we really want a better name than "server". Server can mean a lot of things in a system like this, and while the system overall may be a service written in Erlang, it doesn't necessarily mean that we can call everything inside a "server" without getting super ambiguous! That's confusing. I'm going to namespace your project as "ES" for now. So you'll have es_app and es_sup and so on. This will come in handy later when we want to start defining new modules, maybe some of them called "server" without having to write "server_server" all over the place.
Second: Input Data
Generally speaking, we would like to pass arguments to functions instead of burying literals (or worse, macro rewrites) inside of code. If we are going to have magic numbers and constants let's do our best to put them into a configuration file so that we can access them in a programmatic way, or even better, let's use them in the initial startup calls as arguments to subordinate processes so that we can rework the behavior of the system (once written) only by messing around with the startup calling functions in the main application module.
-module(es).
-behaviour(application).
-export([listen/1, ignore/0]).
-export([start/0, start/1]).
-export([start/2, stop/1]).
listen(PortNum) ->
es_client_man:listen(PortNum).
ignore() ->
es_client_man:ignore().
start() ->
ok = application:ensure_started(sasl),
ok = application:start(?MODULE),
io:format("Starting es...").
start(Port) ->
ok = start(),
ok = es_client_man:listen(Port),
io:format("Startup complete, listening on ~w~n", [Port]).
start(normal, _Args) ->
es_sup:start_link().
stop(_State) ->
ok.
I added a start/1 function above as well as a start/0, a listen/1 and an ignore/0 that you'll see again later in es_client_man. These are mostly convenience wrappers around things you could call more explicitly, but probably won't want to type all the time.
This application module kicks things off by having an application master start the project for us (by calling application:start/1) and then the next line calls erl_server_server to tell it to start listening. In early development I find this approach much more useful than burying autostarts to every component all over the place, and later on it gives us a very simple way to write an external interface that can turn various components on and off.
Ah, also... we're going to start this as a for-real Erlang application, so we'll need an app file in ebin/ (or if you're using erlang.mk or something similar an app.src file in src/):
ebin/es.app looks like this:
{application,es,
[{description,"An Erlang Server example project"},
{vsn,"0.1.0"},
{applications,[stdlib,kernel,sasl]},
{modules,[es,
es_sup,
es_clients,
es_client_sup,
es_client,
es_client_man]},
{mod,{es,[]}}]}.
The list under modules reflects the layout of the supervision tree, actually, as you will see below.
The start/2 function above now asserts that we will only start in normal mode (which may or may not be appropriate), and disregards startup args (which also may or may not be appropriate).
Third: The Supervision Tree
-module(es_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init([]) ->
RestartStrategy = {one_for_one, 1, 60},
Clients = {es_clients,
{es_clients, start_link, []},
permanent,
5000,
supervisor,
[es_clients]},
Children = [Clients],
{ok, {RestartStrategy, Children}}.
And then...
-module(es_clients).
-behavior(supervisor).
-export([start_link/0]).
-export([init/1]).
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, none).
init(none) ->
RestartStrategy = {rest_for_one, 1, 60},
ClientSup = {es_client_sup,
{es_client_sup, start_link, []},
permanent,
5000,
supervisor,
[es_client_sup]},
ClientMan = {es_client_man,
{es_client_man, start_link, []},
permanent,
5000,
worker,
[es_client_man]},
Children = [ClientSup, ClientMan],
{ok, {RestartStrategy, Children}}.
Whoa! What is going on here?!? Well, the es_sup is a supervisor, not a one-off thingy that just spawns other one-off thingies. (Misunderstanding supervisors is part of your core problem.)
Supervisors are boring. Supervisors should be boring. All they really do as a reader of code is what the structure of the supervision tree is inside. What they do for us in terms of OTP structure is extremely important, but they don't require that we write any procedural code, just declare what children it should have. What we are implementing here is called a service -> worker structure. So we have the top-level supervisor for your overall application called es_sup. Below that we have (for now) a single service component called es_clients.
The es_clients process is also a supervisor. The reason for this is to define a clear way for the client connection parts to not affect whatever ongoing state may exist in the rest of your system later. Just accepting connections from clients is useless -- surely there is some state that matters elsewhere, like long-running connections to some Java node or whatever. That would be a separate service component maybe called es_java_nodes and that part of the program would begin with its own, separate supervisor. That's why it is called a "supervision tree" instead of a "supervision list".
So back to clients... We will have clients connecting. That is why we call them "clients", because from the perspective of this Erlang system the things connecting are clients, and the processes that accept those connections abstract the clients so we can treat each connection handler as a client itself -- because that is exactly what it represents. If we connect to upstream services later, we would want to call those whatever they are abstracting so that our semantics within the system is sane.
You can then think in terms of "an es_client sent a message to an es_java_node to query [thingy]" instead of trying to keep things straight like "a server_server asked a java_server_client to server_server the service_server" (which is literally how stupid things get if you don't keep your naming principles straight in terms of inner-system perspective).
Blah blah blah...
So, here is the es_client_sup:
-module(es_client_sup).
-behaviour(supervisor).
-export([start_acceptor/1]).
-export([start_link/0]).
-export([init/1]).
start_acceptor(ListenSocket) ->
supervisor:start_child(?MODULE, [ListenSocket]).
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, none).
init(none) ->
RestartStrategy = {simple_one_for_one, 1, 60},
Client = {es_client,
{es_client, start_link, []},
temporary,
brutal_kill,
worker,
[es_client]},
{ok, {RestartStrategy, [Client]}}.
Are you picking up a pattern? I wasn't kidding when I said that "supervisors should be boring..." :-) Note that here we are actually passing an argument in and we have defined an interface function. That is so we have a logicalish place to call if we need a socket acceptor to start.
Fourth: The Client Service Itself
Let's look at the client manager:
-module(es_client_man).
-behavior(gen_server).
-export([listen/1, ignore/0]).
-export([start_link/0]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2,
code_change/3, terminate/2]).
-record(s, {port_num = none :: none | inet:port_number(),
listener = none :: none | gen_tcp:socket()}).
listen(PortNum) ->
gen_server:call(?MODULE, {listen, PortNum}).
ignore() ->
gen_server:cast(?MODULE, ignore).
start_link() ->
gen_server:start_link({local, ?MODULE}, ?MODULE, none, []).
init(none) ->
ok = io:format("Starting.~n"),
State = #s{},
{ok, State}.
handle_call({listen, PortNum}, _, State) ->
{Response, NewState} = do_listen(PortNum, State),
{reply, Response, NewState};
handle_call(Unexpected, From, State) ->
ok = io:format("~p Unexpected call from ~tp: ~tp~n", [self(), From, Unexpected]),
{noreply, State}.
handle_cast(ignore, State) ->
NewState = do_ignore(State),
{noreply, NewState};
handle_cast(Unexpected, State) ->
ok = io:format("~p Unexpected cast: ~tp~n", [self(), Unexpected]),
{noreply, State}.
handle_info(Unexpected, State) ->
ok = io:format("~p Unexpected info: ~tp~n", [self(), Unexpected]),
{noreply, State}.
code_change(_, State, _) ->
{ok, State}.
terminate(_, _) ->
ok.
do_listen(PortNum, State = #s{port_num = none}) ->
SocketOptions =
[{active, once},
{mode, binary},
{keepalive, true},
{reuseaddr, true}],
{ok, Listener} = gen_tcp:listen(PortNum, SocketOptions),
{ok, _} = es_client:start(Listener),
{ok, State#s{port_num = PortNum, listener = Listener}};
do_listen(_, State = #s{port_num = PortNum}) ->
ok = io:format("~p Already listening on ~p~n", [self(), PortNum]),
{{error, {listening, PortNum}}, State}.
do_ignore(State = #s{listener = none}) ->
State;
do_ignore(State = #s{listener = Listener}) ->
ok = gen_tcp:close(Listener),
State#s{listener = none}.
Hmmm, what is all that about? The basic idea here is that we have a service supervisor over the whole concept of clients (es_clients, as discussed above), and then we have the simple_one_for_one to handle whatever clients happen to be alive just now (es_client_sup), and here we have the management interface to the subsystem. All this manager does is keep track of what port we are listening on, and owns the socket that we are listening to if one is open at the moment. Note that this can be easily rewritten to allow any arbitrary number of ports to be listened to simultaneously, or track all living clients, or whatever. There really is no limit to what you might want to do.
So how do we start clients that can accept connections? By telling them to spawn and listen to the listen socket that we pass in as an argument. Go take another look at es_client_sup above. We are passing in an empty list as its first argument. What will happen when we call its start_link function is that whatever else we pass in as a list will get added to the list of arguments overall. In this case we will pass in the listen socket and so it will be started with the argument [ListenSocket].
Whenever a client listener accepts a connection, its next step will be to spawn its successor, handing it the original ListenSocket argument. Ah, the miracle of life.
-module(es_client).
-export([start/1]).
-export([start_link/1, init/2]).
-export([system_continue/3, system_terminate/4,
system_get_state/1, system_replace_state/2]).
-record(s, {socket = none :: none | gen_tcp:socket()}).
start(ListenSocket) ->
es_client_sup:start_acceptor(ListenSocket).
start_link(ListenSocket) ->
proc_lib:start_link(?MODULE, init, [self(), ListenSocket]).
init(Parent, ListenSocket) ->
ok = io:format("~p Listening.~n", [self()]),
Debug = sys:debug_options([]),
ok = proc_lib:init_ack(Parent, {ok, self()}),
listen(Parent, Debug, ListenSocket).
listen(Parent, Debug, ListenSocket) ->
case gen_tcp:accept(ListenSocket) of
{ok, Socket} ->
{ok, _} = start(ListenSocket),
{ok, Peer} = inet:peername(Socket),
ok = io:format("~p Connection accepted from: ~p~n", [self(), Peer]),
State = #s{socket = Socket},
loop(Parent, Debug, State);
{error, closed} ->
ok = io:format("~p Retiring: Listen socket closed.~n", [self()]),
exit(normal)
end.
loop(Parent, Debug, State = #s{socket = Socket}) ->
ok = inet:setopts(Socket, [{active, once}]),
receive
{tcp, Socket, <<"bye\r\n">>} ->
ok = io:format("~p Client saying goodbye. Bye!~n", [self()]),
ok = gen_tcp:send(Socket, "Bye!\r\n"),
ok = gen_tcp:shutdown(Socket, read_write),
exit(normal);
{tcp, Socket, Message} ->
ok = io:format("~p received: ~tp~n", [self(), Message]),
ok = gen_tcp:send(Socket, ["You sent: ", Message]),
loop(Parent, Debug, State);
{tcp_closed, Socket} ->
ok = io:format("~p Socket closed, retiring.~n", [self()]),
exit(normal);
{system, From, Request} ->
sys:handle_system_msg(Request, From, Parent, ?MODULE, Debug, State);
Unexpected ->
ok = io:format("~p Unexpected message: ~tp", [self(), Unexpected]),
loop(Parent, Debug, State)
end.
system_continue(Parent, Debug, State) ->
loop(Parent, Debug, State).
system_terminate(Reason, _Parent, _Debug, _State) ->
exit(Reason).
system_get_state(Misc) -> {ok, Misc}.
system_replace_state(StateFun, Misc) ->
{ok, StateFun(Misc), Misc}.
Note that above we have written a Pure Erlang process that integrates with OTP the way a gen_server would, but has a bit more straightforward loop that handles only the socket. This means we do not have the gen_server call/cast mechanics (and may need to implement those ourselves, but usually asynch-only is a better approach for socket handling). This is started through the proc_lib module which is designed specifically to bootstrap OTP-compliant processes of any arbitrary type.
If you are going to use supervisors then you really want to go all the way and use OTP properly.
So what we have above right now is a very basic Telnet echo service. Instead of writing a magical client process inside the server module to tie your brain in knots (Erlangers don't like their brains tied in knots), you can start this, tell it to listen to some port, and telnet to it yourself and see the results.
I added some scripts to automate launching things, but basically the hinge on the code and make modules. Your project is laid out like
es/
Emakefile
ebin/es.app
src/*.erl
The contents of the Emakefile will make things easier for us. In this case it is just one line:
enter code here{"src/*", [debug_info, {i, "include/"}, {outdir, "ebin/"}]}.
In the main es/ directory if we enter an erl shell we can now do...
1> code:add_patha("ebin").
true
2> make:all().
up_to_date
3> es:start().
And you'll see a bunch of SASL start reports scrolling up the screen.
From there let's do es:listen(5555):
4> es:listen(5555).
<0.94.0> Listening.
ok
Cool! So it seems like things are working. Let's try to telnet to ourselves:
ceverett#changa:~/vcs/es$ telnet localhost 5555
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Hello es thingy
You sent: Hello es thingy
Yay! It works!
You sent: Yay! It works!
bye
Bye!
Connection closed by foreign host.
What does that look like on the other side?
<0.96.0> Listening.
<0.94.0> Connection accepted from: {{127,0,0,1},60775}
<0.94.0> received: <<"Hello es thingy\r\n">>
<0.94.0> received: <<"Yay! It works!\r\n">>
<0.94.0> Client saying goodbye. Bye!
Ah, here we see the "Listening." message from the next listener <0.96.0> that was spawned by the first one <0.94.0>.
How about concurrent connections?
<0.97.0> Listening.
<0.96.0> Connection accepted from: {{127,0,0,1},60779}
<0.98.0> Listening.
<0.97.0> Connection accepted from: {{127,0,0,1},60780}
<0.97.0> received: <<"Screen 1\r\n">>
<0.96.0> received: <<"Screen 2\r\n">>
<0.97.0> received: <<"I wonder if I could make screen 1 talk to screen 2...\r\n">>
<0.96.0> received: <<"Time to go\r\n">>
<0.96.0> Client saying goodbye. Bye!
<0.97.0> Client saying goodbye. Bye!
Oh, neato. A concurrent server!
From here you can tool around and make this basic structure change to do pretty much anything you might imagine.
DO NOTE that there is a lot missing from this code. I have stripped it of edoc notations and typespecs (used by Dialyzer, a critically important tool in a large project) -- which is a BAD thing to do for a production system.
For an example of a production-style project that is small enough to wrap your head around (only 3 modules + full docs), refer to zuuid. It was written specifically to serve as a code example, though it happens to be a full-featured UUID generator.
Forgive the hugeness of this answer to your (much shorter) question. This comes up from time to time and I wanted to write a full example of a cut-down network socket service to which I can refer people in the future, and just happened to get the hankering to do so when I read your question. :-) Hopefully the SO nazis will forgive this grievous infraction.

How to handle dynamically registered processes and pattern matching in Erlang?

When I register a process to an atom, I can send a message via the atom instead of the Pid interchangeably, which is convenient. However, pattern matching seems to treat Pid and atom as different entities, which is expected but inconvenient. In my example, the {Pid, Response} pattern does not match since Pid in this scope is an atom but the message sent as response contains the actual Pid.
Is there a preferred way to handle this?
The Program:
-module(ctemplate).
-compile(export_all).
start(AnAtom, Fun) ->
Pid = spawn(Fun),
register(AnAtom, Pid).
rpc(Pid, Request) ->
Pid ! {self(), Request},
receive
{Pid, Response} ->
Response;
Any ->
io:format("Finish (wrong):~p~n",[{Pid, Any}])
end.
loop(X) ->
receive
{Sender, Any} ->
io:format("Received: ~p~n",[{Sender, Any}]),
Sender ! {self(), "Thanks for contacting us"},
loop(X)
end.
The Shell:
Eshell V5.10.2 (abort with ^G)
1> c(ctemplate).
{ok,ctemplate}
2> ctemplate:start(foo, fun() -> ctemplate:loop([]) end).
true
3> ctemplate:rpc(foo, ["bar"]).
Received: {<0.32.0>,["bar"]}
Finish (wrong):{foo,{<0.40.0>,"Thanks for contacting us"}}
ok
4> whereis(foo).
<0.40.0>
Use references instead. The example you are suggesting is actually one of the reasons why refs are better for synchronous messages. Another reason is that sometimes you cannot guarantee that the received message is the one you are actually expecting.
So, your code will look like something
rpc(PidOrName, Request) ->
Ref = make_ref(),
PidOrName ! {{self(), Ref}, Request},
receive
{{Pid, Ref}, Response} ->
Response;
Any ->
io:format("Finish (wrong):~p~n",[{PidOrName, Any}])
end.
loop(X) ->
receive
{{Pid, Ref}, Any} ->
io:format("Received: ~p~n",[{Sender, Any}]),
Sender ! {{self(), Ref}, "Thanks for contacting us"},
end,
loop(X).
A couple notes about your and my code:
Note how I moved the last loop/1 call to the end of the function out of the receive block. Erlang does compile-time tail call optimizations, so your code should be fine, but it's a better idea to make tail calls explicitly – helps you to avoid mistakes.
You are probably trying to re-invent gen_server. The only two major differences between gen_server:call/2 and my code above are timeouts (gen_server has them) and that the reference is created by monitoring the remote process. This way, if the process dies before the timeout is thrown, we receive an immediate message. It's slower in many cases, but sometimes proves itself useful.
Overall, try to use OTP and read its code. It's good and gives you better ideas of how Erlang application should work.
rpc(Pid, Request) ->
Pid ! {self(), Request},
receive
{whereis(Pid), Response} ->
Response;
Any ->
io:format("Finish (wrong):~p~n",[{Pid, Any}])
end.
you can this function whereis():
whereis(RegName) -> pid() | port() | undefined
Types:
RegName = atom()
Returns the pid or port identifier with the registered name RegName. Returns undefined if the name is not registered.
For example:
whereis(db).
<0.43.0>

When to use timeouts and monitors in Erlang?

The Learn You Some Erlang book has the below code. Why does the book sometimes use just timeouts and sometimes use both monitors and timeouts? In the below what is the need for the monitor since the timeout will effectively detect if the process is down?
%% Synchronous call
order_cat(Pid, Name, Color, Description) ->
Ref = erlang:monitor(process, Pid),
Pid ! {self(), Ref, {order, Name, Color, Description}},
receive
{Ref, Cat} ->
erlang:demonitor(Ref, [flush]),
Cat;
{'DOWN', Ref, process, Pid, Reason} ->
erlang:error(Reason)
after 5000 ->
erlang:error(timeout)
end.
Also compare the following where add_event doesn't use a monitor but subscribe does
subscribe(Pid) ->
Ref = erlang:monitor(process, whereis(?MODULE)),
?MODULE ! {self(), Ref, {subscribe, Pid}},
receive
{Ref, ok} ->
{ok, Ref};
{'DOWN', Ref, process, _Pid, Reason} ->
{error, Reason}
after 5000 ->
{error, timeout}
end.
add_event(Name, Description, TimeOut) ->
Ref = make_ref(),
?MODULE ! {self(), Ref, {add, Name, Description, TimeOut}},
receive
{Ref, Msg} -> Msg
after 5000 ->
{error, timeout}
end.
There is a big difference between those two examples.
order_cat(Pid, Name, Color, Description) uses a monitor for the duration of one request, and calls erlang:demonitor/2 after receiving a successful response.
subscribe(Pid) establishes a monitor more permanently. The intent is that the event client would receive the {'DOWN', Ref, process, Pid, Reason} message in its main receive block and handle the fact that the event server died. Note how subscribe(Pid) returns the monitor reference as {ok, Ref} for the client to use for this purpose. Unfortunately, the book doesn't show what the event client would look like.
Now as to the more general question about monitors vs. timeouts: The disadvantage of monitoring is a small additional cost, and a small amount of complexity. The advantages compared to timeouts include:
If the target process doesn't exist or crashes trying to handle the message you just sent, you find out immediately rather than after a timeout. For some applications, that time saved is very important.
If the target process crashes, the Reason part of the monitor message tells you something about why. In some applications, the client may try different recovery actions depending on the reason.
The sending process knows whether to expect a response from the target process in the future. In the case of a timeout, if the target process was merely slow to respond, the sending process will get a response in its mailbox. If the sender fails to clear such responses from its mailbox, it will become slow and eventually terminate. The book touches on this problem in the Selective Receive section. The gen_server:call/3 documentation also explains this issue briefly.
The implementation of gen_server:call/2 uses a monitor. Since this is how many production erlang requests are sent, you can trust that it is well optimized and the recommended default. In fact, you should be using OTP rather than rolling your own.
See the source code here. Here's the relevant function:
do_call(Process, Label, Request, Timeout) ->
try erlang:monitor(process, Process) of
Mref ->
%% If the monitor/2 call failed to set up a connection to a
%% remote node, we don't want the '!' operator to attempt
%% to set up the connection again. (If the monitor/2 call
%% failed due to an expired timeout, '!' too would probably
%% have to wait for the timeout to expire.) Therefore,
%% use erlang:send/3 with the 'noconnect' option so that it
%% will fail immediately if there is no connection to the
%% remote node.
catch erlang:send(Process, {Label, {self(), Mref}, Request},
[noconnect]),
receive
{Mref, Reply} ->
erlang:demonitor(Mref, [flush]),
{ok, Reply};
{'DOWN', Mref, _, _, noconnection} ->
Node = get_node(Process),
exit({nodedown, Node});
{'DOWN', Mref, _, _, Reason} ->
exit(Reason)
after Timeout ->
erlang:demonitor(Mref, [flush]),
exit(timeout)
end
catch
error:_ ->
%% Node (C/Java?) is not supporting the monitor.
%% The other possible case -- this node is not distributed
%% -- should have been handled earlier.
%% Do the best possible with monitor_node/2.
%% This code may hang indefinitely if the Process
%% does not exist. It is only used for featureweak remote nodes.
Node = get_node(Process),
monitor_node(Node, true),
receive
{nodedown, Node} ->
monitor_node(Node, false),
exit({nodedown, Node})
after 0 ->
Tag = make_ref(),
Process ! {Label, {self(), Tag}, Request},
wait_resp(Node, Tag, Timeout)
end
end.
Why does the book sometimes use just timeouts and sometimes use both monitors and timeouts?
The process may terminate before the timeout expires, but you don’t want to wait longer for no reason. You want the call to take at most five seconds, not always five seconds.
In the below what is the need for the monitor since the timeout will effectively detect if the process is down?
It won’t; terminations don’t trigger timeouts, time does.

Erlang newbie: why do I have to restart to load new code

I am trying to write a first program in Erlang that effects message communication between a client and server. In theory the server exits when it receives no message from the client, but every time I edit the client code and run the server again, it executes the old code. I have to ^G>q>erl>[re-enter command] to get it to see the new code.
-module(srvEsOne).
%%
%% export functions
%%
-export([start/0]).
%%function definition
start()->
io:format("Server: Starting at pid: ~p \n",[self()]),
case lists:member(serverEsOne, registered()) of
true ->
unregister(serverEsOne); %if the token is present, remove it
false ->
ok
end,
register(serverEsOne,self()),
Pid = spawn(esOne, start,[self()]),
loop(false, false,Pid).
%
loop(Prec, Nrec,Pd)->
io:format("Server: I am waiting to hear from: ~p \n",[Pd]),
case Prec of
true ->
case Nrec of
true ->
io:format("Server: I reply to ~p \n",[Pd]),
Pd ! {reply, self()},
io:format("Server: I quit \n",[]),
ok;
false ->
receiveLoop(Prec,Nrec,Pd)
end;
false ->
receiveLoop(Prec,Nrec,Pd)
end.
receiveLoop(Prec,Nrec,Pid) ->
receive
{onPid, Pid}->
io:format("Server: I received a message to my pid from ~p \n",[Pid]),
loop(true, Nrec,Pid);
{onName,Pid}->
io:format("Server: I received a message to name from ~p \n",[Pid]),
loop(Prec,true,Pid)
after
5000->
io:format("Server: I received no messages, i quit\n",[]),
ok
end.
And the client code reads
-module(esOne).
-export([start/1, func/1]).
start(Par) ->
io:format("Client: I am ~p, i was spawned by the server: ~p \n",[self(),Par]),
spawn(esOne, func, [self()]),
io:format("Client: Now I will try to send a message to: ~p \n",[Par]),
Par ! {self(), hotbelgo},
serverEsOne ! {self(), hotbelgo},
ok.
func(Parent)->
io:format("Child: I am ~p, i was spawned from ~p \n",[self(),Parent]).
The server is failing to receive a message from the client, but I can't sensibly begin to debug that until I can try changes to the code in a more straightforward manner.
When you make modification to a module you need to compile it.
If you do it in an erlang shell using the command c(module) or c(module,[options]), the new compiled version of the module is automatically loaded in that shell. It will be used by all the new process you launch.
For the one that are alive and already use it is is more complex to explain and I think it is not what you are asking for.
If you have several erlang shells running, only the one where you compile the module loaded it. That means that in the other shell, if the module were previously loaded, basically if you already use the module in those shell, and even if the corresponding processes are terminated, the new version is ignored.
Same thing if you use the command erlc to compile.
In all these cases, you need to explicitly load the module with the command l(module) in the shell.
Your server loop contain only local function calls. Running code is changed only if there is remote (or external) function call. So you have to export your loop function first:
-export([loop/3]).
and then you have to change all loop/3 calls in function receiveLoop/3 to
?MODULE:loop(...)
Alternatively you can do same thing with receiveLoop/3 instead. Best practice for serious applications is doing hot code swapping by demand so you change loop/3 to remote/external only after receiving some special message.

How to write a simple webserver in Erlang?

Using the default Erlang installation what is the minimum code needed to produce a "Hello world" producing web server?
Taking "produce" literally, here is a pretty small one. It doesn't even read the request (but does fork on every request, so it's not as minimal possible).
-module(hello).
-export([start/1]).
start(Port) ->
spawn(fun () -> {ok, Sock} = gen_tcp:listen(Port, [{active, false}]),
loop(Sock) end).
loop(Sock) ->
{ok, Conn} = gen_tcp:accept(Sock),
Handler = spawn(fun () -> handle(Conn) end),
gen_tcp:controlling_process(Conn, Handler),
loop(Sock).
handle(Conn) ->
gen_tcp:send(Conn, response("Hello World")),
gen_tcp:close(Conn).
response(Str) ->
B = iolist_to_binary(Str),
iolist_to_binary(
io_lib:fwrite(
"HTTP/1.0 200 OK\nContent-Type: text/html\nContent-Length: ~p\n\n~s",
[size(B), B])).
For a web server using only the built in libraries check out inets http_server.
When in need of some more power but still with simplicity you should check out the mochiweb library. You can google for loads of example code.
Do you actually want to write a web server in Erlang, or do you want an Erlang web server so that you can create dynamic web content using Erlang?
If the latter, try YAWS. If the former, have a look at the YAWS source code for inspiration
Another way, similar to the gen_tcp example above but with less code and already offered as a suggestion, is using the inets library.
%%%
%%% A simple "Hello, world" server in the Erlang.
%%%
-module(hello_erlang).
-export([
main/1,
run_server/0,
start/0
]).
main(_) ->
start(),
receive
stop -> ok
end.
run_server() ->
ok = inets:start(),
{ok, _} = inets:start(httpd, [
{port, 0},
{server_name, "hello_erlang"},
{server_root, "/tmp"},
{document_root, "/tmp"},
{bind_address, "localhost"}
]).
start() -> run_server().
Keep in mind, this exposes your /tmp directory.
To run, simply:
$ escript ./hello_erlang.erl
For a very easy to use webserver for building restful apps or such check out the gen_webserver behaviour: http://github.com/martinjlogan/gen_web_server.
Just one fix for Felix's answer and it addresses the issues Martin is seeing. Before closing a socket, all data being sent from the client should be received (using for example do_recv from gen_tcp description).
Otherwise there's a race condition for the browser/proxy sending the HTTP request being quick enough to send the http request before the socket is closed.

Resources