Different behaviour of ETS with and without a shell - erlang

First a disclaimer I am learning erlang. Not an expert here at all.
While making some examples using ETS I came across something I am not understanding (even after searching).
I have a process where I create a public ETS with
TableID = ets:new(tablename, [public])}
I then pass TableID to other processes. When I do this running the modules form the shell, all is ok. When I run exactly the same modules with erl -noshell -s ... or even without the -noshell option, it fails.
I keep getting error:badarg as if the tabled does not exist. The ID is properly passes, but the table is actually behaving as private!
Is there a difference between running modules interactively from the shell or without?
Thanks
I am adding an example of the code I am using to try and debug the issue. As it is a piece of a larger software (and it is basically stripped to the bone to find the issue), it might be difficult to understand.
-record(loop_state, {
commands
}).
start() ->
LoopState = #loop_state{commands = ets:new(commands, [public])},
tcpserver_otp_backend:start(?MODULE, 7000, {?MODULE, loop}, LoopState).
loop(Socket, LoopState = #loop_state{commands = Commands}) ->
case gen_tcp:recv(Socket, 0) of
{ok, Data} ->
% the call below fails, no error generated, AND only in non interactive shell
A = newCommand(Commands, Data),
gen_tcp:send(Socket, A),
loop(Socket, LoopState);
{error, closed} ->
ok
end.
newCommand(CommandTableId, Command) ->
case ets:lookup(CommandTableId,Command) of
[] ->
_A = ets:insert(CommandTableId, {Command, 1}),
<<1, "new", "1">>; % used for testing
_ ->
<<1, "old", "1">> % used for testing
end.
When I remove the "offending command" ets:lookup, all works again as int he interactive shell.

The problem seems to be that you create the ets table in your start() function. An ets table has an owner (by default the creating process), and when the owner dies, the table gets deleted. When you run the start/0 function from the command line by passing -s to erl, that owner process will be some internal process in the Erlang kernel that is part of handling the startup sequence. Regardless of whether you pass -noshell or not, that process is probably transient and will die before the lookup function gets time to execute, so the table no longer exists when the lookup finally happens.
The proper place to create the ets table would be in the init() callback function of the gen_server that you start up. If it's supposed to be a public est table accessed by several processes, then you might want to create a separate server process whose task it is to own the table.

Related

Erlang: spawn a process and wait for termination without using `receive`

In Erlang, can I call some function f (BIF or not), whose job is to spawn a process, run the function argf I provided, and doesn't "return" until argf has "returned", and do this without using receive clause (the reason for this is that f will be invoked in a gen_server, I don't want pollute the gen_server's mailbox).
A snippet would look like this:
%% some code omitted ...
F = fun() -> blah, blah, timer:sleep(10000) end,
f(F), %% like `spawn(F), but doesn't return until 10 seconds has passed`
%% ...
The only way to communicate between processes is message passing (of course you can consider to poll for a specific key in an ets or a file but I dont like this).
If you use a spawn_monitor function in f/1 to start the F process and then have a receive block only matching the possible system messages from this monitor:
f(F) ->
{_Pid, MonitorRef} = spawn_monitor(F),
receive
{_Tag, MonitorRef, _Type, _Object, _Info} -> ok
end.
you will not mess your gen_server mailbox. The example is the minimum code, you can add a timeout (fixed or parameter), execute some code on normal or error completion...
You will not "pollute" the gen_servers mailbox if you spawn+wait for message before you return from the call or cast. A more serious problem with this maybe that you will block the gen_server while you are waiting for the other process to terminate. A way around this is to not explicitly wait but return from the call/cast and then when the completion message arrives handle it in handle_info/2 and then do what is necessary.
If the spawning is done in a handle_call and you want to return the "result" of that process then you can delay returning the value to the original call from the handle_info handling the process termination message.
Note that however you do it a gen_server:call has a timeout value, either implicit or explicit, and if no reply is returned it generates an error in the calling process.
Main way to communicate with process in Erlang VM space is message passing with erlang:send/2 or erlang:send/3 functions (alias !). But you can "hack" Erlang and use multiple way for communicating over process.
You can use erlang:link/1 to communicate stat of the process, its mainly used in case of your process is dying or is ended or something is wrong (exception or throw).
You can use erlang:monitor/2, this is similar to erlang:link/1 except the message go directly into process mailbox.
You can also hack Erlang, and use some internal way (shared ETS/DETS/Mnesia tables) or use external methods (database or other things like that). This is clearly not recommended and "destroy" Erlang philosophy... But you can do it.
Its seems your problem can be solved with supervisor behavior. supervisor support many strategies to control supervised process:
one_for_one: If one child process terminates and is to be restarted, only that child process is affected. This is the default restart strategy.
one_for_all: If one child process terminates and is to be restarted, all other child processes are terminated and then all child processes are restarted.
rest_for_one: If one child process terminates and is to be restarted, the 'rest' of the child processes (that is, the child processes after the terminated child process in the start order) are terminated. Then the terminated child process and all child processes after it are restarted.
simple_one_for_one: A simplified one_for_one supervisor, where all child processes are dynamically added instances of the same process type, that is, running the same code.
You can also modify or create your own supervisor strategy from scratch or base on supervisor_bridge.
So, to summarize, you need a process who wait for one or more terminating process. This behavior is supported natively with OTP, but you can also create your own model. For doing that, you need to share status of every started process, using cache or database, or when your process is spawned. Something like that:
Fun = fun
MyFun (ParentProcess, {result, Data})
when is_pid(ParentProcess) ->
ParentProcess ! {self(), Data};
MyFun (ParentProcess, MyData)
when is_pid(ParentProcess) ->
% do something
MyFun(ParentProcess, MyData2) end.
spawn(fun() -> Fun(self(), InitData) end).
EDIT: forgot to add an example without send/receive. I use an ETS table to store every result from lambda function. This ETS table is set when we spawn this process. To get result, we can select data from this table. Note, the key of the row is the process id of the process.
spawner(Ets, Fun, Args)
when is_integer(Ets),
is_function(Fun) ->
spawn(fun() -> Fun(Ets, Args) end).
Fun = fun
F(Ets, {result, Data}) ->
ets:insert(Ets, {self(), Data});
F(Ets, Data) ->
% do something here
Data2 = Data,
F(Ets, Data2) end.

Erlang escript launching application with start parameters

Currently my Erlang application is started within an escript (TCP server) and all works fine since it uses the default port I provided. Now I want to pass the port via the escript to the application but I have no idea how. (The app runs a supervisor)
script.escript
!/usr/bin/env escript
%% -*- erlang -*-
-export([main/1]).
main([UDPort, TCPort]) ->
U = list_to_integer(UDPort),
T = list_to_integer(TCPort),
app:start(), %% Want to pass T into the startup.
receive
_ -> ok
end;
...
app.erl
-module(app).
-behaviour(application).
-export([start/0, start/2, stop/0, stop/1]).
-define(PORT, 4300).
start () -> application:start(?MODULE). %% This is called by the escript.
stop () -> application:stop(?MODULE).
start (_StartType, _StartArgs) -> supervisor:start(?PORT).
stop (_State) -> ok.
I'm honestly not sure if this is possible with using application but I thought it best to just ask.
The common way is to start things from whatever shell just calling
erl -run foo
But you can also do
erl -appname key value
to set an environment value and then
application:get_env(appname, key)
to get the value you are looking for.
That said...
I like to have service applications be things that don't have to shut down to be (re)configured. I usually include some message protocol like {config, Aspect, Setting} or similar that can alter the basic state of a service on the fly. Because I often do this I usually just wind up having whatever script starts up the application also send a configuration message to it.
So with this in mind, consider this rough conceptual example:
!/usr/bin/env escript
%% -*- erlang -*-
-export([main/1]).
main([UDPort, TCPort]) ->
U = list_to_integer(UDPort),
T = list_to_integer(TCPort),
ok = case whereis(app) of
undefined -> app:start();
_Pid -> ok
end,
ok = set_ports(U, T).
%% Just an illustration.
%% Making this a synchronous gen_server/gen_fsm call is way better.
set_ports(U, T) ->
app ! {config, listen, {tcp, T}},
app ! {config, listen, {udp, U}},
ok.
Now not only is the startup script a startup script, it is also a config script. The point isn't to have a startup script, it is to have a service running on the ports you designated. This isn't a conceptual fit for all tools, of course, but it should give you some ideas. There is also the practice of putting a config file somewhere the application knows to look and just reading terms from it, among other techniques (like including ports in the application specification, etc.).
Edit
I just realized you are doing this in an escript which will spawn a new node every time you call it. To make the technique above work properly you would need to make the escript name a node for the service to run on, and locate it if it already exists.

Pass some arguments to supervisor init function when app starts

I want to pass some arguments to supervisor:init/1 function and it is desirable, that the application's interface was so:
redis_pool:start() % start all instances
redis_pool:start(Names) % start only given instances
Here is the application:
-module(redis_pool).
-behaviour(application).
...
start() -> % start without params
application:ensure_started(?APP_NAME, transient).
start(Names) -> % start with some params
% I want to pass Names to supervisor init function
% in order to do that I have to bypass application:ensure_started
% which is not GOOD :(
application:load(?APP_NAME),
case start(normal, [Names]) of
{ok, _Pid} -> ok;
{error, {already_started, _Pid}} -> ok
end.
start(_StartType, StartArgs) ->
redis_pool_sup:start_link(StartArgs).
Here is the supervisor:
init([]) ->
{ok, Config} = get_config(),
Names = proplists:get_keys(Config),
init([Names]);
init([Names]) ->
{ok, Config} = get_config(),
PoolSpecs = lists:map(fun(Name) ->
PoolName = pool_utils:name_for(Name),
{[Host, Port, Db], PoolSize} = proplists:get_value(Name, Config),
PoolArgs = [{name, {local, PoolName}},
{worker_module, eredis},
{size, PoolSize},
{max_overflow, 0}],
poolboy:child_spec(PoolName, PoolArgs, [Host, Port, Db])
end, Names),
{ok, {{one_for_one, 10000, 1}, PoolSpecs}}.
As you can see, current implementation is ugly and may be buggy. The question is how I can pass some arguments and start application and supervisor (with params who were given to start/1) ?
One option is to start application and run redis pools in two separate phases.
redis_pool:start(),
redis_pool:run([] | Names).
But what if I want to run supervisor children (redis pool) when my app starts?
Thank you.
The application callback Module:start/2 is not an API to call in order to start the application. It is called when the application is started by application:start/1,2. This means that overloading it to provide differing parameters is probably the wrong thing to do.
In particular, application:start will be called directly if someone adds your application as a dependency of theirs (in the foo.app file). At this point, they have no control over the parameters, since they come from your .app file, in the {mod, {Mod, Args}} term.
Some possible solutions:
Application Configuration File
Require that the parameters be in the application configuration file; you can retrieve them with application:get_env/2,3.
Don't start a supervisor
This means one of two things: becoming a library application (removing the {mod, Mod} term from your .app file) -- you don't need an application behaviour; or starting a dummy supervisor that does nothing.
Then, when someone wants to use your library, they can call an API to create the pool supervisor, and graft it into their supervision tree. This is what poolboy does with poolboy:child_spec.
Or, your application-level supervisor can be a normal supervisor, with no children by default, and you can provide an API to start children of that, via supervisor:start_child. This is (more or less) what cowboy does.
You can pass arguments in the AppDescr argument to application:load/1 (though its a mighty big tuple already...) as {mod, {Module, StartArgs}} according to the docs ("according to the docs" as in, I don't recall doing it this way myself, ever: http://www.erlang.org/doc/apps/kernel/application.html#load-1).
application:load({application, some_app, {mod, {Module, [Stuff]}}})
Without knowing anything about the internals of the application you're starting, its hard to say which way is best, but a common way to do this is to start up the application and then send it a message containing the data you want it to know.
You could make receipt of the message form tell the application to go through a configuration assertion procedure, so that the same message you send on startup is also the same sort of thing you would send it to reconfigure it on the fly. I find this more useful than one-shotting arguments on startup.
In any case, it is usually better to think in terms of starting something, then asking it to do something for you, than to try telling it everything in init parameters. This can be as simple as having it start up and wait for some message that will tell the listener to then spin up the supervisor the way you're trying to here -- isolated one step from the application inclusion issues RL mentioned in his answer.

Make a process end before timeout

It seems that an erlang process will stay alive until the 5 sec default timeout even if it has finished it's work.
I have gen_server call that issues a command to the window CLI which can be completed in less than 1 sec but the process waits 5 sec before I see the result of the operation. What's going on? is it soemthing to do with the timeout, or might it be something else.
EDIT This call doesn't do anything for 5 seconds (the default timeout!)
handle_call({create_app, Path, Name, Args}, _From, State) ->
case filelib:ensure_dir(Path) of
{error, Reason} ->
{reply, Reason, State};
_ ->
file:set_cwd(Path),
Response = os:cmd(string:join(["Rails", Name, Args], " ")),
{reply, Response, State}
end;
I'm guessing the os:cmd is taking that long to return the results. It's possible that maybe the os:cmd is having trouble telling when the rails command is completed and doesn't return till the process triggers the timeout. But from your code I'd say the most likely culprit is the os:cmd call.
Does the return contain everything you expect it to?
You still have not added any information on what the problem is. But I see some other things I'd like to comment on.
Current working dir
You are using file:set_cwd(Path) so the started command will inherit that path. The cwd of the file server is global. You should probably not be using it at all in application code. It's useful for setting the cwd to where you want erlang crash dumps to be written etc.
Your desire to let rail execute with the cwd according to Path is better served with something like this:
_ ->
Response = os:cmd(string:join(["cd", Path, "&&", "Rails", Name, Args], " ")),
{reply, Response, State}
That is, start a shell to parse the command line, have the shell change cwd and the start Rails.
Blocking a gen_server
The gen_server is there to serialize processing. That is, it processes one message after the other. It doesn't handle them all concurrently. It is its reason for existence to not handle them concurrently.
You are (in relation to other costs) doing some very costly computation in the gen_server: starting an external process that runs this rails application. Is it your intention to have at most one rails application running at any one time? (I've heard about ruby on rails requiring tons of memory per process, so it might be a sensible decision).
If you dont need to update the State with any values from a costly call, as in your example code, then you can use an explicit gen_server:reply/2 call.
_ ->
spawn_link(fun () -> rails_cmd(From, Path, Name, Args) end),
{no_reply, State}
And then you have
rails_cmd(From, Path, Name, Args) ->
Response = os:cmd(string:join(["cd", Path, "&&", "Rails", Name, Args], " ")),
gen_server:reply(From, Response).

Erlang - spawning processes and passing arguments

I keep running into this. I want to spawn processes and pass arguments
to them without using the MFA form (module/function/arguments), so
basically without having to export the function I want to spawn with
arguments. I've gotten around this a few times using closures(fun's)
and having the arguments just be bound values outside the fun(that I then reference inside the fun), but its
limiting my code structure... I've looked at the docs and spawn only
has the regular spawn/1 and the spawn/3 form, nothing else...
I understand that code reloading in spawned processes is not possible without the use of the MFA form but the spawned processes are not of the long running nature and finish relatively quickly so that's not an issue (I also want to contain all the code in one module-level function with sub-jobs being placed in funs inside that function).
much appreciated
thanks
actually Richard pointed me in the right direction to take to avoid the issue nicelly (in a reply to the same post I put up on the Erlang GoogleGroups):
http://groups.google.com/group/erlang-programming/browse_thread/thread/1d77a697ec67935a
His answer:
By "using closures", I hope you mean something like this:
Pid = spawn(fun () -> any_function(Any, Number, Of, Arguments) end)
How would that be limiting to your code structure?
/Richard
thank you for promptly commenting you my question. Much appreciated
Short answer: you can't. Spawn (in all it's varying forms) only takes a 0-arity function. Using a closure and bringing in bound variables from the spawning function is the way to go, short of using some sort of shared data store like ETS (which is Monster Overkill).
I've never found using a closure to severely hamper my code structure, though; can you give an example of the problems you're having, and perhaps someone can tidy it up for you?
This is an old question but I believe it can be properly answered with a bit of creativity:
The goal of the question is to
Invoke a function
With the following limits;
No M:F/A formatting
No exporting of the Invoked function
This can be solved in the following;
Using the 1st limitation leads us to the following solution:
run() ->
Module = module,
Function = function,
Args = [arg1, arg2, arg3],
erlang:spawn(Module, Function, Args).
In this solution however, the function is required to be exported.
Using the 2nd limitation (No exporting of the Invoked function) alongside the 1st leads us to the following solution using conventional erlang logic:
run() ->
%% Generate an anonymous fun and execute it
erlang:spawn(fun() -> function(arg1, arg2, arg3) end).
This solution generates Anonymous Funs every execution which may or may not be wanted based on your design due to the extra work that the Garbage Colelctor will need to perform (note that, generally, this will be neglible and issues will potentially only be seen in larger systems).
An alternative way to write the above without generating Anonymous Funs would be to spawn an erlang:apply/2 which can execute functions with given parameters.
By passing a Function Ref. to erlang:apply/2, we can reference a local function and invoke it with the given arguments.
The following implements this solution:
run() ->
%% Function Ref. to a local (non-exported) function
Function = fun function/arity,
Args = [arg1, arg2, arg3],
erlang:spawn(erlang, apply, [Function, Args]).
Edit: This type of solution can be found within the Erlang Src whereby erlang:apply/2 is being called to execute a fun() with args.
%% https://github.com/erlang/otp/blob/71af97853c40d8ac5f499b5f2435082665520642/erts/preloaded/src/erlang.erl#L2888%% Spawn and atomically set up a monitor.
-spec spawn_monitor(Fun) -> {pid(), reference()} when
Fun :: function().
spawn_monitor(F) when erlang:is_function(F, 0) ->
erlang:spawn_opt(erlang,apply,[F,[]],[monitor]);
spawn_monitor(F) ->
erlang:error(badarg, [F]).
first, there is no code and we can't help you a lot, so the best way to control your functions and their args with your spawned processes is to spawn the process with a receive function then you will be in contact with your process across the send and receive method, try:
Pid=spawn(Node, ModuleName, functionThatReceive, [])
%%or just spawn(ModuleName....) if the program is not %%distributed
Pid ! {self(), {M1, f1, A1}},
receive
{Pid, Reply} ->Reply
end,
Pid ! {self(), {M2, f2, A2}},
receive
{Pid, Reply} ->Reply
end,
.......
functionThatReceive() ->
receive
{From, {M1, f1, A1}} ->From ! {self(), doSomething1} ;
{From, {M2, f2, A2}} ->From ! {self(), doSomething2}
end.

Resources