Hello I am trying to figure out why my implementation of gen_server is not respecting pattern matching:
If I run gen_server:call(ServerRef,state) it gets in the second pattern of handle_call and i do not understand why , since the the first pattern should get hit.Is there a problem when sending atoms?
Module
-module(wk).
-behaviour(gen_server).
-compile(export_all).
-record(state,{
limit,
processed=[],
unknown=[],
counter=0
}).
start_link(Limit)->
gen_server:start_link(?MODULE, Limit, []).
start(Limit)->
gen_server:start(?MODULE,Limit,[]).
init(Limit)->
State=#state{limit=Limit},
{ok,State}.
handle_call(From,state,State)->
{reply,State,State};
handle_call(From,Message,State=#state{processed=P,limit=L,counter=C})->
Reply=if C=:=L;C>L -> exit(consumed);
C<L -> {{processed,self(),os:timestamp()},Message}
end,
{reply,Reply,State#state{counter=C+1,processed=[Message,P]}}.
handle_info(Message,State=#state{unknown=U})->
{noreply,State#state{unknown=[Message|U]}}.
Calling:
>gen_server:call(ServerRef,state) enters into the second pattern too.
Because you have wrong parameter order: it should be Module:handle_call(Request, From, State). So the first pattern would only match when From is state.
Related
I am learning Erlang from a Ruby background and having some difficulty grasping the thought process. The problem I am trying to solve is the following:
I need to make the same request to an api, each time I receive a unique ID in the response which I need to pass into the next request until there is not ID returned. From each response I need to extract certain data and use it for other things as well.
First get the iterator:
ShardIteratorResponse = kinetic:get_shard_iterator(GetShardIteratorPayload).
{ok,[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89D"...>>}]}
Parse out the shard_iterator..
{_, [{_, ShardIterator}]} = ShardIteratorResponse.
Make the request to kinesis for the streams records...
GetRecordsPayload = [{<<"ShardIterator">>, <<ShardIterator/binary>>}].
[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89DETABlwVV"...>>}]
14> RecordsResponse = kinetic:get_records(GetRecordsPayload).
{ok,[{<<"NextShardIterator">>,
<<"AAAAAAAAAAFy3dnTJYkWr3gq0CGo3hkj1t47ccUS10f5nADQXWkBZaJvVgTMcY+nZ9p4AZCdUYVmr3dmygWjcMdugHLQEg6x"...>>},
{<<"Records">>,
[{[{<<"Data">>,<<"Zmlyc3QgcmVjb3JkISEh">>},
{<<"PartitionKey">>,<<"BlanePartitionKey">>},
{<<"SequenceNumber">>,
<<"49545722516689138064543799042897648239478878787235479554">>}]}]}]}
What I am struggling with is how do I write a loop that keeps hitting the kinesis endpoint for that stream until there are no more shard iterators, aka I want all records. Since I can't re-assign the variables as I would in Ruby.
WARNING: My code might be bugged but it's "close". I've never ran it and don't see how last iterator can look like.
I see you are trying to do your job entirely in shell. It's possible but hard. You can use named function and recursion (since release 17.0 it's easier), for example:
F = fun (ShardIteratorPayload) ->
{_, [{_, ShardIterator}]} = kinetic:get_shard_iterator(ShardIteratorPayload),
FunLoop =
fun Loop(<<>>, Accumulator) -> % no clue how last iterator can look like
lists:reverse(Accumulator);
Loop(ShardIterator, Accumulator) ->
{ok, [{_, NextShardIterator}, {<<"Records">>, Records}]} =
kinetic:get_records([{<<"ShardIterator">>, <<ShardIterator/binary>>}]),
Loop(NextShardIterator, [Records | Accumulator])
end,
FunLoop(ShardIterator, [])
end.
AllRecords = F(GetShardIteratorPayload).
But it's too complicated to type in shell...
It's much easier to code it in modules.
A common pattern in erlang is to spawn another process or processes to fetch your data. To keep it simple you can spawn another process by calling spawn or spawn_link but don't bother with links now and use just spawn/3.
Let's compile simple consumer module:
-module(kinetic_simple_consumer).
-export([start/1]).
start(GetShardIteratorPayload) ->
Pid = spawn(kinetic_simple_fetcher, start, [self(), GetShardIteratorPayload]),
consumer_loop(Pid).
consumer_loop(FetcherPid) ->
receive
{FetcherPid, finished} ->
ok;
{FetcherPid, {records, Records}} ->
consume(Records),
consumer_loop(FetcherPid);
UnexpectedMsg ->
io:format("DROPPING:~n~p~n", [UnexpectedMsg]),
consumer_loop(FetcherPid)
end.
consume(Records) ->
io:format("RECEIVED:~n~p~n",[Records]).
And fetcher:
-module(kinetic_simple_fetcher).
-export([start/2]).
start(ConsumerPid, GetShardIteratorPayload) ->
{ok, [ShardIterator]} = kinetic:get_shard_iterator(GetShardIteratorPayload),
fetcher_loop(ConsumerPid, ShardIterator).
fetcher_loop(ConsumerPid, {_, <<>>}) -> % no clue how last iterator can look like
ConsumerPid ! {self(), finished};
fetcher_loop(ConsumerPid, ShardIterator) ->
{ok, [NextShardIterator, {<<"Records">>, Records}]} =
kinetic:get_records(shard_iterator(ShardIterator)),
ConsumerPid ! {self(), {records, Records}},
fetcher_loop(ConsumerPid, NextShardIterator).
shard_iterator({_, ShardIterator}) ->
[{<<"ShardIterator">>, <<ShardIterator/binary>>}].
As you can see both processes can do their job concurrently.
Try from your shell:
kinetic_simple_consumer:start(GetShardIteratorPayload).
Now your see that your shell process turns to consumer and you will have your shell back after fetcher will send {ItsPid, finished}.
Next time instead of
kinetic_simple_consumer:start(GetShardIteratorPayload).
run:
spawn(kinetic_simple_consumer, start, [GetShardIteratorPayload]).
You should play with spawning processes - it's erlang main strength.
In Erlang, you can write loop using tail recursive functions. I don't know the kinetic API, so for simplicity, I just assume, that kinetic:next_iterator/1 return {ok, NextIterator} or {error, Reason} when there are no more shards.
loop({error, Reason}) ->
ok;
loop({ok, Iterator}) ->
do_something_with(Iterator),
Result = kinetic:next_iterator(Iterator),
loop(Result).
You are replacing loop with iteration. First clause deals with case, where there are no more shards left (always start recursion with the end condition). Second clause deals with case, where we got some iterator, we do something with it and call next.
The recursive call is last instruction in the function body, which is called tail recursion. Erlang optimizes such calls - they don't use call stack, so they can run infinitely in constant memory (you will not get anything like "Stack level too deep")
What is the difference in ending a function with end and ok in Erlang? I've been trying to grasp the meaning in the following code:
-module(esOne).
-export([start/1, func/1]).
start(Par) ->
io:format("Client: I am ~p, spawned by the server: ~p~n",[self(),Par]),
spawn(esOne, func, [self()]),
Par ! {onPid, self()},
serverEsOne ! {onName, self()},
receiveMessage(),
ok.
receiveMessage() ->
receive
{reply, N} ->
io:format("Client: I received a message: ~p~n",[N])
after
5000->
io:format("Client: I received no message, i quit~n",[])
end.
func(Parent)->
io:format("Child: I am ~p, spawned from ~p~n",[self(),Parent]).
This code works in conjunction with another .erl file that acts as server. I managed to write this only through analyzing the given server file and copying it's behavior. First I thought ok was used to end every function, but that is not the case as I can't end receiveMessage() with ok. Then I thought I could maybe end every function with end, but start(Par) will give an error if I replace ok by end. Not only that, but in the server file I see that ok and end are used within functions to end loops. The way they're used look the same to me, yet they clearly fulfill a separate function as one cannot be replaced by another. Some clarification would be much appreciated.
Two points to understand:
Some code block types in Erlang are closed with an "end". So if ... end, case ... end, receive ... [after N] ... end and so on. It is certainly possible to use "end" as its own atom in place of OK, but that is not what is happening above.
Every function in Erlang returns some value. If you aren't explicit about it, it returns the value of the last expression. The "=" operator isn't assignment to a variable like in other languages, it is assignment to a symbol as in math, meaning that reassigning is effectively a logical assertion. If the assertion fails, the process throws an exception (meaning it crashes, usually).
When you end something with "ok" or any other atom you are providing a known final value that will be returned. You don't have to do anything with it, but if you want the calling process to assert that the function completed or crash if anything unusual happened then you can:
do_stuff() ->
ok = some_func().
instead of
do_stuff() ->
some_func().
If some_func() may have had a side effect that can fail, it will usually return either ok or {error, Reason} (or something similar). By checking that the return was ok we prevent the calling process from continuing execution if something bad happened. That is central to the Erlang concept of "let it crash". The basic idea is that if you call a function that has a side-effect and it does anything unexpected at all, you should crash immediately, because proceeding with bad data is worse than not proceeding at all. The crash will be cleaned up by the supervisor, and the system will be restored to a known state instead of being in whatever random condition was left after the failure of the side-effect.
A variation on the bit above is to have the "ok" part appear in a tuple if the purpose of a function is to return a value. You can see this in any dict-type handling library, for example. The reason some data returning functions have a return type of {ok, Value} | {error, Reason} instead of just Value | {error, Reason} is to make pattern matching more natural.
Consider the following case clauses:
case dict:find(Key, Dict) of
{ok, Value} ->
Value;
{error, Reason} ->
log(error, Reason),
error
end.
And:
case finder(Key, Struct) of
Value ->
Value;
{error, Reason}
log(error, Reason),
error
end.
In the first example we match the success condition first. In the second version, though, this is impossible because the error clause could never match; any return at all would always be represented by Value. Oops.
Most of the time (but not quite always) functions that return a value or crash will return just the value. This is especially true of pure functions that carry no state but what you pass in and have no side effects (for example, dict:fetch/2 gives the value directly, or crashes the calling process, giving you an easy choice which way you want to do things). Functions that return a value or signal an error usually wrap a valid response in {ok, Value} so it is easy to match.
I've created the snippet below based on this tutorial. The last two lines (feed_squid(FeederRP) and feed_red_panda(FeederSquid)) are obviously violating the defined constraints, yet Dialyzer finds them okay. This is quite disappointing, because this is exactly the type of error I want to catch with a tool performing static analysis.
There is an explanation provided in the tutorial:
Before the functions are called with the wrong kind of feeder, they're
first called with the right kind. As of R15B01, Dialyzer would not
find an error with this code. The observed behaviour is that as soon
as a call to a given function succeeds within the function's body,
Dialyzer will ignore later errors within the same unit of code.
What is the rationale for this behavior? I understand that the philosophy behind success typing is "to never cry wolf", but in the current scenario Dialyzer plainly ignores the intentionally defined function specifications (after it sees that the functions have been called correctly earlier). I understand that the code does not result in a runtime crash. Can I somehow force Dialyzer to always take my function specifications seriously? If not, is there a tool that can do it?
-module(zoo).
-export([main/0]).
-type red_panda() :: bamboo | birds | eggs | berries.
-type squid() :: sperm_whale.
-type food(A) :: fun(() -> A).
-spec feeder(red_panda) -> food(red_panda());
(squid) -> food(squid()).
feeder(red_panda) ->
fun() ->
element(random:uniform(4), {bamboo, birds, eggs, berries})
end;
feeder(squid) ->
fun() -> sperm_whale end.
-spec feed_red_panda(food(red_panda())) -> red_panda().
feed_red_panda(Generator) ->
Food = Generator(),
io:format("feeding ~p to the red panda~n", [Food]),
Food.
-spec feed_squid(food(squid())) -> squid().
feed_squid(Generator) ->
Food = Generator(),
io:format("throwing ~p in the squid's aquarium~n", [Food]),
Food.
main() ->
%% Random seeding
<<A:32, B:32, C:32>> = crypto:rand_bytes(12),
random:seed(A, B, C),
%% The zoo buys a feeder for both the red panda and squid
FeederRP = feeder(red_panda),
FeederSquid = feeder(squid),
%% Time to feed them!
feed_squid(FeederSquid),
feed_red_panda(FeederRP),
%% This should not be right!
feed_squid(FeederRP),
feed_red_panda(FeederSquid).
Minimizing the example quite a bit I have these two versions:
First one that Dialyzer can catch:
-module(zoo).
-export([main/0]).
-type red_panda_food() :: bamboo.
-type squid_food() :: sperm_whale.
-spec feed_squid(fun(() -> squid_food())) -> squid_food().
feed_squid(Generator) -> Generator().
main() ->
%% The zoo buys a feeder for both the red panda and squid
FeederRP = fun() -> bamboo end,
FeederSquid = fun() -> sperm_whale end,
%% CRITICAL POINT %%
%% This should not be right!
feed_squid(FeederRP),
%% Time to feed them!
feed_squid(FeederSquid)
Then the one with no warnings:
[...]
%% CRITICAL POINT %%
%% Time to feed them!
feed_squid(FeederSquid)
%% This should not be right!
feed_squid(FeederRP).
Dialyzer's warnings for the version it can catch are:
zoo.erl:7: The contract zoo:feed_squid(fun(() -> squid_food())) -> squid_food() cannot be right because the inferred return for feed_squid(FeederRP::fun(() -> 'bamboo')) on line 15 is 'bamboo'
zoo.erl:10: Function main/0 has no local return
... and is a case of preferring to trust its own judgement against a user's tighter spec.
For the version it doesn't catch, Dialyzer assumes that the feed_squid/1 argument's type fun() -> bamboo is a supertype of fun() -> none() (a closure that will crash, which, if not called within feed_squid/1, is still a valid argument). After the types have been inferred, Dialyzer cannot know if a passed closure is actually called within a function or not.
Dialyzer still gives a warning if the option -Woverspecs is used:
zoo.erl:7: Type specification zoo:feed_squid(fun(() -> squid_food())) -> squid_food() is a subtype of the success typing: zoo:feed_squid(fun(() -> 'bamboo' | 'sperm_whale')) -> 'bamboo' | 'sperm_whale'
... warning that nothing prevents this function to handle the other feeder or any given feeder! If that code did check for the closure's expected input/output, instead of being generic, then I am pretty sure that Dialyzer would catch the abuse. From my point of view, it is much better if your actual code checks for erroneous input instead of you relying on type specs and Dialyzer (which never promised to find all the errors anyway).
WARNING: DEEP ESOTERIC PART FOLLOWS!
The reason why the error is reported in the first case but not the second has to do with the progress of module-local refinement. Initially the function feed_squid/1 has success typing (fun() -> any()) -> any(). In the first case the function feed_squid/1 will first be refined with just the FeederRP and will definitely return bamboo, immediately falsifying the spec and stopping further analysis of main/0. In the second, the function feed_squid/1 will first be refined with just the FeederSquid and will definitely return sperm_whale, then refined with both FeederSquid and FeederRP and return sperm_whale OR bamboo. When then called with FeederRP the expected return value success-typing-wise is sperm_whale OR bamboo. The spec then promises that it will be sperm_whale and Dialyzer accepts it. On the other hand, the argument should be fun() -> bamboo | sperm_whale success-typing-wise, it is fun() -> bamboo so that leaves it with just fun() -> bamboo. When that is checked against the spec (fun() -> sperm_whale), Dialyzer assumes that the argument could be fun() -> none(). If you never call the passed function within feed_squid/1 (something that Dialyzer's type system doesn't keep as information), and you promise in the spec that you will always return sperm_whale, everything is fine!
What can be 'fixed' is for the type system to be extended to note when a closure that is passed as an argument is actually used in a call and warn in cases where the only way to 'survive' some part of the type inference is to be fun(...) -> none().
(Note, I am speculating a bit here. I have not read the dialyzer code in detail).
A "Normal" full-fledged type checker has the advantage that type checking is decidable. We can ask "Is this program well-typed" and get either a Yes or a No back when the type checker terminates. Not so for the dialyzer. It is essentially in the business of solving the halting problem. The consequence is that there will be programs which are blatantly wrong, but still slips through the grips of the dialyzer.
However, this is not one of those cases :)
The problem is two-fold. A success type says "If this function terminates normally, what is its type?". In the above, our feed_red_panda/1 function terminates for any argument matching fun (() -> A) for an arbitrary type A. We could call feed_red_panda(fun erlang:now/0) and it should also work. Thus our two calls to the function in main/0 does not give rise to a problem. They both terminate.
The second part of the problem is "Did you violate the spec?". Note that often, specs are not used in the dialyzer as a fact. It infers the types itself and uses the inference patterns instead of your spec. Whenever a function is called, it is annotated with the parameters. In our case, it will be annotated with the two generator types: food(red_panda()), food(squid()). Then a function local analysis is made based on these annotations in order to figure out if you violated the spec. Since the correct parameters are present in the annotations, we must assume the function is used correctly in some part of the code. That it is also called with squids could be an artifact of code which are never called due to other circumstances. Since we are function-local we don't know and give the benefit of doubt to the programmer.
If you change the code to only make the wrong call with a squid-generator, then we find the spec-discrepancy. Because we know the only possible call site violates the spec. If you move the wrong call to another function, it is not found either. Because the annotation is still on the function and not on the call site.
One could imagine a future variant of the dialyzer which accounted for the fact that each call-site can be handled individually. Since the dialyzer is changing as well over time, it may be that it will be able to handle this situation in the future. But currently, it is one of the errors that will slip through.
The key is to notice that the dialyzer cannot be used as a "Checker of well-typedness". You can't use it to enforce structure on your programs. You need to do that yourself. If you would like more static checking, it would probably be possible to write a type checker for Erlang and run it on parts of your code base. But you will run into trouble with code upgrades and distribution, which are not easy to handle.
I am learning Erlang and trying to figure out how I can, and should, save state inside a process.
For example, I am trying to write a program that given a list of numbers in a file, tells me whether a number appears in that file. My approach is to uses two processes
cache which reads the content of the file into a set, then waits for numbers to check, and then replies whether they appear in the set.
is_member_loop(Data_file) ->
Numbers = read_numbers(Data_file),
receive
{From, Number} ->
From ! {self(), lists:member(Number, Numbers)},
is_member_loop(Data_file)
end.
client which sends numbers to cache and waits for the true or false response.
check_number(Number) ->
NumbersPid ! {self(), Number},
receive
{NumbersPid, Is_member} ->
Is_member
end.
This approach is obviously naive since the file is read for every request. However, I am quite new at Erlang and it is unclear to me what would be the preferred way of keeping state between different requests.
Should I be using the process dictionary? Is there a different mechanism I am not aware of for that sort of process state?
Update
The most obvious solution, as suggested by user601836, is to pass the set of numbers as a param to is_member_loop instead of the filename. It seems to be a common idiom in Erlang and there is a good example in the fantastic online book Learn you some Erlang.
I think, however, that the question still holds for more complex state that I'd want to preserve in my process.
Simple solution, you can pass to your function is_member_loop(Data_file) the list of numbers rather then the file name.
The best solution when you deal with a state consists in using a gen_server. To learn more you should take a look at records and gen_server behaviour (this may also be useful).
In practice:
1) start with a module (yourmodule.erl) based on gen_server behaviour
2) read your file in the init function of the gen_server and pass it as state field:
init([]) ->
Numbers = read_numbers(Data_file),
{ok, #state{numbers=Numbers}}.
3) write a function which will be used to trigger a call to the gen_server
check_number(Number) ->
gen_server:call(?MODULE, {check_number, Number}).
4) write the code in order to handle messages triggered from your function
handle_call({check_number, Number}, _From, #state{numbers=Numbers} = State) ->
Reply = lists:member(Number, Numbers)},
{reply, Reply, State};
handle_call(_Request, _From, State) ->
Reply = ok,
{reply, Reply, State}.
5) export from yourmodule.erl function check_number
-export([check_number/1]).
Two things to be explained about point 4:
a) we extract values inside the record State using pattern matching
b) As you may see I left the generic handle call, otherwise your gen_server will fail due to wrong pattern matching whenever a message different from {check_number, Number} is received
Note: if you are new to erlang, don't use process dictionary
Not sure how idiomatic this is, since I'm not exactly an Erlang pro yet, but I'd handle this by using ETS. Basically,
read_numbers_to_ets(DataFile) ->
Table = ets:new(numbers, [ordered_set]),
insert_numbers(Table, DataFile),
Table.
insert_numbers(Table, DataFile) ->
case read_next_number(DataFile) of
eof -> ok;
Num -> ets:insert(numbers, {Num})
end.
you could then define your is_member as
is_member(TableId, Number) ->
case ets:match(TableId, {Number}) of
[] -> false; %% no match from ets
[[]] -> true %% ets found the number you're looking for in that table
end.
Instead of taking a Data_file, your is_member_loop would take the id of the table to do a lookup on.
I often see Erlang functions return ok, or {ok, <some value>}, or {error, <some problem>}.
Suppose my function returns an integer N. Should my function return just N, or {ok, N}?
Or suppose my function includes the call io:function("Yada yada"). Should it return ok or nothing at all?
Or suppose I'm making a record or fun. Should I return {ok, R} or (ok, F}?
Thanks,
LRP
It's a matter of style, but making reasonable choices goes a long way toward making your code more readable and maintainable. Here are some thoughts based on my own preferences and what I see in the standard library:
If the function can return with both a success and a recoverable failure case, you may want the success case to look like {ok, _} or ok. Good examples are: orddict:find/2 and application:start/1. We do this so both cases can be easily pattern matched when making the call.
If the function has only a success case with a meaningful result, then just return the result as there is no need to add additional information. In fact if you were to wrap the result in a tuple, then you can't easily chain calls together e.g. foo(bar(N)) may not work if bar/1 returns {ok, _}. Good examples are: lists:nth/2 and math:cos/1.
If the function has only a success case without a meaningful result, the common idiom is to return ok or true as opposed to whatever the last returned value in the function happened to be. Good examples are: ets:delete/1 and io:format/1.
Suppose my function returns an integer N. Should my function return just N, or {ok, N}?
If there is no possibility of error at all, or any error indicates a programming error, N. If there may be a good reason to fail, return {ok, N} (and {error, Reason} in case of failure).
Should it return ok or nothing at all?
In Erlang, you can't return nothing at all. If you don't have a meaningful return value, ok will do fine.
This is a conundrum I find myself in many atimes. The rule of the thumb (atleast my thumb) is that if a function is to be used primarily with other functions (higher order functions like map(), filter() etc..) or if the function is designed to be "purely" functional (no side effects), then I would prefer to NOT return an {ok, _} and simply return the result. Errors in this case must be due to some logical/algorithmic error which should be fixed and the client of the function should be made aware of any such error.