Reading files in Erlang - erlang

I have an Erlang app that fetches some key/value pair from files.
In one of the files, instead of the key/value pairs, I have a path to a different folder, in which it has the key/value pairs.
The reading of the files are working fine, but whenever I have to read the path from a file and just then read the subsequent file, the app is complaining that the file is nonexistent.
If I jump into the container and check, I can see the file and the subsequent file. So I guess I'm missing something.
Here is what I have:
get_kvs("keyvalue.properties"), %% <-- This works
{ok, PathToFile} = file:read_file("pathtofile.properties"),
get_kvs(PathToFile), %% <-- This crashes
Files:
keyvalue.properties:
key_1=val_1
key_2=val_2
key_3=val_3
pathtofile.properties:
/data/myfolder/hidden_keyvalue.properties
/data/myfolder/hidden_keyvalue.properties:
extra_key1=extra_val1
extra_key2=extra_val2
extra_key3=extra_val3
And the get_metadata function:
get_metadata(FileName) ->
io:format(FileName),
{ok, MetadataFile} = file:read_file(FileName),
io:format(MetadataFile),
Lines = binary:split(MetadataFile, <<"\n">>, [trim, global]),
make_tuples(Lines, []).
make_tuples([Line|Lines], Acc) ->
[Key, Value] = binary:split(Line, <<"=">>),
make_tuples(Lines, [{Key, Value}|Acc]);
make_tuples([], Acc) -> lists:reverse(Acc).
Whenever running it I can see that the PathToFile is being properly populated, but when I try to read the path I get the error below:
keyvalue.propertiesextra_key_1=extra_val_1
extra_key_2=extra_val_2
extra_key_3=extra_val_3
/data/myfolder/hidden_keyvalue.properties
=CRASH REPORT==== 23-Mar-2022::07:46:30.612093 ===
crasher:
initial call: cowboy_stream_h:request_process/3
pid: <0.735.0>
registered_name: []
exception error: no match of right hand side value {error,enoent}
in function hello_handler:get_metadata/1 (/data/apps/hello_server/src/hello_handler.erl, line 40)
in call from hello_handler:child_call/1 (/data/apps/hello_server/src/hello_handler.erl, line 28)
Any ideas of what am I missing?

After the OP has been modified to reflect the actual point of failure by removing try-catch, the error seems to be {error, enoent }which means The file does not exist. However, the same function is working in some scenarios and not when the path of the file to be read is itself taken from another file.
Just make sure there are mo additional characters, like newlines or non-printable characters after the content of the file which should actually be a valid path.
For example, when I tried with a value as such, <<"hidden_keyvalue.properties\n\n">>, then read_file gave me same result as {error, enoent}.
So it could be possible that the content of the file from which paths are read has additional non-printable characters at the end.
Ignore (Wrong Assumption)
I tried with a local setup and I think this line inside make_tuples is causing that behavior.
[Key, Value] = binary:split(Line, <<"=">>) %% <--- This match will succeed only if there are exactly 2 elements in the list produced by binary:split(...).
Given we are doing inside get_metadata
Lines = binary:split(MetadataFile, <<"\n">>, [trim, global]), %% <-- This splits the contents of the keyvalue.properties file into a List
make_tuples(Lines, [])
From the provided text, it appears that the content of keyvalue.proiperties file is ...
key_1=val_1
key_2=val_2
key_3=val_3
/data/myfolder/hidden_keyvalue.properties %% <-- This isn't a valid key-value pair
And with that last line, make_tuples while doing a match will fail the following way...
[Key, Value] = binary:split(<<"/data/myfolder/hidden_keyvalue.properties">>, <<"=">>).
** exception error: no match of right hand side value [<<"/data/myfolder/hidden_keyvalue.properties">>]
That pattern-match expression requires exactly 2 elements on the right-hand side(the Term/Value side). Since the line with path-entry in the keyvalue.properties does not have an =, so the split produces a list with 1 element, and so the match fails.
To address this...
We can change the format of keyvalue.properties so that every line is a valid key-value pair, but not sure how feasible it will be in the context of the program.
Or, we can change that pattern-match form of list so that it can accept non-exact number of terms while matching
make_tuples([Line|Lines], Acc) ->
%% [Key|Value] is at least 1 element list, [Key, Value] is exactly 2 elements list
[Key|Value] = binary:split(Line, <<"=">>), %% <-- Now Value becomes a list of binary
make_tuples(Lines, [{Key, get_value(Value)}|Acc]); %% <-- Additional get_value function to extract 'Value binary' from the head of the 'Value List' produced in the previous line
make_tuples([], Acc) -> lists:reverse(Acc).
get_value([]) ->
<<>>;
get_value([V|_]) -> V.
Assumptions
get_kvs invokes get_metadata which in turn invokes make_tuples.
keyvalue.properties file content has both valid key-value pairs and also some non-key-value entries

Related

Tests: check if tuple is returned

I am writing a test which checks response from gen_server. The response itself is either {profile, SomeProfileFromGenServer} or {error, ErrorResponse}
So I wanted to write a test which does:
Profile = mygenserver:get_profile(),
?assertEqual(Profile, {profile, SomeProfile})
As I don't really care about the SomeProfile value. But this says that SomeProfile is unbound :( Is there a way to fix it?
You can use ?assertMatch, with the first argument being a pattern:
?assertMatch({profile, _}, Profile)
assertMatch(GuardedPattern, Expr)
Evaluates Expr and matches the result against GuardedPattern, if testing is enabled. If the match fails, an informative exception will be generated; see the assert macro for further details. GuardedPattern can be anything that you can write on the left hand side of the -> symbol in a case-clause, except that it cannot contain comma-separated guard tests.
The main reason for using assertMatch also for simple matches, instead of matching with =, is that it produces more detailed error messages.
Examples:
?assertMatch({found, {fred, _}}, lookup(bloggs, Table))
?assertMatch([X|_] when X > 0, binary_to_list(B))

Proper Call of runParser Haskell [duplicate]

Right now I have two types:
type Rating = (String, Int)
type Film = (String, String, Int, [Rating])
I have a file that has this data in it:
"Blade Runner"
"Ridley Scott"
1982
("Amy",5), ("Bill",8), ("Ian",7), ("Kevin",9), ("Emma",4), ("Sam",7), ("Megan",4)
"The Fly"
"David Cronenberg"
1986
("Megan",4), ("Fred",7), ("Chris",5), ("Ian",0), ("Amy",6)
How can I look through then file storing all of the entries into something like FilmDatabase = [Film] ?
Haskell provides a unique way of sketching out your approach. Begin with what you know
module Main where
type Rating = (String, Int)
type Film = (String, String, Int, [Rating])
main :: IO ()
main = do
films <- readFilms "ratings.dat"
print films
Attempting to load this program into ghci will produce
films.hs:8:12: Not in scope: `readFilms'
It needs to know what readFilms is, so add just enough code to keep moving.
readFilms = undefined
It is a function that should do something related to Film data. Reload this code (with the :reload command or :r for short) to get
films.hs:9:3:
Ambiguous type variable `a0' in the constraint:
(Show a0) arising from the use of `print'
...
The type of print is
Prelude> :t print
print :: Show a => a -> IO ()
In other words, print takes a single argument that, informally, knows how to show itself (that is, convert its contents to a string) and creates an I/O action that when executed outputs that string. It’s more-or-less how you expect print to work:
Prelude> print 3
3
Prelude> print "hi"
"hi"
We know that we want to print the Film data from the file, but, although good, ghc can’t read our minds. But after adding a type hint
readFilms :: FilePath -> Film
readFilms = undefined
we get a new error.
films.hs:8:12:
Couldn't match expected type `IO t0'
with actual type `(String, String, Int, [Rating])'
Expected type: IO t0
Actual type: Film
In the return type of a call of `readFilms'
In a stmt of a 'do' expression: films <- readFilms "ratings.dat"
The error tells you that the compiler is confused about your story. You said readFilms should give it back a Film, but the way you called it in main, the computer should have to first perform some I/O and then give back Film data.
In Haskell, this is the difference between a pure string, say "JamieB", and a side effect, say reading your input from the keyboard after prompting you to input your Stack Overflow username.
So now we know we can sketch readFilms as
readFilms :: FilePath -> IO Film
readFilms = undefined
and the code compiles! (But we can’t yet run it.)
To dig down another layer, pretend that the name of a single movie is the only data in ratings.dat and put placeholders everywhere else to keep the typechecker happy.
readFilms :: FilePath -> IO Film
readFilms path = do
alldata <- readFile path
return (alldata, "", 0, [])
This version compiles, and you can even run it by entering main at the ghci prompt.
In dave4420’s answer are great hints about other functions to use. Think of the method above as putting together a jigsaw puzzle where the individual pieces are functions. For your program to be correct, all the types must fit together. You can make progress toward your final working program by taking little babysteps as above, and the typechecker will let you know if you have a mistake in your sketch.
Things to figure out:
How do you convert the whole blob of input to individual lines?
How do you figure out whether the line your program is examining is a title, a director, and so on?
How do you convert the year in your file (a String) to an Int to cooperate with your definition of Film?
How do you skip blank or empty lines?
How do you make readFilms accumulate and return a list of Film data?
Is this homework?
You might find these functions useful:
readFile :: FilePath -> IO String
lines :: String -> [String]
break :: (a -> Bool) -> [a] -> ([a], [a])
dropWhile :: (a -> Bool) -> [a] -> [a]
null :: [a] -> Bool
read :: Read a => String -> a
Remember that String is the same as [Char].
Some clues:
dropWhile null will get rid of empty lines from the start of a list
break null will split a list into the leading run of non-empty lines, and the rest of the list
Haskell has a great way of using the types to find the right function. For instance: In Gregs answer, he wants you to figure out (among other things) how to convert the year of the film from a String to an Int. Well, you need a function. What should be the type of that function? It takes a String and returns an Int, so the type should be String -> Int. Once you have that, go to Hoogle and enter that type. This will give you a list of functions with similar types. The function you need actually has a slightly different type - Read a => String -> a - so it is a bit down the list, but guessing a type and then scanning the resulting list is often a very useful strategy.

Erlang syntax in function

I have found this code in Ejabberd:
maybe_post_request([$< | _ ] = Data, Host, ClientIp)
I don't understand what [$< | _ ] = Data part do with Data. Could somebody explain?
The construct
[$< | _] = Data
applies a pattern match to Data, expecting it to be a list variable whose first element is the character < and ignoring the rest the elements. Try it in the Erlang shell:
1> Data = "<foo>".
"<foo>"
2> [$<|_] = Data.
"<foo>"
But if Data doesn't match, we get an exception:
3> f(Data), Data = "foo".
"foo"
4> [$<|_] = Data.
** exception error: no match of right hand side value "foo"
I don't understand what [$< | _ ] = Data part do with Data. Could
somebody explain?
It binds the variable Data to the entire first argument to the function.
The left hand side pattern matches the first argument so that this function clause only matches when the first argument is a string (list) starting with the character <. The variable Data is assigned the entire string fr use in the function body.
It's a way of having your cake and eating it at the same time. Data refers to the whole thing while the [$<|_] lets you match it and pull it apart. The putting then together with = in a patterns allows you to do both. In a pattern like this it is generally called an alias. It means that both sides much match and in an argument in a function head (which is where you saw it) the order is irrelevant so the function head could have been written as
maybe_post_request([$< | _ ] = Data, Host, ClientIp)
or
maybe_post_request(Data = [$< | _ ], Host, ClientIp)
Of course in the function body or in the shell they are not equivalent.
I personally prefer the first alternative as that says matching, pulling apart to me.

Erlang - Parse data from the enclosed curly braces

Erlang experts, I am getting a data like the following from ejabberd server
I(<0.397.0>:mod_http_offline:38) : Data of Fromu {jid,"timok","localhost",
"25636221451404911062246700",
"timok","localhost",
"25636221451404911062246700"}
I am a very much confused about this data type. All I need is to get timok from the enclosed flower braces. {} But not sure how to get the value. Any code to get the value will be much helpful. Currently I am printing the values using the below code
?INFO_MSG("Data of Fromu ~p",[_From]),
Thanks once again for your time and effort.
That's an erlang record (it's a tuple, first element an atom, other elements lists/strings/binaries).
Recommended:
Ejabberd has a jid record definition (line 411):
-record(jid, {user = <<"">> :: binary(),
server = <<"">> :: binary(),
resource = <<"">> :: binary(),
luser = <<"">> :: binary(),
lserver = <<"">> :: binary(),
lresource = <<"">> :: binary()}).
It's in the ejabberd/include/jlib.hrl file, so you should be able make it known to your module by including it this way:
-include_lib("ejabberd/include/jlib.hrl").
Now, in your module to access the (first) "timok" element of your data, you can use the erlang record syntax (assuming JidData contains the data mentioned above):
Out = JidData#jid.user.
Not recommended:
As records are, behind their appearance, tuples, you can also access the nth element of the tuple
Out = element(2,JidData).
Or simply use pattern matching:
{_, Out, _, _, _, _} = JidData.
Use Record Definitions
A record is basically syntaxic sugar on a tuple. It remains a tuple and can be treated as such. They're easy to work with, but you should do what you can to avoid treating a record as a tuple, unless you really know what you're doing.
And because in this case you don't even control the record definition, you really should use it, otherwise changes in the definition, following an update, will invalidate your code.
You seem to be trying to access the second item in the tuple stored in variable _From. This can be accessed simply by using pattern matching:
{_, Username, _, _, _, _} = _From
Since you are using the from variable, you should not have a underscore in front of it. In your code change _From to From.

reading file whole flat text file to an array

Im working with erlang writing an escript and Ive seen many examples with file io, not so easy to follow so i found this:
Text = file:read_file("f.txt"),
io:format("~n", Text).
works somehow, it does print the file contents followed by multiple errors
in call from erl_eval:do_apply/6 (erl_eval.erl, line 572)
in call from escript:eval_exprs/5 (escript.erl, line 850)
in call from erl_eval:local_func/5 (erl_eval.erl, line 470)
in call from escript:interpret/4 (escript.erl, line 768)
in call from escript:start/1 (escript.erl, line 277)
in call from init:start_it/1 (init.erl, line 1050)
in call from init:start_em/1 (init.erl, line 1030)
so what would be the easiest way to read the whole file and store the contents in an array or list for later use?
First, file:read_file/1 will return {ok, Binary} on success, where Binary is a binary representing the contents of the file. On error, {error, Reason} is returned. Thus your Text variable is actually a tuple. The easy fix (crashing if there is an error):
{ok, Text} = file:read_file("f.txt")
Next, the first argument to io:format/2 is a format string. ~n is a format that means "newline", but you haven't given it a format that means anything else, so it's not expecting Text as an argument. Furthermore, all arguments to the format string should be in a list passed as the second argument. ~s means string, so:
io:format("~s~n", [Text])
will print out the entire file, followed by a newline. If you want to pass multiple arguments, it would look something like:
io:format("The number ~B and the string ~s~n", [100, "hello"])
Notice how there are only two arguments to io:format/2; one just happens to be a list containing multiple entries.
Since your question asked for an easy way to read the contents of a file into a data-structure, you might enjoy file:consult/1. This solution assumes, you have control over the format of the file since consult/1 expects the file to consist of lines terminated with '.'. It returns {ok, [terms()]} | {error,Reason}.
So, if your file, t.txt, consisted of lines terminated by '.' as follows:
'this is an atom'.
{person, "john", "smith"}.
[1,2,3].
then you could utilize file:consult/1
1> file:consult("c:\t.txt").
2> {ok,['this is an atom',{person,"john","smith"},[1,2,3]]}

Resources