reading file whole flat text file to an array - erlang

Im working with erlang writing an escript and Ive seen many examples with file io, not so easy to follow so i found this:
Text = file:read_file("f.txt"),
io:format("~n", Text).
works somehow, it does print the file contents followed by multiple errors
in call from erl_eval:do_apply/6 (erl_eval.erl, line 572)
in call from escript:eval_exprs/5 (escript.erl, line 850)
in call from erl_eval:local_func/5 (erl_eval.erl, line 470)
in call from escript:interpret/4 (escript.erl, line 768)
in call from escript:start/1 (escript.erl, line 277)
in call from init:start_it/1 (init.erl, line 1050)
in call from init:start_em/1 (init.erl, line 1030)
so what would be the easiest way to read the whole file and store the contents in an array or list for later use?

First, file:read_file/1 will return {ok, Binary} on success, where Binary is a binary representing the contents of the file. On error, {error, Reason} is returned. Thus your Text variable is actually a tuple. The easy fix (crashing if there is an error):
{ok, Text} = file:read_file("f.txt")
Next, the first argument to io:format/2 is a format string. ~n is a format that means "newline", but you haven't given it a format that means anything else, so it's not expecting Text as an argument. Furthermore, all arguments to the format string should be in a list passed as the second argument. ~s means string, so:
io:format("~s~n", [Text])
will print out the entire file, followed by a newline. If you want to pass multiple arguments, it would look something like:
io:format("The number ~B and the string ~s~n", [100, "hello"])
Notice how there are only two arguments to io:format/2; one just happens to be a list containing multiple entries.

Since your question asked for an easy way to read the contents of a file into a data-structure, you might enjoy file:consult/1. This solution assumes, you have control over the format of the file since consult/1 expects the file to consist of lines terminated with '.'. It returns {ok, [terms()]} | {error,Reason}.
So, if your file, t.txt, consisted of lines terminated by '.' as follows:
'this is an atom'.
{person, "john", "smith"}.
[1,2,3].
then you could utilize file:consult/1
1> file:consult("c:\t.txt").
2> {ok,['this is an atom',{person,"john","smith"},[1,2,3]]}

Related

Reading files in Erlang

I have an Erlang app that fetches some key/value pair from files.
In one of the files, instead of the key/value pairs, I have a path to a different folder, in which it has the key/value pairs.
The reading of the files are working fine, but whenever I have to read the path from a file and just then read the subsequent file, the app is complaining that the file is nonexistent.
If I jump into the container and check, I can see the file and the subsequent file. So I guess I'm missing something.
Here is what I have:
get_kvs("keyvalue.properties"), %% <-- This works
{ok, PathToFile} = file:read_file("pathtofile.properties"),
get_kvs(PathToFile), %% <-- This crashes
Files:
keyvalue.properties:
key_1=val_1
key_2=val_2
key_3=val_3
pathtofile.properties:
/data/myfolder/hidden_keyvalue.properties
/data/myfolder/hidden_keyvalue.properties:
extra_key1=extra_val1
extra_key2=extra_val2
extra_key3=extra_val3
And the get_metadata function:
get_metadata(FileName) ->
io:format(FileName),
{ok, MetadataFile} = file:read_file(FileName),
io:format(MetadataFile),
Lines = binary:split(MetadataFile, <<"\n">>, [trim, global]),
make_tuples(Lines, []).
make_tuples([Line|Lines], Acc) ->
[Key, Value] = binary:split(Line, <<"=">>),
make_tuples(Lines, [{Key, Value}|Acc]);
make_tuples([], Acc) -> lists:reverse(Acc).
Whenever running it I can see that the PathToFile is being properly populated, but when I try to read the path I get the error below:
keyvalue.propertiesextra_key_1=extra_val_1
extra_key_2=extra_val_2
extra_key_3=extra_val_3
/data/myfolder/hidden_keyvalue.properties
=CRASH REPORT==== 23-Mar-2022::07:46:30.612093 ===
crasher:
initial call: cowboy_stream_h:request_process/3
pid: <0.735.0>
registered_name: []
exception error: no match of right hand side value {error,enoent}
in function hello_handler:get_metadata/1 (/data/apps/hello_server/src/hello_handler.erl, line 40)
in call from hello_handler:child_call/1 (/data/apps/hello_server/src/hello_handler.erl, line 28)
Any ideas of what am I missing?
After the OP has been modified to reflect the actual point of failure by removing try-catch, the error seems to be {error, enoent }which means The file does not exist. However, the same function is working in some scenarios and not when the path of the file to be read is itself taken from another file.
Just make sure there are mo additional characters, like newlines or non-printable characters after the content of the file which should actually be a valid path.
For example, when I tried with a value as such, <<"hidden_keyvalue.properties\n\n">>, then read_file gave me same result as {error, enoent}.
So it could be possible that the content of the file from which paths are read has additional non-printable characters at the end.
Ignore (Wrong Assumption)
I tried with a local setup and I think this line inside make_tuples is causing that behavior.
[Key, Value] = binary:split(Line, <<"=">>) %% <--- This match will succeed only if there are exactly 2 elements in the list produced by binary:split(...).
Given we are doing inside get_metadata
Lines = binary:split(MetadataFile, <<"\n">>, [trim, global]), %% <-- This splits the contents of the keyvalue.properties file into a List
make_tuples(Lines, [])
From the provided text, it appears that the content of keyvalue.proiperties file is ...
key_1=val_1
key_2=val_2
key_3=val_3
/data/myfolder/hidden_keyvalue.properties %% <-- This isn't a valid key-value pair
And with that last line, make_tuples while doing a match will fail the following way...
[Key, Value] = binary:split(<<"/data/myfolder/hidden_keyvalue.properties">>, <<"=">>).
** exception error: no match of right hand side value [<<"/data/myfolder/hidden_keyvalue.properties">>]
That pattern-match expression requires exactly 2 elements on the right-hand side(the Term/Value side). Since the line with path-entry in the keyvalue.properties does not have an =, so the split produces a list with 1 element, and so the match fails.
To address this...
We can change the format of keyvalue.properties so that every line is a valid key-value pair, but not sure how feasible it will be in the context of the program.
Or, we can change that pattern-match form of list so that it can accept non-exact number of terms while matching
make_tuples([Line|Lines], Acc) ->
%% [Key|Value] is at least 1 element list, [Key, Value] is exactly 2 elements list
[Key|Value] = binary:split(Line, <<"=">>), %% <-- Now Value becomes a list of binary
make_tuples(Lines, [{Key, get_value(Value)}|Acc]); %% <-- Additional get_value function to extract 'Value binary' from the head of the 'Value List' produced in the previous line
make_tuples([], Acc) -> lists:reverse(Acc).
get_value([]) ->
<<>>;
get_value([V|_]) -> V.
Assumptions
get_kvs invokes get_metadata which in turn invokes make_tuples.
keyvalue.properties file content has both valid key-value pairs and also some non-key-value entries

Erlang printing hex instead of integer

I have been trying to fix a problem for hours now, very new to erlang
lists:sublist([6,9,15,24,39,6,96],7,1).
I want this to print "100" instead of "d"
what I am i doing wrong here?
The shell is going to try to print strings as strings whenever it would be legal. That means lists of integers that happen to all be valid characters will be printed as characters, and lists that contain other things will be printed as lists:
1> [65,66,67].
"ABC"
2> [3,65,66,67].
[3,65,66,67]
But notice that I did not actually call any output functions. That was just the shell's convenience operation of implicitly echoing whatever a returned value was so you, as a programmer, can inspect it.
If I want to explicitly call an output function I should use a format string that specifies the nature of the values to be interpolated:
3> io:format("This is a list: ~tw~n", [List]).
This is a list: [65,66,67]
ok
4> io:format("This is a list rendered as an implied string: ~tp~n", [List]).
This is a list rendered as an implied string: "ABC"
ok
5> io:format("This is a string: ~ts~n", [List]).
This is a string: ABC
ok
Note the additional atom ok after each print. That is because the return value from io:format/2 is ok. So we are getting the explicit output from format/2 and then seeing its return value.
The io module doc page has the gritty details: http://erlang.org/doc/man/io.html#format-1
Back to your example...
6> lists:sublist([6,9,15,24,39,6,96],7,1).
"`"
7> io:format("~tw~n", [lists:sublist([6,9,15,24,39,6,96],7,1)]).
[96]
ok
Addendum
There is a setting called shell:strings/1 that tells the shell to turn string formatting on and off:
1> [65,66,67].
"ABC"
2> shell:strings(false).
true
3> [65,66,67].
[65,66,67]
4> <<65,66,67>>.
<<65,66,67>>
5> shell:strings(true).
false
6> <<65,66,67>>.
<<"ABC">>
But I don't mess with this setting ever anymore for a few reasons:
It is almost never worth the effort to remember this detail of the shell (convenience output from the shell is mostly useful for discovering return value structures, not specific values held by those structures -- and when you want that data you usually want strings printed as strings anyway).
It can cause surprising shell output in any case where you really are dealing with strings.
This is almost never the behavior you actually want.
When dealing with real programs you will need actual output functions using io or io_lib modules, and developing habits around format strings is much more useful than worrying over convenience output from the shell.

Setting the length for when ~P wrapps the text in erlang

Is there a way to make erlang print the full string even if one has used ~P in a io:format function?
Im having some troubles with EDoc and it keeps wrapping the error messages to ....
Is there any flags or other way to force erlang to print the entire string?
The only method I have found is to use io_lib:print(Term, Column, LineLength, Depth). That function allows you to specify the starting column, the line length to control wrapping, etc. It returns a string which you can then print using io:format with ~s format.

why is this error with bad utf8 character is caused while creating a document in couchdb?

Creating document in couchdb is generating the following error,
12> ADoc.
[{<<"Adress">>,<<"Hjalmar Brantingsgatan 7 C">>},
{<<"District">>,<<"Brämaregården">>},
{<<"Rent">>,3964},
{<<"Rooms">>,2},
{<<"Area">>,0}]
13> IDoc.
[{<<"Adress">>,<<"Segeparksgatan 2A">>},
{<<"District">>,<<"Kirseberg">>},
{<<"Rent">>,9701},
{<<"Rooms">>,3},
{<<"Area">>,83}]
14> erlang_couchdb:create_document({"127.0.0.1", 5984}, "proto_v1", IDoc).
{json,{struct,[{<<"ok">>,true},
{<<"id">>,<<"c6d96b5f923f50bfb9263638d4167b1e">>},
{<<"rev">>,<<"1-0d17a3416d50129328f632fd5cfa1d90">>}]}}
15> erlang_couchdb:create_document({"127.0.0.1", 5984}, "proto_v1", ADoc).
** exception exit: {ucs,{bad_utf8_character_code}}
in function xmerl_ucs:from_utf8/1 (xmerl_ucs.erl, line 185)
in call from mochijson2:json_encode_string/2 (/Users/admin/AlphaGroup/src/mochijson2.erl, line 200)
in call from mochijson2:'-json_encode_proplist/2-fun-0-'/3 (/Users/admin/AlphaGroup/src/mochijson2.erl, line 181)
in call from lists:foldl/3 (lists.erl, line 1197)
in call from mochijson2:json_encode_proplist/2 (/Users/admin/AlphaGroup/src/mochijson2.erl, line 184)
in call from erlang_couchdb:create_document/3 (/Users/admin/AlphaGroup/src/erlang_couchdb.erl, line 256)
Above of two documents one can be created in couchdb with no problem (IDoc).
can any one help me to figure out the reason it is caused?
I think that is problem is in the <<"Brämaregården">>. It is necessary to convert the unicode to binary firstly. Example is in the following links.
unicode discussion. The core function is in unicode
Entering non-ASCII characters in Erlang code is fiddly, not the least because it works differently in the shell than in compiled Erlang code.
Try inputting the binary explicitly as UTF-8:
<<"Br", 16#c3, 16#a4, "mareg", 16#c3, 16#a5, "rden">>
That is, "ä" is represented by the bytes C3 A4 in UTF-8, and "å" by C3 A5. There are many ways to find those codes; a quick search turned up this table.
Normally you'd get the input from somewhere outside your code, e.g. reading from a file, typed into a web form etc, and then you wouldn't have this problem.

Haskell -> After parsing how to work with strings

Hello
after doing the parsing with a script in Haskell I got a file with the 'appearance' of lists of strings. However when I call the file content with the function getContents or hGetContents, ie, reading the contents I get something like: String with lines (schematically what I want is: "[" aaa "," bbb "" ccc "]" -> ["aaa", "bbb" "ccc"]). I have tried with the read function but without results. I need to work with these lists of strings to concatenating them all in a single list.
I'm using the lines function, but I think it only 'works' one line at a time, doesn't it?
What I need is a function that verify if an element of a line is repeted on other line. If I could have a list of a list of strings it could be easier (but what I have is a line of a string that looks like a list of strings)
Regards
Thanks.
I have tried with the read function but without results
Just tested, and it works fine:
Prelude> read "[\"aaa\",\"bbb\",\"ccc\"]" :: [String]
["aaa","bbb","ccc"]
Note that you need to give the return type explicitly, since it can't be determined from the type of the argument.
I think the function you are looking for is the lines function from Data.List (reexported by the Prelude) that breaks up a multi-line string into a list of strings.
in my understanding, what you can do is
create a function that receives a list of lists, each list is a line of the entire string, of the argument passed in, and checks if a element of a line occurs in other line.
then, this function passes the entire string, separated by lines using [lines][1].

Resources