Can i set particular byte in string? - erlang

I have very long string returned from os:cmd. My exe-file output contains some symbols with code 4, so i replaced them with other symbol and put meta in the beginning of the output. Now i want to replace symbols back. How i can do it in quickest way?

I'm an Erlang noob, so this answer is most likely not the best answer. There's probably a function that does this in a chapter I haven't reached yet in the Erlang Programming book. However, I think this does what you want:
-module(replace).
-export([replace/3]).
replace([], _, _) -> [];
replace([OldChar | T], OldChar, NewChar) -> [NewChar | replace(T, OldChar, NewChar)];
replace([H | T], OldChar, NewChar) -> [H | replace(T, OldChar, NewChar)].
It just goes through list (your string) and replaces the old character with the new one. It doesn't handle I18N. There are probably faster ways to do this. It will let you do this:
24> replace:replace([48,49,50,51,52,53,54,55,56,57], 53, 45).
"01234-6789"
or this:
28> replace:replace("39582049867", 57, 45).
"3-58204-867"
In terms of the quickest way - I'm going to guess that would be a provided function. If not, you'll have to code it up different ways and run the numbers.

Erlang strings are lists. Erlang lists are immutable. So you can't change particular bytes within a string, you can only generate another string with these bytes replaced.
Either replace the characters again (using map), or pass the original string around.

Related

Can I match against a string that contains non-ASCII characters?

I am writing an program in which I am dealing with strings in the form, e.g., of "\001SOURCE\001". That is, the strings contained alphanumeric text with an ASCII character of value 1 at each end. I am trying to write a function to match strings like these. I have tried a match like this:
handle(<<1,"SOURCE",1>>) -> ok.
But the match does not succeed. I have tried a few variations on this theme, but all have failed.
Is there a way to match a string that contains mostly alphanumeric text, with the exception of a non-alpha character at each end?
You can also do the following
[1] ++ "SOURCE" ++ [1] == "\001SOURCE\001".
Or convert to binary using list_to_binary and pattern match as
<<1,"SOURCE",1>> == <<"\001SOURCE\001">>.
Strings are syntactic sugar for lists. Lists are a type and binaries are a different type, so your match isn't working out because you're trying to match a list against a binary (same problem if you tried to match {1, "STRING", 1} to it, tuples aren't lists).
Remembering that strings are lists, we have a few options:
handle([1,83,84,82,73,78,71,1]) -> ok.
This will work just fine. Another, more readable (but uglier, sort of) way is to use character literals:
handle([1, $S,$T,$R,$I,$N,$G, 1]) -> ok.
Yet another way would be to strip the non-character values, and then pass that on to a handler:
handle(String) -> dispatch(string:strip(String, both, 1)).
dispatch("STRING") -> do_stuff();
dispatch("OTHER") -> do_other_stuff().
And, if at all possible, the best case is if you just stop using strings for text values entirely (if that's feasible) and process binaries directly instead. The syntax of binaries is much friendlier, they take up way fewer resources, and quite a few binary operations are significantly more efficient than their string/list counterparts. But that doesn't fit every case! (But its awesome when dealing with sockets...)

Simple explanation of Erlang atom

I am learning Erlang and stuck trying to understand the concept of atoms. I know Python: What is a good explanation of these "atoms" in simple terms, or analogously with Python. So far, my understanding is that the type is like a string but without string operations?
Docs say that:
An atom is a literal, a constant with name.
Sometimes you have couple of options, that you would like to choose from. In C for example, you have enum:
enum Weekday { Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday };
In C, it is really an integer, but you can use it in code as one of options. Atoms in Erlang are very useful in pattern matching. Lets consider very simple server:
loop() ->
receive
{request_type_1, Request} ->
handle_request_1(Request),
loop();
{request_type_2, Request} ->
handle_request_2(Request),
loop();
{stop, Reason} ->
{ok, Reason};
_ ->
{error, bad_request}
end.
Your server receives messages, that are two element tuples and uses atoms to differentiate between different types of requests: request_type_1, request_type_2 and stop. It is called pattern matching.
Server also uses atoms as return values. ok atom means, that everything went ok. _ matches everything, so in case, that simple server receives something unexpected, it quits with tuple {error, Reason}, where the reason is also atom bad_request.
Boolean values true and false are also atoms. You can build logical functions using function clauses like this:
and(true, true) ->
true;
and(_, _) ->
false.
or(false, false) ->
false;
or(_, _) ->
true.
(It is a little bit oversimplified, because you can call it like this: or(atom1, atom2) and it will return true, but it is only for illustration.)
Module names in Erlang are also atoms, so you can bind module name to variable and call it, for example type this in Erlang shell:
io:format("asdf").
Variable = io.
Variable:format("asdf").
You should not use atoms as strings, because they are not garbage collected. If you start creating them dynamically, you can run out of memory. They should be only used, when there is fixed amount of options, that you type in code by hand. Of course, you can use the same atom as many times as you want, because it always points to the same point in memory (an atom table).
They are better than C enums, because the value is known at runtime. So while debugging C code, you would see 1 instead of Tuesday in debugger. Using atoms doesn't have that disadvantage, you will see tuesday in your both in your code and Erlang shell.
Also, they're often used to tag a tuple, for descriptiveness. For example:
{age, 42}
Rather than just
42
Atom is a literal constant. Has no value but can be used as a value. Examples are: true, false, undefined. If you want to use it as a string, you need to apply atom_to_list(atom) to get a string (list) to work with. Module names are also atoms.
Take a look at http://www.erlang.org/doc/reference_manual/data_types.html

Converting string to binary

I have the following issue.
I have a file which used for storing array of some records (unknown structure). All that I know that all records separated with "." (dot). One of the "fields" of this record is a binary value.
So the structure is:
multiline_text <<binary_value>> multiline_text .
I can read file chunk-by-chunk (because it pretty large) and parse data to get actual data "<>" but it's not a binary value it's a string. I'm trying to convert it binary (to convert to term late) but i have no success.
I tried to use BIF list_to_binary (but it won't work because it is not a list) - it's already a binary. I tried to convert it to list of integers, fold them and convert and it's still is not working.
I suppose I'm missing something basic (I'm newbie in Erlang).
Are there any advices?
If you get the binary you're interested in into an String in this format, for example:
S = "<< 1,2,3 >>".
then you can do something like this:
> {ok, T, _} = erl_scan:string(S ++ ".").
> {ok, Term} = erl_parse:parse_term(T).
{ok,<<1,2,3>>}
and then you can use Term, that actually has the binary you just read as a string.
Here is version without erl_parse. Just to study:
str2bin(Bin)->
Bin1 = string:strip(Bin, left, $<),
Bin2 = string:strip(Bin1, right, $>),
list_to_binary(lists:map(fun(Str) -> {Int, _Rest} = string:to_integer(string:strip(Str)), Int end, string:tokens(Bin2, ","))).

convert a string into another format

I would like to convert the binary string <<"abc">> into the following string "<a><b><c>" .
In other words, each byte shall be written between one "less than" char and one "greater than" char.
I suppose that the function is recursive ? Note that abc is just an example !
1>lists:flatten([[$<,C,$>]||C<-binary_to_list(<<"abc">>)]).
"<a><b><c>"
alternative
lists:flatmap(fun(C)-> [$<,C,$>] end,binary_to_list(<<"abc">>)).
or
f(C) -> [$<,C,$>].
lists:flatmap(fun f/1,binary_to_list(<<"abc">>)).
The most efficient if you want a flat list would probably be:
fr(<<C,Rest/binary>>) ->
[$<,C,$>|fr(Rest)];
fr(<<>>) -> [].
This expansion is similar to what a list/binary comprehension expands to.
Use a binary comprehension:
2> [[$<, C, $>] || <<C:1/binary>> <= <<"abc">>].
[[60,<<"a">>,62],[60,<<"b">>,62],[60,<<"c">>,62]]
So you don't have to process the binary into a list first and then work on it. It is probably a bit faster, especially for large lists, so if performance matter to you, it may be a viable alternative option.
this answer is probably not best one in terms of efficiency(i didn't compare it to other solutions) but it certainly helps to understand how you can invent your own iterators over different collections in erlang aimed for achieving your specific goal instead of using predefined iterators
fr(<<>>, Output) -> Output;
fr(<<"b", Rest/binary>>, Output) ->
fr(Rest, <<Output, "b">>);
fr(<<C:8, Rest/binary>>, Output) ->
fr(Rest, <<Output/binary, $<, C:8, $>>>).
f(Input) -> fr(Input, <<>>).
P.S. it looks like this solution is actually the most efficient :)

String splitting problems in Erlang

I've been playing around with the splitting of atoms and have a problem with strings. The input data will always be an atom that consists of some letters and then some numbers, for instance ms444, r64 or min1. Since the function lists:splitwith/2 takes a list the atom is first converted into a list:
24> lists:splitwith(fun (C) -> is_atom(C) end, [m,s,4,4,4]).
{[m,s],[4,4,4]}
25> lists:splitwith(fun (C) -> is_atom(C) end, atom_to_list(ms444)).
{[],"ms444"}
26> atom_to_list(ms444).
"ms444"
I want to separate the letters from the numbers and I've succeeded in doing that when using a list, but since I start out with an atom I get a "string" as result to put into my splitwith function...
Is it interpreting each item in the list as a string or what is going on?
You might want to have a look at the string module documentation:
http://www.erlang.org/doc/man/string.html
The following function might interest you:
tokens(String, SeparatorList) -> Tokens
Since strings in Erlang are just a list() of integer() the test in the fun will be made if the item is an atom() when it is in fact an integer(). If the test is changed to look for letters it works:
29> lists:splitwith(fun (C) -> (C >= $a) and (C =< $Z) end, atom_to_list(ms444)).
{"ms","444"}
An atom in erlang is a named constant and not a variable (or not like a variable is in an imperative language).
You should really not create atoms in dynamic fashion (that is, don't convert things to atoms at runtime)
They are used more in pattern matching and send recive code.
Pid ! {matchthis, X}
recive
{foobar,Y} -> doY(Y);
{matchthis,X} -> doX(X);
Other -> doother(Other)
end
A variable, like X could be set to an atom. For example X=if 1==1 -> ok; true -> fail end. I could suffer from poor imagination but I can't think of a way why you would like to parse atom. You should be in charge of what atoms you write and not use list_to_atom(CharIntegerList).
Can you perhaps give a more overview of what you like to accomplish?
A "string" in Erlang is not a primitive type: it is just a list() of integers(). So if you want to "separate" the letters from the digits, you'll have to do comparison with the integer representation of the characters.

Resources