differentiate a string from a list in Erlang - erlang

In Erlang when you have a list of printable characters, its a string, but a string is also a list of items and all functions of a list can be applied onto a string. Really, the data structure string doesn't exist in Erlang.
Part of my code needs to be sure that something is not only a list, but it's a string. (A real string). It needs to separate lists e.g. [1,2,3,a,b,"josh"] , from string e.g. "Muzaaya".
The guard expression is_list/1 will say true for both strings and lists. There is no such guard as is_string/1 and so this means I need a code snippet will make sure that my data is a string.
A string in this case is a list of only printable (alphabetical, both cases, upper and lower), and may contain numbers e.g "Muzaaya2536 618 Joshua". I need a code snippet please (Erlang) that will check this for me and ensure that the variable is a string, not just a list. thanks

You have two functions in the module io_lib which can be helpful: io_lib:printable_list/1 and io_lib:printable_unicode_list/1 which test if the argument is a list of printable latin1 or unicode characters respectively.

using the isprint(3) definition of printable characters --
isprint(X) when X >= 32, X < 127 -> true;
isprint(_) -> false.
is_string(List) when is_list(List) -> lists:all(fun isprint/1, List);
is_string(_) -> false.
you won't be able to use it as a guard, though.

Related

How can I detect a list [1,2,3] (not a string "example string")? [duplicate]

In Erlang when you have a list of printable characters, its a string, but a string is also a list of items and all functions of a list can be applied onto a string. Really, the data structure string doesn't exist in Erlang.
Part of my code needs to be sure that something is not only a list, but it's a string. (A real string). It needs to separate lists e.g. [1,2,3,a,b,"josh"] , from string e.g. "Muzaaya".
The guard expression is_list/1 will say true for both strings and lists. There is no such guard as is_string/1 and so this means I need a code snippet will make sure that my data is a string.
A string in this case is a list of only printable (alphabetical, both cases, upper and lower), and may contain numbers e.g "Muzaaya2536 618 Joshua". I need a code snippet please (Erlang) that will check this for me and ensure that the variable is a string, not just a list. thanks
You have two functions in the module io_lib which can be helpful: io_lib:printable_list/1 and io_lib:printable_unicode_list/1 which test if the argument is a list of printable latin1 or unicode characters respectively.
using the isprint(3) definition of printable characters --
isprint(X) when X >= 32, X < 127 -> true;
isprint(_) -> false.
is_string(List) when is_list(List) -> lists:all(fun isprint/1, List);
is_string(_) -> false.
you won't be able to use it as a guard, though.

Can I match against a string that contains non-ASCII characters?

I am writing an program in which I am dealing with strings in the form, e.g., of "\001SOURCE\001". That is, the strings contained alphanumeric text with an ASCII character of value 1 at each end. I am trying to write a function to match strings like these. I have tried a match like this:
handle(<<1,"SOURCE",1>>) -> ok.
But the match does not succeed. I have tried a few variations on this theme, but all have failed.
Is there a way to match a string that contains mostly alphanumeric text, with the exception of a non-alpha character at each end?
You can also do the following
[1] ++ "SOURCE" ++ [1] == "\001SOURCE\001".
Or convert to binary using list_to_binary and pattern match as
<<1,"SOURCE",1>> == <<"\001SOURCE\001">>.
Strings are syntactic sugar for lists. Lists are a type and binaries are a different type, so your match isn't working out because you're trying to match a list against a binary (same problem if you tried to match {1, "STRING", 1} to it, tuples aren't lists).
Remembering that strings are lists, we have a few options:
handle([1,83,84,82,73,78,71,1]) -> ok.
This will work just fine. Another, more readable (but uglier, sort of) way is to use character literals:
handle([1, $S,$T,$R,$I,$N,$G, 1]) -> ok.
Yet another way would be to strip the non-character values, and then pass that on to a handler:
handle(String) -> dispatch(string:strip(String, both, 1)).
dispatch("STRING") -> do_stuff();
dispatch("OTHER") -> do_other_stuff().
And, if at all possible, the best case is if you just stop using strings for text values entirely (if that's feasible) and process binaries directly instead. The syntax of binaries is much friendlier, they take up way fewer resources, and quite a few binary operations are significantly more efficient than their string/list counterparts. But that doesn't fit every case! (But its awesome when dealing with sockets...)

Matching function in erlang based on string format

I have user information coming in from an outside source and I need to check if that user is active. Sometimes I have a User and a Server and other times I have User#Server. The former case is no problem, I just have:
active(User, Server) ->
do whatever.
What I would like to do with the User#Server case is something like:
active([User, "#", Server]) ->
active(User, Server).
Doesn't seem to work. When calling active in the erlang terminal with a#b for example, I get an error that there is no match. Any help would be appreciated!
You can tokenize the string to get the result:
active(UserString) ->
[User,Server] = string:tokens(UserString,"#"),
active(User,Server).
If you need something more elaborate, or with better handling of something like email addresses, it might then be time to delve into using regular expressions with the re module.
active(UserString) ->
RegEx = "^([\\w\\.-]+)#([\\w\\.-]+)$",
{match, [User,Server]} = re:run(UserString,RegEx,[{capture,all_but_first,list}]),
active(User,Server).
Note: The supplied Regex is hardly sufficient for email address validation, it's just an example that allows all alphanumeric characters including underscores (\\w), dots (\\.), and dashes (-) seperated by an at symbol. And it will fail if the match doesn't stretch the whole length of the string: (^ to $).
A note on the pattern matching, for the real solution to your problem I think #chops suggestions should be used.
When matching patterns against strings I think it's useful to keep in mind that erlang strings are really lists of integers. So the string "#" is actually the same as [64] (64 being the ascii code for #)
This means that you match pattern [User, "#", Server] will match lists like: [97,[64],98], but not "a#b" (which in list form is [97,64,98]).
To match the string you need to do [User,$#,Server]. The $ operator gives you the ascii value of the character.
However this match pattern limits the matching string to be 1 character followed by # and then one more character...
It can be improved by doing [User, $# | Server] which allows the server part to have arbitrary length, but the User variable will still only match one single character (and I don't see a way around that).

Haskell -> After parsing how to work with strings

Hello
after doing the parsing with a script in Haskell I got a file with the 'appearance' of lists of strings. However when I call the file content with the function getContents or hGetContents, ie, reading the contents I get something like: String with lines (schematically what I want is: "[" aaa "," bbb "" ccc "]" -> ["aaa", "bbb" "ccc"]). I have tried with the read function but without results. I need to work with these lists of strings to concatenating them all in a single list.
I'm using the lines function, but I think it only 'works' one line at a time, doesn't it?
What I need is a function that verify if an element of a line is repeted on other line. If I could have a list of a list of strings it could be easier (but what I have is a line of a string that looks like a list of strings)
Regards
Thanks.
I have tried with the read function but without results
Just tested, and it works fine:
Prelude> read "[\"aaa\",\"bbb\",\"ccc\"]" :: [String]
["aaa","bbb","ccc"]
Note that you need to give the return type explicitly, since it can't be determined from the type of the argument.
I think the function you are looking for is the lines function from Data.List (reexported by the Prelude) that breaks up a multi-line string into a list of strings.
in my understanding, what you can do is
create a function that receives a list of lists, each list is a line of the entire string, of the argument passed in, and checks if a element of a line occurs in other line.
then, this function passes the entire string, separated by lines using [lines][1].

String splitting problems in Erlang

I've been playing around with the splitting of atoms and have a problem with strings. The input data will always be an atom that consists of some letters and then some numbers, for instance ms444, r64 or min1. Since the function lists:splitwith/2 takes a list the atom is first converted into a list:
24> lists:splitwith(fun (C) -> is_atom(C) end, [m,s,4,4,4]).
{[m,s],[4,4,4]}
25> lists:splitwith(fun (C) -> is_atom(C) end, atom_to_list(ms444)).
{[],"ms444"}
26> atom_to_list(ms444).
"ms444"
I want to separate the letters from the numbers and I've succeeded in doing that when using a list, but since I start out with an atom I get a "string" as result to put into my splitwith function...
Is it interpreting each item in the list as a string or what is going on?
You might want to have a look at the string module documentation:
http://www.erlang.org/doc/man/string.html
The following function might interest you:
tokens(String, SeparatorList) -> Tokens
Since strings in Erlang are just a list() of integer() the test in the fun will be made if the item is an atom() when it is in fact an integer(). If the test is changed to look for letters it works:
29> lists:splitwith(fun (C) -> (C >= $a) and (C =< $Z) end, atom_to_list(ms444)).
{"ms","444"}
An atom in erlang is a named constant and not a variable (or not like a variable is in an imperative language).
You should really not create atoms in dynamic fashion (that is, don't convert things to atoms at runtime)
They are used more in pattern matching and send recive code.
Pid ! {matchthis, X}
recive
{foobar,Y} -> doY(Y);
{matchthis,X} -> doX(X);
Other -> doother(Other)
end
A variable, like X could be set to an atom. For example X=if 1==1 -> ok; true -> fail end. I could suffer from poor imagination but I can't think of a way why you would like to parse atom. You should be in charge of what atoms you write and not use list_to_atom(CharIntegerList).
Can you perhaps give a more overview of what you like to accomplish?
A "string" in Erlang is not a primitive type: it is just a list() of integers(). So if you want to "separate" the letters from the digits, you'll have to do comparison with the integer representation of the characters.

Resources