Range Operator for Base62 Encoding - f#

For base 62 encoding, I need all 62 alphanumeric characters. The F# range operator offers a nice shorthand for this.
let alphaNumericCharacters =
seq {
yield! [|'a'..'z'|]
yield! [|'A'..'Z'|]
yield! [|'0'..'9'|]
} |> Array.ofSeq
This is nice and concise, but I'm greedy. Is there a way of doing this in one line?

let alphaNumericCharacters = Array.concat [| [|'0'..'9'|]; [|'A'..'Z'|]; [|'a'..'z'|] |]

let alphaNumericCharacters = ['a'..'z'] # ['A'..'Z'] # ['0'..'9'] |> List.toArray

If your feeling funny:
let alphaNumericCharacters = [|Char.MinValue..Char.MaxValue|] |> Array.filter Char.IsLetterOrDigit

Related

Remove All but First Occurrence of a Character in a List of Strings

I have a list of names, and I need to output a single string that shows the letters from the names in the order they appear without the duplicates (e.g. If the list is ["John"; "James"; "Jack"], the output string should be Johnamesck). I've got a solution (folding all the names into a string then parse), but I feel like I'm cheesing it a bit by making my string mutable.
I also want to state this is not a school assignment, just an exercise from a work colleague as I'm coming into F# from only ever knowing Java Web stuff.
Here is my working solution (for insight purposes):
let lower = ['a' .. 'z']
let upper = ['A' .. 'Z']
let mutable concatedNames = ["John"; "James"; "Jack"] |> List.fold (+) ""
let greaterThanOne (length : int) = length > 1
let stripChars (str : string) letter =
let parts = str.Split([| letter |])
match greaterThanOne (Array.length parts) with
| true -> seq {
yield Array.head parts
yield string letter
yield! Array.tail parts
}
|> String.concat ""
| _ -> str
let killAllButFirstLower = lower |> List.iter (fun letter -> concatedNames <- (stripChars concatedNames letter))
let killAllButFirstUpper = upper |> List.iter ( fun letter -> concatedNames <- (stripChars concatedNames letter))
printfn "All names with duplicate letters removed: %s" concatedNames
I originally wanted to do this explicitly with functions alone and had a solution previous to above
let lower = ['a' .. 'z']
let upper = ['A' .. 'Z']
:
:
:
let lowerStripped = [""]
let killLowerDuplicates = lower |> List.iter (fun letter ->
match lowerStripped.Length with
| 1 ->
(stripChars concatedNames letter)::lowerStripped |> ignore
| _ -> (stripChars (List.head lowerStripped) letter)::lowerStripped |> ignore
)
let upperStripped = [List.head lowerStripped]
let killUpperDuplicates = lower |> List.iter ( fun letter -> (stripChars (List.head upperStripped) letter)::upperStripped |> ignore )
let strippedAll = List.head upperStripped
printfn "%s" strippedAll
But I couldn't get this working because I realized the consed lists weren't going anywhere (not to mention this is probably inefficient). The idea was that by doing it this way, once I parsed everything, the first element of the list would be the desired string.
I understand it may be strange asking a question I already have a solution to, but I feel like using mutable is just me not letting go of my Imperative habits (as I've read it should be rare to need to use it) and I want to more reinforce pure functional. So is there a better way to do this? Is the second solution a feasible route if I can somehow pipe the result somewhere?
You can use Seq.distinct to remove duplicates and retain ordering, so you just need to convert the list of strings to a single string, which can be done with String.concat "":
let distinctChars s = s |> String.concat ""
|> Seq.distinct
|> Array.ofSeq
|> System.String
If you run distinctChars ["John"; "James"; "Jack"], you will get back:
"Johnamesck"
This should do the trick:
let removeDuplicateCharacters strings =
// Treat each string as a seq<char>, flattening them into one big seq<char>
let chars = strings |> Seq.collect id // The id function (f(x) = x) is built in to F#
// We use it here because we want to collect the characters themselves
chars
|> Seq.mapi (fun i c -> i,c) // Get the index of each character in the overall sequence
|> Seq.choose (fun (i,c) ->
if i = (chars |> Seq.findIndex ((=) c)) // Is this character's index the same as the first occurence's index?
then Some c // If so, return (Some c) so that `choose` will include it,
else None) // Otherwise, return None so that `choose` will ignore it
|> Seq.toArray // Convert the seq<char> into a char []
|> System.String // Call the new String(char []) constructor with the choosen characters
Basically, we just treat the list of strings as one big sequence of characters, and choose the ones where the index in the overall sequence is the same as the index of the first occurrence of that character.
Running removeDuplicateCharacters ["John"; "James"; "Jack";] gives the expected output: "Johnamesck".

How do I do in F# what would be called compression in APL?

In APL one can use a bit vector to select out elements of another vector; this is called compression. For example 1 0 1/3 5 7 would yield 3 7.
Is there a accepted term for this in functional programming in general and F# in particular?
Here is my F# program:
let list1 = [|"Bob"; "Mary"; "Sue"|]
let list2 = [|1; 0; 1|]
[<EntryPoint>]
let main argv =
0 // return an integer exit code
What I would like to do is compute a new string[] which would be [|"Bob"; Sue"|]
How would one do this in F#?
Array.zip list1 list2 // [|("Bob",1); ("Mary",0); ("Sue",1)|]
|> Array.filter (fun (_,x) -> x = 1) // [|("Bob", 1); ("Sue", 1)|]
|> Array.map fst // [|"Bob"; "Sue"|]
The pipe operator |> does function application syntactically reversed, i.e., x |> f is equivalent to f x. As mentioned in another answer, replace Array with Seq to avoid the construction of intermediate arrays.
I expect you'll find many APL primitives missing from F#. For lists and sequences, many can be constructed by stringing together primitives from the Seq, Array, or List modules, like the above. For reference, here is an overview of the Seq module.
I think the easiest is to use an array sequence expression, something like this:
let compress bits values =
[|
for i = 0 to bits.Length - 1 do
if bits.[i] = 1 then
yield values.[i]
|]
If you only want to use combinators, this is what I would do:
Seq.zip bits values
|> Seq.choose (fun (bit, value) ->
if bit = 1 then Some value else None)
|> Array.ofSeq
I use Seq functions instead of Array in order to avoid building intermediary arrays, but it would be correct too.
One might say this is more idiomatic:
Seq.map2 (fun l1 l2 -> if l2 = 1 then Some(l1) else None) list1 list2
|> Seq.choose id
|> Seq.toArray
EDIT (for the pipe lovers)
(list1, list2)
||> Seq.map2 (fun l1 l2 -> if l2 = 1 then Some(l1) else None)
|> Seq.choose id
|> Seq.toArray
Søren Debois' solution is good but, as he pointed out, but we can do better. Let's define a function, based on Søren's code:
let compressArray vals idx =
Array.zip vals idx
|> Array.filter (fun (_, x) -> x = 1)
|> Array.map fst
compressArray ends up creating a new array in each of the 3 lines. This can take some time, if the input arrays are long (1.4 seconds for 10M values in my quick test).
We can save some time by working on sequences and creating an array at the end only:
let compressSeq vals idx =
Seq.zip vals idx
|> Seq.filter (fun (_, x) -> x = 1)
|> Seq.map fst
This function is generic and will work on arrays, lists, etc. To generate an array as output:
compressSeq sq idx |> Seq.toArray
The latter saves about 40% of computation time (0.8s in my test).
As ildjarn commented, the function argument to filter can be rewritten to snd >> (=) 1, although that causes a slight performance drop (< 10%), probably because of the extra function call that is generated.

How to define x++ (where x: int ref) in F#?

I currently use this function
let inc (i : int ref) =
let res = !i
i := res + 1
res
to write things like
let str = input.[inc index]
How define increment operator ++, so that I could write
let str = input.[index++]
You cannot define postfix operators in F# - see 4.4 Operators and Precedence. If you agree to making it prefix instead, then you can define, for example,
let (++) x = incr x; !x
and use it as below:
let y = ref 1
(++) y;;
val y : int ref = {contents = 2;}
UPDATE: as fpessoa pointed out ++ cannot be used as a genuine prefix operator, indeed (see here and there for the rules upon characters and character sequences comprising valid F# prefix operators).
Interestingly, the unary + can be overloaded for the purpose:
let (~+) x = incr x; !x
allowing
let y = ref 1
+y;;
val y : int ref = {contents = 2;}
Nevertheless, it makes sense to mention that the idea of iterating an array like below
let v = [| 1..5 |]
let i = ref -1
v |> Seq.iter (fun _ -> printfn "%d" v.[+i])
for the sake of "readability" looks at least strange in comparison with the idiomatic functional manner
[|1..5|] |> Seq.iter (printfn "%d")
which some initiated already had expressed in comments to the original question.
I was trying to write it as a prefix operator as suggested, but you can't define (++) as a proper prefix operator, i.e., run things like ++y without the () as you could for example for (!+):
let (!+) (i : int ref) = incr i; !i
let v = [| 1..5 |]
let i = ref -1
[1..5] |> Seq.iter (fun _ -> printfn "%d" v.[!+i])
Sorry, but I guess the answer is that actually you can't do even that.

Unpack tuple into another tuple

Suppose I need to construct a tuple of length three:
(x , y, z)
And I have a function which returns a tuple of length two - exampleFunction and the last two elements of the tuple to be constructed are from this tuple.
How can I do this without having to call the exampleFunction two times:
(x, fst exampleFunction , snd exampleFunction)
I just want to do / achieve something like
(x, exampleFunction)
but it complains that the tuples have unmatched length ( of course )
Not looking at doing let y,z = exampleFunction()
There may be a built in function, but a custom one would work just as well.
let repack (a,(b,c)) = (a,b,c)
repack (x,exampleFunction)
I'm not sure it if worth a separate answer, but both answers provided above are not optimal since both construct redundant Tuple<'a, Tuple<'b, 'c>> upon invocation of the helper function. I would say a custom operator would be better for both readability and performance:
let inline ( +# ) a (b,c) = a, b, c
let result = x +# yz // result is ('x, 'y, 'z)
The problem you have is that the function return a*b so the return type becomes 'a*('b*'c) which is different to 'a*'b*'c the best solution is a small helper function like
let inline flatten (a,(b,c)) = a,b,c
then you can do
(x,examplefunction) |> flatten
I have the following function in my common extension file.
You may find this useful.
let inline squash12 ((a,(b,c) ):('a*('b*'c) )):('a*'b*'c ) = (a,b,c )
let inline squash21 (((a,b),c ):(('a*'b)*'c )):('a*'b*'c ) = (a,b,c )
let inline squash13 ((a,(b,c,d)):('a*('b*'c*'d))):('a*'b*'c*'d) = (a,b,c,d)
let seqsquash12 (sa:seq<'a*('b*'c) >) = sa |> Seq.map squash12
let seqsquash21 (sa:seq<('a*'b)*'c >) = sa |> Seq.map squash21
let seqsquash13 (sa:seq<'a*('b*'c*'d)>) = sa |> Seq.map squash13
let arrsquash12 (sa:('a*('b*'c) ) array) = sa |> Array.map squash12
let arrsquash21 (sa:(('a*'b)*'c ) array) = sa |> Array.map squash21
let arrsquash13 (sa:('a*('b*'c*'d)) array) = sa |> Array.map squash13

F# Basics: Folding 2 lists together into a string

a little rusty from my Scheme days, I'd like to take 2 lists: one of numbers and one of strings, and fold them together into a single string where each pair is written like "{(ushort)5, "bla bla bla"},\n". I have most of it, i'm just not sure how to write the Fold properly:
let splitter = [|","|]
let indexes =
indexStr.Split(splitter, System.StringSplitOptions.None) |> Seq.toList
let values =
valueStr.Split(splitter, System.StringSplitOptions.None) |> Seq.toList
let pairs = List.zip indexes values
printfn "%A" pairs
let result = pairs |> Seq.fold
(fun acc a -> String.Format("{0}, \{(ushort){1}, \"{2}\"\}\n",
acc, (List.nth a 0), (List.nth a 1)))
Your missing two things. The initial state of the fold which is an empty string and you can't use list comprehension on tuples in F#.
let splitter = [|","|]
let indexes =
indexStr.Split(splitter, System.StringSplitOptions.None) |> Seq.toList
let values =
valueStr.Split(splitter, System.StringSplitOptions.None) |> Seq.toList
let pairs = List.zip indexes values
printfn "%A" pairs
let result =
pairs
|> Seq.fold (fun acc (index, value) ->
String.Format("{0}{{(ushort){1}, \"{2}\"}},\n", acc, index, value)) ""
fold2 version
let result =
List.fold2
(fun acc index value ->
String.Format("{0}{{(ushort){1}, \"{2}\"}},\n", acc, index, value))
""
indexes
values
If you are concerned with speed you may want to use string builder since it doesn't create a new string every time you append.
let result =
List.fold2
(fun (sb:StringBuilder) index value ->
sb.AppendFormat("{{(ushort){0}, \"{1}\"}},\n", index, value))
(StringBuilder())
indexes
values
|> string
Fold probably isn't the best method for this task. Its a lot easier to map and concat like this:
let l1 = "a,b,c,d,e".Split([|','|])
let l2 = "1,2,3,4,5".Split([|','|])
let pairs =
Seq.zip l1 l2
|> Seq.map (fun (x, y) -> sprintf "(ushort)%s, \"%s\"" x y)
|> String.concat "\n"
I think you want List.fold2. For some reason the List module has a fold2 member but Seq doesn't. Then you can dispense with the zip entirely.
The types of your named variables and the type of the result you hope for are all implicit, so it's difficult to help, but if you are trying to accumulate a list of strings you might consider something along the lines of
let result = pairs |> Seq.fold
(fun prev (l, r) ->
String.Format("{0}, \{(ushort){1}, \"{2}\"\}\n", prev, l, r)
"" pairs
My F#/Caml is very rusty so I may have the order of arguments wrong. Also note your string formation is quadratic; in my own code I would go with something more along these lines:
let strings =
List.fold2 (fun ss l r ->
String.format ("\{(ushort){0}, \"{1}\"\}\n", l, r) :: ss)
[] indexes values
let result = String.concat ", " strings
This won't cost you quadratic time and it's a little easier to follow. I've checked MSDN and believe I have the correct order of arguments on fold2.
Keep in mind I know Caml not F# and so I may have details or order of arguments wrong.
Perhaps this:
let strBuilder = new StringBuilder()
for (i,v) in Seq.zip indexes values do
strBuilder.Append(String.Format("{{(ushort){0}, \"{1}\"}},\n", i,v))
|> ignore
with F# sometimes is better go imperative...
map2 or fold2 is the right way to go. Here's my take, using the (||>) operator:
let l1 = [| "a"; "b"; "c"; "d"; "e" |]
let l2 = [| "1"; "2"; "3"; "4"; "5" |]
let pairs = (l1, l2) ||> Seq.map2 (sprintf ("(ushort)%s, \"%s\""))
|> String.concat "\n"

Resources