Reading from a text file and sorting - f#

I have managed to read my text file which contains line by line random numbers. When I output lines using printfn "%A" lines I get seq ["45"; "5435" "34"; ... ] so I assume that lines must be a datatype list.
open System
let readLines filePath = System.IO.File.ReadLines(filePath);;
let lines = readLines #"C:\Users\Dan\Desktop\unsorted.txt"
I am now trying to sort the list by lowest to highest but it does not have the .sortBy() method. Any chance anyone can tell me how to manually do this? I have tried turning it to an array to sort it but it doesn't work.
let array = [||]
let counter = 0
for i in lines do
array.[counter] = i
counter +1
Console.ReadKey <| ignore
Thanks in advance.

If all the lines are integers, you can just use Seq.sortBy int, like so:
open System
let readLines filePath = System.IO.File.ReadLines(filePath)
let lines = readLines #"C:\Users\Dan\Desktop\unsorted.txt"
let sorted = lines |> Seq.sortBy int
If some of the lines may not be valid integers, then you'd need to run through a parsing and validation step. E.g.:
let tryParseInt s =
match System.Int32.TryParse s with
| true, n -> Some n
| false, _ -> None
let readLines filePath = System.IO.File.ReadLines(filePath)
let lines = readLines #"C:\Users\Dan\Desktop\unsorted.txt"
let sorted = lines |> Seq.choose tryParseInt |> Seq.sort
Note that the tryParseInt function I just wrote is returning the int value, so I used Seq.sort instead of Seq.sortBy int, and the output of that function chain is going to be a sequence of ints rather than a sequence of strings. If you really wanted a sequence of strings, but only the strings that could be parsed to ints, you could have done it like this:
let tryParseInt s =
match System.Int32.TryParse s with
| true, _ -> Some s
| false, _ -> None
let readLines filePath = System.IO.File.ReadLines(filePath)
let lines = readLines #"C:\Users\Dan\Desktop\unsorted.txt"
let sorted = lines |> Seq.choose tryParseInt |> Seq.sortBy int
Note how I'm returning s from this version of tryParseInt, so that Seq.choose is keeping the strings (but throwing away any strings that failed to validate through System.Int32.TryParse). There's plenty more possibilities, but that should give you enough to get started.

All the comments are valid but I'm a bit more concerned about your very imperative loop.
So here's an example:
To read all the lines:
open System.IO
let file = #"c:\tmp\sort.csv"
let lines = File.ReadAllLines(file)
To sort the lines:
let sorted = Seq.sort lines
sorted |> Seq.length // to get the number of lines
sorted |> Seq.map (fun x -> x.Length) // to iterate over all lines and get the length of each line
You can also use a list comprehension syntax:
[for l in sorted -> l.ToUpper()]
Seq will work for all kinds of collections but you can replace it with Array (mutable) or List (F# List).

Related

How to efficiently create a list in reversed order in F#

Is there anyway to contruct a list in reverse order without having to reverse it
Here is an example, I read all lines from stdin
#!/usr/bin/env dotnet fsi
open System
let rec readLines1 () =
let rec helper acc =
match Console.ReadLine() with
| null -> acc
| line ->
helper (line :: acc)
helper [] |> List.rev
readLines1 () |> List.iter (printfn "%s")
Before return from readLines1 I have to List.rev it so that is in right order. Since the result is a slightly linked list it will have to read all trough it and create the reversed version. Is there any way of creating the list in right order?
You can use a sequence instead of accumulating the lines in a list:
open System
let readLines1 () =
let rec helper () =
seq {
match Console.ReadLine() with
| null -> ()
| line ->
yield line
yield! helper ()
}
helper () |> Seq.toList
readLines1 () |> List.iter (printfn "%s")
You cannot create list in reverse order, because that would require mutation. If you read inputs one by one, and want to turn them into a list immediately, the only thing you can do is to create new list, linking to the previous one.
In practice, reversing the list is perfectly fine and that's probably the best way of solving this.
Out of curiosity, you could try defininig a mutable list that has the same structure as immutable F# list:
open System
type MutableList<'T> =
{ mutable List : MutableListBody<'T> }
and MutableListBody<'T> =
| Empty
| Cons of 'T * MutableList<'T>
Now you can implement your function by mutating the list:
let rec readLines () =
let res = { List = Empty }
let rec helper acc =
match Console.ReadLine() with
| null -> res
| line ->
let next = { List = Empty }
acc.List <- Cons(line, next)
helper next
helper res
This may be educational, but it's not very useful and, if you really wanted mutation in F#, you should probably use ResizeArray.
Yet another trick is to work with functions that take the tail of the list:
let rec readLines () =
let rec helper acc =
match Console.ReadLine() with
| null -> acc []
| line -> helper (fun tail -> acc (line :: tail))
helper id
In the line case, this returns a function that takes tail adds line before the tail and then calls whatever function was constructed before to add more things to the front.
This actually creates the list in the right order, but it's probably less efficient than creating a list and reversing it. It may look nice, but you are allocating a new function for each iteration, which is not better than allocating an extra copy of the list. (But it is a nice trick, nevertheless!)
Alternative solution without implementing recursive functions
let lines =
Seq.initInfinite (fun _ -> Console.ReadLine())
|> Seq.takeWhile (not << isNull)
|> Seq.toList

Reading text file, iterating over lines to find a match, and return the value with FSharp

I have a text file that contains the following and I need to retrieve the value assigned to taskId, which in this case is AWc34YBAp0N7ZCmVka2u.
projectKey=ProjectName
serverUrl=http://localhost:9090
serverVersion=10.5.32.3
strong text**interfaceUrl=http://localhost:9090/interface?id=ProjectName
taskId=AWc34YBAp0N7ZCmVka2u
taskUrl=http://localhost:9090/api/ce/task?id=AWc34YBAp0N7ZCmVka2u
I have two different ways of reading the file that I've wrote.
let readLines (filePath:string) = seq {
use sr = new StreamReader (filePath)
while not sr.EndOfStream do
yield sr.ReadLine ()
}
readLines (FindFile currentDirectory "../**/sample.txt")
|> Seq.iter (fun line ->
printfn "%s" line
)
and
let readLines (filePath:string) =
(File.ReadAllLines filePath)
readLines (FindFile currentDirectory "../**/sample.txt")
|> Seq.iter (fun line ->
printfn "%s" line
)
At this point, I don't know how to approach getting the value I need. Options that, I think, are on the table are:
use Contains()
Regex
Record type
Active Pattern
How can I get this value returned and fail if it doesn't exist?
I think all the options would be reasonable - it depends on how complex the file will actually be. If there is no escaping then you can probably just look for = in the line and use that to split the line into a key value pair. If the syntax is more complex, this might not always work though.
My preferred method would be to use Split on string - you can then filter to find values with your required key, map to get the value and use Seq.head to get the value:
["foo=bar"]
|> Seq.map (fun line -> line.Split('='))
|> Seq.filter (fun kvp -> kvp.[0] = "foo")
|> Seq.map (fun kvp -> kvp.[1])
|> Seq.head
Using active patterns, you could define a pattern that takes a string and splits it using = into a list:
let (|Split|) (s:string) = s.Split('=') |> List.ofSeq
This then lets you get the value using Seq.pick with a pattern matching that looks for strings where the substring before = is e.g. foo:
["foo=bar"] |> Seq.pick (function
| Split ["foo"; value] -> Some value
| _ -> None)
The active pattern trick is quite neat, but it might be unnecessarily complicating the code if you only need this in one place.

Sorting indexes in list of lists - F#

Currently I have a function to return the first elements of each list (floats), within a list to a separate list.
let firstElements list =
match list with
| head::_ -> head
| [] -> 0.00
My question is, how do I expand this to return elements at the same index into different lists while I don't know how long this list is? For example
let biglist = [[1;2;3];[4;5;6];[7;8;9]]
If I did not know the length of this list, what is the most efficient and safest way to get
[[1;4;7];[2;5;8];[3;6;9]]
List.transpose has been added recently to FSharp.Core
let biglist = [[1;2;3];[4;5;6];[7;8;9]]
let res = biglist |> List.transpose
//val res : int list list = [[1; 4; 7]; [2; 5; 8]; [3; 6; 9]]
You can use the recent added List.transpose function. But it is always good to be good enough to create such functions yourself. If you want to solve the problem yourself, think of a general algorithm to solve your problem. One would be.
From the first element of each list you create a new list
You drop the first element of each list
If you end with empty lists you end, otherwise repeat at step 1)
This could be the first attempt to solve the Problem. Function names are made up, at this point.
let transpose lst =
if allEmpty lst
then // Some Default value, we don't know yet
else ...
The else branch looks like following. First we want to pick the first element of every element. We imagine a function pickFirsts that do this task. So we could write pickFirsts lst. The result is a list that itself is the first element of a new list.
The new list is the result of the remaining list. First we imagine again a function that drops the first element of every sub-list dropFirsts lst. On that list we need to repeat step 1). We do that by a recursive call to transpose.
Overall we get:
let rec transpose lst =
if allEmpty lst
then // Some Default value, we don't know yet
else (pickFirsts lst) :: (transpose (dropFirsts lst))
At this point we can think of the default value. transpose needs to return a value in the case it ends up with an empty list of empty lists. As we use the result of transpose to add an element to it. The results of it must be a list. And the best default value is an empty list. So we end up with.
let rec transpose lst =
if allEmpty lst
then []
else (pickFirsts lst) :: (transpose (dropFirsts lst))
Next we need to implement the remaining functions allEmpty, pickFirsts and dropFirsts.
pickFirst is easy. We need to iterate over each element, and must return the first value. We get the first value of a list by List.head, and iterating over it and turning every element into a new list is what List.map does.
let pickFirsts lst = List.map List.head lst
dropFirsts need to iterate ver each element, and just remove the first element, or in other words keeps the remaining/tail of a list.
let dropFirsts lst = List.map List.tail lst
The remaining allEmpty is a predicate that either return true/false if we have an empty list of lists or not. With a return value of bool, we need another function that allows to return another type is a list. This is usually the reason to use List.fold. An implementation could look like this:
let allEmpty lst =
let folder acc x =
match x with
| [] -> acc
| _ -> false
List.fold folder true lst
It starts with true as the default value. As long it finds empty lists it returns the default value unchanged. As soon there is one element found, in any list, it will return false (Not Empty) as the new default value.
The whole code:
let allEmpty lst =
let folder acc x =
match x with
| [] -> acc
| _ -> false
List.fold folder true lst
let pickFirsts lst = List.map List.head lst
let dropFirsts lst = List.map List.tail lst
let rec transpose lst =
if allEmpty lst
then []
else (pickFirsts lst) :: (transpose (dropFirsts lst))
transpose [[1;2;3];[4;5;6];[7;8;9]]
Another approach would be to turn it into a 2 dimensional mutable array. Also do length checkings. Do the transformation and return the mutable array again as an immutable list.

F# check if a string contains only number

I am trying to figure out a nice way to check if a string contains only number. This is the result of my effort but it seems really verbose:
let isDigit c = Char.IsDigit c
let rec strContainsOnlyNumber (s:string)=
let charList = List.ofSeq s
match charList with
| x :: xs ->
if isDigit x then
strContainsOnlyNumber ( String.Concat (Array.ofList xs))
else
false
| [] -> true
for example it seems really ugly that I have to convert a string to char list and then back to a string.
Can you figure out a better solution?
There are a few different options for approaching this.
Given that System.String is a sequence of characters, which you're currently using to turn into a list, you can skip the list conversions and just use Seq.forall to directly test:
let strContainsOnlyNumber (s:string) = s |> Seq.forall Char.IsDigit
If you want to see if it's a valid number, you can parse it into a number directly:
let strContainsOnlyNumber (s:string) = System.Int32.TryParse s |> fst
Note that this will also return true for things like "-342" (which contains -, but is a valid number).
Another approach would be to use a regular expression:
let numberCheck = System.Text.RegularExpressions.Regex("^[0-9]+$")
let strContainsOnlyNumbers (s:string) = numberCheck.IsMatch s
This will also handle numeric characters, but could be adapted to include other symbols in numbers if needed.
If the goal is to later use the string as a number, my suggestion would be to just do a conversion, and store in an option:
let tryToInt s =
match System.Int32.TryParse s with
| true, v -> Some v
| false, _ -> None
This will allow you to check to see if the value was a number (via Option.isSome), pattern match to use the results, and more.
Note that conversions to floating point numbers is nearly identical - just change the Int32.TryParse to a Double.TryParse if you want to handle float values.

parse log files with f#

I'm trying to parse data from iis log files.
Each row has a date that I need like this:
u_ex15090503.log:3040:2015-09-05 03:57:45
And a name and email address I need in here:
&actor=%7B%22name%22%3A%5B%22James%2C%20Smith%22%5D%2C%22mbox%22%3A%5B%22mailto%3AJames.Smith%40student.colled.edu%22%5D%7D&
I start off by getting the correct column like this. This part works fine.
//get the correct column
let getCol =
let line = fileReader inputFile
line
|> Seq.filter (fun line -> not (line.StartsWith("#")))
|> Seq.map (fun line -> line.Split())
|> Seq.map (fun line -> line.[7],1)
|> Seq.toArray
getCol
Now I need to parse the above and get the date, name, and email, but I'm having a hard time figuring out how to do that.
So far I have this, which gives me 2 errors(below):
//split the above column at every "&"
let getDataInCol =
let line = getCol
line
|> Seq.map (fun line -> line.Split('&'))
|> Seq.map (fun line -> line.[5], 1)
|> Seq.toArray
getDataInCol
Seq.map (fun line -> line.Split('&'))
the field constructor 'Split' is not defined
The errors:
Seq.map (fun line -> line.[5], 1)
the operator 'expr.[idx]' has been used on an object of indeterminate type based on information prior to this program point.
Maybe I'm going about this all wrong. I'm very new to f# so I apologize for the sloppy code.
Something like this would get the name and email. You'll still need to parse the date.
#r "Newtonsoft.Json.dll"
open System
open System.Text.RegularExpressions
open Newtonsoft.Json.Linq
let (|Regex|_|) pattern input =
let m = Regex.Match(input, pattern)
if m.Success then Some(List.tail [ for g in m.Groups -> g.Value ])
else None
type ActorDetails =
{
Date: DateTime
Name: string
Email: string
}
let parseActorDetails queryString =
match queryString with
| Regex #"[\?|&]actor=([^&]+)" [json] ->
let jsonValue = JValue.Parse(Uri.UnescapeDataString(json))
{
Date = DateTime.UtcNow (* replace with parsed date *)
Name = jsonValue.Value<JArray>("name").[0].Value<string>()
Email = jsonValue.Value<JArray>("mbox").[0].Value<string>().[7..]
}
| _ -> invalidArg "queryString" "Invalid format"
parseActorDetails "&actor=%7B%22name%22%3A%5B%22James%2C%20Smith%22%5D%2C%22mbox%22%3A%5B%22mailto%3AJames.Smith%40student.colled.edu%22%5D%7D&"
val it : ActorDetails = {Date = 11/10/2015 9:14:25 PM;
Name = "James, Smith";
Email = "James.Smith#student.colled.edu";}

Resources