Array indexer within choose() won't work - f#

Please pardon my dust here while I am trying to learn F#
I have a function that gives me a Seq of Arrays read from a CSV file. Each element of those arrays represent one column data.
let file = readFile("""C:\path\to\file.csv""")
The first column is dates which I am trying to fetch here is my code
let dates =
file
|> Seq.skip(1)
|> Seq.choose(fun x -> x.[0])
I am getting the following compile error
error FS0001: This expression was expected to have type 'a option
Am I using it wrong ? When I point mouse to 'x', intellisense tells me x is of type string[]

What you actually wanted was
let dates =
file
|> Seq.skip(1)
|> Seq.map(fun x -> x.[0])
Seq.choose does filtering as well, but as you don't use the filtering you only need to use map

I got it fixed. Some() is what I wanted.
let dates =
file
|> Seq.skip(1)
|> Seq.choose(fun x -> Some(x.[0]))

Related

Why does F# not like the type ('a list list) as input?

*I edited my original post to include more info.
I'm working on an F# assignment where I'm supposed to create a function that takes an "any list list" as input and outputs an "any list". It should be able to concatenate a list of lists into a single list.
Here's what my function looks like:
let llst = [ [1] ; [2;3] ; ['d';'e';'f'] ]
let concat (llst:'a list list) : 'a list =
List.concat llst
List.iter (fun elem -> printf "%d " elem) concat
This solution more or less copied directly from microsofts example of using the List.concat function, the only exception being the specification of input/output types.
When i run the code i get this error:
concat.fsx(7,43): error FS0001: This expression was expected to have type
''a list'
but here has type
''b list list -> 'b list'
So it appears that concat is turning my llst into a character list, which i don't understand.
Can anyone help me understand why I'm getting this type error and how I can write a function that takes the types that I need?
The problem is somewhere in your implementation of the concat function. It is hard to say where exactly without seeing your code, but since this is an assignment, it is actually perhaps better to explain what the error message is telling you, so that you can find the issue yourself.
The error message is telling you that the F# type inference algorithm found a place in your code where the actual type of what you wrote does not match the type that is expected in that location. It also tells you what the two mismatching types are. For example, say you write something like this:
let concat (llst:'a list list) : 'a list =
llst
You will get the error you are getting on the second line, because the type of llst is 'a list list (the compiler knows this from the type annotation you give on line 1), but the expected type is the same as the result type of the function which is 'a list - also specified by your type annotation.
So, to help you find the issue - look at the exact place where you are getting an error and try to infer why compiler thinks that the actual type is 'a list list and try to understand why it expects 'a list as the type that should be in this place.
This is correct:
let concat (llst:'a list list) : 'a list =
List.concat llst
However, it's really equivalent to let concat = List.concat
This, however, doesn't compile, the elements of the lists need to be of the same type:
let llst = [ [1] ; [2;3] ; ['d';'e';'f'] ]
This also is problematic:
List.iter (fun elem -> printf "%d " elem) concat
List.iter has two arguments and the second one needs to be a List. However in your case you are (as per compiler error) providing your concat function which is a a' List List -> a' List.
What I suspect you meant to do, is apply the concat function to your llist first:
List.iter (fun elem -> printf "%d " elem) (concat llist)
// or
llist
|> concat
|> List.iter (fun elem -> printf "%d " elem)
However, all of this is perhaps missing the point of the exercise. What perhaps you need to do is implement some simple recursion based on the empty / non-empty state of your list, ie. fill in the blanks from here:
let rec myconcat acc inlist =
match inlist with
| [] -> ??
| elt :: tail -> ??

CSV Type Provider & Accessing Data

Good evening! I am a very new programmer getting my feet wet with F#. I am attempting to do some simple data analysis and plotting but I cannot figure out how access the data properly. I get everything set up and use the CSVProvider and it works perfectly:
#load #"packages\FsLab\FsLab.fsx"
#load #"packages\FSharp.Charting\FSharp.Charting.fsx"
open Deedle
open FSharp.Data
type Pt = CsvProvider<"C:/Users/berkl/Test10/CGC.csv">
let data = Pt.Load("C:/Users/berkl/Test10/CGC.csv")
Then, I pull out the data for a specific entry:
let test = data.Rows |> Seq.filter (fun r -> r.``Patient number`` = 2104)
This works as expected and prints the following to FSI:
test;;
val it : seq<CsvProvider<...>.Row> =
seq
[(2104, "Cita 1", "Nuevo", "Femenino", nan, nan, nan);
(2104, "Cita 2", "Establecido", "", 18.85191818, 44.0, 103.0);
(2104, "Cita 3", "Establecido", "Femenino", 17.92617533, 46.0, 108.0);
(2104, "Cita 4", "Establecido", "Femenino", nan, nan, nan); ...]
Here is where I'm at a loss. I want to take out the fifth column and plot it against the sixth column. And I don't know how to access it.
What I can do so far is access a single value in one of the columns:
let Finally = Seq.item 1 test
let PtHt = Finally.Ht_cm
Any help is much appreciated!!
I would probably recommend using the XPlot library instead of F# Charting, because that is the one that's going to be available in FsLab in the long term (it is cross-platform).
To create a chart using XPlot, you need to give it a sequence of pairs with X and Y values:
#load "packages/FsLab/FsLab.fsx"
open XPlot.Plotly
Chart.Scatter [ for x in 0.0 .. 0.1 .. 10.0 -> x, sin x ]
In your example, you can get the required format using sequence comprehensions (as in the above example) or using Seq.map as in the existing answer - both options do the same thing:
// Using sequence comprehensions
Chart.Scatter [ for row in test -> row.Ht_cm, row.Wt_kg ]
// Using Seq.map and piping
test |> Seq.map (fun row -> row.Ht_cm, row.Wt_kg) |> Chart.Scatter
The key thing is that you need to produce one sequence (or a list) containing the X and Y values as a tuple (rather than producing two separate sequences).
What you want to do is transform your sequence of rows to a sequence of values from a column. You use Seq.map for any such transformation.
In your case, you could do (modulo the correct column names which I don't have)
let col5 =
test
|> Seq.map (fun row -> row.Ht_cm)
let col6 =
test
|> Seq.map (fun row -> row.Wt_kg)

Get a column by name as array from CsvFile.Load (or create dictionary of arrays from csv)

I have the following code to load a csv. What is the best way to get a column from "msft" (preferably by name) as an array? Or should I be loading the data in a different way to do this?
#r "FSharp.Data.dll"
open FSharp.Data.Csv
let msft = CsvFile.Load("http://ichart.finance.yahoo.com/table.csv?s=MSFT").Cache()
Edit: Alternatively, what would be an efficient way to import a csv into a dictionary of arrays keyed by column name? If I should really be creating a new question for this, please let me know. Not yet familiar with all stackoverflow standards.
Building on Latkin's answer, this seems like the more functional or F# way of doing what you want.
let getVector columnAccessor msft =
[| yield! msft.Data |> Seq.map columnAccessor |]
(* Now we can get the column all at once *)
let closes = getVector (fun x -> x.Close) msft
(* Or we can create an accessor and pipe our data to it. *)
let getCloses = getVector (fun x -> x.Close)
let closes = msft |> getCloses
I hope that this helps.
I went through this example as well. Something like the following should do it.
let data =
msft.Data
|> List.fold (fun acc row -> row.Date :: acc) List.Empty<DateTime>
Here I am piping the msft.Data list of msft data records and folding it down to a list of one item from that list. Please check the documentation for all functions mentioned. I have not run this.
When you say you want to column "by name" it's not clear if you mean "someone passes me the column name as a string" or "I use the column name in my code." Type providers are perfect for the latter case, but do not really help with the former.
For the latter case, you could use this:
let closes = [| yield! msft.Data |> Seq.map (fun x -> x.Close) |]
If the former, you might want to consider reading in the data some other way, perhaps to a dictionary keyed by column names.
The whole point of type providers is to make all of this strongly typed and code-focused, and to move away from passing column names as strings which might or might not be valid.

Working with large text files?

I need to import a large text file (55MB) (525000 * 25) and manipulate the data and produce some output. As usual I started exploring with f# interactive, and I get some really strange behaviours.
Is this file too large or my code wrong?
First test was to import and simply comute the sum over one column (not the end goal but first test):
let calctest =
let reader = new StreamReader(path)
let csv = reader.ReadToEnd()
csv.Split([|'\n'|])
|> Seq.skip 1
|> Seq.map (fun line -> line.Split([|','|]))
|> Seq.filter (fun a -> a.[11] = "M")
|> Seq.map (fun values -> float(values.[14]))
As expected this produces a seq of float both in typecheck and in interactive. If I know add:
|> Seq.sum
Type check works and says this function should return a float but if I run it in interactive I get this error:
System.IndexOutOfRangeException: Index was outside the bounds of the array
Then I removed the last line again and thought I look at the seq of float in a text file:
let writetest =
let str = calctest |> Seq.map (fun i -> i.ToString())
System.IO.File.WriteAllLines("test.txt", str )
Again, this passes the type check but throws errors in interactive.
Can the standard StreamReader not handle that amount of data? or am I going wrong somewhere? Should I use a different function then Streamreader?
Thanks.
Seq is lazy, which means that only when you add the Seq.sum is all the mapping and filtering actually being done, that's why you don't see the error before adding that line. Are you sure you have 15 columns on all rows? That's probably the problem
I would advise you to use the CSV Type Provider instead of just doing a string.Split, that way you'll be sure to not have an accidental IndexOutOfRangeException, and you'll handle , escaping correctly.
Additionaly, you're reading the whole csv file into memory by calling reader.ReadToEnd(), the CsvProvider supports streaming if you set the Cache parameter to false. It's not a problem with a 55MB file, but if you have something much larger it might be

Splitting a string list list

I'm quite (very) new to F# and I'm scratching my head over a little problem. I have a string list list that I'm trying to manipulate and transform. This is probably trivial.
The following data is being read in from a CSV file:
1,ABC,3
1,DEF,3
1,XYZ,1
2,ABC,2
2,XYZ,1
3,DEF,2
3,XYZ,2
Which right or wrong, I'm reading into a string list list. This data represents a non-normalized set of data, where the field at index 0 on each record is an Identifier field. At the moment I'm just trying to split the outer-list up so that I end up with a string list list list representing the following:
1,ABC,3 2,ABC,2 3,DEF,2
1,DEF,3 2,XYZ,1 3,XYZ,2
1,XYZ,1
The results above will then be pushed into my Typed model and fed into the rest of the application.
In your code:
csvRecords
|> Seq.groupBy (fun record -> (record.Item 0))
|> List.ofSeq
|> List.map(toTypedModel)
record.Item 0 isn't a good way to get the first element of a list. You should either use List.head or pattern matching for that purpose.
Your example would look like:
csvRecords
|> Seq.groupBy List.head
|> Seq.map toTypedModel
|> List.ofSeq
I also changed the order to use toTypedModel with sequence, it helps to avoid allocating an unnecessary list.
Use Seq.groupby -
input
|> Seq.groupBy (fun (a,b,c) -> a)
|> Seq.toList

Resources