Any benefit for targeting F# runtime for F# 4.0 or 3.1 instead of 3.0? - f#

In Visual Studio 2015 Preview you can select from three target F# runtimes:
Is there any benefit to targeting the newer versions? Do they give you access to additional APIs? If so, which ones? It would be great if we could generate a comprehensive list.
F# Core Library Reference

Unfortunately, I don't think there is a complete list of things that you get from referencing F# 4.0. However, looking at the list of new things on CodePlex, there are a few obvious ones:
Lots of new functions in List, Seq and Array modules (so that equivalent functionality is available in all of the modules where possible)
A number of other library additions (search the table for "Library"), including things like tryUnbox, isNull, ofObj, toObj, ofNullable, toNullable but also AwaitTask for non-generic tasks
Out of the language features, the support for quoting arguments of method calls is definitely one that requires the new F# core.
Also, I'm not quite sure which of these are actually in the preview - I suspect most of them are not.

I was able to generate a complete list of new additions to the public surface area of FSharp.Core for 3.1 and 4.0. The code I used to generate the list of differences is included and can be re-purposed.
https://gist.github.com/ctaggart/0205da3f153cd20b099d

A quick-n-dirty way to see the deltas in the public surface area is to crib the code from the FSharp.Core public surface area unit tests :-)
Create a console app with below code, and rebuild/rerun it against each version you are interested in. It will dump all public APIs in that version. You can use windiff or your diff tool of choice to compare APIs between versions.
open System.Reflection
let file = typeof<int list>.Assembly.Location
let asm = Assembly.ReflectionOnlyLoadFrom(file)
let referenced = asm.GetReferencedAssemblies()
for ref in referenced do
Assembly.ReflectionOnlyLoad(ref.FullName) |> ignore
let types = asm.GetExportedTypes()
let values =
types
|> Array.collect (fun t -> t.GetMembers())
|> Array.map (fun v -> sprintf "%s: %s" (v.ReflectedType.ToString()) (v.ToString()))
|> Array.sort
|> String.concat "\r\n"
// dump to a file or print to console
printfn "%s" values

Related

How to do tuple augmentation

The following code is from chapter 5 of "F# 4.0 Design Patterns".
let a = 1,"car"
type System.Tuple<'T1,'T2> with
member t.AsString() =
sprintf "[[%A]:[%A]]" t.Item1 t.Item2
(a |> box :?> System.Tuple<int,string>).AsString()
The desired output is [[1]:["car"]]
However, a red squiggly appears under AsString(). "The field, constructor or member 'AsString' is not defined. Maybe you want one of the following: ToString"
This is a bit odd code sample - I suspect the point that this is making is that F# tuples are actually .NET tuples represented using System.Tuple - by showing that an extension to System.Tuple can be invoked on ordinary F# tuples.
I suspect the behaviour of F# has changed and it no longer allows this - it may have been that adding extensions was allowed on System.Tuple, but not on ordinary F# tuples, but the two have became more unified in the compiler.
However, you can do a very similar thing using the .NET-style extension methods:
let a = 1,"car"
[<System.Runtime.CompilerServices.ExtensionAttribute>]
type TupleExtensions =
[<System.Runtime.CompilerServices.ExtensionAttribute>]
static member AsString(t:System.Tuple<'T1,'T2>) =
sprintf "[[%A]:[%A]]" t.Item1 t.Item2
let st = (a |> box :?> System.Tuple<int,string>)
st.AsString()
This can actually be also invoked directly on an F# tuple value:
("car", 32).AsString()

F# CSV TypeProvider less robust in console application

I am trying to experiment with live data from the Coronavirus pandemic (unfortunately and good luck to all of us).
I have developed a small script and I am transitioning into a console application: it uses CSV type providers.
I have the following issue. Suppose we want to filter by region the Italian spread we can use this code into a .fsx file:
open FSharp.Data
let provinceData = CsvProvider< #"https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-province/dpc-covid19-ita-province.csv" , IgnoreErrors = true>.GetSample()
let filterDataByProvince province =
provinceData.Rows
|> Seq.filter (fun x -> x.Sigla_provincia = province)
Being sequences lazy, then suppose I force the complier to load in memory the data for the province of Rome, I can add:
let romeProvince = filterDataByProvince "RM" |> Seq.toArray
This works fine, run by FSI, locally.
Now, if I transition this code into a console application using a .fs file; I declare exactly the same functions and using exactly the same type provider loader; but instead of using the last line to gather the data, I put it into a main function:
[<EntryPoint>]
let main _ =
let romeProvince = filterDataByProvince "RM" |> Seq.toArray
Console.Read() |> ignore
0
This results into the following runtime exception:
System.Exception
HResult=0x80131500
Message=totale_casi is missing
Source=FSharp.Data
StackTrace:
at <StartupCode$FSharp-Data>.$TextRuntime.GetNonOptionalValue#139-4.Invoke(String message)
at CoronaSchiatta.Evoluzione.provinceData#10.Invoke(Object parent, String[] row) in C:\Users\glddm\source\repos\CoronaSchiatta\CoronaSchiatta\CoronaEvolution.fs:line 10
at FSharp.Data.Runtime.CsvHelpers.parseIntoTypedRows#174.GenerateNext(IEnumerable`1& next)
Can you explain that?
Some rows have an odd format, possibly, but the FSI session is robust to those, whilst the console version is fragile; why? How can I fix that?
I am using VS2019 Community Edition, targeting .NET Framework 4.7.2, F# runtime: 4.7.0.0;
as FSI, I am using the following: FSI Microsoft (R) F# Interactive version 10.7.0.0 for F# 4.7
PS: Please also be aware that if I use CsvFile, instead of type providers, as in:
let test = #"https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-province/dpc-covid19-ita-province.csv"
|> CsvFile.Load |> (fun x -> x.Rows ) |> Seq.filter ( fun x-> x.[6 ] = "RM")
|> Seq.iter ( fun x -> x.[9] |> Console.WriteLine )
Then it works like a charm also in the console application. Of course I would like to use type providers otherwise I have to add type definition, mapping the schema to the columns (and it will be more fragile). The last line was just a quick test.
Fragility
CSV Type Providers can be fragile if you don't have a good schema or sample.
Now getting a runtime error is almost certainly because your data doesn't match up.
How do you figure it out? One way is to run through your data first:
provinceData.Rows |> Seq.iteri (fun i x -> printfn "Row %d: %A" (i + 1) x)
This runs up to Row 2150. And sure enough, the next row:
2020-03-11 17:00:00,ITA,19,Sicilia,994,In fase di definizione/aggiornamento,,0,0,
You can see the last value (totale_casi) is missing.
One of CsvProvider's options is InferRows. This is the number of rows the provider scans in order to build up a schema - and its default value happens to be 1000.
So:
type COVID = CsvProvider<uri, InferRows = 0>
A better way to prevent this from happening in the future is to manually define a sample from a sub-set of data:
type COVID = CsvProvider<"sample-dpc-covid19-ita-province.csv">
and sample-dpc-covid19-ita-province.csv is:
data,stato,codice_regione,denominazione_regione,codice_provincia,denominazione_provincia,sigla_provincia,lat,long,totale_casi
2020-02-24 18:00:00,ITA,13,Abruzzo,069,Chieti,CH,42.35103167,14.16754574,0
2020-02-24 18:00:00,ITA,13,Abruzzo,066,L'Aquila,AQ,42.35122196,13.39843823,
2020-02-24 18:00:00,ITA,13,Abruzzo,068,Pescara,PE,42.46458398,14.21364822,0
2020-02-24 18:00:00,ITA,13,Abruzzo,067,Teramo,TE,42.6589177,13.70439971,0
With this the type of totale_casi is now Nullable<int>.
If you don't mind NaN values, you can also use:
CsvProvider<..., AssumeMissingValues = true>
Why does FSI seem more robust?
FSI isn't more robust. This is my best guess:
Your schema source is being regularly updated.
Type Providers cache the schema, so that it doesn't regenerate the schema every time you compile your code, which can be impractical. When you restart an FSI session, you end up regenerating your Type Provider, but not so with the console application. So it might sometimes has the effect of being less error-prone, having worked with a newer source.

Why are some functions available only in F# script files, not in source files?

I have noticed this a few times now. An example of an offending function is Array.take. In a script file I can write
[|1; 2; 4; 7; 6; 5|]
|> Array.take 3
|> Array.iter (printfn "%d")
and this works without a problem. But if I try to use Array.take in an actual source file, I get the following error
[|1; 2; 4; 7; 6; 5|]
|> Array.take 3 // RED SQUIGGLY ERROR HERE
|> Array.iter (printfn "%d")
and the error message is:
The value, constructor, namespace or type 'take' is not defined
So, what gives? Thanks in advance for your help.
I suspect that you're seeing this is due to different versions of F#. There was an attempt to regularise a lot of the List Seq and Array functions in F# 4.0, see: https://visualfsharp.codeplex.com/wikipage?title=Status
One of the functions that was added as part of that process was Array.take.
In F# interactive, no doubt you are using the latest version of F# but presumably you are not in your compiled project.
This could be because you haven't changed the version in the project settings or it could be because you have a nuget package attached your project which references a specific version of F# Core via nuget.
If you go to your project properties, you should see an a 'Target F# Runtime' setting, change this to F# 4.0. If I remember correctly, a nuget reference to a specific F# Core version will prevent you from changing that setting, in which case you'll need to delete the reference to F# core and re-add the correct version as a reference from the list of assemblies.

File transform in F#

I am just starting to work with F# and trying to understand typical idoms and effective ways of thinking and working.
The task at hand is a simple transform of a tab-delimited file to one which is comma-delimited. A typical input line will look like:
let line = "#ES# 01/31/2006 13:31:00 1303.00 1303.00 1302.00 1302.00 2514 0"
I started out with looping code like this:
// inFile and outFile defined in preceding code not shown here
for line in File.ReadLines(inFile) do
let typicalArray = line.Split '\t'
let transformedLine = typicalArray |> String.concat ","
outFile.WriteLine(transformedLine)
I then replaced the split/concat pair of operations with a single Regex.Replace():
for line in File.ReadLines(inFile) do
let transformedLine = Regex.Replace(line, "\t",",")
outFile.WriteLine(transformedLine)
And now, finally, have replaced the looping with a pipeline:
File.ReadLines(inFile)
|> Seq.map (fun x -> Regex.Replace(x, "\t", ","))
|> Seq.iter (fun y -> outFile.WriteLine(y))
// other housekeeping code below here not shown
While all versions work, the final version seems to me the most intuitive. Is this how a more experienced F# programmer would accomplish this task?
I think all three versions are perfectly fine, idiomatic code that F# experts would write.
I generally prefer writing code using built-in language features (like for loops and if conditions) if they let me solve the problem I have. These are imperative, but I think using them is a good idea when the API requires imperative code (like outFile.WriteLine). As you mentioned - you started with this version (and I would do the same).
Using higher-order functions is nice too - although I would probably do that only if I wanted to write data transformation and get a new sequence or list of lines - this would be handy if you were using File.WriteAllLines instead of writing lines one-by-one. Although, that could be also done by simply wrapping your second version with sequence expression:
let transformed =
seq { for line in File.ReadLines(inFile) -> Regex.Replace(line, "\t",",") }
File.WriteAllLines(outFilePath, transformed)
I do not think there is any objective reason to prefer one of the versions. My personal stylistic preference is to use for and refactor to sequence expressions (if needed), but others will likely disagree.
A side note that if you want to write to the same file that you are reading from, you need to remember that Seq is doing lazy evaluation.
Using Array as opposed to Seq makes sure file is closed for reading when it is needed for writing.
This works:
let lines =
file |> File.ReadAllLines
|> Array.map(fun line -> ..modify line..)
File.WriteAllLines(file, lines)
This does not (causes file access file violation)
let lines =
file |> File.ReadLines
|> Seq.map(fun line -> ..modify line..)
File.WriteAllLines(file, lines)
(potential overlap with another discussion here, where intermediate variable helps with the same problem)

Using F# for the semantic web implementing in memory triple stores

What is an effective means of implementing an in memory semantic web triple store using the basic .NET collection classes using F#?
Are there any F# examples or projects already doing this?
There is also SemWeb which is a C# library which provides it's own SQL based Triple Store - http://razor.occams.info/code/semweb/
I'm working on a new C# library for RDF called dotNetRDF and have just released the latest Alpha http://www.dotnetrdf.org.
Here's an equivalent program to the one spoon16 showed:
open System
open VDS.RDF
open VDS.RDF.Parsing
open VDS.RDF.Query
//Get a Graph and fill it from a file
let g = new Graph()
let parser = new TurtleParser()
parser.Load(g, "test.ttl")
//Place into a Triple Store and query
let store = new TripleStore()
store.Load(g)
let results = store.ExecuteQuery("SELECT ?s ?p ?o WHERE {?s ?p ?o} LIMIT 10") :?> SparqlResultSet
//Output the results
Console.WriteLine(results.Count.ToString() ^ " Results")
for result in results.Results do
Console.WriteLine(result.ToString())
done
//Wait for user to hit enter so they can see the results
Console.ReadLine() |> ignore
My library currently supports my own SQL databases, AllegroGraph, 4store, Joseki, Sesame, Talis and Virtuoso as backing stores
Check out LinqToRdf which, in addition to simple VS.NET hosted modeling tools, provides a full LINQ query provider and round-tripping data when dealing with in-memory databases:
var ctx = new MusicDataContext(#"http://localhost/linqtordf/SparqlQuery.aspx");
var q = (from t in ctx.Tracks
where t.Year == "2006" &&
t.GenreName == "History 5 | Fall 2006 | UC Berkeley"
orderby t.FileLocation
select new {t.Title, t.FileLocation}).Skip(10).Take(5);
foreach (var track in q)
{
Console.WriteLine(track.Title + ": " + track.FileLocation);
}
Intellidimension offers a .NET based in-memory triple store as part of their Semantics SDK. They often provide free licenses for research/education if you contact them.
I use their technology every single day from C# and PowerShell though and I really enjoy it.
//disclaimer: really the first time I have used F# so this may not be any good...
//but it does work
open Intellidimension.Rdf
open System.IO
let rdfXml = File.ReadAllText(#"C:\ontology.owl")
let gds = new GraphDataSource()
gds.Read(rdfXml) |> ignore
let tbl = gds.Query("select ?s ?p ?o where {?s ?p ?o} limit 10")
System.Console.Write(tbl.RowCount)
System.Console.ReadLine() |> ignore
Aduna's Sesame framework is ported to .NET. This is a sample code that shows how to use F# to connect to Sesame Http repository: http://debian.fmi.uni-sofia.bg/~toncho/myblog/archives/309-Using-F-to-connect-to-a-Sesame-repository.html
I know this doesn't directly answer your question, but you could use 4store which is a stable, proven triplestore, and write a client for it in .Net, instead of developing your own triplestore.
Related questions
Which Triplestore for rapid semantic web development?
RDF/OWL/SPARQL/Triple Stores/Reasoners and other Semantic Web APIs for C#?

Resources