Is it possible to use arrays as static parameters in F# type providers? - f#

I want to create a type provider with a static parameter that is an array. I thought this might work if the array was of another primitive type (int, string etc), but this seems not to work.
As a motivating use-case, this would, for example, allow specifying header names when using a CSV type provider for .csv files without a header row.

Only primitive types can be used as type provider parameters. The current json type provider gets around that by using a comma separated list of parameters as a string.

Related

Aliases for primitive types in AVRO

I want to define aliases/logical names for primitive types. But if I do that directly like for PartyId in the example below, Avro-Tools do not accept this:
protocol Test {
string PartyId;
record Header {
PartyId partyId;
}
}
Is that possible in IDL or AVRO Schema language? How? As a work-around, I can define:
record PartyId {
string _value;
}
Though binary this seems equivalent, semantically (e.g. in generated Java code) this is not the same - PartyId is a structured type, not a primitive type.
I can define custom names for enums and records, but it seems to me that AVRO doesn't offer means for aliasing primitive types.
Aliases in Avro are for remapping of record namespaces
Names for records and enums are because these are actual objects, not primitives
Just like you cannot refer to a Java String by another class, Avro has no need for such a feature. You can write doc comments or name the fields appropriately to make it clear what each field is

F# using XML Type Provider to modify xml

I need to process a bunch of XML documents. They are quite complex in their structure (i.e. loads of nodes), but the processing consists in changing the values for a few nodes and saving the file under a different name.
I am looking for a way to do that without having to reconstruct the output XML by explicitly instantiating all the types and passing all of the unchanged values in, but simply by copying them from the input. If the types generated automatically by the type provider were record types, I could simply create the output by let output = { input with changedNode = myNewValue }, but with the type provider I have to do let output = MyXml.MyRoot(input.UnchangedNode1, input.UnchangedNode2, myNewValue, input.UnchangedNode3, ...). This is further complicated by my changed values being in some of the nested nodes, so I have quite a lot of fluff to pass in to get to it.
The F# Data type providers were primarily designed to provide easy access when reading the data, so they do not have very good story for writing data (partly, the issue is that the underlying JSON representation is quite different than the underlying XML representation).
For XML, the type provider just wraps the standard XElement types, which happen to be mutable. This means that you can actually navigate to the elements using provided types, but then use the underlying LINQ to XML to mutate the value. For example:
type X = XmlProvider<"<foos><foo a=\"1\" /><foo a=\"2\" /></foos>">
// Change the 'a' attribute of all 'foo' nodes to 1234
let doc = X.GetSample()
for f in doc.Foos do
f.XElement.SetAttributeValue(XName.Get "a", 1234)
// Prints the modified document
doc.ToString()
This is likely not perfect - sometimes, you'll need to change the parent element (like here, the provided f.A property is not mutable), but it might do the trick. I don't know whether this is the best way of solving the problem in general, or whether something like XSLT might be easier - it probably depends on the concrete transformations.

Csv Type Provider convert to Json

I am using the Csv Type Provider to read data from a local csv file.
I want to export the data as json, so I am taking each row and serializing it using the json.net Library with JsonConvert.SerializeObject(x).
The problem is that each row is modeled as a tuple, meaning that the column headers do not become property names when serializing. Instead I get Item1="..."; Item2="..." etc.
How can I export to Json without 'hand-rolling' a class/record type to hold the values and maintain the property names?
The TypeProvider works by providing compile time type safety. The actual code that is compiled maps (at compile time) the nice accessors to tupled values (for performance reasons, I guess). So at run time the JSON serializer sees tuples only.
AFAIK there is no way around hand-rolling records. (That is unless we eventually get type providers that are allowed to take types as parameters which would allow a Lift<T>-type provider or the CSV type provider implementation is adjusted accordingly.)

Data Structure to store Token Properties

I am writing an interpreter for a mathematical language in Rust which is intended to be used to solve mathematical expressions.
When lexing, the program needs to know based on the characters used in a token, what type of token it is (for example is it a function or an operator).
Currently I use an enumeration to represent a type of token:
pub enum IdentifierType {
Function,
Variable,
Operator,
Integer,
}
To check the type of a token I use a function which takes an IdentifierType as input and matches based on input to return a bool. The data structures that could be used in this case are relatively simple as tokens only have a single property: allowed characters.
When parsing to an Abstract Syntax Tree (AST), I would like to know what specific operator or function is being used based on a token and to be able to add a reference to that operator and its associated functions to the AST.
When interpreting, I would like to be able to call execute on a node and have it know how to perform its own function.
I have tried to come up with a solution to store all of these related items, but none that I have encountered as felt satisfactory.
For example I stored all of the operators in a TOML file (a type of configuration file that maps to a hash table) but storing enumerations (values that are constrained) is difficult and there is no way to store an operators function. I also want to be able to search by multiple keys, such as operator associativity (e.g. get all operators that are right associative), which means storing within source code is not very satisfactory.
Other possible ideas I have had are using some kind of SQL hybrid system, however that seems tough to implement

Can a Type Provider be passed into a function as a parameter

I am learning F# and the FSharp.Data library. I have a task which I need to read 20 CSV files. Each file has different number of columns but the records share the same nature: keyed on a date string and all the rest of the columns are float numbers. I need to do some statistical calculation on the float format data columns before persist the results into the database. Although I got all the plumbing logic working:
read in the CSV via FSharp.Data CSV type provider,
use reflection to get the type of the each column fields together with the header names they are fed into a pattern match, which decides the relevant calculation logics
sqlbulkcopy the result), I ended 20 functions (1 per CSV file).
The solution is far from acceptable. I thought I could create a generic top level function as the driver to loop through all the files. However after days of attempts I am getting nowhere.
The FSharp.Data CSV type provider has the following pattern:
type Stocks = CsvProvider<"../docs/MSFT.csv">
let msft = Stocks.Load("http://ichart.finance.yahoo.com/table.csv?s=MSFT")
msft.Data |> Seq.map(fun row -> do something with row)
...
I have tried:
let mainfunc (typefile:string) (datafile:string) =
let msft = CsvProvider<typefile>.Load(datafile)
....
This doesnt work as the CsvProvider complains the typefile is not a valid constant expression. I am guessing the type provider must need the file to deduce the type of the columns at the coding time, the type inference can not be deferred until the code where the mainfunc is called with the relevant information.
I then tried to pass the Type into the mainfunc as a parameter
neither
let mainfunc (typeProvider:CsvProvider<"../docs/MSFT.csv">) =
....
nor
let mainfunc<typeProvider:CsvProvider<"../docs/MSFT.csv">> =
....
worked.
I then tried to pass the MSFT from
type Stocks = CsvProvider<"../docs/MSFT.csv">
let msft = Stocks.Load("http://ichart.finance.yahoo.com/table.csv?s=MSFT")
Into a mainFunc. According to the intellisence, MSFT has a type of CsvProvider<...> and MSFT.Data has a type of seq<CsvProvider<...>>. I have tried to declare a input parameter with explicit type of these two but neither of them can pass compile.
Can anyone please help and point me to the right direction? Am I missing somthing fundamental here? Any .net type and class object can be used in a F# function to explicitly specify the parameter type, but can i do the same with the type from a type provider?
If the answer to above question is no, what are the alternative to make the logic generic to handle 20 files or even 200 different files?
This is related to Type annotation for using a F# TypeProvider type e.g. FSharp.Data.JsonProvider<...>.DomainTypes.Url
Even though intellisense shows you CsvProvider<...>, to reference the msft type in a type annotation you have to use Stocks, and for msft.Data, instead of CsvProvider<...>.Row, you have to use Stocks.Row.
If you want to do something dynamic, you can get the columns names with msft.Headers and you can get the types of the columns using Microsoft.FSharp.Reflection.FSharpType.GetTupleElements(typeof<Stocks.Row>) (this works because the row is erased to a tuple at runtime)
EDIT:
If the formats are incompatible, and you're dealing with dynamic data that doesn't conform to a common format, you might want to use CsvFile instead (http://fsharp.github.io/FSharp.Data/library/CsvFile.html), but you'll lose all the type safety of the type provider. You might also consider using Deedle instead (http://bluemountaincapital.github.io/Deedle/)

Resources