I have a JSON document that I'm parsing using Thoth.Json.Net. The document has an array containing a set of objects that each have a "type" attribute with a value that identifies their type. Each of these types needs a different decoder so I need to be able to provide some sort of filter based on the value of the "type" attribute. How can I do this?
Update:
After getting the "hack" that I describe above working, I revisited using the CE and decodeByType custom decoder, with each decoder returning a value from a Discriminated Union as mentioned by #tranquillity above. Once I had got my head around all the types the only thing I had to do was to specify the types for the Builder:
Builder: type DecoderBuilder () =
member __.Bind((decoder:Decoder<'a>), (func:('a -> Decoder<'b>))) =
Decode.andThen func decoder
This enabled the use of the CE to combine decoders easily as described by #brianberns below. I created a new file for the custom decoders and the Discriminated Union and found that I was able to extract the values from the JSON more closely to the domain model (making unrepresentable state impossible because an error will be returned if the JSON structure is invalid).
All in all cleaner, more functional, and elegant code. Thank you for the help.
I'm not a Thoth expert, but here's what I'd do. First, I find it easier to combine decoders using a computation expression:
type DecodeBuilder() =
member _.Bind(decoder, f) : Decoder<_> =
Decode.andThen f decoder
member _.Return(value) =
Decode.succeed value
member _.ReturnFrom(decoder : Decoder<_>) =
decoder
let decode = DecodeBuilder()
Then I invented two custom decoders, just as examples, and put them in a map by name. One reverses a string, and one decodes using the ROT13 cipher. Of course, you'll use your own custom decoders here instead:
let decodeReverse =
decode {
let! str = Decode.string
return str
|> Seq.rev
|> Seq.toArray
|> String
}
let decodeRot13 =
let rot13 c =
if 'a' <= c && c <= 'm' || 'A' <= c && c <= 'M' then
char (int c + 13)
elif 'n' <= c && c <= 'z' || 'N' <= c && c <= 'Z' then
char (int c - 13)
else c
decode {
let! str = Decode.string
return str
|> Seq.map rot13
|> Seq.toArray
|> String
}
let customDecoders =
Map [
"Reverse", decodeReverse
"Rot13", decodeRot13
]
Then, custom decoding is just a matter of decoding the "type" field, looking up the corresponding custom decoder, and using it to decode the "value" field:
let decodeByType =
decode {
let! typ = Decode.field "type" Decode.string
return! Decode.field "value" customDecoders.[typ]
}
Example usage:
Decode.fromString
(Decode.array decodeByType)
"""
[
{
"type" : "Reverse",
"value" : "edcba"
},
{
"type" : "Rot13",
"value" : "qrpbqr guvf"
}
]
"""
|> printfn "%A" // Ok [|"abcde"; "decode this"|]
I put the complete program here for reference.
Related
Trying to learn F# and got stuck when trying to find a better approach of converting a csv file to a json array where each row + header is a json object in that array.
After some trial and error I finally caved and went for an ugly approach with mutable list and map. Are there any better ways this can be implemented?
let csvFileToJsonList (csvFile: FSharp.Data.CsvFile) =
let mutable tempList = List.empty<Map<string,string>>
let heads =
match csvFile.Headers with
| Some h -> h
| None -> [|"Missing"|] // what to do here?
let nbrOfColumns = csvFile.NumberOfColumns
for row in csvFile.Rows do
let columns = row.Columns
let mutable tempMap = Map.empty<string,string>
for i = 0 to nbrOfColumns-1 do
tempMap <- tempMap.Add(heads.[i], columns.[i])
tempList <- tempMap :: tempList
System.Text.Json.JsonSerializer.Serialize(tempList)
This outputs the following which is the goal:
[
{
"Header1": "Row1Val1",
"Header2": "Row1Val2",
"Header3": "Row1Val3",
"Header4": "Row1Val4",
"Header5": "Row1Val5"
},
{
"Header1": "Row2Val1",
"Header2": "Row2Val2",
"Header3": "Row2Val3",
"Header4": "Row2Val4",
"Header5": "Row2Val5"
}
]
This is about as simple as I could make it, although a longer version might be more readable for you:
let csvFileToJsonList (csvFile: FSharp.Data.CsvFile) =
let heads = csvFile.Headers |> Option.defaultValue [||]
csvFile.Rows
|> Seq.map (fun row -> Seq.zip heads row.Columns |> Map)
|> System.Text.Json.JsonSerializer.Serialize
This produces the output in the original order, which I'm assuming is preferable (your solution reverses the order).
This also assumes some headers exist, otherwise the output will be empty objects.
Description: For each row use Seq.zip to produce a sequence of header-value tuples. Pass that to the Map constructor to create a map, providing a sequence of maps, which can be serialized.
Note that using dict instead of Map might be a bit faster.
You also could use CsvProvider to create a typed object (Row)
open FSharp.Data
type Persons =
CsvProvider<"David,Raab,19.02.1983",
Schema="First (string), Last (string), BirthDay(string)",
HasHeaders=true>
let parseCsv (reader:System.IO.TextReader) = [
let data = Persons.Load reader
for row in data.Rows do
Map [
("First", row.First)
("Last", row.Last)
("Birthday", row.BirthDay)
]
]
Returning a List of a map instead of Json, but i guess you will know how to change it to Serialze the data.
Please consider this dataset, composed by man and woman, and that I filter in a second moment according to few variables:
type ls = JsonProvider<"...">
let dt = ls.GetSamples()
let dt2 =
dt |> Seq.filter (fun c -> c.Sex = "male" && c.Height > Some 150)
dt2
[{"sex":"male","height":180,"weight":85},
{"sex":"male","height":160" "weight":60},
{"sex":"male","height":180,"weight":85}]
Lets suppose that I would like to add a fourth key "body mass index" or "bmi", and that its value is roughly given by "weight"/"height". Hence I expect:
[{"sex":"male","height":180,"weight":85, "bmi":(180/85)},
{"sex":"male","height":160" "weight":60, "bmi":(160/60},
{"sex":"male","height":180,"weight":85, "bmi":(180/85)}]
I thought that map.Add may help.
let dt3 = dt2.Add("bmi", (dt2.Height/dt2.Weight))
Unfortunately, it returns an error:
error FS0039: The field, constructor or member 'Add' is not defined
I am sure there are further errors in my code, but without this function I cannot actually look for them. Am I, at least, approaching the problem correctly?
Creating modified versions of the JSON is sadly one thing that the F# Data type provider does not make particularly easy. What makes that hard is the fact that we can infer the type from the source JSON, but we cannot "predict" what kind of fields people might want to add.
To do this, you'll need to access the underlying representation of the JSON value and operate on that. For example:
type ls = JsonProvider<"""
[{"sex":"male","height":180,"weight":85},
{"sex":"male","height":160,"weight":60},
{"sex":"male","height":180,"weight":85}]""">
let dt = ls.GetSamples()
let newJson =
dt
|> Array.map (fun recd ->
// To do the calculation, you can access the fields via inferred types
let bmi = float recd.Height / float recd.Weight
// But now we need to look at the underlying value, check that it is
// a record and extract the properties, which is an array of key-value pairs
match recd.JsonValue with
| JsonValue.Record props ->
// Append the new property to the existing properties & re-create record
Array.append [| "bmi", JsonValue.Float bmi |] props
|> JsonValue.Record
| _ -> failwith "Unexpected format" )
// Re-create a new JSON array and format it as JSON
JsonValue.Array(newJson).ToString()
I'm reading Expert F# book and I found this code
open System.Collections.Generic
let divideIntoEquivalenceClasses keyf seq =
// The dictionary to hold the equivalence classes
let dict = new Dictionary<'key,ResizeArray<'T>>()
// Build the groupings
seq |> Seq.iter (fun v ->
let key = keyf v
let ok,prev = dict.TryGetValue(key)
if ok then prev.Add(v)
else let prev = new ResizeArray<'T>()
dict.[key] <- prev
prev.Add(v))
dict |> Seq.map (fun group -> group.Key, Seq.readonly group.Value)
and the example use:
> divideIntoEquivalenceClasses (fun n -> n % 3) [ 0 .. 10 ];;
val it : seq<int * seq<int>>
= seq [(0, seq [0; 3; 6; 9]); (1, seq [1; 4; 7; 10]); (2, seq [2; 5; 8])]
first for me this code is really ugly, even if this is safe, It looks more similar to imperative languages than to functional lang..specially compared to clojure. But the problem is not this...I'm having problems with the Dictionary definition
when I type this:
let dict = new Dictionary<'key,ResizeArray<'T>>();;
I get this:
pruebafs2a.fs(32,5): error FS0030: Value restriction. The value 'dict' has been inferred to have generic type
val dict : Dictionary<'_key,ResizeArray<'_T>> when '_key : equality
Either define 'dict' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.
is It ok?...
thanks so much
improve question:
Ok I've been reading about value restriction and I found this helpfull information
In particular, only function definitions and simple immutable data
expressions are automatically generalized
...ok..this explains why
let dict = new Dictionary<'key,ResizeArray<'T>>();;
doesn't work...and show 4 different techniques, although in my opinion they only resolve the error but aren't solutions for use generic code:
Technique 1: Constrain Values to Be Nongeneric
let empties : int list [] = Array.create 100 []
Technique 3: Add Dummy Arguments to Generic Functions When Necessary
let empties () = Array.create 100 []
let intEmpties : int list [] = empties()
Technique 4: Add Explicit Type Arguments When Necessary (similar to tec 3)
let emptyLists = Seq.init 100 (fun _ -> [])
> emptyLists<int>;;
val it : seq<int list> = seq [[]; []; []; []; ...]
----- and the only one than let me use real generic code ------
Technique 2: Ensure Generic Functions Have Explicit Arguments
let mapFirst = List.map fst //doesn't work
let mapFirst inp = List.map fst inp
Ok, in 3 of 4 techniques I need resolve the generic code before can work with this...now...returning to book example...when the compile knows the value for 'key and 'T
let dict = new Dictionary<'key,ResizeArray<'T>>()
in the scope the code is very generic for let key be any type, the same happen with 'T
and the biggest dummy question is :
when I enclose the code in a function (technique 3):
let empties = Array.create 100 [] //doesn't work
let empties () = Array.create 100 []
val empties : unit -> 'a list []
I need define the type before begin use it
let intEmpties : int list [] = empties()
for me (admittedly I'm a little dummy with static type languages) this is not real generic because it can't infer the type when I use it, I need define the type and then pass values (not define its type based in the passed values) exist other way define type without be so explicit..
thanks so much..really appreciate any help
This line
let dict = new Dictionary<'key,ResizeArray<'T>>();;
fails because when you type the ;; the compiler doesn't know what 'key and 'T are. As the error message states you need to add a type annotation, or allow the compiler to infer the type by using it later or make it a function
Examples
Type annotation change
let dict = new Dictionary<int,ResizeArray<int>>();;
Using types later
let dict = new Dictionary<'key,ResizeArray<'T>>()
dict.[1] <- 2
using a function
let dict() = new Dictionary<'key,ResizeArray<'T>>();;
This actually doesn't cause an issue when it's defined all together. That is, select the entire block that you posted and send it to FSI in one go. I get this:
val divideIntoEquivalenceClasses :
('T -> 'key) -> seq<'T> -> seq<'key * seq<'T>> when 'key : equality
However, if you type these individually into FSI then as John Palmer says there is not enough information in that isolated line for the interpreter to determine the type constraints. John's suggestions will work, but the original code is doing it correctly - defining the variable and using it in the same scope so that the types can be inferred.
for me this code is really ugly, even if this is safe, It looks more similar to imperative languages than to functional lang.
I agree completely – it's slightly tangential to your direct question, but I think a more idiomatic (functional) approach would be:
let divideIntoEquivalenceClasses keyf seq =
(System.Collections.Generic.Dictionary(), seq)
||> Seq.fold (fun dict v ->
let key = keyf v
match dict.TryGetValue key with
| false, _ -> dict.Add (key, ResizeArray(Seq.singleton v))
| _, prev -> prev.Add v
dict)
|> Seq.map (function KeyValue (k, v) -> k, Seq.readonly v)
This allows sufficient type inference to obviate the need for your question in the first place.
The workarounds proposed by the other answers are all good. Just to clarify based on your latest updates, let's consider two blocks of code:
let empties = Array.create 100 []
as opposed to:
let empties = Array.create 100 []
empties.[0] <- [1]
In the second case, the compiler can infer that empties : int list [], because we are inserting an int list into the array in the second line, which constrains the element type.
It sounds like you'd like the compiler to infer a generic value empties : 'a list [] in the first case, but this would be unsound. Consider what would happen if the compiler did that and we then entered the following two lines in another batch:
empties.[0] <- [1] // treat 'a list [] as int list []
List.iter (printfn "%s") empties.[0] // treat 'a list [] as string list []
Each of these lines unifies the generic type parameter 'a with a different concrete type (int and string). Either of these unifications is fine in isolation, but they are incompatible with each other and would result in treating the int value 1 inserted by the first line as a string when the second line is executed, which is clearly a violation of type safety.
Contrast this with an empty list, which really is generic:
let empty = []
Then in this case, the compiler does infer empty : 'a list, because it's safe to treat empty as a list of different types in different locations in your code without ever impacting type safety:
let l1 : int list = empty
let l2 : string list = empty
let l3 = 'a' :: empty
In the case where you make empties the return value of a generic function:
let empties() = Array.create 100 []
it is again safe to infer a generic type, since if we try our problematic scenario from before:
empties().[0] <- [1]
List.iter (printfn "%s") (empties().[0])
we are creating a new array on each line, so the types can be different without breaking the type system.
Hopefully this helps explain the reasons behind the limitation a bit more.
Given an F# record:
type R = { X : string ; Y : string }
and two objects:
let a = { X = null ; Y = "##" }
let b = { X = "##" ; Y = null }
and a predicate on strings:
let (!?) : string -> bool = String.IsNullOrWhiteSpace
and a function:
let (-?>) : string -> string -> string = fun x y -> if !? x then y else x
is there a way to use F# quotations to define:
let (><) : R -> R -> R
with behaviour:
let c = a >< b // = { X = a.X -?> b.X ; Y = a.Y -?> b.Y }
in a way that somehow lets (><) work for any arbitrary F# record type, not just for R.
Short: Can quotations be used to generate F# code for a definition of (><) on the fly given an arbitrary record type and a complement function (-?>) applicable to its fields?
If quotations cannot be used, what can?
You could use F# quotations to construct a function for every specific record and then compile it using the quotation compiler available in F# PowerPack. However, as mentioned in the comments, it is definitely easier to use F# reflection:
open Microsoft.FSharp.Reflection
let applyOnFields (recd1:'T) (recd2:'T) f =
let flds1 = FSharpValue.GetRecordFields(recd1)
let flds2 = FSharpValue.GetRecordFields(recd2)
let flds = Array.zip flds1 flds2 |> Array.map f
FSharpValue.MakeRecord(typeof<'T>, flds)
This function takes records, gets their fields dynamically and then applies f to the fields. You can use it to imiplement your operator like this (I'm using a function with a readable name instead):
type R = { X : string ; Y : string }
let a = { X = null ; Y = "##" }
let b = { X = "##" ; Y = null }
let selectNotNull (x:obj, y) =
if String.IsNullOrWhiteSpace (unbox x) then y else x
let c = applyOnFields a b selectNotNull
The solution using Reflection is quite easy to write, but it might be less efficient. It requires running .NET Reflection each time the function applyOnFields is called. You could use quotations to build an AST that represents the function that you could write by hand if you knew the record type. Something like:
let applyOnFields (a:R) (b:R) f = { X = f (a.X, b.X); Y = f (a.Y, b.Y) }
Generating the function using quotations is more difficult, so I won't post a complete sample, but the following example shows at least a part of it:
open Microsoft.FSharp.Quotations
// Get information about fields
let flds = FSharpType.GetRecordFields(typeof<R>) |> List.ofSeq
// Generate two variables to represent the arguments
let aVar = Var.Global("a", typeof<R>)
let bVar = Var.Global("b", typeof<R>)
// For all fields, we want to generate 'f (a.Field, b.Field)` expression
let args = flds |> List.map (fun fld ->
// Create tuple to be used as an argument of 'f'
let arg = Expr.NewTuple [ Expr.PropertyGet(Expr.Var(aVar), fld)
Expr.PropertyGet(Expr.Var(bVar), fld) ]
// Call the function 'f' (which needs to be passed as an input somehow)
Expr.App(???, args)
// Create an expression that builds new record
let body = Expr.NewRecord(typeof<R>, args)
Once you build the right quotation, you can compile it using F# PowerPack. See for example this snippet.
Two functions are defined:
let to2DStrArray (inObj : string[][]) =
Array2D.init inObj.Length inObj.[0].Length (fun i j -> inObj.[i].[j])
let toTypedList typeFunc (strArray : string[,]) =
if (Array2D.length1 strArray) = 0 then
[]
else
List.init (Array2D.length1 strArray) typeFunc
trying to call them from fsx as follows fails:
let testData = to2DStrArray [|[||]|]
let failingCall = testData
|> toTypedList (fun row -> (Double.Parse(testData.[row,0]),
Double.Parse(testData.[row,1])))
What is a working/better way to get this code to handle the case of empty 2-dimensional string arrays?
The problem is not in toTypeList function so you don't have to check whether strArray is empty or not. It will give an error if you check inObj.[0].Length in to2DStrArray function when the input array is empty. A safe way to create an Array2D from an array of array is using array2D operator:
let to2DStrArray (inObj : string[][]) =
array2D inObj
Of course, you have to guarantee that all inner arrays have the same length. And the other function is shortened as follows:
let toTypedList typeFunc (strArray : string[,]) =
List.init (Array2D.length1 strArray) typeFunc
Given your use case, note that [|[||]|] is not an empty string[][]; it is an array which consists of only one element which in turn is an empty string array. Therefore, it causes a problem for the anonymous function you passed to toTypedList. Since the two dimensional array has length2 <= 1 and you accesses two first indices, it results in an index of bound exception. The function could be fixed by returning option values, and you can extract values from option values to use later on:
let testData = to2DStrArray [|[||]|]
let failingCall = testData
|> toTypedList (fun row -> if Array2D.length2 testData >= 2 then Some (Double.Parse(testData.[row,0]), Double.Parse(testData.[row,1])) else None)
Realistically you will have another problem as testdata.[0].Length <> testdata.[1].Length - unless you know this from somewhere else. I suspect that the best approach
let ysize = (inobj |> Array.maxBy (fun t -> t.Length)).Length
I quickly tested this and it seems to work - although it may still fail at the point where you access the array