I have the following code:
type Client =
{ Name : string; Income : int ; YearsInJob : int
UsesCreditCard : bool; CriminalRecord : bool }
type QueryInfo =
{ Title : string
Check : Client -> bool
Positive : Decision
Negative : Decision }
and Decision =
| Result of string
| Querys of QueryInfo
let tree =
Querys {Title = "More than €40k"
Check = (fun cl -> cl.Income > 40000)
Positive = moreThan40
Negative = lessThan40}
But in the last line :
Querys {Title = "More than €40k"
Check = (fun cl -> cl.Income > 40000)
Positive = moreThan40
Negative = lessThan40}
I have an erorr:
No assignment has given for field 'Check' of type 'Script.QueryInfo'
F# is white-space sensitive, which means that white-space is used to indicate scope. The code as given doesn't compile because Check appears too far to the left.
This, on the other hand, ought to compile (if moreThan40 and lessThan40 are correctly defined):
let tree =
Querys {Title = "More than €40k"
Check = (fun cl -> cl.Income > 40000)
Positive = moreThan40
Negative = lessThan40}
The curly brackets here do not indicate scope, but instead the start and end of a record. Because of the incorrect indentation in the OP, the compiler sees the Check binding as outside the scope of the record expression. That's the reason it complains that no value has been bound to the field Check.
Until you get used to significant white-space, it can be a bit annoying, but it does save you from a lot of explicit opening and closing of scopes (e.g. with curly brackets), so in my opinion, it's a benefit once you get used to it.
Related
I have a data pipeline where at each step more data fields are required. I would like to do this in a functional way by respecting immutability. I could achieve this with a class by I am wondering if there is an F# way of doing it?
// code that loads initial field information and returns record A
type recordA = {
A: int
}
// code that loads additional field information and returns record AB
type recordAB = {
A: int
B: int
}
// code that loads additional field information and returns record ABC
type recordABC = {
A: int
B: int
C: int
}
As records are sealed I can't just inherit them. How can I avoid having to define a new record with the exact same fields as the previous step and adding the required fields? Preferably I would like to have something like one record that has all required fields and the fields get assigned to their values in each step.
Note that the number of fields added in each step could be more than 1.
I think this can be a good use case for the anonymous records recently introduced in F#.
let a = {| X = 3 |}
let b = {| a with Y = "1"; Z = 4.0|}
let c = {| b with W = 1 |}
printfn "%d, %s, %f, %d" c.X c.Y c.Z c.W
One way to do it in a very FP-style would be to use a DU with a case for each step of the workflow, and the appropriate data for each step in each case:
type WorfklowState =
| StepOne of int
| StepTwo of int * int
| StepThree of int * int * int
Then your entire workflow state, both what step you're currently on and the data produced/consumed by that step, would be modeled in the data type. Of course, you would probably create record types for the data of each case, rather than using progressively larger tuples.
Depending on the application, this may be a (mis-)use case for a dynamic data container.
F# might help by providing user-defined dynamic lookup operators, for which a special syntactic translation occurs.
let (?) (m : Map<_,_>) k = m.Item k
// val ( ? ) : m:Map<'a,'b> -> k:'a -> 'b when 'a : comparison
let (?<-) (m : Map<_,_>) k v = m.Add(k, v)
// val ( ?<- ) : m:Map<'a,'b> -> k:'a -> v:'b -> Map<'a,'b> when 'a : comparison
let m = Map.empty<_,_>
let ma = m?A <- "0"
let mabc = (ma?B <- "1")?C <- "2"
ma?A // val it : string = "0"
mabc?C // val it : string = "2"
You can "inherit" records:
type RecordA =
{
a : int
}
type RecordAB =
{
a : RecordA
b : int
}
type RecordABC =
{
ab : RecordAB
c : int
}
Then you can access all of the elements, though with longer and longer chain as you go deeper and deeper.
However, why don't you just use a list of elements to store the result?
First, I would create a type to handle all possible types that you may have on each step, e.g.:
type Step =
| Int of int
| String of string
// ...
Then you can represent the workflow simply as:
type WorkflowState = list<Step>
and if you want to ensure that you always have at least one element then you can use:
type WorkflowState = Step * list<Step>
However, the records have labels and the structure above does not have them! So, if you do need labels, then you can represent them by a map using either a strong type:
type Label =
| A
| B
// ...
type WorkflowMappedState = Map<Label, Step>
or just a string based one, e.g.
type WorkflowMappedState = Map<string, Step>
The benefits of list or map based approach in comparison to the answers above is that you don't have to know the maximum number of possible steps. What if the number of steps is over 100? Would you want to create a record with 100+ labels? Most likely not! The anonymous records are great, but what if you want to use them outside of module where they were created? I think that that would cause some troubles.
Having said all that, I think that I would go with a list based approach: type WorkflowState = list<Step>. It is very F# way and it is very easy to transform further.
See I have a single case discriminated union
type R = R of string * int * sting option * .....
And I got a value of R.
let r: R = getAValue ()
Now I need to replace the first item of r to an empty string and keep all other value. How to do it? Record type has the with construct
let r' = { r with Item1 = "" }
I know it can use 'pattern match' to extract all the items and create a new one. But it seems very cumbersome.
I assume you do not want to involve reflection, do you?
Then, I believe your only option would be using pattern matching. The (quite limited) burden would be defining the r-ity of your type Ras a pattern for matching.
Let's assume, for example, that your R wraps a tuple of 3 elements, i.e. has r-ity 3:
type R = R of string * int * string option
In this case all you need is to do define the following function:
let modR = function
| R(x,y,z) -> R("",y,z)
The signature of modR is R -> R, a quick check of your scenario:
let r = R("abc",1,None)
modR r
in fsi brings back
>
val it : R = R ("",1,None)
All you would need for applying the above to your specific R is set the actual r-ity of your type into the pattern.
UPDATE: As Fyodor Soikin pointed, a matching function isn't needed at all for unwrapping a single-case DU (see the docs). The sought convertion function definition may be defined as simple as
let modR (R(_,y,z)) = R("",y,z)
UPDATE2: While considering the comment from ca9163d9 I recalled just another flavor of pattern matching, namely as Pattern. Using it while implementing the sought conversion in the form of DU member gives:
type R = R of string * int * string option with
member self.modR() = let R(_,b,c) as x = self in R("",b,c)
Also #FyodorSoikin and #kaefer have pointed out in the comments that as x form isn't required for the simple DU unwrapping, similarly to terser modR function definition above:
member self.modR() = let (R(_,b,c)) = self in R("",b,c)
I'd like to create a type with a definition a bit like this:
type LeftRight<'left, 'right> = {
Left : 'left list
Right : 'right list
}
and a couple of functions:
let makeLeft xs = { Left = xs; Right = [] }
let makeRight ys = { Left = []; Right = ys }
and I'd like to provide a 'combiner' function:
let combine l r = { Left = l.Left # r.Left; Right = l.Right # r.Right }
When I try and make something, I (obviously!) get issues as my value is generic:
let aaa = makeLeft [1;2;3]
// Value restriction. The value 'aaa' has been inferred to have generic type
// val aaa : LeftRight<int,'_a>
If I combine a left and a right, type inference kicks in and everything's A-OK:
let bbb = makeRight [1.0;2.0;3.0]
let comb = combine aaa bbb // LeftRight<int, float>
but I want to use one with only lefts on its own. I tried creating an 'Any' type:
type Any = Any
and explicitly specified the types on makeLeft and makeRight:
let makeLeft xs : LeftRight<_, Any> = { Left = xs; Right = [] }
let makeRight ys : LeftRight<Any, _> = { Left = []; Right = ys }
which makes the value definitions happy, but makes the combine function sad:
let combined = combine aaa bbb
// Type mismatch. Expecting a
// LeftRight<int,Any>
// but given a
// LeftRight<Any,float>
// The type 'int' does not match the type 'Any'
I feel like there's probably a way around this with loads of voodoo with .Net's overloading of function calls, but I can't make it work. Has anyone tried this before/have any ideas?
The value restriction is not a problem in this case, you need the result of makeLeft or makeRight be generic if you ever hope to use them generically further down the line.
In F# (and OCaml), generic syntactic values must be explicitly marked as such, with full type annotations. Indeed, the compiler reports this:
error FS0030: Value restriction. The value 'aaa' has been inferred to
have generic type
val aaa : LeftRight Either define 'aaa' as a simple data term, make it a function with explicit arguments or, if you do
not intend for it to be generic, add a type annotation.
Without going into too much detail*, this is to avoid issues that can occur when combining polymorphism and side effects. The downside is that it does reject some perfectly safe code as a result.
So, the solution is simple, we make these values explicitly generic:
let aaa<'a> : LeftRight<int,'a> = makeLeft [1;2;3]
let bbb<'a> : LeftRight<'a, float> = makeRight [1.0;2.0;3.0]
Putting them together in FSI:
let comb = combine aaa bbb;;;
val comb : LeftRight<int,float> = {Left = [1; 2; 3];
Right = [1.0; 2.0; 3.0];}
Note that if you combine without intermediate let bindings, you no longer have a generic value and the proper type can be inferred by the compiler:
combine (makeLeft [1;2;3]) (makeRight [1.0;2.0;3.0]);;
val it : LeftRight<int,float> = {Left = [1; 2; 3];
Right = [1.0; 2.0; 3.0];}
*For more detail, check out this article.
I need a data structure for the following:
In a device that has memory slots, each of the slots has a set of parameters. These parameters have different types. The list of possible parameters is fixed, so there is no need for generic flexibility à la »Support of arbitrary parameters without change«. Also, for each parameter, the structure of the contents is known. Typical use cases are the retrieval and modification of one specific parameter as well as a transformation of the complete parameter set into a different (but already defined) data structure.
The natural choice of F# data structure would be a sum type like this:
type SomeParameterContentType = { Field1 : string, Field2 : int }
type SomeOtherParameterContentType = { Other1 : bool option, Other2 : double }
type Parameter =
| SomeParameter of SomeParameterContentType
| SomeOtherParameter of SomeOtherParameterContentType
This way I could create a set and store the parameters there with a very nice data structure. The question here is: Given this idea, how would looking for a specific parameter look like? I don't know of any way to specify a predicate for a find-function for sets. It would be possible to define another sum type listing just the Parameter Types without their contents using this as key for a Dictionary but I don't like this idea too much. Using strings instead of the second sum type doesn't make things better as it still would require providing the list of possible parameters twice.
Does anyone have a better idea?
Thx
--Mathias.
Sounds like all you want is a tryFind for a Set:
module Set =
let tryFind p =
Set.toList >> List.tryFind p
Usage:
let mySet = Set.ofList [1;2;3;4;5]
let m = mySet |> Set.tryFind (fun t -> t = 2)
val m : int option = Some 2
Usage with your Types:
let yourSet = Set.ofList [SomeParameter {Field1="hello";Field2=3}]
let mYours = yourSet |> Set.tryFind (fun t -> match t with
|SomeParameter p -> true
|SomeOtherParameter p -> false)
val mYours : Parameter option = Some (SomeParameter {Field1 = "hello";
Field2 = 3;})
I am writing a compiler of mini-pascal in Ocaml. I would like my compiler to accept the following code for instance:
program test;
var
a,b : boolean;
n : integer;
begin
...
end.
I have difficulties in dealing with the declaration of variables (the part following var). At the moment, the type of variables is defined like this in sib_syntax.ml:
type s_var =
{ s_var_name: string;
s_var_type: s_type;
s_var_uniqueId: s_uniqueId (* key *) }
Where s_var_uniqueId (instead of s_var_name) is the unique key of the variables. My first question is, where and how I could implement the mechanism of generating a new id (actually by increasing the biggest id by 1) every time I have got a new variable. I am wondering if I should implement it in sib_parser.mly, which probably involves a static variable cur_id and the modification of the part of binding, again don't know how to realize them in .mly. Or should I implement the mechanism at the next stage - the interpreter.ml? but in this case, the question is how to make the .mly consistent with the type s_var, what s_var_uniqueId should I provide in the part of binding?
Another question is about this part of statement in .mly:
id = IDENT COLONEQ e = expression
{ Sc_assign (Sle_var {s_var_name = id; s_var_type = St_void}, e) }
Here, I also need to provide the next level (the interpreter.ml) a variable of which I only know the s_var_name, so what could I do regarding its s_var_type and s_var_uniqueId here?
Could anyone help? Thank you very much!
The first question to ask yourself is whether you actually need an unique id. From my experience, they're almost never necessary or even useful. If what you're trying to do is making variables unique through alpha-equivalence, then this should happen after parsing is complete, and will probably involve some form of DeBruijn indices instead of unique identifiers.
Either way, a function which returns a new integer identifier every time it is called is:
let unique =
let last = ref 0 in
fun () -> incr last ; !last
let one = unique () (* 1 *)
let two = unique () (* 2 *)
So, you can simply assign { ... ; s_var_uniqueId = unique () } in your Menhir rules.
The more important problem you're trying to solve here is that of variable binding. Variable x is defined in one location and used in another, and you need to determine that it happens to be the same variable in both places. There are many ways of doing this, one of them being to delay the binding until the interpreter. I'm going to show you how to deal with this during parsing.
First, I'm going to define a context: it's a set of variables that allows you to easily retrieve a variable based on its name. You might want to create it with hash tables or maps, but to keep things simple I will be using List.assoc here.
type s_context = {
s_ctx_parent : s_context option ;
s_ctx_bindings : (string * (int * s_type)) list ;
s_ctx_size : int ;
}
let empty_context parent = {
s_ctx_parent = parent ;
s_ctx_bindings = [] ;
s_ctx_size = 0
}
let bind v_name v_type ctx =
try let _ = List.assoc ctx.s_ctx_bindings v_name in
failwith "Variable is already defined"
with Not_found ->
{ ctx with
s_ctx_bindings = (v_name, (ctx.s_ctx_size, v_type))
:: ctx.s_ctx_bindings ;
s_ctx_size = ctx.s_ctx_size + 1 }
let rec find v_name ctx =
try 0, List.assoc ctx.s_ctx_bindings v_name
with Not_found ->
match ctx.s_ctx_parent with
| Some parent -> let depth, found = find v_name parent in
depth + 1, found
| None -> failwith "Variable is not defined"
So, bind adds a new variable to the current context, find looks for a variable in the current context and its parents, and returns both the bound data and the depth at which it was found. So, you could have all global variables in one context, then all parameters of a function in another context that has the global context as its parent, then all local variables in a function (when you'll have them) in a third context that has the function's main context as the parent, and so on.
So, for instance, find 'x' ctx will return something like 0, (3, St_int) where 0 is the DeBruijn index of the variable, 3 is the position of the variable in the context identified by the DeBruijn index, and St_int is the type.
type s_var = {
s_var_deBruijn: int;
s_var_type: s_type;
s_var_pos: int
}
let find v_name ctx =
let deBruijn, (pos, typ) = find v_name ctx in
{ s_var_deBruijn = deBruijn ;
s_var_type = typ ;
s_var_pos = pos }
Of course, you need your functions to store their context, and make sure that the first argument is the variable at position 0 within the context:
type s_fun =
{ s_fun_name: string;
s_fun_type: s_type;
s_fun_params: context;
s_fun_body: s_block; }
let context_of_paramlist parent paramlist =
List.fold_left
(fun ctx (v_name,v_type) -> bind v_name v_type ctx)
(empty_context parent)
paramlist
Then, you can change your parser to take into account the context. The trick is that instead of returning an object representing part of your AST, most of your rules will return a function that takes a context as an argument and returns an AST node.
For instance:
int_expression:
(* Constant : ignore the context *)
| c = INT { fun _ -> Se_const (Sc_int c) }
(* Variable : look for the variable inside the contex *)
| id = IDENT { fun ctx -> Se_var (find id ctx) }
(* Subexpressions : pass the context to both *)
| e1 = int_expression o = operator e2 = int_expression
{ fun ctx -> Se_binary (o, e1 ctx, e2 ctx) }
;
So, you simply propagate the context "down" recursively through the expressions. The only clever parts are those when new contexts are created (you don't have this syntax yet, so I'm just adding a placeholder):
| function_definition_expression (args, body)
{ fun ctx -> let ctx = context_of_paramlist (Some ctx) args in
{ s_fun_params = ctx ;
s_fun_body = body ctx } }
As well as the global context (the program rule itself does not return a function, but the block rule does, and so a context is created from the globals and provided).
prog:
PROGRAM IDENT SEMICOLON
globals = variables
main = block
DOT
{ let ctx = context_of_paramlist None globals in
{ globals = ctx;
main = main ctx } }
All of this makes the implementation of your interpreter much easier due to the DeBruijn indices: you can have a "stack" which holds your values (of type value) defined as:
type stack = value array list
Then, reading and writing variable x is as simple as:
let read stack x =
(List.nth stack x.s_var_deBruijn).(x.s_var_pos)
let write stack x value =
(List.nth stack x.s_var_deBruijn).(x.s_var_pos) <- value
Also, since we made sure that function parameters are in the same order as their position in the function context, if you want to call function f and its arguments are stored in the array args, then constructing the stack is as simple as:
let inner_stack = args :: stack in
(* Evaluate f.s_fun_body with inner_stack here *)
But I'm sure you'll have a lot more questions to ask when you start working on your interpeter ;)
How to create a global id generator:
let unique =
let counter = ref (-1) in
fun () -> incr counter; !counter
Test:
# unique ();;
- : int = 0
# unique ();;
- : int = 1
Regarding your more general design question: it seems that your data representation does not faithfully represent the compiler phases. If you must return a type-aware data-type (with this field s_var_type) after the parsing phase, something is wrong. You have two choices:
devise a more precise data representation for the post-parsing AST, that would be different from the post-typing AST, and not have those s_var_type fields. Typing would then be a conversion from the untyped to the typed AST. This is a clean solution that I would recommend.
admit that you must break the data representation semantics because you don't have enough information at this stage, and try to be at peace with the idea of returning garbage such as St_void after the parsing phase, to reconstruct the correct information later. This is less typed (as you have an implicit assumption on your data which is not apparent in the type), more pragmatic, ugly but sometimes necessary. I don't think it's the right decision in this case, but you will encounter situation where it's better to be a bit less typed.
I think the specific choice of unique id handling design depends on your position on this more general question, and your concrete decisions about types. If you choose a finer-typed representation of post-parsing AST, it's your choice to decide whether to include unique ids or not (I would, because generating a unique ID is dead simple and doesn't need a separate pass, and I would rather slightly complexify the grammar productions than the typing phase). If you choose to hack the type field with a dummy value, it's also reasonable to do that for variable ids if you wish to, putting 0 as a dummy value and defining it later; but still I personally would do that in the parsing phase.