I am looking to encode (in some .NET language -- fsharp seems most likely to support) a class of types that form a current context.
The rules would be that we start with an initial context of type 'a. As we progress through the computations the context will be added to, but in a way that I can always get previous values. So assume an operation that adds information to the context, 'a -> 'b, that infers that all elements of 'a are also in 'b.
The idea is similar to an immutable map, but I would like it to be statically typed. Is that feasible? How, or why not? TIA.
Update: The answer appears to be that you cannot quite do this at this time, although I have some good suggestions for modeling what I am looking for in a different way. Thanks to all who tried to help with my poorly worded question.
Separate record types in F# are distinct, even if superficially they have similar structure. Even if the fields of record 'a form a subset of fields of record 'c, there's no way of enforcing that relationship statically. If you have a valid reason to use distinct record types there, the best you could do would be to use reflection to get the fields using FSharpType.GetRecordFields and check if one forms the subset of the other.
Furthermore, introducing a new record type for each piece of data added would result in horrendous amounts of boilerplate.
I see two ways to model it that would feel more at place in F# and still allow you some way of enforcing some form of your 'a :> 'c constraint at runtime.
1) If you foresee a small number of records, all of which are useful in other parts of your program, you can use a discriminated union to enumerate the steps of your process:
type NameAndAmountAndFooDU =
| Initial of Name
| Intermediate of NameAndAmount
| Final of NameAndAmountAndFoo
With that, records that previously were unrelated types 'a and 'c, become part of a single type. That means you can store them in a list inside Context and easily go back in time to see if the changes are going in the right direction (Initial -> Intermediate -> Final).
2) If you foresee a lot of changes like 'adding' a single field, and you care more about the final product than the intermediate ones, you can define a record of option fields based on the final record:
type NameAndAmountAndFooOption =
{
Name: string option
Amount: decimal option
Foo: bool option
}
and have a way to convert it to a non-option NameAndAmountAndFoo (or the intermediate ones like NameAndAmount if you need them for some reason). Then in Context you can set the values of individual fields one at a time, and again, collect the previous records to keep track of how changes are applied.
Something like this?
type Property =
| Name of string
| Amount of float
let context = Map.empty<string,Property>
//Parse or whatever
let context = Map.add "Name" (Name("bob")) context
let context = Map.add "Amount" (Amount(3.14)) context
I have a feeling that if you could show us a bit more of your problem space, there may be a more idiomatic overall solution.
Related
A quick screenshot with the point of interest:
There are 2 questions here.
This happens in a tight loop. The 12.8% code is this:
{
this with Side = side; PositionPrice = position'; StopLossPrice = sl'; TakeProfitPrice = tp'; Volume = this.Volume + this.Quantity * position'
}
This object is passed around a lot and has 23 fields so it's not tiny. It looks like immutability is great for stable code, but it's horrible for performance.
Since this recursive loop is run in parallel, I need to store it's context variables in an object.
I am looking for a general idea of what makes sense, not something specific to that code because I have a few tight loops with a bunch of math which I need to profile as well. I am sure I'll find the same pattern in several places.
The flaw here is that I store both the context for the calculations and its variables in a singe type that gets passed in the loop. As the variable fields get updated, the whole object has to be recreated.
What would make sense here (in general for this type of situations)?
make the fields that can change mutable. In this case, that means keeping the type as is (23 fields) and make some fields mutable (only 5 fields get regularly changed)
move the mutable fields to their own type to have a general context object and one holding all the variables. In this case, that means having a context with (23 - 5 fields) and a separate 5 fields type
make the mutable fields variables and move them out of the type. In this case, these 5 fields would be passed as variables in the recursive loop?
and for the second question:
I have no idea what the 10.0% line with get_Tag is. I have nothing called 'Tag' in the code, so I assume that's a dotnet internal thing.
I have a type called Side and there is a field with the same name used in the loop, but what is the 'Tag' part?
What I would suggest is not to modify your existing immutable type at all. Instead, create a new type with mutable fields that is only used within your tight loop. If the type leaves that loop, convert it back to your immutable type (assuming you don't need a copy to go through the rest of your program with every iteration).
get_Tag in this case is likely the auto-generated get-only property on a discriminated union, it's just how the F# compiler represents this sort of type in CLR. The property can most easily be seen when looking at F# code from C#, here's a great page on F# decompiled:
https://fsharpforfunandprofit.com/posts/fsharp-decompiled/#unions
For the performance issues I can only offer some suggestions:
If you can constrain the context object to your code only, then try making a mutable version and see which effect it has.
You mention that the context object is quite large, is it possible to split it up?
Taking the example of writing a parser-unparser pair for a DU.
The unparser function of course uses match expression, where we can count on static exhaustiveness check if new cases are added later.
Now, for the parsing side, the obvious route is (with eg FParsec) use a choice parser with a list of the DU case parsers. This works, but does not have exhaustiveness checking.
My best solution for this, is to create a DU_Case -> parser function that uses match expression, and using runtime reflection get all DU cases, call the function with them, and pass the so generated list to the choice combinator. This works, and does have static exhaustiveness checking, BUT gets pretty clunky fast, especially if the cases have varying fields.
In other words, is there a good pattern other than using reflection for ensuring exhaustiveness when the discriminated union is the output and not the input?
Edit: example on the reflection based solution
By getting clunky, I mean that for every case that has fields, I have to also create the field values dynamically, instead of the simple Array.zeroCreate(0) which is of course always valid. But what in the else branch? There needs to be code to dynamically create the correct number of field values, with their correct (possibly complex) constructors and so on. Not impossible with reflection, but clunky.
FSharpType.GetUnionCases duType
|> Seq.map (fun duCase ->
if duCase.GetFields().Length = 0 then
let case = FSharpValue.MakeUnion(duCase, Array.zeroCreate(0))
else
(*...*)
I'm writing an SVG parser, mainly as an exercise for learning how to use Parsec. Currently I'm using the following data type to represent my SVG file:
data SVG = Element String [Attribute] [SVG]
| SelfClosingTag [Attribute]
| Body String
| Comment String
| XMLDecl String
This works quite well, however I'm not sure about the Element String [Attribute] [SVG] part of my data type.
Since there is only a limited number of potential tags for an SVG, I was thinking about using a type to represent an SVG element instead of using a String. Something like this:
data SVG = Element TagName [Attribute] [SVG]
| ...
data TagName = A
| AltGlyph
| AltGlyphDef
...
| View
| Vkern
Is it a good idea? What would be the benefits of doing this if there are any?
Is there a more elegant solution?
I personally prefer the approach of enumerating all possible TagNames. This way, the compiler can give you errors and warnings if you make any careless mistakes. For example, if I want to write a function that covers every possible type of Element, then if every type is enumerated in an ADT, the compiler can give you non-exhaustive match warnings. If you represent it as a string, this is not possible. Additionally, if I want to match an Element of a specific type, and I accidentally misspell the TagName, the compiler will catch it. A third reason, which probably doesn't really apply here, but is worth noting in general is that if I later decide to add or remove a variant of TagName, then the compiler will tell me every place that needs to be modified. I doubt this will happen for SVG tag names, but in general it is something to keep in mind.
To answer your question:
You can do this either way depending on what you are going to do with your parse tree after you make it.
If all you care to do with you SVG parser is describe the shape of the SGV data, you are just fin with a string.
On the other hand if you want to somehow transform that SVG data into something like a graphic (that is you anticipate evaluating your AST) you will find that it is best to represent all semantic information in the type system. It will make the next steps much easier.
The question in my mind is whether the parsing pass is exactly the place to make that happen. (Full disclosure, I have only a passing familiarity with SVG.) I suspect that rather then just a flat list of tags, you would be better off with Element each with it's own set of required and optional attributes. if this transformation "happens later in the program" there is no need to create a TagName data type. You can catch all the type errors at the same time you merge the attributes into the Elements.
On the other hand, a good argument could be made to parse straight into a complete Element tree in which case, I would drop the generic [Attribute] and [SVG] fields of the Element constructor and instead make appropriate fields in your TagName constructor.
Another answer to the question you didn't ask:
Put source code location into your parse tree early. From personal experence, I can tell you it gets harder the larger your program gets.
I understand well the benefit of option, but in this case, I want to avoid using option for performance reasons. option wraps a type in a class, which just means more work for the garbage collector -- and I want to avoid that.
In this case especially, I have multiple fields that are all Some under the same circumstances, but I don't want to put them in a tuple because, again, tuples are classes -- and puts additional stress on the GC. So I end up accessing field.Value -- which defeats the purpose of option.
So unless there's an optimization I don't know about that causes option types to be treated as references that are potentially null, I want to just use null. Is there a way that I can do that?
Edit: To expand on what I'm doing, I'm making a bounding volume hierarchy, which is really a binary tree with data only at the leaf nodes. I'm implementing it as a class rather than as a discriminated union because keeping the items immutable isn't an option for performance reasons, and discriminated unions can't have mutable members, only refs -- again, adding to GC pressure.
As silly as it is in a functional language, I may just end up doing each node type as an inheritance of a Node parent type. Downcasting isn't exactly the fastest operation, but as far as XNA and WP7 are concerned, almost anything is better than angering the GC.
According to this MSDN documentation, if you decorate your type with the [<AllowNullLiteral>] attribute, you can then call Unchecked.defaultof<T>() to build a null for you.
That seems to be the only way within F# to do what you want. Otherwise, you could marshall out to another .net language and get nulls from there... but I'm guessing that is not what you want at all
Now there are Value Options which may give you the best of both worlds
[<StructuralEquality; StructuralComparison>]
[<Struct>]
type ValueOption<'T> =
| ValueNone
| ValueSome of 'T
No class wrapping, and syntax semantics of Option<'T>
I am having a brain freeze on f#'s option types. I have 3 books and read all I can but I am not getting them.
Does someone have a clear and concise explanation and maybe a real world example?
TIA
Gary
Brian's answer has been rated as the best explanation of option types, so you should probably read it :-). I'll try to write a more concise explanation using a simple F# example...
Let's say you have a database of products and you want a function that searches the database and returns product with a specified name. What should the function do when there is no such product? When using null, the code could look like this:
Product p = GetProduct(name);
if (p != null)
Console.WriteLine(p.Description);
A problem with this approach is that you are not forced to perform the check, so you can easily write code that will throw an unexpected exception when product is not found:
Product p = GetProduct(name);
Console.WriteLine(p.Description);
When using option type, you're making the possibility of missing value explicit. Types defined in F# cannot have a null value and when you want to write a function that may or may not return value, you cannot return Product - instead you need to return option<Product>, so the above code would look like this (I added type annotations, so that you can see types):
let (p:option<Product>) = GetProduct(name)
match p with
| Some prod -> Console.WriteLine(prod.Description)
| None -> () // No product found
You cannot directly access the Description property, because the reuslt of the search is not Product. To get the actual Product value, you need to use pattern matching, which forces you to handle the case when a value is missing.
Summary. To summarize, the purpose of option type is to make the aspect of "missing value" explicit in the type and to force you to check whether a value is available each time you work with values that may possibly be missing.
See,
http://msdn.microsoft.com/en-us/library/dd233245.aspx
The intuition behind the option type is that it "implements" a null-value. But in contrast to null, you have to explicitly require that a value can be null, whereas in most other languages, references can be null by default. There is a similarity to SQLs NULL/NOT NULL if you are familiar with those.
Why is this clever? It is clever because the language can assume that no output of any expression can ever be null. Hence, it can eliminate all null-pointer checks from the code, yielding a lot of extra speed. Furthermore, it unties the programmer from having to check for the null-case all the same, should he or she want to produce safe code.
For the few cases where a program does require a null value, the option type exist. As an example, consider a function which asks for a key inside an .ini file. The key returned is an integer, but the .ini file might not contain the key. In this case, it does make sense to return 'null' if the key is not to be found. None of the integer values are useful - the user might have entered exactly this integer value in the file. Hence, we need to 'lift' the domain of integers and give it a new value representing "no information", i.e., the null. So we wrap the 'int' to an 'int option'. Now, if there is no integer value we will get 'None' and if there is an integer value, we will get 'Some(N)' where N is the integer value in question.
There are two beautiful consequences of the choice. One, we can use the general pattern match features of F# to discriminate the values in e.g., a case expression. Two, the framework of algebraic datatypes used to define the option type is exposed to the programmer. That is, if there were no option type in F# we could have created it ourselves!