I have an avro schema define like -
#namepsace("com.test.customer)
protocol customerData {
record customerData {
union {null, string} id;
union {null, string} firstname;
}
}
after it went live, we added another field -
#namepsace("com.test.customer)
protocol customerData {
record customerData {
union {null, string} id;
union {null, string} firstname,
union {null, string} customerType
}
}
customerType is defined as null, string. even then when while registering the schema with the confluent registry, we get the error -
Schema being registered is incompatible with an earlier schema.
Please let us know if there is any reason behind it. we have fixed this by explicitly defaultting customerType to null,
union {null, string} customerType = null;
But somehow I feel that this is not required. Please let me know why we get the error even when the schema is defined as {null, string}
"union {null, string} customerType" means that customerType can be null, but the default is undefined. The default can be anything, like
union {null, string} customerType = ""; (empty)
union {null, string} customerType = null;
You have to specify a default value so that when the field is missing, Avro knows what to use for the field.
Related
So say I have an employee type
type employee = {
employee_id: int
name: string
department: int
}
type department = {
department_id: int
department_name: string
}
And I want a second type that includes everything in the employee type, but also everything from the department type (in practice the result of an SQL join).
E.g
type employee_extended = {
employee_id: int
name: string
department: int
department_id: int
department_name: string
}
In reality the tables I have have many more columns so just wondering if there was a shorthand to define the extended type :)
Can assume no duplicate of property names, but if it's possible to handle them might be useful in the future.
If you have some data in two types (employee and department) and want to create a new type that contains information about both the employee and the department, the best way is to define a new record type that contains the two other types as fields:
type Employee = {
EmployeeId: int
Name: string
Department: int
}
type Department = {
DepartmentId: int
Department_name: string
}
// New type containing information about
// an employee and also their department
type EmployeeWithDepartment = {
Employee : Employee
Department : Department
}
I have two kinds of entity in my application: customers and products. They are each identified at a database level by a UUID.
In my F# code, this can be represented by System.Guid.
For readability, I added some types like this:
open System
type CustomerId = Guid
type ProductId = Guid
However, this does not prevent me from using a ProductId as a CustomerId and vice-versa.
I came up with a wrapper idea to prevent this:
open System
[<Struct>]
type ProductId =
{
Product : Guid
}
[<Struct>]
type CustomerId =
{
Customer : Guid
}
This makes initialization a little more verbose, and perhaps less intuitive:
let productId = { Product = Guid.NewGuid () }
But it adds type-safety:
// let customerId : CustomerId = productId // Type error
I was wondering what other approaches there are.
You can use single-case union types:
open System
[<Struct>]
type ProductId = ProductId of Guid
[<Struct>]
type CustomerId = CustomerId of Guid
let productId = ProductId (Guid.NewGuid())
Normally we add some convenient helper methods/properties directly to the types:
[<Struct>]
type ProductId = private ProductId of Guid with
static member Create () = ProductId (Guid.NewGuid())
member this.Value = let (ProductId i) = this in i
[<Struct>]
type CustomerId = private CustomerId of Guid with
static member Create () = CustomerId (Guid.NewGuid())
member this.Value = let (CustomerId i) = this in i
let productId = ProductId.Create ()
productId.Value |> printfn "%A"
Another approach, which is less common, but worth mentioning is to use so-called phantom types. The idea is that you will have a generic wrapper ID<'T> and then use different types for 'T to represent different types of IDs. Those types are never actually instantiated, which is why they're called phantom types.
[<Struct>]
type ID<'T> = ID of System.Guid
type CustomerID = interface end
type ProductID = interface end
Now you can create ID<CustomerID> and ID<ProductID> values to represent two kinds of IDs:
let newCustomerID () : ID<CustomerID> = ID(System.Guid.NewGuid())
let newProductID () : ID<ProductID> = ID(System.Guid.NewGuid())
The nice thing about this is that you can write functions that work with any ID easily:
let printID (ID g) = printfn "%s" (g.ToString())
For example, I can now create one customer ID, one product ID and print both, but I cannot do equality test on those IDs, because they're types do not match:
let ci = newCustomerID ()
let pi = newProductID ()
printID ci
printID pi
ci = pi // Type mismatch. Expecting a 'ID<CustomerID>' but given a 'ID<ProductID>'
This is a neat trick, but it is a bit more complicated than just using new type for each ID. In particular, you will likely need more type annotations in various places to make this work and the type errors might be less clear, especially when there is generic code involved. However, it's worth mentioning this as an alternative.
My intent is to define a module with functions which can operate on all records types which comply with certain assumptions about the keys.
To illustrate, let us have the following code:
> type DBRow = { id: string ; createdAt: System.DateTime } ;;
type DBRow =
{id: string;
createdAt: System.DateTime;}
> let logCreationInfo row = printf "Record %s created at %s " row.id (row.createdAt.ToString()) ;;
val logCreationInfo : row:DBRow -> unit
I would like to change the above logCreationInfo to be able to operate on all records which have id: string and createdAt: System.DateTime (and maybe other things).
Coming from typescript's structural typing, I'd have expected this to be trivial, but I am exploring the possibility that there is a more idiomatic way to handle this in F#.
I had attempted to handle this using interfaces, but even if that could work, since F# supports only explicit interfaces, this will not be suitable for types I don't define myself.
You could use statically resolved type constraints.
let inline logCreationInfo (x : ^t) =
printfn "Record %s created at %s"
(^t : (member id : string) (x))
((^t : (member createdAt : System.DateTime) (x)).ToString())
F# largely uses nominative typing - this is a natural choice in its runtime environment, as this is what Common Type System specification prescribes. Adherence to that set of rules allows F# code to near-seamlessly interoperate with other .NET languages.
It's worth noting that this follows the same reasoning as to why TypeScript uses structural typing. Since that language builds up on top of dynamically typed JavaScript, it's more natural to express object relationships in terms of their structure rather than nominal types - which are a foreign concept in JS.
F# does have a "backdoor" for structural typing through already mentioned SRTPs, but I would suggest using it very sparingly. SRTPs are resolved and the code using them is inlined by the compiler, making for longer compilation times and reduced interoperability with other languages and the .NET platform in general (simply put, you can't refer to that code from other languages or using reflection API, because it's "compiled away").
Usually there are other solutions available. Interfaces were already mentioned, though the example used was a bit contrived - this is simpler:
type IDBRow =
abstract Id: string
abstract CreatedAt: System.DateTime
type Person =
{
id: string
name: string
age: int
createdAt: System.DateTime
}
interface IDBRow with
member this.Id = this.id
member this.CreatedAt = this.createdAt
let logCreationInfo (row: #IDBRow) =
printf "Record %s created at %s" row.Id (string row.CreatedAt)
let x = { id = "1"; name = "Bob"; age = 32; createdAt = DateTime.Now }
logCreationInfo x
Or using composition and a generic type to capture the generic part of what it means to be a DBRow:
type DBRow<'data> =
{
id: string
data: 'data
createdAt: System.DateTime
}
type Person =
{
name: string
age: int
}
let logCreationInfo (row: DBRow<_>) =
printf "Record %s created at %s" row.id (string row.createdAt)
let x = { id = "1"; data = { name = "Bob"; age = 32 }; createdAt = DateTime.Now }
logCreationInfo x
Here's a version with interfaces:
open System
type DBRow1 = {
id: string
createdAt: DateTime
}
type DBRow2 = {
id: string
createdAt: DateTime
address: string
}
/// The types are defined above without an interface
let row1 = {id = "Row1"; createdAt = DateTime.Now}
let row2 = {id = "Row2"; createdAt = DateTime.Now; address = "NYC"}
type IDBRow<'A> =
abstract member Data:(string * DateTime)
// Object expression implements the interface
let Data1 (x:DBRow1) = {
new IDBRow<_> with
member __.Data = (x.id, x.createdAt)
}
let Data2 (x: DBRow2) = {
new IDBRow<_> with
member __.Data = (x.id, x.createdAt)
}
//pass in both the object expression and the record
let getData (ifun: 'a -> IDBRow<'b>) xrec =
(ifun xrec).Data
// You could partially apply the functions: `getData1 = getData Data1`
getData Data1 row1 //("Row1", 2018/02/05 9:24:17)
getData Data2 row2 //("Row2", 2018/02/05 9:24:17)
You can certainly use an interface (an object expression in this case) to tack on another member, .Data, even if you don'T have access to the original type. You would still need to put together one object expression for each type though, so SRTP might be a more "elegant" solution.
When I have the following code:
[<Struct>]
type Person = { mutable FirstName:string ; LastName : string}
let john = { FirstName = "John"; LastName = "Connor"}
john.FirstName <- "Sarah";
The compiler complains that "A value must be mutable in order to mutate the contents". However when I remove the Struct attribute it works fine. Why is that so ?
This protects you from a gotcha that used to plague the C# world a few years back: structs are passed by value.
Note that the red squiggly (if you're in IDE) appears not under FirstName, but under john. The compiler complains not about changing the value of john.FirstName, but about changing the value of john itself.
With non-structs, there is an important distinction between the reference and the referenced object:
Both the reference and the object itself can be mutable. So that you can either mutate the reference (i.e. make it point to a different object), or you can mutate the object (i.e. change the contents of its fields).
With structs, however, this distinction does not exist, because there is no reference:
This means that when you mutate john.FirstName, you also mutate john itself. They are one and the same.
Therefore, in order to perform this mutation, you need to declare john itself as mutable too:
[<Struct>]
type Person = { mutable FirstName:string ; LastName : string}
let mutable john = { FirstName = "John"; LastName = "Connor"}
john.FirstName <- "Sarah" // <-- works fine now
For further illustration, try this in C#:
struct Person
{
public string FirstName;
public string LastName;
}
class SomeClass
{
public Person Person { get; } = new Person { FirstName = "John", LastName = "Smith" };
}
class Program
{
static void Main( string[] args )
{
var c = new SomeClass();
c.Person.FirstName = "Jack";
}
}
The IDE will helpfully underline c.Person and tell you that you "Cannot modify the return value of 'SomeClass.Person' because it is not a variable".
Why is that? Every time you write c.Person, that is translated into calling the property getter, which is just like another method that returns you a Person. But because Person is passed by value, that returned Person is going to be a different Person every time. The getter cannot return you references to the same object, because there can be no references to a struct. And therefore, any changes you make to this returned value will not be reflected in the original Person that lives inside SomeClass.
Before this helpful compiler error existed, a lot of people would do this:
c.Person.FirstName = "Jack"; // Why the F doesn't it change? Must be compiler bug!
I clearly remember answering this question almost daily. Those were the days! :-)
I'd like some way to define related records. For example,
type Thing = { field1: string; field2: float }
type ThingRecord = { field1: string; field2: float; id: int; created: DateTime }
or
type UserProfile = { username: string; name: string; address: string }
type NewUserReq = { username: string; name: string; address: string; password: string }
type UserRecord = { username: string; name: string; address: string; encryptedPwd: string; salt: string }
along with a way to convert from one to the other, without needing to write so much boilerplate. Even the first example in full would be:
type Thing =
{ field1: string
field2: float }
with
member this.toThingRecord(id, created) =
{ field1 = this.field1
field2 = this.field2
id = id
created = created } : ThingRecord
and ThingRecord =
{ field1: string
field2: float
id: int
created: DateTime }
with
member this.toThing() =
{ field1 = this.field1
field2 = this.field2 } : Thing
As you get up to field10 etc, it gets to be a liability.
I currently do this in an unsafe (and slow) manner using reflection.
I added a request for with syntax to be extended to record definitions on uservoice, which would fill this need.
But is there perhaps an typesafe way to do this already? Maybe with type providers?
Yes, that's a chink in F#'s otherwise shiny armor. I don't feel there's a universal solution there for easily inheriting or extending a record. No doubt there is an appetite for one - I've counted over a dozen uservoice submissions advocating improvements along these lines - here are a few leading ones, feel free to vote up: 1, 2, 3, 4, 5.
For sure, there are things you can do to work around the problem, and depending on your scenario they might work great for you. But ultimately - they're workarounds and there's something you have to sacrifice:
Speed and type safety when using reflection,
Brevity when you go the type safe way and have full-fledged records with conversion functions between them,
All the syntactic and semantic goodness that records give you for free when you decide to fall back to plain .NET classes and inheritance.
Type providers won't cut it because they're not really a good tool for metaprogramming. That's not what they were designed for. If you try to use them that way, you're bound to hit some limitation.
For one, you can only provide types based on external information. This means that while you could have a type provider that would pull in types from a .NET assembly via reflection and provide some derived types based on that, you can't "introspect" into the assembly you're building. So no way of deriving from a type defined earlier in the same assembly.
I guess you could work around that by structuring your projects around the type provider, but that sounds clunky. And even then, you can't provide record types anyway yet, so best you could do are plain .NET classes.
For a more specific use case of providing some kind of ORM mapping for a database - I imagine you could use type providers just fine. Just not as a generic metaprogramming facility.
Why don't you make them more nested, like the following?
type Thing = { Field1: string; Field2: float }
type ThingRecord = { Thing : Thing; Id: int; Created: DateTime }
or
type UserProfile = { Username: string; Name: string; Address: string }
type NewUserReq = { UserProfile: UserProfile; Password: string }
type UserRecord = { UserProfile: UserProfile; EncryptedPwd: string; Salt: string }
Conversion functions are trivial:
let toThingRecord id created thing = { Thing = thing; Id = id; Created = created }
let toThing thingRecord = thingRecord.Thing
Usage:
> let tr = { Field1 = "Foo"; Field2 = 42. } |> toThingRecord 1337 (DateTime (2016, 6, 24));;
val tr : ThingRecord = {Thing = {Field1 = "Foo";
Field2 = 42.0;};
Id = 1337;
Created = 24.06.2016 00:00:00;}
> tr |> toThing;;
val it : Thing = {Field1 = "Foo";
Field2 = 42.0;}