Aliases for primitive types in AVRO - avro

I want to define aliases/logical names for primitive types. But if I do that directly like for PartyId in the example below, Avro-Tools do not accept this:
protocol Test {
string PartyId;
record Header {
PartyId partyId;
}
}
Is that possible in IDL or AVRO Schema language? How? As a work-around, I can define:
record PartyId {
string _value;
}
Though binary this seems equivalent, semantically (e.g. in generated Java code) this is not the same - PartyId is a structured type, not a primitive type.
I can define custom names for enums and records, but it seems to me that AVRO doesn't offer means for aliasing primitive types.

Aliases in Avro are for remapping of record namespaces
Names for records and enums are because these are actual objects, not primitives
Just like you cannot refer to a Java String by another class, Avro has no need for such a feature. You can write doc comments or name the fields appropriately to make it clear what each field is

Related

Deserialize missing values of type to certain value

I have made a wrapper type called Skippable<'a> (an F# discriminated union, not unlike Option) specifically meant for indicating which members should be excluded when serializing types:
type Skippable<'a> =
| Skip
| Serialize of 'a
I have functioning converters, but during deserialization, I want missing JSON values to be serialized to the Skip case of the DU (instead of null as is currently happening).
I know of DefaultValueAttribute, but that only works with constant values, and besides I don't want to use an attribute on each and every Skippable-wrapped property in my DTOs.
Is it possible in some way to tell Newtonsoft.Json to populate missing values of a certain type (Skippable<'a>) with a certain value of that type (Skip)? Using converters, contract resolvers, or other methods?
Making Skippable a struct union is one way to do it, since then the default value (e.g. using Unchecked.defaultOf) seems to be the first case with any fields (none, in this case) at their default values.
[<Struct>]
type Skippable<'a> =
| Skip
| Serialize of 'a
// Unchecked.defaultof<Skippable<obj>> = Skip
This is part of the FSharp.JsonSkippable library, which allows you to control in a simple and strongly typed manner whether to include a given property when serializing (and determine whether a property was included when deserializing), and moreover, to control/determine exclusion separately of nullability.

Csv Type Provider convert to Json

I am using the Csv Type Provider to read data from a local csv file.
I want to export the data as json, so I am taking each row and serializing it using the json.net Library with JsonConvert.SerializeObject(x).
The problem is that each row is modeled as a tuple, meaning that the column headers do not become property names when serializing. Instead I get Item1="..."; Item2="..." etc.
How can I export to Json without 'hand-rolling' a class/record type to hold the values and maintain the property names?
The TypeProvider works by providing compile time type safety. The actual code that is compiled maps (at compile time) the nice accessors to tupled values (for performance reasons, I guess). So at run time the JSON serializer sees tuples only.
AFAIK there is no way around hand-rolling records. (That is unless we eventually get type providers that are allowed to take types as parameters which would allow a Lift<T>-type provider or the CSV type provider implementation is adjusted accordingly.)

Is it possible to use arrays as static parameters in F# type providers?

I want to create a type provider with a static parameter that is an array. I thought this might work if the array was of another primitive type (int, string etc), but this seems not to work.
As a motivating use-case, this would, for example, allow specifying header names when using a CSV type provider for .csv files without a header row.
Only primitive types can be used as type provider parameters. The current json type provider gets around that by using a comma separated list of parameters as a string.

How can I code nullable objects in Google Cloud Dataflow?

This post is intended to answer questions like the following:
Which built-in Coders support nullable values?
How can I encode nullable objects?
What about classes with nullable fields?
What about collections with null entries?
You can inspect the built-in Coders in the DataflowJavaSDK source.
Some of the default Coders do not support null values, often for efficiency. For example, DoubleCoder always encodes a double using 8 bytes; adding a bit to reflect whether the double is null would add a (padded) 9th byte to all non-null values.
It is possible to encode nullable values using the techniques outlined below.
We generally recommend using AvroCoder to encode classes. AvroCoder has support for nullable fields annotated with the org.apache.avro.reflect.Nullable annotation:
#DefaultCoder(AvroCoder.class)
class MyClass {
#Nullable String nullableField;
}
See the TrafficMaxLaneFlow for a more complete code example.
AvroCoder also supports fields that include Null in a Union.
We recommend using NullableCoder to encode nullable objects themselves. This implements the strategy in #1.
For example, consider the following working code:
PCollection<String> output =
p.apply(Create.of(null, "test1", null, "test2", null)
.withCoder(NullableCoder.of(String.class)));
Nested null fields/objects are supported by many coders, as long as the nested coder supports null fields/objects.
For example, the SDK should be able to infer a working coder using the default CoderRegistry for a List<MyClass> -- it should automatically use a ListCoder with a nested AvroCoder.
Similarly, a List<String> with possibly-null entries can be encoded with the Coder:
Coder<List<String>> coder = ListCoder.of(NullableCoder.of(String.class))
Finally, in some cases Coders must be deterministic, e.g., the key used for GroupByKey. In AvroCoder, the #Nullable fields are coded deterministically as long as the Coder for the base type is itself deterministic. Similarly, using NullableCoder should not affect whether an object can be encoded deterministically.

Can a record have a nullable field?

Is it legal for a record to have a nullable field such as:
type MyRec = { startDate : System.Nullable<DateTime>; }
This example does build in my project, but is this good practice if it is legal, and what problems if any does this introduce?
It is legal, but F# encourage using option types instead:
type MyRec = { startDate : option<DateTime>; }
By using option you can easily pattern match against options and other operations to transform option values as for example map values (by using Option.map), and abstractions such as the Maybe monad (by using Option.bind), whereas with nullable you can't since only value types can be made nullables.
You will notice most F# functions (such as List.choose) work with options instead of nullables. Some language features like optional parameters are interpreted as the F# option type.
However in some cases, when you need to interact with C# you may want to use Nullable.
When usign Linq to query a DB you may consider using the Linq.Nullable Module and the Nullable operators
F# does not allow types that are declared in F# to be null. However, if you're using types that are not defined in F#, you are still allowed to use null. This is why your code is still legal. This is needed for inter-operability, because you may need to pass null to a .NET library or accept it as a result.
But I would say it is not a good practice unless your need is specifically of inter-operability. As others pointed out, you can use the option feature. However, this doesn't create an optional record field whose value you don't need to specify when creating it. To create a value of the record type, you still need to provide the value of the optional field.
Also, you can mark a type with the AllowNullLiteral attribute, and F# compiler would allow null as a value for that specific type, even if it is a type declared in F#. But AllowNullLiteral can't be applied to record types.
Oh and I almost forgot to mention: option types are NOT compatible with nullable types. Something that I kind of naively expected to just work (stupid me!). See this nice SO discussion for details.

Resources