Avro SpecificSchemaCompiler null pointer exception - avro

I am trying to generate avro java classes programmatically. I have defined avro schema file. I am invoking SpecificSchemaCompiler object with this schema and then calling compileToDestination(src,dest) method to get the generated class. However I am facing a null pointer exception.
The above issue I faced in my existing workspace. So I created a new java project in eclipse and simply tried the above approach. There it worked and I am getting the generate java file.
I am using the following schema.
{
"type": "record",
"namespace": "gov.mrm",
"name": "Customer",
"doc": "Avro Schema for our Customer",
"fields": [
{ "name": "first_name", "type": "string", "doc": "First Name of Customer" },
{ "name": "last_name", "type": "string", "doc": "Last Name of Customer" },
{ "name": "age", "type": "int", "doc": "Age at the time of registration" },
{ "name": "height", "type": "float", "doc": "Height at the time of registration in cm" },
{ "name": "weight", "type": "float", "doc": "Weight at the time of registration in kg" },
{ "name": "automated_email", "type": "boolean", "default": true, "doc": "Field indicating if the user is enrolled in marketing emails" }
]
}
public class Test {
public static void main(String[] args) throws IOException
{
SpecificCompiler compiler = new SpecificCompiler(new Schema.Parser().parse(new File("src/main/avro/customer.avsc")));
compiler.compileToDestination(new File("src/main/avro"), new File("src/main/java"));
}
}
I am getting following exception.
Exception in thread "main" java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.avro.compiler.specific.SpecificCompiler.renderTemplate(SpecificCompiler.java:387)
at org.apache.avro.compiler.specific.SpecificCompiler.compile(SpecificCompiler.java:456)
at org.apache.avro.compiler.specific.SpecificCompiler.compileToDestination(SpecificCompiler.java:374)
at main.avro.Test.main(Test.java:14)
Caused by: java.lang.NullPointerException
at org.apache.velocity.runtime.RuntimeInstance.getTemplate(RuntimeInstance.java:831)
at org.apache.velocity.runtime.RuntimeInstance.getTemplate(RuntimeInstance.java:813)
at org.apache.velocity.app.VelocityEngine.getTemplate(VelocityEngine.java:470)
at org.apache.avro.compiler.specific.SpecificCompiler.renderTemplate(SpecificCompiler.java:385)
... 3 more

Related

Avro Tools Failure Expected start-union. Got VALUE_STRING

I've defined the below avro schema (car_sales_customer.avsc),
{
"type" : "record",
"name" : "topLevelRecord",
"fields" : [ {
"name": "cust_date",
"type": "string"
},
{
"name": "customer",
"type": {
"type": "array",
"items": {
"name": "customer",
"type": "record",
"fields": [
{
"name": "address",
"type": "string"
},
{
"name": "driverlience",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "name",
"type": "string"
},
{
"name": "phone",
"type": "string"
}
]
}
}
}]
}
and my input json payload (car_sales_customer.json) is as follows,
{"cust_date":"2017-04-28","customer":[{"address":"SanFrancisco,CA","driverlience":"K123499989","name":"JoyceRidgely","phone":"16504378889"}]}
I'm trying to use avro-tools and convert the above json to avro using the avro schema,
java -jar ./avro-tools-1.9.2.jar fromjson --schema-file ./car_sales_customer.avsc ./car_sales_customer.json > ./car_sales_customer.avro
I get the below error when I execute the above statement,
Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-union. Got VALUE_STRING
at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:514)
at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:433)
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:283)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:259)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:298)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:183)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:259)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at org.apache.avro.tool.DataFileWriteTool.run(DataFileWriteTool.java:89)
at org.apache.avro.tool.Main.run(Main.java:66)
at org.apache.avro.tool.Main.main(Main.java:55)
Is there a solution to overcome the error?

Apache Avro UnresolvedUnionException: Not in union ["null",{"type":"int","logicalType":"date"}]: 2001-01-01

Despite examples collected here and there, I haven't been able to produce a correct Avro 1.9.1 schema for my (lomboked) class, getting the title's error message at serialization time of my LocalDate field.
Can someone please explain what I'm missing?
#Data
public class Person {
private Long id;
private String firstname;
private LocalDate birth;
private Integer votes = 0;
}
This is the schema:
{
"type": "record",
"name": "Person",
"namespace": "com.example.demo",
"fields": [
{
"name": "id",
"type": "long"
},
{
"name": "firstname",
"type": "string"
},
{
"name": "birth",
"type": [ "null", { "type": "int", "logicalType": "date" }]
},
{
"name": "votes",
"type": "int"
}]
}
The error, meaning java.time.LocalDate is not found in the union's "index named" map, is this:
org.apache.avro.UnresolvedUnionException: Not in union ["null",{"type":"int","logicalType":"date"}]: 2001-01-01
Index named map keys are "null" and "int", which seems logical.

How to use Spring-Kafka to read AVRO message with Confluent Schema registry?

How to use Spring-Kafka to read AVRO message with Confluent Schema registry? Is there any sample? I can't find it in official reference document.
Below code can read the message from customer-avro topic. Here's the AVRO schema on value i have defined as.
{
"type": "record",
"namespace": "com.example",
"name": "Customer",
"version": "1",
"fields": [
{ "name": "first_name", "type": "string", "doc": "First Name of Customer" },
{ "name": "last_name", "type": "string", "doc": "Last Name of Customer" },
{ "name": "age", "type": "int", "doc": "Age at the time of registration" },
{ "name": "height", "type": "float", "doc": "Height at the time of registration in cm" },
{ "name": "weight", "type": "float", "doc": "Weight at the time of registration in kg" },
{ "name": "automated_email", "type": "boolean", "default": true, "doc": "Field indicating if the user is enrolled in marketing emails" }
]
}
Below is a complete code snippet to read this example with manual commit.
import com.example.Customer;
import io.confluent.kafka.serializers.KafkaAvroDeserializer;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.serialization.StringDeserializer;
import java.util.Calendar;
import java.util.Collections;
import java.util.Properties;
public class KafkaAvroJavaConsumerV1Demo {
public static void main(String[] args) {
Properties properties = new Properties();
// normal consumer
properties.setProperty("bootstrap.servers","127.0.0.1:9092");
properties.put("group.id", "customer-consumer-group-v1");
properties.put("auto.commit.enable", "false");
properties.put("auto.offset.reset", "earliest");
// avro part (deserializer)
properties.setProperty("key.deserializer", StringDeserializer.class.getName());
properties.setProperty("value.deserializer", KafkaAvroDeserializer.class.getName());
properties.setProperty("schema.registry.url", "http://127.0.0.1:8081");
properties.setProperty("specific.avro.reader", "true");
KafkaConsumer<String, Customer> kafkaConsumer = new KafkaConsumer<>(properties);
String topic = "customer-avro";
kafkaConsumer.subscribe(Collections.singleton(topic));
System.out.println("Waiting for data...");
while (true){
System.out.println("Polling at " + Calendar.getInstance().getTime().toString());
ConsumerRecords<String, Customer> records = kafkaConsumer.poll(1000);
for (ConsumerRecord<String, Customer> record : records){
Customer customer = record.value();
System.out.println(customer);
}
kafkaConsumer.commitSync();
}
}
}
This does not vary if use spring kafka or native java client for kafka.Assuming you want to consume messages from a topic and the key and value are avro records you can have the below properties added to consumerproperties.
consumerProperties.put("key.deserializer", "io.confluent.kafka.serializers.KafkaAvroDeserializer");
consumerProperties.put("value.deserializer", "io.confluent.kafka.serializers.KafkaAvroDeserializer");
consumerProperties.put("schema.registry.url", KAFKA_SCHEMA_REGISTRY_URL);
If you are consuming a specific record from the topic you need to add the following line as well
consumerProperties.put(KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG, true);

Avro schema definition nesting types

I am fairly new to Avro and going through documentation for nested types. I have the example below working nicely but many different types within the model will have addresses. Is it possible to define an address.avsc file and reference that as a nested type? If that is possible, can you also take it a step further and have a list of Addresses for a Customer? Thanks in advance.
{"namespace": "com.company.model",
"type": "record",
"name": "Customer",
"fields": [
{"name": "firstname", "type": "string"},
{"name": "lastname", "type": "string"},
{"name": "email", "type": "string"},
{"name": "phone", "type": "string"},
{"name": "address", "type":
{"type": "record",
"name": "AddressRecord",
"fields": [
{"name": "streetaddress", "type": "string"},
{"name": "city", "type": "string"},
{"name": "state", "type": "string"},
{"name": "zip", "type": "string"}
]}
}
]
}
There are 4 possible ways:
Including it in pom file as mentioned in this ticket.
Declare all your types in a single avsc file.
Using a single static parser that first parses all the imports and then parse the actual data types.
(This is a hack) Use avdl file and use imports like https://avro.apache.org/docs/1.7.7/idl.html#imports . Though, IDL is intended for RPC calls.
Example for 2. Declare all your types in a single avsc file. Also answers array declaration on address.
[
{
"type": "record",
"namespace": "com.company.model",
"name": "AddressRecord",
"fields": [
{
"name": "streetaddress",
"type": "string"
},
{
"name": "city",
"type": "string"
},
{
"name": "state",
"type": "string"
},
{
"name": "zip",
"type": "string"
}
]
},
{
"namespace": "com.company.model",
"type": "record",
"name": "Customer",
"fields": [
{
"name": "firstname",
"type": "string"
},
{
"name": "lastname",
"type": "string"
},
{
"name": "email",
"type": "string"
},
{
"name": "phone",
"type": "string"
},
{
"name": "address",
"type": {
"type": "array",
"items": "com.company.model.AddressRecord"
}
}
]
},
{
"namespace": "com.company.model",
"type": "record",
"name": "Customer2",
"fields": [
{
"name": "x",
"type": "string"
},
{
"name": "y",
"type": "string"
},
{
"name": "address",
"type": {
"type": "array",
"items": "com.company.model.AddressRecord"
}
}
]
}
]
Example for 3. Using a single static parser
Parser parser = new Parser(); // Make this static and reuse
parser.parse(<location of address.avsc file>);
parser.parse(<location of customer.avsc file>);
parser.parse(<location of customer2.avsc file>);
If we want a hold of the Schema, that is if we want to create new records, we can either do
https://avro.apache.org/docs/1.5.4/api/java/org/apache/avro/Schema.Parser.html#getTypes() method to get the schema
or
Parser parser = new Parser(); // Make this static and reuse
Schema addressSchema =parser.parse(<location of address.avsc file>);
Schema customerSchema=parser.parse(<location of customer.avsc file>);
Schema customer2Schema =parser.parse(<location of customer2.avsc file>);
Just to added to #Princey James answer, the nested type must be defined before it is used.
Other add to #Princey James
With the Example for 2. Declare all your types in a single avsc file.
It will work for Serializing and deserializing with code generation
but Serializing and deserializing without code generation is not working
you will get org.apache.avro.AvroRuntimeException: Not a record schema: [{"type":" ...
working example with code generation :
#Test
public void avroWithCode() throws IOException {
UserPerso UserPerso3 = UserPerso.newBuilder()
.setName("Charlie")
.setFavoriteColor("blue")
.setFavoriteNumber(null)
.build();
AddressRecord adress = AddressRecord.newBuilder()
.setStreetaddress("mo")
.setCity("Paris")
.setState("IDF")
.setZip("75")
.build();
ArrayList<AddressRecord> li = new ArrayList<>();
li.add(adress);
Customer cust = Customer.newBuilder()
.setUser(UserPerso3)
.setPhone("0101010101")
.setAddress(li)
.build();
String fileName = "cust.avro";
File a = new File(fileName);
DatumWriter<Customer> customerDatumWriter = new SpecificDatumWriter<>(Customer.class);
DataFileWriter<Customer> dataFileWriter = new DataFileWriter<>(customerDatumWriter);
dataFileWriter.create(cust.getSchema(), new File(fileName));
dataFileWriter.append(cust);
dataFileWriter.close();
DatumReader<Customer> custDatumReader = new SpecificDatumReader<>(Customer.class);
DataFileReader<Customer> dataFileReader = new DataFileReader<>(a, custDatumReader);
Customer cust2 = null;
while (dataFileReader.hasNext()) {
cust2 = dataFileReader.next(cust2);
System.out.println(cust2);
}
}
without :
#Test
public void avroWithoutCode() throws IOException {
Schema schemaUserPerso = new Schema.Parser().parse(new File("src/main/resources/avroTest/user.avsc"));
Schema schemaAdress = new Schema.Parser().parse(new File("src/main/resources/avroTest/user.avsc"));
Schema schemaCustomer = new Schema.Parser().parse(new File("src/main/resources/avroTest/user.avsc"));
System.out.println(schemaUserPerso);
GenericRecord UserPerso3 = new GenericData.Record(schemaUserPerso);
UserPerso3.put("name", "Charlie");
UserPerso3.put("favorite_color", "blue");
UserPerso3.put("favorite_number", null);
GenericRecord adress = new GenericData.Record(schemaAdress);
adress.put("streetaddress", "mo");
adress.put("city", "Paris");
adress.put("state", "IDF");
adress.put("zip", "75");
ArrayList<GenericRecord> li = new ArrayList<>();
li.add(adress);
GenericRecord cust = new GenericData.Record(schemaCustomer);
cust.put("user", UserPerso3);
cust.put("phone", "0101010101");
cust.put("address", li);
String fileName = "cust.avro";
File file = new File(fileName);
DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<>(schemaCustomer);
DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<>(datumWriter);
dataFileWriter.create(schemaCustomer, file);
dataFileWriter.append(cust);
dataFileWriter.close();
File a = new File(fileName);
DatumReader<GenericRecord> datumReader = new GenericDatumReader<>(schemaCustomer);
DataFileReader<GenericRecord> dataFileReader = new DataFileReader<>(a, datumReader);
GenericRecord cust2 = null;
while (dataFileReader.hasNext()) {
cust2 = dataFileReader.next(cust2);
System.out.println(cust2);
}
}

How do I use the swagger models section?

Inside the Swagger API Documentation there is inside the json beside the apis array a model object entry but no documentation about it. How can I use this "models" part?
{
apiVersion: "0.2",
swaggerVersion: "1.1",
basePath: "http://petstore.swagger.wordnik.com/api",
resourcePath: "/pet.{format}"
...
apis: [...]
models: {...}
}
Models are nothing but like your POJO classes in java which have variables and properties. In models section you can define your own custom class and you can refer it as data type.
If you see below
{
"path": "/pet.{format}",
"description": "Operations about pets",
"operations": [
{
"httpMethod": "POST",
"summary": "Add a new pet to the store",
"responseClass": "void",
"nickname": "addPet",
"parameters": [
{
"description": "Pet object that needs to be added to the store",
"paramType": "body",
"required": true,
"allowMultiple": false,
"dataType": "Pet"
}
],
"errorResponses": [
{
"code": 405,
"reason": "Invalid input"
}
]
}
Here in parameter section it have one parameter who's dataType is Pet and pet is defined in models as below
{
"models": {
"Pet": {
"id": "Pet",
"properties": {
"id": {
"type": "long"
},
"status": {
"allowableValues": {
"valueType": "LIST",
"values": [
"available",
"pending",
"sold"
]
},
"description": "pet status in the store",
"type": "string"
},
"name": {
"type": "string"
},
"photoUrls": {
"items": {
"type": "string"
},
"type": "Array"
}
}
}
}}
You can have nested models , for more information see Swagger PetStore example
So models are nothing but like classes.

Resources