how to evolve schema in confluent schema registry - avro

I'm working with Avro/Kafka and Confluent's Schema Registry for Avro.
I made some basic schemas and subjects with basic types using avsc files and avdl.
I am looking at the API's documentation made by Confluent to try to evolve a Schema to version 2. Particularly this part:
https://docs.confluent.io/current/schema-registry/using.html#register-a-new-version-of-a-schema-under-the-subject-kafka-key
But when I try to POST to this endpoint I get a 422 Conflict.
I'm using BACKWARDS compatibility and am updating just one field from the previous version:
{
"type": "record",
"name": "Address",
"fields": [
{"name": "id", "type": "string"},
{"name": "street", "type": "string"}
]
}
And the new version:
{
"type": "record",
"name": "Address",
"fields": [
{"name": "id", "type": "string"},
{"name": "street", "type": "string"},
{"name": "number", "type": "int"}
]
}
Can anyone tell me how to evolve a schema?

Solved it by setting the
access.control.allow.methods=GET,POST,OPTIONS,PUT,DELETE
I had previously set to allow CORSto * but it seems it just allows GET (?) so after changing this property I was able to change/evolve the schema.

Related

Replace a field of enum value in AVRO schema to string

I need a little help in removing an ENUM and replacing it with String in an AVRO schema.
I have an avro schema file which has something like this among other entries:
{
"name": "anonymizedLanguage",
"type": [
"null",
"com.publicevents.common.LanguageCode"
],
"default": null
}
The LanguageCode is also an avsc file with entries as below:
{
"name": "LanguageCode",
"type": "enum",
"namespace": "com.publicevents.common",
"symbols": [
"EN",
"NL",
"FR",
"ES"
]
}
I want to remove the language enum and move it to a string having the language code. How would I go about doing that ?
You can only do "type": ["null", "string"]. You cannot make it "have" anything specific to a language within the schema, that's what an enum is for. Once it is a plain string, that would be app-specific validation logic to enforce it have specific values.

How does schema evolution in JDBC Kafka Connector work?

I've configured my confluent schema registry's compatibility to BACKWARD_TRANSITIVE. I'm using confluent jdbc connector to pull incremental changes from MySQL DB.
Suppose my initial schema v1 looks like this
{"connect.name": "customers",
"type": "record",
"name": "user",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number", "type": "int"},
{"name": "favorite_color", "type": "string", "default": null}
]
}
I perform following actions on the customers table in my database and schedule jdbc connect to pull data from database after each action (I also pushed some DML after each action to capture events into Kafka topic)
I drop favorite_color (optional) column from customers table
I add a new (optional) column address to customers table
When 1 is performed I observe no change in my AVRO schema. However, when action 2 is performed, I notice that the schema version bumps up to v2 with below changes
{"connect.name": "customers",
"type": "record",
"name": "user",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number", "type": "int"},
{"name": "address", "type": "string", "default": null}
]
}
favorite_color column got dropped from schema and address column got introduced.
I noticed similar behaviour for non optional columns as well.
Why didn't schema registry update the schema when action 1 was
performed?
Why did it update when 2 was performed?
Could you please walk me through the kind of schema changes gets
updated eagerly and the ones updated lazily?
The behaviour is same irrespective of compatibility types.

Swagger/Swashbuckle not generating Tags element at root

We have Swagger setup on our .NET API (not Core) using Swashbuckle.
I'm looking at LucyBot to make a nicer looking documentation page. Looking at their sample, their OpenAPI file has a 'tags' element at the root, which is used to split the display into groups. Ours (/swagger/docs/v1) has no such element. I've tried playing around with everything I can see in the SwaggerConfig.cs, but am having no luck.
Any easy way to auto generate this? Some option, or comments, or something I'm just overlooking?
"swagger": "2.0",
"info": {
"description": "This is a demo of [LucyBot's API Documentation](http:\/\/lucybot.com) using swagger.io's Petstore server. You can find out more about Swagger at [http:\/\/swagger.io](http:\/\/swagger.io) or on [irc.freenode.net, #swagger](http:\/\/swagger.io\/irc\/). For this sample, you can use the api key `special-key` to test the authorization filters.\n\nTo use this documentation for your own API, visit [http:\/\/lucybot.com](http:\/\/lucybot.com)",
"version": "1.0.0",
"title": "Swagger Petstore",
"termsOfService": "http:\/\/swagger.io\/terms\/",
"contact": {
"email": "apiteam#swagger.io"
},
"license": {
"name": "Apache 2.0",
"url": "http:\/\/www.apache.org\/licenses\/LICENSE-2.0.html"
}
},
"host": "petstore.swagger.io",
"basePath": "\/v2",
"tags": [
{
"name": "pet",
"description": "Everything about your Pets",
"externalDocs": {
"description": "Find out more",
"url": "http:\/\/swagger.io"
}
},
{
"name": "store",
"description": "Access to Petstore orders"
},
{
"name": "user",
"description": "Operations about user",
"externalDocs": {
"description": "Find out more about our store",
"url": "http:\/\/swagger.io"
}
}
],```
Looking at the code swashbuckle code for the externalDocs:
https://github.com/domaindrivendev/Swashbuckle/blob/master/Swashbuckle.Core/Swagger/SwaggerDocument.cs#L134
Good thing that is there in the definitions, but is not used anywhere...
I think your only option would be to use and IDocumentFilter and inject your missing tags.
here is an example of how to use those filters:
https://github.com/domaindrivendev/Swashbuckle/blob/5489aca0d2dd7946f5569341f621f581720d4634/Swashbuckle.Dummy.Core/SwaggerExtensions/AppendVersionToBasePath.cs

How does one parse nested Avro records correctly in NiFi?

I have incoming Avro records that roughly follow the format below. I am able to read them and convert them in existing NiFi flows. However, a recent change requires me to read from these files and parse the nested record, employers in this example. I read the Apache NiFi blog post, Record-Oriented Data with NiFi
but was unable to figure out how to get the AvroRecordReader to parse nested records.
{
"name": "recordFormatName",
"namespace": "nifi.examples",
"type": "record",
"fields": [
{ "name": "id", "type": "int" },
{ "name": "firstName", "type": "string" },
{ "name": "lastName", "type": "string" },
{ "name": "email", "type": "string" },
{ "name": "gender", "type": "string" },
{ "name": "employers",
"type": "record",
"fields": [
{"name": "company", "type": "string"},
{"name": "guid", "type": "string"},
{"name": "streetaddress", "type": "string"},
{"name": "city", "type": "string"}
]}
]
}
What I hope to achieve is a flow to read the employers records for each recordFormatName record and use the PutDatabaseRecord processor to keep track of the employers values seen. The current plan is to insert the records to a MySQL database. As suggested in an answer below, I plan on using PartitionRecord to sort the records based on a value in the employers subrecord. I do not need the top level details for this particular flow.
I have tried to parse with the AvroRecordReader but cannot figure out how to specify the nested records. Is this something that can be accomplished with the AvroRecordReader alone or does preprocessing, say a JOLT Transform need to happen first?
EDIT: Added further details about database after receiving a response.
What is your target DB and what does your target table look like? PutDatabaseRecord may not be able to handle nested records unless your DB, driver, and target table support them.
Alternatively you may need to use UpdateRecord to flatten the "employers" object into fields at the top level of the record. This is a manual process (until NIFI-4398 is implemented), but you only have 4 fields. After flattening the records, you could use PartitionRecord to get all records with a specific value for, say, employers.company. The outgoing flow files from PartitionRecord would technically constitute the distinct values for the partition field(s). I'm not sure what you're doing with the distinct values, but if you can elaborate I'd be happy to help.

Google Cloud Endpoints REST Discovery Document missing format

I've upgraded to Cloud Endpoints 2.0 which no longer supports RPC. Therefore, I generated a new discovery document and used the service generator with the REST discovery doc as input in order to generate the client library for my iOS app.
Using the new REST discovery doc I am getting the following error when trying to generate the library:
~/workspace/google-api-objectivec-client-for-rest/Source/Tools/ServiceGenerator/build/Release/ServiceGenerator discovery/servUsApi-v1-rest.discovery --outputDir GTLAPI --gtlrFrameworkName GoogleAPIClientForREST
ERROR: Failure, exception: Looking at parameter 'creditKickbackKash:creditAmount', found a type/format pair of 'number/(null)', and don't how to map that to Objective-C
I was able to manually fix this by adding (in numerous places) in the discovery doc, the "format": "double" key and value for all double parameters. Notice creditAmount below is missing a format, like all other doubles.
The generated discovery doc looks like this:
"creditKickbackKash": {
"httpMethod": "PUT",
"id": "servUsApi.admin.creditKickbackKash",
"parameterOrder": [
"userId",
"creditAmount"
],
"parameters": {
"userId": {
"format": "int64",
"location": "path",
"required": true,
"type": "string"
},
"creditAmount": {
"location": "path",
"required": true,
"type": "number"
}
},
"path": "creditKickbackKash/{userId}/{creditAmount}",
"response": {
"$ref": "ResultDTO"
},
"scopes": [
"https://www.googleapis.com/auth/userinfo.email"
]
}
Is anyone else having this issue? How can I get the discovery document generation to properly format the document including double number types?
I had the same problem. I rolled back from 1.9.50 to 1.9.48 and the problem is gone.

Resources