No Type Error when trying to transform JSON to AVRO - avro

I'm trying to convert a JSON payload to Avro to publish to a Kafka topic. However, when I do the Dataweave transformation I'm getting a "No Type" error. I'm not sure what's causing the error. I originally thought this might be due to the transformation not knowing what the MIME type on the inbound payload. So, I've made sure that it's set to application/json but that didn't make any difference.
Avro Schema
{
"compatibility" : "forward",
"name": "ContentManagerCoupons",
"type": "record",
"namespace": "com.rentpath",
"fields": [
{
"name": "clientID",
"type": "string"
},
{
"name": "outputHistoryId",
"type": "string"
},
{
"name": "categoryCoupons",
"type": {
"type": "array",
"items": {
"name": "categoryCoupons_record",
"type": "record",
"fields": [
{
"name": "applyBy",
"type": [
"string",
"int",
"null"
]
},
{
"name": "applyPeriod",
"type": [
"string",
"null"
]
},
{
"name": "cashValue",
"type": [
"int",
"null"
]
},
{
"name": "couponCategory",
"type": "string"
},
{
"name": "cashOffDesc",
"type": [
"string",
"null"
]
},
{
"name": "endDate",
"type": [
"string",
"null"
]
},
{
"name": "feeType",
"type": [
"string",
"null"
]
},
{
"name": "freeWeeks",
"type": [
"string",
"null"
]
},
{
"name": "generatedText",
"type": "string"
},
{
"name": "leaseby",
"type": [
"string",
"null"
]
},
{
"name": "leaseTerm",
"type": [
"int",
"null"
]
},
{
"name": "offerText",
"type": [
"string",
"null"
]
},
{
"name": "startDate",
"type": "string"
},
{
"name": "unitType",
"type": [
"string",
"null"
]
}
]
}
}
}
]
}
JSON Message
{
"outputHistoryId": "55324456",
"clientID": "112345",
"categoryCoupons": [
{
"unitType": null,
"startDate": "07/21/2020",
"offerText": "This would be the special offer message.",
"leaseTerm": null,
"leaseby": null,
"generatedText": "This would be the special offer message..",
"freeWeeks": null,
"feeType": null,
"endDate": "10/01/2020",
"couponCategory": "Special Offer",
"cashValue": null,
"cashOffDesc": null,
"applyPeriod": null,
"applyBy": null
}
]
}
Datawave
%dw 2.2
output application/avro schemaUrl="http://schema-registry.domain.com:8081/subjects/Coupon-value/versions/1"
---
payload
Error Message
"org.apache.avro.SchemaParseException - No type: {"subject":"ContentManager.Coupon-value","version":1,"id":342,"schema":"{"type":"record","name":"ContentManagerCoupons","namespace":"com.rentpath","fields":[{"name":"clientID","type":"string"},{"name":"outputHistoryId","type":"string"},{"name":"categoryCoupons","type":{"type":"array","items":{"type":"record","name":"categoryCoupons_record","fields":[{"name":"applyBy","type":["string","int","null"]},{"name":"applyPeriod","type":["string","null"]},{"name":"cashValue","type":["int","null"]},{"name":"couponCategory","type":"string"},{"name":"cashOffDesc","type":["string","null"]},{"name":"endDate","type":["string","null"]},{"name":"feeType","type":["string","null"]},{"name":"freeWeeks","type":["string","null"]},{"name":"generatedText","type":"string"},{"name":"leaseby","type":["string","null"]},{"name":"leaseTerm","type":["int","null"]},{"name":"offerText","type":["string","null"]},{"name":"startDate","type":"string"},{"name":"unitType","type":["string","null"]}]}}}],"compatibility":"forward"}"}
org.apache.avro.SchemaParseException: No type: {"subject":"ContentManager.Coupon-value","version":1,"id":342,"schema":"{"type":"record","name":"ContentManagerCoupons","namespace":"com.rentpath","fields":[{"name":"clientID","type":"string"},{"name":"outputHistoryId","type":"string"},{"name":"categoryCoupons","type":{"type":"array","items":{"type":"record","name":"categoryCoupons_record","fields":[{"name":"applyBy","type":["string","int","null"]},{"name":"applyPeriod","type":["string","null"]},{"name":"cashValue","type":["int","null"]},{"name":"couponCategory","type":"string"},{"name":"cashOffDesc","type":["string","null"]},{"name":"endDate","type":["string","null"]},{"name":"feeType","type":["string","null"]},{"name":"freeWeeks","type":["string","null"]},{"name":"generatedText","type":"string"},{"name":"leaseby","type":["string","null"]},{"name":"leaseTerm","type":["int","null"]},{"name":"offerText","type":["string","null"]},{"name":"startDate","type":"string"},{"name":"unitType","type":["string","null"]}]}}}],"compatibility":"forward"}"}
at org.apache.avro.Schema.getRequiredText(Schema.java:1753)
at org.apache.avro.Schema.parse(Schema.java:1604)
at org.apache.avro.Schema$Parser.parse(Schema.java:1394)
at org.apache.avro.Schema$Parser.parse(Schema.java:1365)
at org.mule.weave.v2.module.avro.AvroWriter.doWriteValue(AvroWriter.scala:195)
at org.mule.weave.v2.module.writer.Writer.writeValue(Writer.scala:41)
at org.mule.weave.v2.module.writer.Writer.writeValue$(Writer.scala:39)
at org.mule.weave.v2.module.avro.AvroWriter.writeValue(AvroWriter.scala:44)
at org.mule.weave.v2.module.writer.DeferredWriter.doWriteValue(DeferredWriter.scala:73)
at org.mule.weave.v2.module.writer.Writer.writeValue(Writer.scala:41)
at org.mule.weave.v2.module.writer.Writer.writeValue$(Writer.scala:39)
at org.mule.weave.v2.module.writer.DeferredWriter.writeValue(DeferredWriter.scala:16)
at org.mule.weave.v2.module.writer.WriterHelper$.writeValue(Writer.scala:120)
at org.mule.weave.v2.module.writer.WriterHelper$.writeAndGetResult(Writer.scala:98)
at org.mule.weave.v2.interpreted.InterpretedMappingExecutableWeave.write(InterpreterMappingCompilerPhase.scala:236)
at org.mule.weave.v2.el.WeaveExpressionLanguageSession.evaluateWithTimeout(WeaveExpressionLanguageSession.scala:243)
at org.mule.weave.v2.el.WeaveExpressionLanguageSession.evaluate(WeaveExpressionLanguageSession.scala:108)
at org.mule.runtime.core.internal.el.dataweave.DataWeaveExpressionLanguageAdaptor$1.evaluate(DataWeaveExpressionLanguageAdaptor.java:308)
at org.mule.runtime.core.internal.el.DefaultExpressionManagerSession.evaluate(DefaultExpressionManagerSession.java:105)
at com.mulesoft.mule.runtime.core.internal.processor.SetPayloadTransformationTarget.process(SetPayloadTransformationTarget.java:32)
at com.mulesoft.mule.runtime.core.internal.processor.TransformMessageProcessor.lambda$0(TransformMessageProcessor.java:92)
at java.util.Optional.ifPresent(Optional.java:159)
at com.mulesoft.mule.runtime.core.internal.processor.TransformMessageProcessor.process(TransformMessageProcessor.java:92)
at org.mule.runtime.core.api.util.func.CheckedFunction.apply(CheckedFunction.java:25)
at org.mule.runtime.core.api.rx.Exceptions.lambda$checkedFunction$2(Exceptions.java:84)
at org.mule.runtime.core.internal.util.rx.Operators.lambda$nullSafeMap$0(Operators.java:47)
at reactor.core.publisher.FluxHandleFuseable$HandleFuseableSubscriber.onNext(FluxHandleFuseable.java:165)
at org.mule.runtime.core.privileged.processor.chain.AbstractMessageProcessorChain$2.onNext(AbstractMessageProcessorChain.java:425)
at org.mule.runtime.core.privileged.processor.chain.AbstractMessageProcessorChain$2.onNext(AbstractMessageProcessorChain.java:420)
at reactor.core.publisher.FluxHide$SuppressFuseableSubscriber.onNext(FluxHide.java:127)
at reactor.core.publisher.FluxPeekFuseable$PeekFuseableSubscriber.onNext(FluxPeekFuseable.java:204)
at reactor.core.publisher.FluxOnAssembly$OnAssemblySubscriber.onNext(FluxOnAssembly.java:345)
at reactor.core.publisher.FluxSubscribeOnValue$ScheduledScalar.run(FluxSubscribeOnValue.java:178)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:50)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:27)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.mule.service.scheduler.internal.AbstractRunnableFutureDecorator.doRun(AbstractRunnableFutureDecorator.java:111)
at org.mule.service.scheduler.internal.RunnableFutureDecorator.run(RunnableFutureDecorator.java:54)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748), while writing Avro at payload.

It seems to work for me. Maybe there is a problem trying to access the schema. Because I don't have access to that URL I replaced it with a local file:
output application/avro schemaUrl="classpath://schema.json"

Apparently the answer is pretty simple. I just needed to add schema to the end of my URL. This removes the extraneous items like version and id that come down without it.
New Dataweave
%dw 2.2
output application/avro schemaUrl="http://schema-registry.domain.com:8081/subjects/Coupon-value/versions/1/schema"
---
payload

Related

Data creation Error creating a kafka message to producer - Expected start-union. Got VALUE_STRING [duplicate]

Unable to Error creating a kafka message to producer - Expected start-union. Got VALUE_STRING
{
"namespace": "de.morris.audit",
"type": "record",
"name": "AuditDataChangemorris",
"fields": [
{"name": "employeeID", "type": "string"},
{"name": "employeeNumber", "type": ["null", "string"], "default": null},
{"name": "serialNumbers", "type": [ "null", {"type": "array", "items": "string"}]},
{"name": "correlationId", "type": "string"},
{"name": "timestamp", "type": "long", "logicalType": "timestamp-millis"},
{"name": "employmentscreening","type":{"type": "enum", "name": "employmentscreening", "symbols": ["NO","YES"]}},
{"name": "vouchercodes","type": ["null",
{
"type": "array",
"items": {
"name": "Vouchercodes",
"type": "record",
"fields": [
{"name": "voucherName","type": ["null","string"], "default": null},
{"name": "authocode","type": ["null","string"], "default": null}
]
}
}], "default": null}
]
}
when i was trying to create a sample data in json format based on the above avsc for kafka consumer i am getting the below error upon testing
{
"employeeID": "qtete46524",
"employeeNumber": {
"string": "custnumber9813"
},
"serialNumbers": {
"type": "array",
"items": ["363536623","5846373733"]
},
"correlationId": "corr-656532443",
"timestamp": 1476538955719,
"employmentscreening": "NO",
"vouchercodes": [
{
"voucherName": "skygo",
"authocode": "A238472ASD"
}
]
}
getting the below error when i got when i ran the dataflow job in gcp
Error message from worker: java.lang.RuntimeException: java.io.IOException: Insert failed: [{"errors":[{"debugInfo":"","location":"serialnumbers","message":"Array specified for non-repeated field: serialnumbers.","reason":"invalid"}],"index":0}]**
how to create correct sample data based on the above schema ?
Read the spec
The value of a union is encoded in JSON as follows:
if its type is null, then it is encoded as a JSON null;
otherwise it is encoded as a JSON object with one name/value pair whose name is the type’s name and whose value is the recursively encoded value
So, here's the data it expects.
{
"employeeID": "qtete46524",
"employeeNumber": {
"string": "custnumber9813"
},
"serialNumbers": {"array": [
"serialNumbers3521"
]},
"correlationId": "corr-656532443",
"timestamp": 1476538955719,
"employmentscreening": "NO",
"vouchercodes": {"array": [
{
"voucherName": {"string": "skygo"},
"authocode": {"string": "A238472ASD"}
}
]}
}
With this schema
{
"namespace": "de.morris.audit",
"type": "record",
"name": "AuditDataChangemorris",
"fields": [
{
"name": "employeeID",
"type": "string"
},
{
"name": "employeeNumber",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "serialNumbers",
"type": [
"null",
{
"type": "array",
"items": "string"
}
]
},
{
"name": "correlationId",
"type": "string"
},
{
"name": "timestamp",
"type": {
"type": "long",
"logicalType": "timestamp-millis"
}
},
{
"name": "employmentscreening",
"type": {
"type": "enum",
"name": "employmentscreening",
"symbols": [
"NO",
"YES"
]
}
},
{
"name": "vouchercodes",
"type": [
"null",
{
"type": "array",
"items": {
"name": "Vouchercodes",
"type": "record",
"fields": [
{
"name": "voucherName",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "authocode",
"type": [
"null",
"string"
],
"default": null
}
]
}
}
],
"default": null
}
]
}
Here's an example of producing and consuming to Kafka
$ jq -rc < /tmp/data.json | kafka-avro-console-producer --topic foobar --property value.schema="$(jq -rc < /tmp/data.avsc)" --bootstrap-server localhost:9092 --sync
$ kafka-avro-console-consumer --topic foobar --from-beginning --bootstrap-server localhost:9092 | jq
{
"employeeID": "qtete46524",
"employeeNumber": {
"string": "custnumber9813"
},
"serialNumbers": {
"array": [
"serialNumbers3521"
]
},
"correlationId": "corr-656532443",
"timestamp": 1476538955719,
"employmentscreening": "NO",
"vouchercodes": {
"array": [
{
"voucherName": {
"string": "skygo"
},
"authocode": {
"string": "A238472ASD"
}
}
]
}
}
^CProcessed a total of 1 messages

Avro schema cannot deserialize autoregistered avro schema by connector

We are trying to consume a topic that has data emitted by a connector. We are using a handwritten schema that matches the data in the topic.
{
"type": "record",
"name": "Event",
"namespace": "com.example.avro",
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "type",
"type": ["null", "string"],
"default": null
},
{
"name": "entity_id",
"type": ["null", "string"],
"default": null
},
{
"name": "emitted_at",
"type": ["null", "string"],
"default": null
},
{
"name": "data",
"type": ["null", "string"],
"default": null
}
]
}
Unfortunately it cannot deserialize this because of the auto-registered schema by the connector.
{
"type": "record",
"name": "Value",
"namespace": "postgres.public.events",
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "type",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "entity_id",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "emitted_at",
"type": [
"null",
{
"type": "string",
"connect.version": 1,
"connect.name": "io.debezium.time.ZonedTimestamp"
}
],
"default": null
},
{
"name": "data",
"type": [
"null",
{
"type": "string",
"connect.version": 1,
"connect.name": "io.debezium.data.Json"
}
],
"default": null
}
],
"connect.name": "postgres.public.events.Value"
}
We are getting the following error:
Caused by: org.apache.kafka.common.errors.SerializationException: Could not find class postgres.public.events.Value specified in writer's schema whilst finding reader's schema for a SpecificRecord.
How do we resolve this issue?
You can either download the schema from the registry instead of defining your own (there's maven plugins to do this), or change the namespace+name of your own schema such that the generated class will match.
Adding an alias might work as well, but I've not had much experience/luck with that, personally.

Avro schema getting undefined type name when using Record type

so im trying to parse an object with this avro schema.
object is like:
myInfo: {size: 'XL'}
But Its behaving like the record type doesn't actually exist and im getting a undefined type name: data.platform_data.test_service.result.record at Function.Type.forSchema for it.
schema looks like:
"avro": {
"metadata": {
"loadType": "full",
"version": "0.1"
},
"schema": {
"name": "data.platform_data.test_service.result",
"type": "record",
"fields": [
{
"name": "myInfo",
"type": "record",
"fields": [{
"name": "size",
"type": {"name":"size", "type": "string"}
}]
}
]
}
}
I should mention im also using avsc for this. Anybody have any ideas? I've tried pretty much all combinations but afaik the only way of parsing out an objct like this is with record
Playing around with the schema, I found that "type": "record" is a problem. I moved it to nested definition. And it worked. Seems like description here is little bit confusing.
Change
Before:
{
"name": "myInfo",
"type": "record",
"fields": [{
"name": "size",
"type": {"name":"size", "type": "string"}
}]
}
After:
{
"name": "myInfo",
"type": {
"type": "record",
"name": "myInfo",
"fields": [
{
"name": "size",
"type": {"name":"size", "type": "string"}
}
]
}
}
Updated schema which is working:
{
"name": "data.platform_data.test_service.result",
"type": "record",
"fields": [
{
"name": "myInfo",
"type": {
"type": "record",
"name": "myInfo",
"fields": [
{
"name": "size",
"type": {"name":"size", "type": "string"}
}
]
}
}
]
}
To make a record attribute nullable, process is same as any other attribute. You need to union with "null" (as show in below schema):
{
"name": "data.platform_data.test_service.result",
"type": "record",
"fields": [
{
"name": "myInfo",
"type": [
"null",
{
"type": "record",
"name": "myInfo",
"fields": [
{
"name": "size",
"type": {
"name": "size",
"type": "string"
}
}
]
}
]
}
]
}

Avro Tools Failure Expected start-union. Got VALUE_STRING

I've defined the below avro schema (car_sales_customer.avsc),
{
"type" : "record",
"name" : "topLevelRecord",
"fields" : [ {
"name": "cust_date",
"type": "string"
},
{
"name": "customer",
"type": {
"type": "array",
"items": {
"name": "customer",
"type": "record",
"fields": [
{
"name": "address",
"type": "string"
},
{
"name": "driverlience",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "name",
"type": "string"
},
{
"name": "phone",
"type": "string"
}
]
}
}
}]
}
and my input json payload (car_sales_customer.json) is as follows,
{"cust_date":"2017-04-28","customer":[{"address":"SanFrancisco,CA","driverlience":"K123499989","name":"JoyceRidgely","phone":"16504378889"}]}
I'm trying to use avro-tools and convert the above json to avro using the avro schema,
java -jar ./avro-tools-1.9.2.jar fromjson --schema-file ./car_sales_customer.avsc ./car_sales_customer.json > ./car_sales_customer.avro
I get the below error when I execute the above statement,
Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-union. Got VALUE_STRING
at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:514)
at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:433)
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:283)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:259)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:298)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:183)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:259)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at org.apache.avro.tool.DataFileWriteTool.run(DataFileWriteTool.java:89)
at org.apache.avro.tool.Main.run(Main.java:66)
at org.apache.avro.tool.Main.main(Main.java:55)
Is there a solution to overcome the error?

Avro schema issue when record missing a field

I am using the NiFi (v1.2) processor ConvertJSONToAvro. I am not able to parse a record that only contains 1 of 2 elements in a "record" type. This element is also allowed to be missing entirely from the data. Is my Avro schema incorrect?
Schema snippet:
"name": "personname",
"type": [
"null":,
{
"type": "record",
"name": "firstandorlast",
"fields": [
{
"name": "first",
"type": [
"null",
"string"
]
},
{
"name": "last",
"type": [
"null",
"string"
]
}
]
}
]
If "personname" contains both "first" and "last" it works, but if it only contains one of the elements, it fails with the error: Cannot convert field personname: cannot resolve union:
{ "last":"Smith" }
not in
"type": [ "null":,
{
"type": "record",
"name": "firstandorlast",
"fields": [
{
"name": "first",
"type": [
"null",
"string"
]
},
{
"name": "last",
"type": [
"null",
"string"
]
}
]
}
]
You are missing the default value
https://avro.apache.org/docs/1.8.1/spec.html#schema_record
Your schema should looks like
"name": "personname",
"type": [
"null":,
{
"type": "record",
"name": "firstandorlast",
"fields": [
{
"name": "first",
"type": [
"null",
"string"
],
"default": "null"
},
{
"name": "last",
"type": [
"null",
"string"
],
"default": "null"
}
]
}
]

Resources