Schema issue with Kafka schema registry - avro

I know I come to you with any news, but I'm stuck solving an issue that probably is my fault, indeed I can't realize what's the solution.
I'm using a standalone installation of the Confluent platform (4.0.0 open source version) in order to demonstrate how to adopt the platform for a specific use case.
Trying to demonstrate the value of using the schema registry I'm facing the following issue posting a new schema with Postman.
The request is:
http://host:8081/subjects/test/versions
, method POST
, Header: Accept:application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json
Content-Type:application/json
, Body:
{"schema":"{{\"namespace\":\"com.testlab\",\"name\":\"test\",\"type\":\"record\",\"fields\":[{\"name\":\"resourcepath\",\"type\":\"string\"},{\"name\":\"resource\",\"type\":\"string\"}]}}" }
The response is: {"error_code":42201,"message":"Input schema is an invalid Avro schema"}
Looking at docs and after googling a lot I'm without options.
Any suggestion ?
Thanks for your time
R.

You have extra {} around the schema field.
One way to test this is with jq
Before
$ echo '{"schema":"{{\"namespace\":\"com.testlab\",\"name\":\"test\",\"type\":\"record\",\"fields\":[{\"name\":\"resourcepath\",\"type\":\"string\"},{\"name\":\"resource\",\"type\":\"string\"}]}}" }' | jq '.schema|fromjson'
jq: error (at <stdin>:1): Objects must consist of key:value pairs at line 1, column 146 (while parsing '{{"namespace":"com.testlab","name":"test","type":"record","fields":[{"name":"resourcepath","type":"string"},{"name":"resource","type":"string"}]}}')
After
$ echo '{"schema":"{\"namespace\":\"com.testlab\",\"name\":\"test\",\"type\":\"record\",\"fields\":[{\"name\":\"resourcepath\",\"type\":\"string\"},{\"name\":\"resource\",\"type\":\"string\"}]}" }' | jq '.schema|fromjson'
{
"namespace": "com.testlab",
"name": "test",
"type": "record",
"fields": [
{
"name": "resourcepath",
"type": "string"
},
{
"name": "resource",
"type": "string"
}
]
}
See my comment here about importing AVSC files so that you don't need to type out the JSON on the CLI

Related

How could I manage Graylog to parse my JSON logs correctly?

I have an rails app and I'm trying to configure logging to graylog. Pipeline consists of next steps:
1) Logs are written to file in JSON format by SemanticLogger gem. Log message consist of header info (first level tags) and payload with several levels of hierarchy:
{
"tag": "mortgage",
"app": "sneakers",
"pid": 3448,
"env": "production",
"host": "thesaurus-mortgage",
"thread": "91090300",
"level": "info",
"name": "Sneakers",
"payload": {
"class": "EgrnListenerWorker",
"method": "work",
"json": {
"resource": "kontur",
"action": "request_egrn_done",
"system_code": "thesaurus",
"id": 35883717,
"project_id": "mortgage",
"bank_id": "ab",
"params": {
"egrn": {
"zip": "rosreestr/kontur/kontur_4288_2018-10-11_021848.zip",
"pdf": "rosreestr/kontur/kontur_4288_2018-10-11_021848.pdf",
"xml": "rosreestr/kontur/kontur_4288_2018-10-11_021848.xml"
},
"code": "SUCCESS"
}
},
"valid_json": true
},
"created_at": "2018-10-11T17:44:58.262+00:00"
}
2) File is being read by Filebeat service and sent to Graylog.
And graylog could not parse correctly payload contents:
As you can see - keys are concatenated with ":" in one string in such manner: key1=value1:key2=value2. This is not what I am expected. It would be perfect if I could manage graylog to parse contents of payload into different fields with names payload.key1, payload.key2 and so on (so I could perform search on these fields)
ps: my log data is heterogeneous, i.e. payload contents depend on functionality it was produced by, so I expect that there would be a huge amount of different fields of a kind "payload.xxxxx" - is it ok?
This isn't exactly a filebeat question since filebeat only ships the logs in their original JSON format (zipped, if wanted).
From the Graylog Website: http://docs.graylog.org/en/2.4/pages/extractors.html
Using the JSON extractor
Since version 1.2, Graylog also supports extracting data from messages sent in JSON format.
Using the JSON extractor is easy: once a Graylog input receives
messages in JSON format, you can create an extractor by going to
System -> Inputs and clicking on the Manage extractors button for that
input. Next, you need to load a message to extract data from, and
select the field containing the JSON document. The following page let
you add some extra information to tell Graylog how it should extract
the information.
This should get you going.

influx schema type validations

I am using influx db and I want to enforce some kind of schema validation.
I had a problem that influx had learned a filed using the wrong type due to a developer mistake. As a result, once we sent the right type, influx wouldn't persist it because it recognised the field of another type.
Can I force field types such as String, Integer and Double?
I use Java
Regards,
Ido
Unfortunately we have waited so long to see this feature finally in the newer release.
Starting from InfluxDB v2.4, we could create a bucket (new name for database in InfluxDB v2.X ) with an explicit schema. That is,
Create a bucket with an explicit schema (see more details here)
influx bucket create \
--name my_schema_bucket \
--schema-type explicit
Adding measurement schemas to your bucket (see more details here)
influx bucket-schema create \
--bucket my_schema_bucket \
--name temperature \
--columns-file columns.csv
where that column.csv is similar to DDL:
{"name": "time", "type": "timestamp"}
{"name": "alert", "type": "field", "dataType": "string"}
{"name": "cdi", "type": "field", "dataType": "float"}
You could refer to this blog as well.

how to import raw (not json-formatted) mqtt values in thingsboard iot-gateway?

I've searched in the documentation and in the message boards, but was unable to find the following.
I'm trying to import data from mqtt into thingsboard, using the IOT gateway.
The documentation outlines how to configure the IOT gateway to import json-formatted data.
{
"topicFilter": "sensors",
"converter": {
"type": "json",
"filterExpression": "",
"deviceNameJsonExpression": "${$.serialNumber}",
"attributes": [
{
"type": "string",
"key": "model",
"value": "${$.model}"
}
],
"timeseries": [
{
"type": "double",
"key": "temperature",
"value": "${$.temperature}"
}
]
}
}
from (https://thingsboard.io/docs/iot-gateway/getting-started/#step-81-basic-mapping-example).
That mapping then works to import data published like this:
mosquitto_pub -h localhost -p 1883 -t "sensors" -m '{"serialNumber":"SN-001", "model":"T1000", "temperature":36.6}'
I am hoping that it is also possible to import raw data, i.e. without json formatting, because I already have many data topics with raw data payloads. So, just raw ascii-encoded values. So, like this:
mosquitto_pub -h localhost -p 1883 -t "sensors/livingroom/temperature" -m '36.6'
Is that possible with the IOT gateway, and if so, what would the configuration look like?
It is possible, but you will need to implement new converter type. The one we have is using JSON. You can implement your own converter that accepts binary data. So, your configuration will looks similar to this:
{
"topicFilter": "sensors",
"converter": {
"type": "binary",
/* whatever configuration structure that is applicable to your use case */
}
}

How to deal with GeoJson in CKAN DataStore?

Is it true CKAN DataStore is able to deal with GeoJson? I've not seen any reference in the documentation except for this link about the DataStore Map visualization, saying:
Shows data stored on the DataStore in an interactive map. It supports plotting markers from a pair of latitude / longitude fields or from a field containing a GeoJSON representation of the geometries.
Thus, I'm supossing GeoJson is accepted in DataStore columns. Anyway, I've not found any GeoJson CKAN type, thus, again, I'm guessing the simple Json type must be use for this purpose.
Can anybody confirm this? Thanks!
EDIT 1
I've created a resource and a datastore and a "recline_map_view" associated to the resource. Then, I've upserted a value, which is shown by this datastore_search operation:
$ curl -X POST "https://host:port/api/3/action/datastore_search" -d '{"resource_id":"14418d40-de42-4fdd-84f7-3c51244c7469"}' -H "Authorization: xxx" -k
{"help": "https://host:port/api/3/action/help_show?name=datastore_search", "success": true, "result": {"resource_id": "14418d40-de42-4fdd-84f7-3c51244c7469", "fields": [{"type": "int4", "id": "_id"}, {"type": "text", "id": "label"}, {"type": "json", "id": "geojson"}], "records": [{"_id": 1, "geojson": {"type": "Point", "coordinates": [48.856699999999996, 2.3508]}, "label": "Paris"}], "_links": {"start": "/api/3/action/datastore_search", "next": "/api/3/action/datastore_search?offset=100"}, "total": 1}}
Nevertheless, nothing is shown in CKAN :(
EDIT 2
It was a problem with my CKAN. I've tested Ifurini's solution at demo.ckan.org and it works.
GeoJSON is just a (particular kind of) JSON, so it does not have a particular treatment as a database field.
So, you can create a resource with a GeoJSON field from a simple CSV file like this:
Name,Position
"Paris","{""type"":""Point"",""coordinates"":[2.3508,48.8567]}"
(note the double double quotes "" instead of just a single double quote ")
If you call the column "GeoJSON" (or "geojson", "gEoJsOn", etc., as capitalization is not important) the Map View will automatically use that field to mark the data in the map, instead of just letting you manually select which field to use.

Nested query parameters in Swagger 2.0

I'm documenting a Rails app with Swagger 2.0 and using Swagger-UI as the human-readable documentation/sandbox solution.
I have a resource where clients can store arbitrary metadata to query later. According to the Rails convention, the query would be submitted like so:
/posts?metadata[thing1]=abc&metadata[thing2]=def
which Rails translates to params of:
{ "metadata" => { "thing1" => "abc", "thing2" => "def" } }
which can easily be used to generate the appropriate WHERE clause for the database.
Is there any support for something like this in Swagger? I want to ultimately have Swagger-UI give some way to modify the generated request to add on arbitrary params under the metadata namespace.
This doesn't appear supported yet (over 2 years after you asked the question), but there's an ongoing discussion & open ticket about adding support for this on the OpenAPI github repo. They refer to this type of nesting as deepObjects.
There's another open issue where an implementation was attempted here. Using the most recent stable swagger-ui release, however, I have observed it working as I expect:
"parameters": [
{
"name": "page[number]",
"in": "query",
"type": "integer",
"default": 1,
"required": false
},
{
"name": "page[size]",
"in": "query",
"type": "integer",
"default": 25,
"required": false
}
This presents the expected dialog box & works with Try it out against a working server.
I don't believe there is a good way to specify arbitrary or a selection of values (e.g. an enum), so you may have to add parameters for every nesting option.

Resources