How to store business ID in elasticsearch? - twitter

I'm trying to store tweets in some Elasticsearch index using Spring Data Elasticsearch (for tweet requesting , I'm using twitter4j).
I have followed some basic example and I'm using this basic annotated POJO (metadatas with complex type have been removed) :
#Document(indexName = "twitter", type = "tweet")
public class StorableTweet {
#Id
private long id;
private String createdAt;
private String text;
private String source;
private boolean isTruncated;
private long inReplyToStatusId;
private long inReplyToUserId;
private boolean isFavorited;
private boolean isRetweeted;
private int favoriteCount;
private String inReplyToScreenName;
private String userScreenName = null;
// Getters/setters removed
}
To store a tweet using this model, I use :
public interface TweetRepository extends ElasticsearchRepository<StorableTweet, Long> {
}
and in my storing service :
tweetRepository.save(storableTweet);
It works fine, but my tweet Id is stored in "_id" (why not) and some other number coming from nowhere is stored in "id" (why....?) :
{
"_index": "twitter",
"_type": "tweet",
**"_id": "655008947099840512"**, <-- this is the real tweet id
"_version": 1,
"found": true,
"_source":
{
**"id": 655008947099840500**, <-- this number comes from nowhere
"createdAt": "Fri Oct 16 15:14:37 CEST 2015",
"text": "tweet text(...)",
"source": "Twitter for iPhone",
"inReplyToStatusId": -1,
"inReplyToUserId": -1,
"favoriteCount": 0,
"inReplyToScreenName": null,
"user": "971jml",
"favorited": false,
"retweeted": false,
"truncated": false
}
}
What I would like is either my tweet id stored in "_id" (and no "id" field), either my tweet id stored in "id" an having a generated number in "_id", and get rid of this random useless number in "id".
EDIT
mapping :
{
"twitter":
{
"mappings":
{
"tweet":
{
"properties":
{
"createdAt":
{
"type": "string"
},
"favoriteCount":
{
"type": "long"
},
"favorited":
{
"type": "boolean"
},
"inReplyToScreenName":
{
"type": "string"
},
"inReplyToStatusId":
{
"type": "long"
},
"inReplyToUserId":
{
"type": "long"
},
"retweeted":
{
"type": "boolean"
},
"source":
{
"type": "string"
},
"text":
{
"type": "string"
},
"truncated":
{
"type": "boolean"
},
"tweetId":
{
"type": "long"
},
"user":
{
"type": "string"
}
}
}
}
}
}
EDIT 2 : It looks like the problem is not about #Id annotation but about "long" type. Some other longs (not all) are transformed (a few units more or less) when stored into elasticsearch via Spring Data Elasticsearch.

Related

Owner_id field does not pass validation error. What is wrong with my schema?

I am looking into what is wrong with my schema. I'm attempting to insert an entry into my collection and I have gotten a slew of errors as I've changed things around but this seems to be the closest I have gotten to successfully inserting a document. I am using the mongodb-stitch-browser-sdk in a React Ionic project and I have a valid user logged in.
I am using the StitchUser.id which is a string as my owner_id (matches the id of my valid user in users collection).
Here is my schema followed by the error in Stitch logs. I was simply trying to insert a document to my Goals table. Also, there are no filters on this collection and there is only one role with the following rule.
{
"owner_id": "%%user.id"
}
This gives the user read and write permissions on the collection's that they created.
{
"bsonType": "object",
"required": [
"goalTitle",
"startDate",
"endDate",
"owner_id"
],
"properties": {
"_id": {
"bsonType": "objectId"
},
"owner_id": {
"bsonType": "string",
"validate": {
"%or": [
{
"%%prevRoot.owner_id": {
"%exists": false
}
},
{
"%%prevRoot.owner_id": "%%this"
}
]
}
},
"goalTitle": {
"bsonType": "string",
"minLength": {
"$numberInt": "1"
},
"maxLength": {
"$numberInt": "30"
}
},
"goalDescription": {
"bsonType": "string",
"minLength": {
"$numberInt": "0"
},
"maxLength": {
"$numberInt": "600"
}
},
"startDate": {
"bsonType": "string"
},
"endDate": {
"bsonType": "string"
}
}
}
Error:
role "owner" in "todo_list.Goals" does not have insert permission for document with _id: ObjectID("5e6aa8d11d233536e3ea8604"): could not validate document:
owner_id: Does not pass validation
Stack Trace:
StitchError: insert not permitted
Details:
{
"serviceAction": "insertOne",
"serviceName": "mongodb-atlas",
"serviceType": "mongodb-atlas"
}
{
"arguments": [
{
"collection": "Goals",
"database": "todo_list",
"document": {
"goalTitle": "Test Goal",
"goalDescription": "Test Description",
"endDate": "2020-03-11",
"startDate": "2020-03-10",
"owner_id": "5e6891382e6039c1c32f7d46",
"_id": {
"$oid": "5e6aa8d11d233536e3ea8604"
}
}
}
],
"name": "insertOne",
"service": "mongodb-atlas"
}
I've created another collection with no schema and the same rule checking for owner_id and documents in that collection are able to be inserted just fine. I'd have to imagine it is a schema error.

Apache Avro UnresolvedUnionException: Not in union ["null",{"type":"int","logicalType":"date"}]: 2001-01-01

Despite examples collected here and there, I haven't been able to produce a correct Avro 1.9.1 schema for my (lomboked) class, getting the title's error message at serialization time of my LocalDate field.
Can someone please explain what I'm missing?
#Data
public class Person {
private Long id;
private String firstname;
private LocalDate birth;
private Integer votes = 0;
}
This is the schema:
{
"type": "record",
"name": "Person",
"namespace": "com.example.demo",
"fields": [
{
"name": "id",
"type": "long"
},
{
"name": "firstname",
"type": "string"
},
{
"name": "birth",
"type": [ "null", { "type": "int", "logicalType": "date" }]
},
{
"name": "votes",
"type": "int"
}]
}
The error, meaning java.time.LocalDate is not found in the union's "index named" map, is this:
org.apache.avro.UnresolvedUnionException: Not in union ["null",{"type":"int","logicalType":"date"}]: 2001-01-01
Index named map keys are "null" and "int", which seems logical.

Using specific properties while referencing to schema

I'm using swagger 2.0 and I have the following schema(definition) :
"User": {
"type": "object",
"properties": {
"firstName": {
"type": "string",
"example": "Tom"
},
"lastName": {
"type": "string",
"example": "Hanks"
},
"email": {
"type": "string",
"example": "Tom.Hanks#gmail.com"
},
"password": {
"type": "string",
"example": "azerty#123456"
}
}
and i want to refer to this schema in one of my responses, so i do the following:
"responses": {
"201": {
"description": "Created.",
"schema": {
"$ref": "#/definitions/User"
}
}
}
Until now everything works perfectly, but i don't want to expose the password property in the response schema. is the anyway to choose exactly the properties i want to use from the Userdefinition ?
No, there is no way. I'd suggest you define 2 types:
One type for user data without password, let's name it User.
And another type that inherits from it and contains additionally a password attribute. Let's name it UserWithCredential.

How to provide string array (default values) in the swagger definition

Currently, my swagger definition has :
"MyAttribute": {
"type": "string",
"default": "item1",
"enum": [
"item1",
"item2"
],
"xml": {
"attribute": true
}
},
Swagger codegen generates this java code:
#XmlEnum(String.class)
public enum MyAttribute{
#XmlEnumValue("item1")
item1("item1"),
#XmlEnumValue("item2")
item2("item2");
private String value;
/**
* #param value the string representation of the map specification type attribute
*/
MyAttribute(final String value) {
this.value = value;
}
#Override
public String toString() {
return String.valueOf(value);
}
}
Now, I want to have a string array(with values) instead of enum.
I tried this (but shows errors on swagger website : http://editor.swagger.io/)
"MyAttribute": {
"type": "array",
"items": {
"type": "string",
"values": [
"item1",
"item2"
]
},
"xml": {
"attribute": true
}
}
How to achieve this?

Can we refer to a only one property of other schema

I have a rest service, that can work as below:
http://server/path/AddressResource and
http://server/path/AddressResource/someAnotherPath
I have a definitions like below.
"definitions": {
"address": {
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
},
"required": ["street_address", "city", "state"]
}
}
that is the response of path1, and in path two i just want to return the "city" property of address.
Can I create a schema, referring to address and using just one of it's property?

Resources