Elasticsearch default mapping

Elasticsearch default mapping - mapping

My current understanding-
Elasticsearch creates the mapping indices the first time it receives the JSON datasets.
This mapping cannot be changed, but the datasets can be re-mapped.
Question-
Forget re-mapping. Is there any way to tell ES to behave by default as-
"Consider everything that is not a date to be of string type"?
Also, will i be losing out on much if i do this?
Update-
i added the file- config/mappings/_default/mapping.json with the following contents-
{
"dynamic_templates": [
{
"template_1": {
"match": "*",
"match_mapping_type": "int",
"mapping": {
"type": "string"
}
},
"template_2": {
"match": "*",
"match_mapping_type": "long",
"mapping": {
"type": "string"
}
}
}
]
}
i also tried placing the following at- config/default_mapping.json
{
"_default_" : {
"match": "*",
"match_mapping_type": "int",
"mapping": {
"type": "string"
}
}
}
My 'motive' is to get rid of errors that crop up if int and long types change to string. Will this map all int and long values as string across all indexes that are created in the future? Do i need to nest this dynamic_templates key within _all?
Update II-
Adding this mapping file causes elasticsearch to cough up-
[2014-02-04 10:48:34,396][DEBUG][action.admin.indices.create] [Her] [logstash-2014.02.04] failed to create
org.elasticsearch.index.mapper.MapperParsingException: mapping [mapping.json]
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:312)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:298)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:135)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.util.Map
at org.elasticsearch.index.mapper.DocumentMapperParser.extractMapping(DocumentMapperParser.java:268)
at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:155)
at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:314)
at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:193)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:309)
... 5 more
2014-02-04 10:48:34 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2014-02-04 10:48:33 +0000 error_class="Net::HTTPServerException" error="400 \"Bad Request\"" instance=17509700

When you start from scratch, thus without mapping, you rely on defaults. Every time you send a document the fields that weren't mapped yet are automatically mapped based on their json type (and conventions for dates). That said, if you send a field in your first document as a number and that same field becomes a string in your second document, the index operation for the second document will return an error.
There are apis to manage mappings, which doesn't mean that you have to declare all your fields. You can just specify the ones that you want to behave differently from the default. You can specify mappings while creating an index, using the put mapping api if the index already exists, or even include them in index templates, for indices that have yet to be created.
Changing the mappings is possible, but only backwards compatible changes can be applied. You can always add new fields, but you can't change the type or the analyzer for an existing field. What you could do in that case is trying to make the change backwards compatible by using multi-fields, otherwise you need to reindex against the updated mappings.
As for your last question, if you index everything as a string, you lose what you can usually do with numbers e.g. range queries. Whether this is feasible or not depends on your data and what you need to do with it.

Related

Neo4j-OGM/Spring-Data-Neo4j: Migrate property type from Integer to String

In a large database I have to change the data type of a property for a type of nodes from Integer to String (i.e. 42 to "42") in order to also support non-numerical IDs.
I've managed to do the migration itself and the property now has the expected type in the database.
I have verified this using the Neo4j-Browsers ability to show the query result as JSON:
"graph": {
"nodes": [
{
"id": "4190",
"labels": [
"MyEntity"
],
"properties": {
"id": "225"
}
}
}
Note that the "id" property is different from the node's own (numerical) id.
In the corresponding Spring-Data-Neo4j 4app, I adjusted the type of the corresponding property from Integer to String as well. I expected that to be enough, however upon first loading an affected entity I now receive:
org.neo4j.ogm.exception.MappingException: Error mapping GraphModel to instance of com.example.MyEntity
[...]
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Can not set java.lang.String field de.moneysoft.core.model.base.UriEntity.transfermarktId to java.lang.Integer
at org.neo4j.ogm.entity.io.FieldWriter.write(FieldWriter.java:43)
at org.neo4j.ogm.entity.io.FieldWriter.write(FieldWriter.java:68)
at org.neo4j.ogm.context.GraphEntityMapper.writeProperty(GraphEntityMapper.java:232)
at org.neo4j.ogm.context.GraphEntityMapper.setProperties(GraphEntityMapper.java:184)
at org.neo4j.ogm.context.GraphEntityMapper.mapNodes(GraphEntityMapper.java:151)
at org.neo4j.ogm.context.GraphEntityMapper.mapEntities(GraphEntityMapper.java:135)
... 122 common frames omitted
Caused by: java.lang.IllegalArgumentException: Can not set java.lang.String field com.example.MyEntity.id to java.lang.Integer
at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:167)
at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:171)
at sun.reflect.UnsafeObjectFieldAccessorImpl.set(UnsafeObjectFieldAccessorImpl.java:81)
at java.lang.reflect.Field.set(Field.java:764)
at org.neo4j.ogm.entity.io.FieldWriter.write(FieldWriter.java:41)
... 127 common frames omitted
I am not aware of Neo4j-OGM storing any kind of model or datatype (at least I don't see it in the graph). Why does it still believe that my property is an Integer?
Edit:
Node Entity after migration:
#NodeEntity
public class MyEntity
{
#Property
protected String name;
#Property
private String id;
}
I am not aware of any other relevant code.

Well, if the error you see looks implausible, it probably is.
After a good nights sleep, I realized that I had connected to the wrong database instance: Not the one that was migrated and that I was looking at in the browser, but another one that contained an unmigrated state.
After connecting to the correct instance, everything worked as expected!

Is it possible to to define a re-usable path operation?

We are adding CORS support to our Swagger API which includes defining an options operation per path. Since this is boiler-plate code we want to define the option operation once in the definitions section like so
"definitions":{
"CORS":{ .. }
}
And then reference the operation in our paths like so
"paths":{
"/system/info":{
"options" : {
"$ref": "#/definitions/CORS"
}
}
}
This does not seem to work when we upload the swagger definition. What is the proper way to accomplish our goal of defining a path operation once and then re-using it across paths?

You can reference an entire path to an external location:
"paths": {
"/system/info": {
"$ref": "cors.json"
}
}
but not an individual http method. In addition, the spec doesn't allow for a relative reference for a path--you'll have to put it in a separate document.
See here for information on the path item object, and here for the top-level swagger object.

Persisting date and nested with JSON

I'm trying to save nested person, which is json array and complains about requiring a Set.
Another problem I encountered, is that another field date cannot be null, but contains value already.
What I need to do before for adding params into my object or I have to change my json is built? I'm trying to save json post like this:
// relationship of Test
//static hasMany = [people: Person, samples: Sample]
def jsonParams= JSON.parse(request.JSON.toString())
def testInstance= new Test(jsonParams)
//Error requiring a Set
[Failed to convert property value of type 'org.codehaus.groovy.grails.web.json.JSONArray' to required type 'java.util.Set' for property 'people'; nested exception is java.lang.IllegalStateException: Cannot convert value of type [java.lang.String] to required type [com.Person] for property 'people[0]': no matching editors or conversion strategy found]]
//error saying its null
Field error in object 'com.Test' on field 'samples[2].dateTime': rejected value [null]; codes [com.Sample]
//...
"samples[0].dateTime_hour":"0",
"samples[0].dateTime_minute":"0",
"samples[0].dateTime_day":"1",
"samples[0].dateTime_month":"0",
"samples[0].dateTime_year":"-1899",
"samples[0]":{
"dateTime_minute":"0",
"dateTime_day":"1",
"dateTime_year":"-1899",
"dateTime_hour":"0",
"dateTime_month":"0"
},
"people":[
"1137",
"1141"
], //...

First off, ths line is unnecessary:
def jsonParams= JSON.parse(request.JSON.toString())
The request.JSON can be directly passed to the Test constructor:
def testInstance = new Test(request.JSON)
I'm not sure what your Person class looks like, but I'm assuming those numbers (1137, 1141) are ids. If that is the case, then your json should work - there's a chance that passing the request.JSON directly could help. I tested your JSON locally and it has no problem associating the hasMany collection. I also used:
// JSON numbers rather than strings
"people": [1137, 1141]
// using Person map with the id
"people: [{
"id": 1137
}, {
"id": 1141
}]
Both of these worked as well and are worth trying.
Concerning the null dateTime, I would rework your JSON. I would send the dateTime in a single field, instead of splitting the value into hour/minute/day/etc. The default formats are yyyy-MM-dd HH:mm:ss.S and yyyy-MM-dd'T'hh:mm:ss'Z', but these can be defined by the grails.databinding.dateFormats config setting (config.groovy). There are other ways to do the binding as well (#BindingFormat annotation) but it's going to be easiest to just send the date in a way that grails can handle without additional configuration.
If you are dead set on splitting the dateTime into pieces, then you could use the #BindUsing annotation:
class Sample{
#BindUsing({obj, source ->
def hour = source['dateTime_hour']
def minute = source['dateTime_minute']
...
// set obj.dateTime based on these pieces
})
Date dateTime
}
An additional comment on your JSON, you seem to have samples[0] defined twice and are using 2 syntaxes for your internal collections (JSON arrays and indexed keys). I personally would stick with a single syntax to clean it up:
"samples": [
{"dateTime": "1988-01-01..."}
{"dateTime": "2015-10-21..."}
],"people": [
{"id": "1137"},
{"id": "1141"}
],

Neo4J, SDN and running Cypher spatial queries

I am new to Neo4J and I am trying to build a proof of concept for High Availability spatial temporal based querying.
I have a setup with 2 standalone Neo4J Enterprise servers and a single Java application running with an embedded HA Neo4J server.
Everything was simple to setup and basic queries are easy to setup and efficient. Additionally performing the queries derived from the Neo4J SpatialRepository work as expected.
What I am struggling to understand is how to use SDN to make a spatial query in combination with any other where clauses. As a trivial example how could I write find all places User called X has been within Y miles of lat/lon. Because the SpatialRepository is not part of the regular Spring Repository class tree I do not believe that there are any naming conventions that I can use, is the intention that I perform the spatial query and then filter the results?
I have traced the code through to a LegacyIndexSearcher (which has a name that scares me!) and cannot see any mechanism for extending the search. I have also had a look at the IndexProviderTest on GitHub which could provide a manual mechanism for performing the query against the index, except that I think there may be two indexes in play.
It might be helpful if I understood how to construct a Cypher query that I could use within an #Query annotation. Whilst I have been able to use the console to perform a simple REST query using:
:POST /db/data/ext/SpatialPlugin/graphdb/findGeometriesWithinDistance
{
"layer":"location",
"pointX":0.0,
"pointY":51.526256,
"distanceInKm":100
}
This does not work:
start n=node:location('withinDistance:[51.526256,0.0,100.0]') return n;
The error is:
Index `location` does not exist
Neo.ClientError.Schema.NoSuchIndex
The index was (possibly naively) created using Spring:
#Indexed(indexType = IndexType.POINT, indexName = "location")
String wkt;
If I run index --indexes in the console I can see that there is no index named location, but that there is one named location__neo4j-spatial__LayerNodeIndex__internal__spatialNodeLookup__.
Am I required to create the Index manually? If so, could someone point me in the direction of the documentation and I'll get on with it.
Assuming that it is just ignorance that has stopped me getting the simple Cypher query to run, is it as simple as adding a regular Cypher WHERE clause to the query to perform the combination of Spatial and property based querying?
Added more index detail
Having run :GET /db/data/index/node/ from the console I could see two possibly useful indexes (other indexes removed):
{
"location__neo4j-spatial__LayerNodeIndex__internal__spatialNodeLookup__": {
"template": "/db/data/index/node/location__neo4j-spatial__LayerNodeIndex__internal__spatialNodeLookup__/{key}/{value}",
"provider": "lucene",
"type": "exact"
},
"GeoTemporalThing": {
"template": "/db/data/index/node/GeoTemporalThing/{key}/{value}",
"provider": "lucene",
"type": "exact"
}
}
So perhaps this should is the correct format for the query I was trying:
start n=node:GeoTemporalThing('withinDistance:[51.526256,0.0,100.0]') return n;
But that gives me this error (which I am now Googling)
org.apache.lucene.queryParser.ParseException: Cannot parse 'withinDistance:[51.526256,0.0,100.0]': Encountered " "]" "] "" at line 1, column 35.
Was expecting one of:
"TO" ...
...
...
Update
Having decided that my index didn't exist and that it should I used the REST interface to create an index with the name that I expected SDN to create like this:
:POST /db/data/index/node
{
"name" : "location",
"config" : {
"provider" : "spatial",
"geometry_type" : "point",
"wkt" : "wkt"
}
}
And, now everything seems to work just fine. So, my question is, should I have to create that index manually? If I look at the code in org.springframework.data.neo4j.support.index.IndexType it looks as if it should use exactly the settings that I used above but it had only created the long named Lucene Index:
public enum IndexType
{
#Deprecated
SIMPLE { public Map getConfig() { return LuceneIndexImplementation.EXACT_CONFIG; } },
LABEL { public Map getConfig() { return null; } public boolean isLabelBased() { return true; }},
FULLTEXT { public Map getConfig() { return LuceneIndexImplementation.FULLTEXT_CONFIG; } },
POINT { public Map getConfig() { return MapUtil.stringMap(
IndexManager.PROVIDER, "spatial", "geometry_type" , "point","wkt","wkt") ; } }
;
public abstract MapgetConfig();
public boolean isLabelBased() { return false; }
}
I did clear down the system and the behaviour was the same, is there a step I have missed?
Software details:
Java:
neo4j 2.0.1
neo4j-ha 2.0.1
neo4j-spatial 0.12-neo4j-2.0.1
spring-data-neo4j 3.0.0.RELEASE
Standalone Servers:
neo4j-enterprise-2.0.1
neo4j-spatial-0.12-neo4j-2.0.1-server-plugin

I'm not sure if this is a bug in Spring Data when setting up the index, but manually creating the index using the REST index worked:
:POST /db/data/index/node
{
"name" : "location",
"config" : {
"provider" : "spatial",
"geometry_type" : "point",
"wkt" : "wkt"
}
}
I can now perform queries with minimal effort using cypher in an #Query annotation (more parameters coming obviously):
#Query(value = "start n=node:location('withinDistance:[51.526256,0.0,100.0]') MATCH user-[wa:WAS_HERE]-n WHERE wa.ts > {ts} return user"
Page findByTimeAtLocation(#Param("ts") long ts);

XText cross-referencing: How to establish references between locally unique IDs, using their context as a qualifier?

I have a problem with cross-referencing terminals that are only locally unique (in their block/scope), but not globally. I found tutorials that describe, how I can use fully qualified names or package declarations, but my case is syntactically a little bit different from the example and I cannot change the DSL to support something like explicit fully qualified names or package declarations.
In my DSL I have two types of structured JSON resources:
The instance that contains my data.
A meta model, containing type information etc. for my data.
I can easily parse those two, and get an EMF model with the following Java snippet:
new MyDSLStandaloneSetup().createInjectorAndDoEMFRegistration();
ResourceSet rs = new ResourceSetImpl();
rs.getResource(URI.createPlatformResourceURI("/Foo/meta.json", true), true);
Resource instanceResource= rs.getResource(URI.createPlatformResourceURI("/Bar/instance.json", true), true);
EObject eobject = instanceResource.getContents().get(0);
Simplyfied example:
meta.json
{
"toplevel_1": {
"sublevels": {
"sublevel_1": {
"type": "int"
},
"sublevel_2": {
"type": "long"
}
}
},
"toplevel_2": {
"sublevels": {
"sublevel_1": {
"type": "float"
},
"sublevel_2": {
"type": "double"
}
}
}
}
instance.json
{
"toplevel_1": {
"sublevel_1": "1",
"sublevel_2": "2"
},
"toplevel_2": {
"sublevel_1": "3",
"sublevel_2": "4"
}
}
From this I want to infer that:
toplevel_1:sublevel_1 has type int and value 1
toplevel_1:sublevel_2 has type long and value 2
toplevel_2:sublevel_1 has type float and value 3
toplevel_2:sublevel_2 has type double and value 4
I was able to cross-reference the unique toplevel-elements and iterate over all sublevels until I found the ones that I was looking for, but for my use case that is quite inefficient and complicated. Also, I don't get the generated editor to link between the sublevels this way.
I played around with linking and scoping, but I'm unsure as to what I really need, and if I have to extend the providers-classes AbstractDeclarativeScopeProvider and/or DefaultDeclarativeQualifiedNameProvider.
What's the best way to go?
See also:
Xtext cross reference using custom terminal rule
http://www.eclipse.org/Xtext/documentation.html#scoping
http://www.eclipse.org/Xtext/documentation.html#linking

After some trial and error I solved my problem with a ScopeProvider.
The main issue was that I didn't really understand what a scope is in Xtext-terms, and what I have to provide it to.
Looking at the signature from the documentation:
IScope scope_<RefDeclaringEClass>_<Reference>(<ContextType> ctx, EReference ref)
In my example language:
RefDeclaringEClass would refer to the Sublevel from instance.json,
Reference to the cross-reference to the Sublevel from meta.json, and
ContextType would match the RefDeclaringEClass.
Using the eContainer of ctx I can get the Toplevel from instance.json.
This Toplevel already has a cross-reference to the matching Toplevel from meta.json, which I can use to get the Sublevels from meta.json. This collection of Sublevels is basically the scope within which the current Sublevel should be unique.
To get the IScope I used Scopes#scopeFor(Iterable).
I didn't post any code here because the actual grammar is bigger/different, and therefore doesn't really help the explanation.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Elasticsearch default mapping - mapping

Related

Neo4j-OGM/Spring-Data-Neo4j: Migrate property type from Integer to String

Is it possible to to define a re-usable path operation?

Persisting date and nested with JSON

Neo4J, SDN and running Cypher spatial queries

XText cross-referencing: How to establish references between locally unique IDs, using their context as a qualifier?

Categories

Resources