ElasticSearch & Tire: Using Mapping and to_indexed_json - ruby-on-rails

While reading the Tire doc, I was under the impression that you should use either mapping or to_indexed_json methods, since (my understanding was..) the mapping is used to feed the to_indexed_json.
The problem is, that I found some tutorials where both are used. WHY?
Basically, my app works right now with the to_indexed_json but I can't figure out how to set the boost value of some of the attributes (hence the reason I started looking at mapping) and I was wondering if using both would create some conflicts.

While the mapping and to_indexed_json methods are related, they serve two different purposes, in fact.
The purpose of the mapping method is to define mapping for the document properties within an index. You may want to define certain property as "not_analyzed", so it is not broken into tokens, or set a specific analyzer for the property, or (as you mention) indexing time boost factor. You may also define multifield property, custom formats for date types, etc.
This mapping is then used eg. when Tire automatically creates an index for your model.
The purpose of the to_indexed_json method is to define a JSON serialization for your documents/models.
The default to_indexed_json method does use your mapping definition, to use only properties defined in the mapping — on a basis that if you care enough to define the mapping, by default Tire indexes only properties with defined mapping.
Now, when you want a tight grip on how your model is in fact serialized into JSON for elasticsearch, you just define your own to_indexed_json methods (as the README instructs).
This custom MyModel#to_indexed_method usually does not care about mapping definition, and builds the JSON serialization from scratch (by leveraging ActiveRecord's to_json, using a JSON builder such as jbuilder, or just building a plain old Hash and calling Hash#to_json).
So, to answer the last part of your question, using both mapping and to_indexed_json will absolutely not create any conflicts, and is in fact required to use advanced features in elasticsearch.
To sum up:
You use the mapping method to define the mapping for your models for the search engine
You use a custom to_indexed_json method to define how the search engine sees your documents/models.

Related

Restrict search to specific domains

I'm using the elasticsearch plugin and I'm running searches using elasticSearchService.search(myKeywords) which searches for keywords over all the domain classes marked as searchable.
Now I want to restrict the search to two specif domain classes. I can see there are options named indices and types that can be passed to the search method, but if I simply use my domain class names on them I get errors telling the index or type doesn't exist. What exactly should I do to achieve what I want?
(I'm new to lucene and elasticsearch and I'm not sure I understood the index and type concepts. Reading the docs I could only find examples to restrict searches to an specific field, not a hole domain class or whatever it is mapped to, in lucene/elasticsearch concepts).
The way to go is:
elasticSearchService.search(myKeywords, [types:["myPackage.MyClass","myPackage.MyOtherClass"]])
The results are as expected, but I'm still worried about having one index (and one type) per domain. Not what I expected but I can't see how to map all domain classes to a single index for the hole database as stated by the docs

How to create nodes in neo4j with properties defined by a dictionary via neo4jclient in C#

As a complete novice programmer I am trying to populate my neo4j DB with data from heterogeneous sources. For this I am trying to use the Neo4jClient C# API. The heterogeneity of my data comes from a custom, continuously evolving DSL/DSML/metamodel that defines the possible types of elements, i.e. models, thus creating classes for each type would not be ideal.
As I understand, my options are the following:
Have a predefined class for each type of element: This way I can easily serialize my objects that is if all properties are primitive types or arrays/lists.
Have a base class (with a Dictionary to hold properties) that I use as an interface between the models that I'm trying to serialize and neo4j. I've seen an example for this at Can Neo4j store a dictionary in a node?, but I don't understand how to use the converter (defined in the answer) to add a node. Also, I don't see how an int-based dictionary would allow me to store Key-Value pairs where the keys (that are strings) would translate to Property names in neo4j.
Generate a custom query dynamically, as seen at https://github.com/Readify/Neo4jClient/wiki/cypher#manual-queries-highly-discouraged. This is not recommended and possibly is not performant.
Ultimately, what I would like to achieve is to avoid the need to define a separate class for every type of element that I have, but still be able to add properties that are defined by types in my metamodel.
I would also be interested to somehow influencing the serializer to ignore non-compatible properties (similarly to XmlIgnore), so that I would not need to create a separate class for each class that has more than just primitive types.
Thanks,
J
There are 2 problems you're trying to solve - the first is how to program the C# part of this, the second is how to store the solution to the first problem.
At some point you'll need to access this data in your C# code - unless you're going fully dynamic you'll need to have some sort of class structure.
Taking your 3 options:
Please have a look at this question: neo4jclient heterogenous data return which I think covers this scenario.
In that answer, the converter does the work for you, you would create, delete etc as before, the converter just handles the IDictionary instance in that case. The IDictionary<int, string> in the answer is an example, you can use whatever you want, you could use IDictionary<string, string> if you wanted, in fact - in that example, all you'd need to do would be changing the IntString property to be an IDictionary<string,string> and it should just work.
Even if you went down the route of using custom queries (which you really shouldn't need to) you will still need to bring back objects as classes. Nothing changes, it just makes your life a lot harder.
In terms of XmlIgnore - have you tried JsonIgnore?
Alternatively - look at the custom converter and get the non-compatible properties into your DB.

doctrine2 mapping required

I'm newbie in doctrine 2. i want to know which mapping required for doctrine 2?
I create my Annotation map with all columns, methods and etc.
I need to know it's required to define XML or YAML mapping for any kind of features in doctrine 2 and without them which feature i haven't?
http://docs.doctrine-project.org/projects/doctrine-orm/en/2.0.x/reference/basic-mapping.html
You only need to use one type of mapping and you've done so already with the annotation approach. This is also the preferred approach from what I gather.
The other mapping methods just let you work the way you prefer. So if you want to have your schema defined away from your model source code. You can use XML or YAML. A lot of the Symfony guys use YAML because it's what the framework is configured with. I personally prefer the annotation method.

Creating the same model from multiple data sources

This is mostly of a design pattern question. I have one type of model that I'm going to get the data to create them from multiple sources. So for example one record my be created from an API where another is created via screen scraping with Nokogiri.
My issue lies in how best to abstract out these different data sources. Right now I'm building lib classes that return the same hash which I then use to set the attributes of the model. But I'm wondering if this isn't more of a case to use STI. Or if there is some other way of doing this I'm just not thinking about.
I think your design decision would depend largely on what attributes need to be stored. From your description, it sounds like you have a model with multiple data sources, but which would be storing the same attributes regardless of the source. In that case STI seems like overkill. When you retrieve a row from the table, does it matter whether the source is the API or the screen scraper? If not, then you could just define separate methods for each data source and use the appropriate method in the controller.
#instance = MyModel.new(:datasource=>"API")`
I'd say don't worry about inheritance (or mixing in code from modules) unless you really need to. There are some gotchas -- STI is not fully supported by some gems/plugins, for example.

How do I handle data which must be persisted in a database, but isn't a proper model, in Ruby on Rails?

Imagine a web application written in Ruby on Rails. Part of the state of that application is represented in a piece of data which doesn't fit the description of a model. This state descriptor needs to be persisted in the same database as the models.
Where it differs from a model is that there needs to be only one instance of its class and it doesn't have relationships with other classes.
Has anyone come across anything like this?
From your description I think the rails-settings plugin should do what you need.
From the Readme:
"Settings is a plugin that makes managing a table of global key, value pairs easy. Think of it like a global Hash stored in you database, that uses simple ActiveRecord like methods for manipulation. Keep track of any global setting that you dont want to hard code into your rails app. You can store any kind of object. Strings, numbers, arrays, or any object."
http://github.com/Squeegy/rails-settings/tree/master
If it's data, and it's in the database, it's part of the model.
This isn't really a RoR problem; it's a general OO design problem.
If it were me, I'd probably find a way to conceptualize the data as a model and then just make it a singleton with a factory method and a private constructor.
Alternatively, you could think of this as a form of logging. In that case, you'd just have a Logger class (also a singleton) that reads/writes the database directly and is invoked at the beginning and end of each request.
In Rails, if data is in the database it's in a model. In this case the model may be called "Configuration", but it is still mapped to an ActiveRecord class in your Rails system.
If this data is truly static, you may not need the database at all.
You could use (as an example) a variable in your application controller:
class ApplicationController < ActionController::Base
helper :all
#data = "YOUR DATA HERE"
end
There are a number of approaches that can be used to instantiate data for use in a Rails application.
I'm not sure I understand why you say it can't fit in a Rails model.
If it's just a complex data structure, just save a bunch of Ruby code in a text field in the database :-)
If for example you have a complex nested hash you want to save, assign the following to your 'data' text field:
ComplexThing.data = complex_hash.inspect
When you want to read it back, simply
complex_hash = eval ComplexThing.data
Let me point out 2 more things about this solution:
If your data structure is not standard Ruby classes, a simple inspect may not do it. If you see #<MyClass:0x4066e3c> anywhere, something's not being serialized properly.
This is a naive implementation. You may want to check out real marshalling solutions if you risk having unicode data or if you really are saving a lot of custom-made classes.

Resources