Calculating the Count of a related Collection - ruby-on-rails

I have two models Professionals and Projects
Professionals hasMany Projects
Projects belongsTo Professionals
In the Professionals index page i need to show the number of projects the Professional has.
Right now i am doing the following query to get all the Professionals.
How can i fetch the count of the Projects of each of the Professionals as well.
#pros = Professionals.all.asc(:name)

I would add projects_count to Professional
Then
class Project
belongs_to :professional, counter_cache: true
end
And rails will handle the count every time a project is added to or removed from a professional. Then you can just do .projects_count on each professional.
Edit:
If you actually want additonal data
#pros = Professionals.includes(:projects).order(:name)
Then
#pros.each do |pro|
pro.name
pro.projects.each do |project|
project.name
end
end

I am just abstracting here because the rails thing really isn't my bag. But let's talk about schema and things to look for. And as such the code is really just "pseudo-code" but should be close to what is wanted.
Considering "just" how MongoDB is going to store the data, and that you presumably seem to have multiple collections. And I am not saying that is or is not the best model, but just dealing with it.
Let us assume we have this data for "Projects"
{
"_id" : ObjectId("53202e1d78166396592cf805"),
"name": "Project1,
"desc": "Building Project"
},
{
"_id" : ObjectId("532197fb423c37c0edbd4a52")
"name": "Project2",
"desc": "Renovation Project"
}
And that for "Professionals" we might have something like this:
{
"_id" : ObjectId("531e22b7ba53b9dd07756bc8"),
"name": "Steve",
"projects": [
ObjectId("53202e1d78166396592cf805"),
ObjectId("532197fb423c37c0edbd4a52")
]
}
Right. So now we see that the "Professional" has to have some kind of concept that there are related items in another collection and what those related items are.
Now I presume, (and it's not my bag) that there is a way to get down to the lower level of the driver implementation in Mongoid ( I believe that is Moped off the top of my head ) and that it likely ( from memory ) is invoked in a similar way to ( asssuming "Professionals" as the class model name ) :
Professionals.collection.aggregate([
{ "$unwind": "$projects" },
{ "$group": {
"_id": "$_id",
"count": { "$sum": 1 }
}
])
Or in some similar form that is more or less the analog to what you would do in the native mongodb shell. The point being, with something like this you just made the server do the work, rather than pulling all the results to you client and looping through them.
Suggesting that you use native code to iterate results from your data store is counter productive and counter intuitive do using any kind of back end database store. Whether it by a SQL database or a NoSQL database, the general preference is as long as the database has methods to do the aggregation work, then use them.
If you are writing code that essentially pulls every record from your store and then cycles through to get the result then you are doing something wrong.
Use the database methods. Otherwise you might as well just use a text file and be done with it.

Related

Ruby: Hash: use one record attribute as key and another as value

Let's say I have a User with attributes name and badge_number
For a JavaScript autocomplete field I want the user to be able to start typing the user's name and get a select list.
I'm using Materialize which offers the JS needed, I just need to provide it the data in this format:
data: { "Sarah Person": 13241, "Billiam Gregory": 54665, "Stephan Stevenston": 98332 }
This won't do:
User.select(:name, :badge_number) => { name: "Sarah Person", badge_number: 13241, ... }
And this feels repetitive, icky and redundant (and repetitive):
user_list = User.select(:name, :badge_number)
hsh = {}
user_list.each do |user|
hsh[user.name] = user.badge_number
end
hsh
...though it does give me my intended result, performance will suck over time.
Any better ways than this weird, slimy loop?
This will give the desired output
User.pluck(:name, :badge_number).to_h
Edit
Though above code is one liner, it still have loop internally. Offloading such loops to database may improve the performance when dealing with too many rows. But there is no database agnostic way to achieve this in active record. Follow this answer for achieving this in Postgres
If your RDBMS is Postgresql, you can use Postgresql function json_build_object for this specific case.
User.select("json_build_object(name, badge_number) as json_col")
.map(&:json_col)
The whole json can be build using Postgresql supplied functions too.
User.select("array_to_json(array_agg(json_build_object(name, badge_number))) as json_col")
.limit(1)[0]
.json_col

Is using a case statement in my ruby on rails example ok?

I have a built a little online tool using rails. The app has no database.
(I know I should have used another simpler framework, like Sinatra for Ruby).
So I need to capture selected car model from the interface and use its power, speed, and dimensions properties
Is it ok to use case statement, like the code below:
class Car
def get_car_properties (dropdown_selection)
info = {}
info=Hash.new{2}
case dropdown_selection
when 'Mazda 3'
info = { speed: 'fast', power: 'real strong', dimensions: '4X3X3'}
when 'Lancer 81'
info = { speed: 'slow', power: 'real weak', dimensions: '2X2X2'}
else
info={}
end
return info
end
end
selected_car=Car.new()
puts selected_car.get_car_properties('Mazda 3')
For simplifications, I have not used the exact scenario. Also note that my case statement is about 30 options big and the hash 10 symbols.
Big case statements are a code smell, meaning they are a hint that there may be an opportunity to improve your code.
In this case, while your app does not have a database, you still have data. The reason why your code looks messy right now is that you have not separated that data from your program logic.
I would recommend storing the data about each car in a hash, and then treat that hash as if it were your database.
class Car
CAR_INFO = {
'Mazda 3' => {
speed: 'fast',
power: 'real strong',
dimensions: '4X3X3'
},
'Lancer 81' => {
speed: 'slow',
power: 'real weak',
dimensions: '2X2X2'
},
# etc
}
def get_car_properties (dropdown_selection)
CAR_INFO.fetch(dropdown_selection, {})
end
end
selected_car=Car.new()
puts selected_car.get_car_properties('Mazda 3')
Why is this better? For one thing, the hash is pure data. It's clear that there's no additional control logic hiding somewhere in one of the branches of your case statement. You can now tell at a glance exactly what get_car_properties is doing - it's looking up the selection in a data structure, and returning an empty hash if nothing is found. You may wish to move CAR_INFO into a separate file, or perhaps into a proper database later on - get_car_properties won't change much in either of those cases.
The code you're written probably works, but definitely smells and would violate the ruby style guide.
I suggest you utilize a code analyzer, such as rubocop which will enforce the ruby style guide, and help you write conforming code and avoid common pitfalls.

Association integration in a REST API with Rails and Angular 2/4

I'm working in a web application with Angular 2/4 on client side and a Rails application on the server side. In the rails side I have a model Product that has the following serializer:
class ProductSerializer < ActiveModel::Serializer
attributes :id, :weight
belongs_to :brand
end
This serializer generates:
{
"id": 1,
"weight": 62.0,
"brand": {
"id": 1,
"name": "foo"
}
}
Then, in Angular I have this class to parse the JSON:
export class Produto implements Model {
id: number;
weight: number;
brand: Brand;
}
Until here is everything ok, the issue began when the user alter the Product and the Angular has to send the Product altered to update route in Rails. If I just send the TypeScript object in the body of PUT or POST, the generated JSON will have the same pattern that the one received from Rails. But my rails app was expecting the following pattern:
{
"id": 1,
"weight": 62.0,
"brand_id": 1
}
In the ProductController I have the method product_params that I did this workaround to accept the pattern sended by Angular:
def produto_params
params[:brand_id] = params[:brand][:id] if params[:brand][:id]
params.delete(:brand) if params[:brand]
params.permit(:weight, :brand_id)
end
All said, this looks ugly and not scalable at all. This is just one association, if my product scale to 5 association this will grow big. Besides that, I can have many others situations like that with others models.
I think that I'm doing structure wrong, but where is wrong and where is right is obscure to me, so I need some help to structure in a scalable way.
Some considerations: Both Angular app and Rails app are mine and they are in early development, so I can do structural changes with few troubles. I learning Ruby, Rails, TypeScript and Angular with this project, but I'm trying to reuse code the maximum I can, so what want to achieve is a structure that I don't need to write serializer for every model in both sides. A approach that thought was a generic serializer in Angular, using decoratos in the fields that carry object from association, so overriding the toJSON() method I would transform a brand in a brand_id (don't know if is even possible), this would be a bad approach?
Thanks in advance.

PredictionIO suggest to like items that have already been liked

I'm trying to use PredictionIO recommendation engine in Rails app to suggest items for users to like. So, I have three models: user, product and favorite(user_id, product_id). This is what algorithms.json file looks like:
[
{
"name": "ncMahoutItemBased",
"params": {
"booleanData": true,
"itemSimilarity": "LogLikelihoodSimilarity",
"weighted": false,
"threshold": 0.6,
"nearestN": 10,
"unseenOnly": false,
"freshness" : 0,
"freshnessTimeUnit" : 86400
}
}
]
The things is, after training and deploying, I get a list of suggested items for user and some of which the user has already liked. Why is this?
What is the name for UserBased algorithm instead of "ncMahoutItemBased"?
Thanks.
There is nothing wrong with recommending an item the user has shown a preference for. This is expected behavior in a clothing store, where I always buy Levi's Jeans and they want to remind me of that.
In your case you may not want to recommend items already prefered so filter them out of the recommendations. In most Mahout recommenders this is done for you so PredictionIO must have disabled that feature. Is there some param or config option that tells PredictionIO to filter out a user's preferred items?

Keeping elasticsearch and database in sync

I am trying to figure out a way to keep my mysql db and elasticsearch db in sync. I have setup a jdbc river using the jprante / elasticsearch-river-jdbc plugin for elasticsearch. When I execute the below request:
curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"driver" : "com.mysql.jdbc.Driver",
"url" : "jdbc:mysql://localhost:3306/MY-DATABASE",
"user" : "root",
"password" : "password",
"sql" : "select * from users",
"poll" : "1m"
},
"index" : {
"index" : "test_index",
"type" : "user"
}
}'
the river starts indexing data, but for some records I get org.elasticsearch.index.mapper.MapperParsingException. Well there is discussion related to this issue here, but I want to know a way to get around this issue.
Is it possible to permanently fix this by creating an explicit mapping for all 'fields' of the 'type' that I am trying to index or is there a better way to solve this issue?
Another question that I have is, when the jdbc-river polls the database again, it seems to re-index the entire data-set(given in sql query) again into ES. I am not sure, but is this done because elasticsearch wants to add fresh data as well as update any changes in the existing data? Is it possible to index only the fresh data, if the table's data is static?
Did you look at default mapping?
http://www.elasticsearch.org/guide/reference/mapping/dynamic-mapping.html
I think it can help you here.
If you have an insertion date field in your datatable, you can use it to filter what you have to index.
See https://github.com/jprante/elasticsearch-river-jdbc#time-based-selecting
HTH
David
Elastic Search has dropped the river sync concept at all. It is not a recommended path, because usually it doesn't make sense to keep same normalized SQL table structure in document store like Elastic Search.
Say, you have Product as an entity with some attributes, and Reviews on Product entity as a parent child table as Reviews could be multiple on same table.
Products(Id, name, status,... etc)
Product_reviewes(product_id, review_id)
Reviews(id, note, rating,... etc)
In document store you may want to create a single Index with name say product that includes Product{attribute1, attribute1,... Product reviews[review1, review2,...]}
Here is approach of syncing in such setup.
Assumption:
SQL Database(True Source of record)
Elastic Search or any other NoSql Document Store
Solution:
As soon as Update/updates happens in Publish event/events in JMS/AMQP/Database Queue/File System Queue/Amazon SQS etc. either full Product or primary object ID(I would recommend just ID)
Queue consumer should then call the Web Service to get full object if only Primary ID is pushed to Queue or just take the object it self and send the respective changes to Elastic search/NoSQL database.

Resources