Rails reporting objects - ruby-on-rails

I am currently attempting to implement a reporting module in a rails app. Thanks to some assistance provided here: Ruby on Rails object reporting, I have decided to go down the road of coding common metrics and populating reports with these.
What I have to work out is how to create the metrics - essentially I need to have a metric object that I can use within my targeting framework (e.g. if i have a target object where target.value is 0.5, I can have target.metric_id to know which metric is being targeted, and thus report on it).
My problem is how to store the formula for the metric within the model structure. A simple example of a metric would be profit, where I could do sales.selling_prices.sum - sales.cost_prices.sum. How can I set up some columns that allow this formula to be stored? All formulas will be calculated using other objects, as in the profit example.
Any assistance would be greatly appreciated.
Thanks!

Depending on how ambitious your formulas get, you could start with something like this for metric:
operation_type:string, one of %w(add sub mult div)
left_operand:decimal
right_operand:decimal
Then, to calculate, you might have a method on metric like:
def result
if operation_type == 'add'
left_operand + right_operand
elsif operation_type == 'sub'
left_operand - right_operand
...
end
When you create your metrics (maybe an admin panel of some kind) you could have ways of selecting the source inputs (for instance, left_operand is set to sales.selling_prices.sum, etc).

Related

Reclassify data using filters

My goal is to include or exclude dimensional data, from a calculation that creates a category on that dimension, in this example, Customer Name. I have achieve the inclusion/exclusion using Parameters, but they only accept single values. That means I need to create several parameters to achieve a selection of 10 items or more.
To explain the case in full, I'm using SuperStore sample dataset on Tableau Desktop 2021.1, I have created the following calculation
Top 10 Customers
IF
{fixed [Customer Name]:sum([Sales])}>10000
then
[Customer Name]
ELSE
"Other"
END
That renders the following visual
How can I move Bart Watters and Denny Joy to Other, without filtering the data? The idea is providing the user the ability to classify - instead of hard coding the selection into the calculation.

How to write filtering query with graphql?

Currently we are using graphql/graphql-ruby library. I have wrote few queries and mutations as per our requirement.
I have the below use case, where i am not sure how to implement it,
I have already an query/endpoint named allManagers which return all manager details.
Today i have got requirement to implement another query to return all the managers based on the region filter.
I have 2 options to handle this scenario.
Create an optional argument for region , and inside the query i need to check if the region is passed then filter based on region.
Use something like https://www.howtographql.com/graphql-ruby/7-filtering/ .
Which approach is the correct one ?
Looks like you can accomplish this with either approach. #2 looks a bit more complicated, but maybe is more extensible if you end up adding a ton of different types of filters?
are you going to be asked to select multiple regions? or negative regions (any region except north america?) - those are the types of questions you want to be thinking about when choosing an approach.
Sounds like a good conversation to have with a coworker :)
I'd probably opt to start with a simple approach and then change it out for a more complex one when the simple solution isn't solving all of my needs any more.

Improve Mahout suggestions

I'm searching for the way to improve Mahout suggestions (form Item-based recommender, and data sets originally are user/item/weight) using an 'external' set of data.
Assuming we already have recommendations: a number of Users were suggested by the number of items.
But also, it's possible to receive a feedback from these suggested users in a binary form: 'no, not for me' and 'yes, i was suggested because i know about items'; this way 1/0 by each of suggested users.
What's the better and right way to use this kind of data? Is there any approaches built-in Mahout? If no, what approach will be suitable to train the data set and use that information in the next rounds?
It's not ideal that you get explicit user feedback as 0-1 (strongly disagree - strongly agree), otherwise the feedback could be treated as any other user rating from the input.
Anyway you can introduce this user feedback in you initial training set, with recommended score ('1' feedback) or 1 - recommended score ('0' feedback) as weight and retrain your model.
It would be nice to add a 3-rd option 'neutral' that does not do anything, to avoid noise in the data (e.g. recommended score is 0.5 and user disagrees, you would still add it as 0.5 regardless...) and model over fitting.
Boolean data IS ideal but you have two actions: "like" and "dislike"
The latest way to use this is by using indicators and cross-indicators. You want to recommend things that are liked so for this data you create an indicator. However it is quite likely that a user's pattern of "dislikes" can be used to recommend likes, for this you need to create a cross-indicator.
The latest Mahout SNAPSHOT-1.0 has the tools you need in *spark-itemsimilarity". It can take two actions, one primary the other secondary and will create an indicator matrix and a cross-indicator matrix. These you index and query using a search engine, where the query is a user's history of likes and dislikes. The search will return an ordered list of recommendations.
By using cross-indicators you can begin to use many different actions a user takes in your app. The process of creating cross-indicators will find important correlations between the two actions. In other words it will find the "dislikes" that lead to specific "likes". You can do the same with page-views, applying tags, viewing categories, almost any recorded user action.
The method requires Mahout, Spark, Hadoop, and a search engine like Solr. It is explained here: http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html under How to use Multiple User Actions

Apache Mahout modified abstract similarity .. To incorporate trust network .. Need suggestions

I have modified the AbstractSimilarity class / UserSimilarity method with the following:
Collection c = multiMap.get(user1);
if(c.contains(user2)){
result = result+0.50;
}
I use the epinions dataset that has two files. One with userid, itemid, rating and a trust network user-user which is stored in the multimap above. The rating set is on the datamodel.
Finally: I would like to add a value to a user (e.g +0.50) if he is on the trust network of the user who asks for the recommendations.
Would it be better to use two datamodels?
Thnaks
You've hit upon a very interesting topic in recommenders: multi-modal or multi-action recommenders. They solve the problem of have several actions by the same users and how to use the data to recommend the primary action using all available data. For instance how to recommend purchases with purchase AND page view data.
To use epinions is good intuition on your part. The problem is that there may be no correlation between trust and rating for an individual user. The general technique you use here is to correlate the two bits of data by using a multi-action indicator. Just adding a weight may have little or no effect and can, in your own real-world data, even produce a negative effect.
The snapshot Mahout 1.0 has a new spark-itemsimilarity CLI job (you can use it like a library too) that takes two actions and correlates the second to the first producing two "indicator" outputs. The primary action is the one you want to recommend, in this case recommending people that an individual might like. The secondary action may be anything but must have the user IDs in common, in epinions it's the trust action. The epinions data is actually what is used to test this technique.
Running both inputs through spark-itemsimilarity will produce an "indicator-matrix" and a "cross-indicator-matrix" These are the core of any "cooccurrence" recommender. If you want to learn more about this technique I'd suggest bringing it up on the Mahout mailing list: user#mahout.apache.org

What is the 'Rails Way' to implement a dynamic reporting system on data

Intro
I'm doing a system where I have a very simple layout only consisting of transactions (with basic CRUD). Each transaction has a date, a type, a debit amount (minus) and a credit amount (plus). Think of an online banking statement and that's pretty much it.
The issue I'm having is keeping my controller skinny and worrying about possibly over-querying the database.
A Simple Report Example
The total debit over the chosen period e.g. SUM(debit) as total_debit
The total credit over the chosen period e.g. SUM(credit) as total_credit
The overall total e.g. total_credit - total_debit
The report must allow a dynamic date range e.g. where(date BETWEEN 'x' and 'y')
The date range would never be more than a year and will only be a max of say 1000 transactions/rows at a time
So in the controller I create:
def report
#d = Transaction.select("SUM(debit) as total_debit").where("date BETWEEN 'x' AND 'y'")
#c = Transaction.select("SUM(credit) as total_credit").where("date BETWEEN 'x' AND 'y'")
#t = #c.credit_total - #d.debit_total
end
Additional Question Info
My actual report has closer to 6 or 7 database queries (e.g. pulling out the total credit/debit as per type == 1 or type == 2 etc) and has many more calculations e.g totalling up certain credit/debit types and then adding and removing these totals off other totals.
I'm trying my best to adhere to 'skinny model, fat controller' but am having issues with the amount of variables my controller needs to pass to the view. Rails has seemed very straightforward up until the point where you create variables to pass to the view. I don't see how else you do it apart from putting the variable creating line into the controller and making it 'skinnier' by putting some query bits and pieces into the model.
Is there something I'm missing where you create variables in the model and then have the controller pass those to the view?
A more idiomatic way of writing your query in Activerecord would probably be something like:
class Transaction < ActiveRecord::Base
def self.within(start_date, end_date)
where(:date => start_date..end_date)
end
def self.total_credit
sum(:credit)
end
def self.total_debit
sum(:debit)
end
end
This would mean issuing 3 queries in your controller, which should not be a big deal if you create database indices, and limit the number of transactions as well as the time range to a sensible amount:
#transactions = Transaction.within(start_date, end_date)
#total = #transaction.total_credit - #transaction.total_debit
Finally, you could also use Ruby's Enumerable#reduce method to compute your total by directly traversing the list of transactions retrieved from the database.
#total = #transactions.reduce(0) { |memo, t| memo + (t.credit - t.debit) }
For very small datasets this might result in faster performance, as you would hit the database only once. However, I reckon the first approach is preferable, and it will certainly deliver better performance when the number of records in your db starts to increase
I'm putting in params[:year_start]/params[:year_end] for x and y, is that safe to do?
You should never embed params[:anything] directly in a query string. Instead use this form:
where("date BETWEEN ? AND ?", params[:year_start], params[:year_end])
My actual report probably has closer to 5 database calls and then 6 or 7 calculations on those variables, should I just be querying the date range once and then doing all the work on the array/hash etc?
This is a little subjective but I'll give you my opinion. Typically it's easier to scale the application layer than the database layer. Are you currently having performance issues with the database? If so, consider moving the logic to Ruby and adding more resources to your application server. If not, maybe it's too soon to worry about this.
I'm really not seeing how I would get the majority of the work/calculations into the model, I understand scopes but how would you put the date range into a scope and still utilise GET params?
Have you seen has_scope? This is a great gem that lets you define scopes in your models and have them automatically get applied to controller actions. I generally use this for filtering/searching, but it seems like you might have a good use case for it.
If you could give an example on creating an array via a broad database call and then doing various calculations on that array and then passing those variables to the template that would be awesome.
This is not a great fit for Stack Overflow and it's really not far from what you would be doing in a standard Rails application. I would read the Rails guide and a Ruby book and it won't be too hard to figure out.

Resources