Rails 3 Select Distinct Order By Number of Occurrences - ruby-on-rails

In one of my models I have a country column. How would I go about selecting the top 3 countries based on how many models have that country?

Without any further information you can try this out:
YourModel.group('country').order('count_country DESC').limit(3).count('country')
when you call count on a field rails automatically adds an AS count_field_name field to your query.
Count must be called at the end of the query because it returns an ordered hash.

Related

Count total number of records in Rails join table

I have two models, users and departments, and a join table users_departments to enable a has_and_belongs_to_many association between them. I am using PostgreSQL as the database.
# users table columns
id
name
# departments table columns
id
name
# users_departments table columns
user_id
department_id
What is the best way in Rails for counting the total number of records in the users_departments table? Preferably without creating a new model class.
Please note that I do not want to count the records for a specific user or department (user.departments.count / departments.users.count), but the total number records for the table, considering all users and departments.
The best way is to just create a model called UsersDepartment and do a nice and easy query on that.
count = UsersDepartment.count
You can query the table directly however with exec_query which gives you an ActiveRecord::Result object to play with.
result = ActiveRecord::Base.connection.exec_query('select count(*) as count from users_departments')
count = result[0]['count']

Report using Rails ActiveRecord group by

I am trying to generate a report to screen of accounting transaction history. In most situations it is one display row per record in the AccountingTransaction table. But occasionally there are transactions that I wish to display to the end user as one transaction which are really, behind the scenes, two accounting transactions. This is caused by deferral of revenues and fund splitting since this app is a fund accounting app.
If I display all rows one by one, those double entries look odd to the user since the fund splitting and deferral is "behind the scenes". So I want to roll up all the related transactions into one display row on screen.
I have my query now using group by to group the related transactions
#history = AccountingTransaction.where("customer_id in (?) AND no_download <> 1", customers_in_account).group(:transaction_type_id, :reference_id).order(:created_at)
as I loop through I get the transactions grouped as I want but I am struggling with how to display the total sum of the 'credit' field for all records in the group. (It is only showing the credit for the first record of the group) If I add a .sum(:credit) to my query, of course, it returns the sums just as I want but not all the other data.
Is there a way for me to group these records like in my #history query and also get the sum of the credit field for each respective group?
* Addition *
What I really want is what the following SQL query would give me.
SELECT transaction_type_id, reference_id, sum(credit)
WHERE customer_id in (21,22,23,24) AND no_download <> 1
GROUP BY reference_id, transaction_type_id ORDER BY created_at
I'm not sure you can do "ORDER BY created_at" and not include it in the select fields, but here is an example.
#history = AccountingTransaction.
select([:reference_id, :transaction_type_id, :created_at]).
select(AccountingTransaction.arel_table[:credit].sum.as("credit_sum")).
where("customer_id in (?) AND no_download <> 1", customers_in_account).
group(:transaction_type_id, :reference_id).
order(:created_at)
To access the credit_sum you could do:
#history[0].attributes["credit_sum"]
I guess if you'd like, you could create a method:
def credit_sum
attributes["credit_sum"]
end
EDIT *
As stated in comments you can access the attribute directly:
#history[0].credit_sum

Change Data Capture with table joins in ETL

In my ETL process I am using Change Data Capture (CDC) to discover only rows that have been changed in the source tables since the last extraction. Then I do the transformation only for this rows. The problem is when I have for example 2 tables which I want to join into one dimension, and only one of them has changed. For example I have table Countries and Towns as following:
Countries:
ID Name
1 France
Towns:
ID Name Country_ID
1 Lyon 1
Now lets say a new row is added to Towns table:
ID Name Country_ID
1 Lyon 1
2 Paris 2
The Countries table has not been changed, so CDC for these tables shows me only the row from Towns table. The problem is when I do the join between Countries and Towns, there is no row in Countries change set, so the join will result in empty set.
Do you have an idea how to solve it? Of course there might be more difficult cases, involving 3 and more tables, and consequential joins.
This is a typical problem found when doing Realtime Change-Data-Capture, or even Incremental-only daily changes.
There's multiple ways to solve this.
One way would be to do your joins on the natural keys in the dimension or mapping table, to get the associated country (SELECT distinct country_name, [..other attributes..] from dim_table where country_id = X).
Another alternative would be to do the join as part of the change capture process - when a row is loaded to towns, a trigger goes off that loads the foreign key values into the associated staging tables (country, etc).
There is allot i could babble on for more information on but i will be specific to what is in your question. I would suggest the following to get the results...
1st Pass is where everything matches via the join...
Union All
2nd Pass Gets all towns where there isn't a country
(left outer join with a where condition that
requires the ID in the countries table to be null/missing).
You would default the Country ID value in that unmatched join to something designated as a "Unmatched Value" typically 0 or -1 is used or a series of standard -negative numbers that you could assign descriptions to later to identify why data is bad for your example -1 could be "Found Town Without Country".

How do I get an array of unique values in my controller that is NOT connected to a model?

I have a table named Donations which has a column named season. Season contains the actual season the donation was made in... like 2011 or 2010, etc.
I also have a controller named ReportController that would like to pass a unique list of seasons from the Donations table.
In the ReportController, how do I get an array of those unique values? Is there something like #valid_seasons = Donations.find(:all).unique{|x| x.season} that I use in my reportcontroller? Will I then be able to pass #valid_seasons as an option for select statement in the views/report/foo.html.erb file?
You can use uniq_by
Donations.all.uniq_by{|x| x.season}
However this still executes a select * on your table.
You might be better off with using raw sql. Something like:
Donations.find_by_sql("SELECT * FROM donations GROUP BY season")
The first example will retrieve all the records and then filter. The second will only fetch the first row for each unique season.
You don't mention if this is rails 3 but, if so, this should do the trick:
Donations.select(:season).group(:season)
This will execute a proper group by:
SELECT season FROM "donations" GROUP BY season

SQL Syntax Challenge

I have two tables, one containing a list of different options users can select from. For example:
tbl_options
id_option
option
The next table I use to store which of these options the user selects. For example:
tbl_selected
id_selected
id_option
id_user
I use PHP to loop through the tbl_options table to generate a full list of checkboxes that the user can select from. When a user selects an option, the id_option and id_user are stored in the tbl_selected table. When a user deselects an option, the id_selected record is deleted from the tbl_selected table.
The challenge I am having is the best way to retrieve the full list of options in tbl_options, plus having the query indicate the associated records stored in the tbl_selected table.
I've tried LEFT JOIN'ing tbl_options to tbl_selected which provides me with the full list of options, but as soon as I add the WHERE id_user = ### the query only returns those records with values in tbl_selected. Ideally, I would like to see the results from a query as follows:
id_option option id_user
1 Apples 3
2 Oranges 3
3 Bananas
4 Pears
5 Peaches 3
This would indicate that user #3 has stored Apples, Oranges and Peaches. This also indicates that user #3 has not selected Bananas or Pears.
Is this possible using a SQL statement or should I pursue a different technique?
Your problem is that the user-restriction is applied to the whole query. To apply it only to the Join condition you need to add it to the ON clause like this:
select o.id_option, o.[option], s.id_user
from tbl_options o
left outer join tbl_selected s
on o.id_option = s.id_option and s.id_user = 3

Resources