Rails 4 bulk updating array of models - ruby-on-rails

I have an array of ActiveRecord model and I want to renumber one column and bulk update them.
Code looks like this:
rules = subject.email_rules.order(:number)
rules.each_with_index do |rule, index|
rule.number = index + 1
end
EmailRule.update(rules.map(&:id), rules.map { |r| { number: r.number } })
But this creates N SQL statements and I would like 1, is there a way to do it?

Assuming you are using postgres you can use row_number and the somewhat strange looking UPDATE/FROM construct. This is the basic version:
UPDATE email_rules target
SET number = src.idx
FROM (
SELECT
email_rules.id,
row_number() OVER () as idx
FROM email_rules
) src
WHERE src.id = target.id
You might need to scope this on a subject and of course include the order by number which could look like this:
UPDATE email_rules target
SET number = src.idx
FROM (
SELECT
email_rules.id,
row_number() OVER (partition by subject_id) as idx
FROM email_rules
ORDER BY number ASC
) src
WHERE src.id = target.id
(assuming subject_id is the foreign key that associates subjects/email_rules)

One alternative to you is to put all interaction in a transaction and it will at least make one single commit at the end, making it way faster.
ActiveRecord::Base.transaction do
...
end

Related

How to combine 3 SQL request into one and order it Rails

I'm creating filter for my Point model on Ruby on Rails app. App uses ActiveAdmin+Ransacker for filters. I wrote 3 methods to filter the Point:
def self.filter_by_customer_bonus(bonus_id)
Point.joins(:customer).where('customers.bonus_id = ?', bonus_id)
end
def self.filter_by_classificator_bonus(bonus_id)
Point.joins(:setting).where('settings.bonus_id = ?', bonus_id)
end
def self.filter_by_bonus(bonus_id)
Point.where(bonus_id: bonus_id)
end
Everything works fine, but I need to merge the result of 3 methods to one array. When The Points.count (on production server for example) > 1000000 it works too slow, and I need to merge all of them to one method. The problem is that I need to order the final merged array this way:
Result array should start with result of first method here, the next adding the second method result, and then third the same way.
Is it possible to move this 3 sqls into 1 to make it work faster and order it as I write before?
For example my Points are [1,2,3,4,5,6,7,8,9,10]
Result of first = [1,2,3]
Result of second = [2,3,4]
Result of third = [5,6,7]
After merge I should get [1,2,3,4,5,6,7] but it should be with the result of 1 method, not 3+merge. Hope you understand me :)
UPDATE:
The result of the first answer:
Point Load (8.0ms) SELECT "points".* FROM "points" INNER JOIN "customers" ON "customers"."number" = "points"."customer_number" INNER JOIN "managers" ON "managers"."code" = "points"."tp" INNER JOIN "settings" ON "settings"."classificator_id" = "managers"."classificator_id" WHERE "points"."bonus_id" = $1 AND "customers"."bonus_id" = $2 AND "settings"."bonus_id" = $3 [["bonus_id", 2], ["bonus_id", 2], ["bonus_id", 2]]
It return an empty array.
You can union these using or (documentation):
def self.filter_trifecta(bonus_id)
(
filter_by_customer_bonus(bonus_id)
).or(
filter_by_classificator_bonus(bonus_id)
).or(
filter_by_bonus(bonus_id)
)
end
Note: you might have to hoist those joins up to the first condition — I'm not sure of or will handle those forks well as-is.
Below gives you all the results in a single query. if you have indexes on the foreign keys used here it should be able to handle million records:
The one provided earlier does an AND on all 3 queries, thats why you had zero results, you need union, below should work. (Note: If you are using rails 5, there is active record syntax for union, which the first commenter provided.)
Updated:
Point.from(
"(#{Point.joins(:customer).where(customers: {bonus_id: bonus_id).to_sql}
UNION
#{Point.joins(:setting).where(settings: {bonus_id: bonus_id}).to_sql}
UNION
#{Point.where(bonus_id: bonus_id).to_sql})
AS points")
Instead you can also use your 3 methods like below:
Point.from("(#{Point.filter_by_customer_bonus(bonus_id).to_sql}
UNION
#{Point.filter_by_classificator_bonus(bonus_id).to_sql}
UNION
#{Point.filter_by_bonus(bonus_id).to_sql}
) as points")

Getting Conditional Count in Join with Laravel Query Builder

I am trying to achieve the following with Laravel Query builder.
I have a table called deals . Below is the basic schema
id
deal_id
merchant_id
status
deal_text
timestamps
I also have another table called merchants whose schema is
id
merchant_id
merchant_name
about
timestamps
Currently I am getting deals using the following query
$deals = DB::table('deals')
-> join ('merchants', 'deals.merchant_id', '=', 'merchants.merchant_id')
-> where ('merchant_url_text', $merchant_url_text)
-> get();
Since only 1 merchant is associated with a deal, I am getting deals and related merchant info with the query.
Now I have a 3rd table called tbl_deal_votes. Its schema looks like
id
deal_id
vote (1 if voted up, 0 if voted down)
timestamps
What I want to do is join this 3rd table (on deal_id) to my existing query and be able to also get the upvotes and down votes each deal has received.
To do this in a single query you'll probably need to use SQL subqueries, which doesn't seem to have good fluent query support in Laravel 4/5. Since you're not using Eloquent objects, the raw SQL is probably easiest to read. (Note the below example ignores your deals.deal_id and merchants.merchant_id columns, which can likely be dropped. Instead it just uses your deals.id and merchants.id fields by convention.)
$deals = DB::select(
DB::raw('
SELECT
deals.id AS deal_id,
deals.status,
deals.deal_text,
merchants.id AS merchant_id,
merchants.merchant_name,
merchants.about,
COALESCE(tbl_upvotes.upvotes_count, 0) AS upvotes_count,
COALESCE(tbl_downvotes.downvotes_count, 0) AS downvotes_count
FROM
deals
JOIN merchants ON (merchants.id = deals.merchant_id)
LEFT JOIN (
SELECT deal_id, count(*) AS upvotes_count
FROM tbl_deal_votes
WHERE vote = 1 && deal_id
GROUP BY deal_id
) tbl_upvotes ON (tbl_upvotes.deal_id = deals.id)
LEFT JOIN (
SELECT deal_id, count(*) AS downvotes_count
FROM tbl_deal_votes
WHERE vote = 0
GROUP BY deal_id
) tbl_downvotes ON (tbl_downvotes.deal_id = deals.id)
')
);
If you'd prefer to use fluent, this should work:
$upvotes_subquery = '
SELECT deal_id, count(*) AS upvotes_count
FROM tbl_deal_votes
WHERE vote = 1
GROUP BY deal_id';
$downvotes_subquery = '
SELECT deal_id, count(*) AS downvotes_count
FROM tbl_deal_votes
WHERE vote = 0
GROUP BY deal_id';
$deals = DB::table('deals')
->select([
DB::raw('deals.id AS deal_id'),
'deals.status',
'deals.deal_text',
DB::raw('merchants.id AS merchant_id'),
'merchants.merchant_name',
'merchants.about',
DB::raw('COALESCE(tbl_upvotes.upvotes_count, 0) AS upvotes_count'),
DB::raw('COALESCE(tbl_downvotes.downvotes_count, 0) AS downvotes_count')
])
->join('merchants', 'merchants.id', '=', 'deals.merchant_id')
->leftJoin(DB::raw('(' . $upvotes_subquery . ') tbl_upvotes'), function($join) {
$join->on('tbl_upvotes.deal_id', '=', 'deals.id');
})
->leftJoin(DB::raw('(' . $downvotes_subquery . ') tbl_downvotes'), function($join) {
$join->on('tbl_downvotes.deal_id', '=', 'deals.id');
})
->get();
A few notes about the fluent query:
Used the DB::raw() method to rename a few selected columns.
Otherwise, there would have been a conflict between deals.id
and merchants.id in the results.
Used COALESCE to default null votes to 0.
Split the subqueries into separate PHP strings to improve readability.
Used left joins for the subqueries so deals with no upvotes/downvotes still show up.

How to convert SQL statement "delete from TABLE where someID not in (select someID from Table group by property1, property2)

I'm trying to convert the following SQL statement to Core Data:
delete from SomeTable
where someID not in (
select someID
from SomeTable
group by property1, property2, property3
)
Basically, I want to retrieve and delete possible duplicates in a table where a record is deemed a duplicate if property1, property2 and property3 are equal to another record.
How can I do that?
PS: As the title says, I'm trying to convert the above SQL statement into iOS Core Data methods, not trying to improve, correct or comment on the above SQL, that is beyond the point.
Thank you.
It sounds like you are asking for SQL to accomplish your objective. Your starting query won't do what you describe, and most databases wouldn't accept it at all on account of the aggregate subquery attempting to select a column that is not a function of the groups.
UPDATE
I had initially thought the request was to delete all members of each group containing dupes, and wrote code accordingly. Having reinterpreted the original SQL as MySQL would do, it seems the objective is to retain exactly one element for each combination of (property1, property2, property3). I guess that makes more sense anyway. Here is a standard way to do that:
delete from SomeTable st1
where someID not in (
select min(st2.someId)
from SomeTable st2
group by property1, property2, property3
)
That's distinguished from the original by use of the min() aggregate function to choose a specific one of the someId values to retain from each group. This should work, too:
delete from SomeTable st1
where someID in (
select st3.someId
from SomeTable st2
join SomeTable st3
on st2.property1 = st3.property1
and st2.property2 = st3.property2
and st2.property3 = st3.property3
where st2.someId < st3.someId
)
These two queries will retain the same rows. I like the second better, even though it's longer, because the NOT IN operator is kinda nasty for choosing a small number of elements from a large set. If you anticipate having enough rows to be concerned about scaling, though, then you should try both, and perhaps look into optimizations (for example, an index on (property1, property2, property3)) and other alternatives.
As for writing it in terms of Core Data calls, however, I don't think you exactly can. Core Data does support grouping, so you could write Core Data calls that perform the subquery in the first alternative and return you the entity objects or their IDs, grouped as described. You could then iterate over the groups, skip the first element of each, and call Core Data deletion methods for all the rest. The details are out of scope for the SO format.
I have to say, though, that doing such a job in Core Data is going to be far more costly than doing it directly in the database, both in time and in required memory. Doing it directly in the database is not friendly to an ORM framework such as Core Data, however. This sort of thing is one of the tradeoffs you've chosen by going with an ORM framework.
I'd recommend that you try to avoid the need to do this at all. Define a unique index on SomeTable(property1, property2, property3) and do whatever you need to do to avoid trying to creating duplicates or to gracefully recover from a (failed) attempt to do so.
DELETE SomeTable
FROM SomeTable
LEFT OUTER JOIN (
SELECT MIN(RowId) as RowId, property1, property2, property3
FROM SomeTable
GROUP BY property1, property2, property3
) as KeepRows ON
SomeTable.RowId = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL
A few pointers for doing this in iOS: Before iOS 9 the only way to delete objects is individually, ie you will need to iterate through an array of duplicates and delete each one. (If you are targeting iOS9, there is a new NSBatchDeleteRequest which will help delete them all in one go - it does act directly on the store but also does some cleanup to eg. ensure relationships are updated where necessary).
The other problem is identifying the duplicates. You can configure a fetch to group its results (see the propertiesToGroupBy of NSFetchRequest), but you will have to specify NSDictionaryResultType (so the results are NOT the objects themselves, just the values from the relevant properties.) Furthermore, CoreData will not let you fetch properties (other than aggregates) that are not specified in the GROUP BY. So the suggestion (in the other answer) to use min(someId) will be necessary. (To fetch an expression such as this, you will need to use an NSExpression, embed it in an NSExpressionDescription and pass the latter in propertiesToFetch of the fetch request).
The end result will be an array of dictionaries, each holding the someId value of your prime records (ie the ones you don't want to delete), from which you have then got to work out the duplicates. There are various ways, but none will be very efficient.
So as the other answer says, duplicates are better avoided in the first place. On that front, note that iOS 9 allows you to specify attributes that you would like to be unique (individually or collectively).
Let me know if you would like me to elaborate on any of the above.
Group-wise Maximum:
select t1.someId
from SomeTable t1
left outer join SomeTable t2
on t1.property1 = t2.property1
and t1.property2 = t2.property2
and t1.property3 = t2.property3
and t1.someId < t2.someId
where t2.someId is null;
So, this could be the answer
delete SomeTable
where someId not in
(select t1.someId
from SomeTable t1
left outer join SomeTable t2
on t1.property1 = t2.property1
and t1.property2 = t2.property2
and t1.property3 = t2.property3
and t1.someId < t2.someId
where t2.someId is null);
Sqlfiddle demo
You can use exists function to check for each row if there is another row that exists whose id is not equal to the current row and all other properties that define the duplicate criteria of each row are equal to all the properties of the current row.
delete from something
where
id in (SELECT
sm.id
FROM
sometable sm
where
exists( select
1
from
sometable sm2
where
sm.prop1 = sm2.prop1
and sm.prop2 = sm2.prop2
and sm.prop3 = sm2.prop3
and sm.id != sm2.id)
);
I think you could easily handle this by creating a derived duplicate_flg column and set it to 1 when all three property values are equal. Once that is done, you could just delete those records where duplicate_flg = 1. Here is a sample query on how to do this:
--retrieve all records that has same property values (property1,property2 and property3)
SELECT *
FROM (
SELECT someid
,property1
,property2
,property3
,CASE
WHEN property1 = property2
AND property1 = property3
THEN 1
ELSE 0
END AS duplicate_flg
FROM SomeTable
) q1
WHERE q1.duplicate_flg = 1;
Here is a sample delete statement:
DELETE
FROM something
WHERE someid IN (
SELECT someid
FROM (
SELECT someid
,property1
,property2
,property3
,CASE
WHEN property1 = property2
AND property1 = property3
THEN 1
ELSE 0
END AS duplicate_flg
FROM SomeTable
) q1
WHERE q1.duplicate_flg = 1
);
Simply, if you want to remove duplicate from table you can execute below Query :
delete from SomeTable
where rowid not in (
select max(rowid)
from SomeTable
group by property1, property2, property3
)
if you want to delete all duplicate records try the below code
WITH tblTemp as
(
SELECT ROW_NUMBER() Over(PARTITION BY Property1,Property2,Property3 ORDER BY Property1) As RowNumber,* FROM Table_1
)
DELETE FROM tblTemp where RowNumber >1
Hope it helps
Use the below query to delete the duplicate data from that table
delete from SomeTable where someID not in
(select Min(someID) from SomeTable
group by property1+property2+property3)

How to build inner join in Rails with conditions?

I've a model StockUpdate which keeps track of stocks for every product for a store. Table attributes are: :product_id, :stock, :store_id. I was trying to find out last entry for every product for a given store. According to that I build my query in PGAdmin which is given below and it's working fine. I'm new in Rails and I don't know how to represent it in Model. Please help.
SELECT a.*
FROM stock_updates a
INNER JOIN
(
SELECT product_id, MAX(id) max_id
FROM stock_updates where store_id = 9 and stock > 0
GROUP BY product_id
) b ON a.product_id = b.product_id AND
a.id = b.max_id
I does not clearly understand what you want to do, but I think you can do something like this:
class StockUpdate < ActiveRecord::Base
scope :a_good_name, -> { joins(:product).where('store_id = ? and stock > ?', 9, 0) }
end
You can all call StoclUpdate.a_good_name.explain to check the generated sql
What you need is really simple and can be easily accomplished with 2 queries. Otherwise it becomes very complicated in a single query (it's still doable though):
store_ids = [0, 9]
latest_stock_update_ids = StockUpdate.
where(store_id: store_ids).
group(:product_id).
maximum(:id).
values
StockUpdate.where(id: latest_stock_update_ids)
Two queries, without any joins necessary. The same could be possible with a single query too. But like your original code, it would include subqueries.
Something like this should work:
StockUpdate.
where(store_id: store_ids).
where("stock_updates.id = (
SELECT MAX(su.id) FROM stock_updates AS su WHERE (
su.product_id = stock_updates.product_id
)
)
")
Or perhaps:
StockUpdate.where("id IN (
SELECT MAX(su.id) FROM stock_updates AS su GROUP BY su.product_id
)")
And to answer your original question, you can manually specify a joins like so:
Model1.joins("INNER JOINS #{Model2.table_name} ON #{conditions}")
# That INNER JOINS can also be LEFT OUTER JOIN, etc.

PGError: ERROR: aggregates not allowed in WHERE clause on a AR query of an object and its has_many objects

Running the following query on a has_many association. Recommendations has_many Approvals.
I am running, rails 3 and PostgreSQL:
Recommendation.joins(:approvals).where('approvals.count = ?
AND recommendations.user_id = ?', 1, current_user.id)
This is returning the following error: https://gist.github.com/1541569
The error message tells you:
aggregates not allowed in WHERE clause
count() is an aggregate function. Use the HAVING clause for that.
The query could look like this:
SELECT r.*
FROM recommendations r
JOIN approvals a ON a.recommendation_id = r.id
WHERE r.user_id = $current_user_id
GROUP BY r.id
HAVING count(a.recommendation_id) = 1
With PostgreSQL 9.1 or later it is enough to GROUP BY the primary key of a table (presuming recommendations.id is the PK). In Postgres versions before 9.1 you had to include all columns of the SELECT list that are not aggregated in the GROUP BY list. With recommendations.* in the SELECT list, that would be every single column of the table.
I quote the release notes of PostgreSQL 9.1:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
Simpler with a sub-select
Either way, this is simpler and faster, doing the same:
SELECT *
FROM recommendations r
WHERE user_id = $current_user_id
AND (SELECT count(*)
FROM approvals
WHERE recommendation_id = r.id) = 1;
Avoid multiplying rows with a JOIN a priori, then you don't have to aggregate them back.
Looks like you have a column named count and PostgreSQL is interpreting that column name as the count aggregate function. Your SQL ends up like this:
SELECT "recommendations".*
FROM "recommendations"
INNER JOIN "approvals" ON "approvals"."recommendation_id" = "recommendations"."id"
WHERE (approvals.count = 1 AND recommendations.user_id = 1)
The error message specifically points at the approvals.count:
LINE 1: ...ecommendation_id" = "recommendations"."id" WHERE (approvals....
^
I can't reproduce that error in my PostgreSQL (9.0) but maybe you're using a different version. Try double quoting that column name in your where:
Recommendation.joins(:approvals).where('approvals."count" = ? AND recommendations.user_id = ?', 1, current_user.id)
If that sorts things out then I'd recommend renaming your approvals.count column to something else so that you don't have to worry about it anymore.

Resources