Rails transaction isn't acting atomic?

Rails transaction isn't acting atomic? - ruby-on-rails

I have an accounting system I wrote which follows standard dual-entry accounting practices.
There is a feature of dual entry accounting called 'trial balance' where you can verify the entire system is correct because when you run it, it will always equal 0.00
I have written tests and always run my trial balance when the system is 'stopped', but under write-heavy database load during seeding lots of records, I noticed my trial balance is WRONG about 1 out of 10 tries.
When it's at rest (no inserts), its always correct at 0.00 however.
When I insert transactions they're always in a transaction, like this:
2000.times do |i|
ActiveRecord::Base.transaction do
puts "#{i} ==================================================================="
entry = JournalEntry.create!(description: 'Purchase mower on credit', user: user)
entry.line_items.create!(amount: Money.from_amount(1551.75).cents, account: property.accounts.find_by(name: 'Equipment'), side: :debit)
entry.line_items.create!(amount: Money.from_amount(1551.75).cents, account: property.accounts.find_by(name: 'Accounts Payable'), side: :credit)
end
end
The fact it breaks under load makes me think I'm not understanding something vital about how Rails transactions work......
What could be causing this?
FWIW my trial balance function (GeneralLedger.new(property).trial_balance) executes the following pseodo-SQL (NOT in a transaction):
SELECT sum(...) WHERE account = 'asset'
SELECT sum(...) WHERE account = 'liability'
SELECT sum(...) WHERE account = 'equity'
SELECT sum(...) WHERE account = 'income'
SELECT sum(...) WHERE account = 'expense'
I then add them together according to the Accounting Formula to arrive at 0.00:
def trial_balance
balance_category(:asset) - (balance_category(:liability) + balance_category(:equity) + balance_category(:income) - balance_category(:expense))
end
The balance_category function is what triggers each SELECT for a total of 5 times, once for each category.
Because its returning 0 it means it's somehow selecting when it's halfway inserted........... I have no idea how this is happening?
I could understand if the creation of the journal entry/line item was not in a transaction and it was SELECTing halfway inserted rows, but it should only select from the entire group as a whole after the transaction ends?

If you want to avoid repeated statements, collapse it into one, something of this form:
SELECT account, SUM(...) AS amount FROM ...
WHERE account IN ('asset', 'liability', ...)
GROUP BY account
You can fetch these like this:
where(account: ACCOUNT_TYPES).group(:account).pluck('account, SUM(...)')
Where ACCOUNT_TYPES is an array of the account types you need to fetch.
You can always take this pluck form and convert to a quick look-up hash with .to_h then use it like this:
balance_category = ...where(...)...pluck(...).to_h
balance_category[:asset]
If you need a default value, consider:
balance_category = Hash.new(0).merge(...where(...)...pluck(...).to_h)
Where that default can be integer (0) or a float (0.0) or anything at all.

Related

I need advice in speeding up this rails method that involves many queries

I'm trying to display a table that counts webhooks and arranges the various counts into cells by date_sent, sending_ip, and esp (email service provider). Within each cell, the controller needs to count the webhooks that are labelled with the "opened" event, and the "sent" event. Our database currently includes several million webhooks, and adds at least 100k per day. Already this process takes so long that running this index method is practically useless.
I was hoping that Rails could break down the enormous model into smaller lists using a line like this:
#today_hooks = #m_webhooks.where(:date_sent => this_date)
I thought that the queries after this line would only look at the partial list, instead of the full model. Unfortunately, running this index method generates hundreds of SQL statements, and they all look like this:
SELECT COUNT(*) FROM "m_webhooks" WHERE "m_webhooks"."date_sent" = $1 AND "m_webhooks"."sending_ip" = $2 AND (m_webhooks.esp LIKE 'hotmail') AND (m_webhooks.event LIKE 'sent')
This appears that the "date_sent" attribute is included in all of the queries, which implies that the SQL is searching through all 1M records with every single query.
I've read over a dozen articles about increasing performance in Rails queries, but none of the tips that I've found there have reduced the time it takes to complete this method. Thank you in advance for any insight.
m_webhooks.controller.rb
def index
def set_sub_count_hash(thip) {
gmail_hooks: {opened: a = thip.gmail.send(#event).size, total_sent: b = thip.gmail.sent.size, perc_opened: find_perc(a, b)},
hotmail_hooks: {opened: a = thip.hotmail.send(#event).size, total_sent: b = thip.hotmail.sent.size, perc_opened: find_perc(a, b)},
yahoo_hooks: {opened: a = thip.yahoo.send(#event).size, total_sent: b = thip.yahoo.sent.size, perc_opened: find_perc(a, b)},
other_hooks: {opened: a = thip.other.send(#event).size, total_sent: b = thip.other.sent.size, perc_opened: find_perc(a, b)},
}
end
#m_webhooks = MWebhook.select("date_sent", "sending_ip", "esp", "event", "email").all
#event = params[:event] || "unique_opened"
#m_list_of_ips = [#List of three ip addresses]
end_date = Date.today
start_date = Date.today - 10.days
date_range = (end_date - start_date).to_i
#count_array = []
date_range.times do |n|
this_date = end_date - n.days
#today_hooks = #m_webhooks.where(:date_sent => this_date)
#count_array[n] = {:this_date => this_date}
#m_list_of_ips.each_with_index do |ip, index|
thip = #today_hooks.where(:sending_ip => ip) #Stands for "Today Hooks ip"
#count_array[n][index] = set_sub_count_hash(thip)
end
end

Well, your problem is very simple, actually. You gotta remember that when you use where(condition), the query is not straight executed in the DB.
Rails is smart enough to detect when you need a concrete result (a list, an object, or a count or #size like in your case) and chain your queries while you don't need one. In your code, you keep chaining conditions to the main query inside a loop (date_range). And it gets worse, you start another loop inside this one adding conditions to each query created in the first loop.
Then you pass the query (not concrete yet, it was not yet executed and does not have results!) to the method set_sub_count_hash which goes on to call the same query many times.
Therefore you have something like:
10(date_range) * 3(ip list) * 8 # (times the query is materialized in the #set_sub_count method)
and then you have a problem.
What you want to do is to do the whole query at once and group it by date, ip and email. You should have a hash structure after that, which you would pass to the #set_sub_count method and do some ruby gymnastics to get the counts you're looking for.
I imagine the query something like:
main_query = #m_webhooks.where('date_sent > ?', 10.days.ago.to_date)
.where(sending_ip:#m_list_of_ips)
Ok, now you have one query, which is nice, but I think you should separate the query in 4 (gmail, hotmail, yahoo and other), which gives you 4 queries (the first one, the main_query, will not be executed until you call for materialized results, don forget it). Still, like 100 times faster.
I think this is the result that should be grouped, mapped and passed to #set_sub_count instead of passing the raw query and calling methods on it every time and many times. It will be a little work to do the grouping, mapping and counting for sure, but hey, it's faster. =)

In case this helps anybody else, I learned how to fill a hash with counts in a much simpler way. More importantly, this approach runs a single query (as opposed to the 240 queries that I was running before).
#count_array[esp_index][j] = MWebhook.where('date_sent > ?', start_date.to_date)
.group('date_sent', 'sending_ip', 'event', 'esp').count

My program isn't saving/displaying all the bills I've added

I've flirted with learning web dev in the past and haven't had the time as I am a full time Business Student.
I started digging back in today and decided to take a break from the learning and practice what I've learned today by writing a simple program that allows the user to enter in their bills and will eventually calculate how much disposable income they have after their bills are paid each month.
My problem is that the program runs through perfectly, the loop is continuing/exiting when it should, but either the program is not storing the users input in the hash like I'm wanting it to or it's not displaying all the bills entered as it should. Here is my program:
# This program allows you to assign monthly payments
# to their respective bills and will automatically
# calculate how much disposable income you have
# after your bills are paid
# Prompts user to see if they have any bills to enter
puts "Do you have any bills you would like to enter, Yes or No?"
new_bill = gets.chomp.downcase
until new_bill == 'no'
# Creates a hash to store a key/value pair
# of the bill name and the respection payment amount
bills = {}
puts "Enter the bill name: "
bill_name = gets.chomp
puts "How much is this bill?"
pay_amt = gets.chomp
bills[bill_name] = pay_amt
puts "Would you like to add another bill, Yes or No?"
new_bill = gets.chomp.downcase
end
bills.each do |bill_name, pay_amt|
puts "Your #{bill_name} bill is $#{pay_amt}."
end
My questions are:
Is my hash set up properly to store the key/value pairs from the users input?
If not, how can I correct it?
I'm getting only the last bill that was entered by the user. I've tried several bills at a time but only getting the last entry.
As I stated, I'm a noob but I'm extremely ambitious to learn. I've referred to to the ruby docs on hashes to see if there is an error in my code but was able to locate a solution (still finding my way around ruby docs).
Any help is appreciated! Also, if you have any recommendations on ways I can make my code more efficient, could you point me in the direction where I can obtain the appropriate information to do so?
Thank you.
Edit:
The main question has been answered. This is a follow up question to the same program - I'm getting an error message budget_calculator.rb:35:in -': Hash can't be coerced into Float (TypeError)
from budget_calculator.rb:35:in'
From the following code (keep in mind of the program above) -
# Displays the users bills
bills_hash.each {|key,value| puts "Your #{key} bill is $#{value}."}
# Get users net income
puts "What is your net income?"
net_income = gets.chomp.to_f
#Calculates the disposable income of the user
disposable_income = net_income - bills_hash.each {|value| value}
puts disposable_income
I understand the error is appearing from this line of code:
disposable_income = net_income - bills_hash.each {|value| value}
I'm just not understanding why this is unacceptable. I'm trying to subtract all of the values in the hash (pay_amt) from the net income to derive the disposable income.

This is the part that's getting you:
bills = {}
You're resetting the hash every time the program loops. Try declaring bills at the top of the program.
As to your second question about bills_hash, it's not working because the program is attempting to subtract a hash from a float. You've got the right idea, but the way it's set up, it's not going to just subtract each key from the net_income in turn.
The return value of #each is the original hash that you were looping over. You can see this if you open IRB and type
[1,2,3].each {|n| puts n}
The block is evaluated for each element of the list, but the final return value is the original list:
irb(main):007:0> [1,2,3].each {|n| puts n}
1
2
3
=> [1, 2, 3] # FINAL RETURN VALUE
So according to the order of operations, your #each block is iterating, then returning the original bills_hash hash, and then trying to subtract that hash from net_income, which looks like this (assuming my net_income is 1000):
1000 - {rent: 200, video_games: 800}
hence the error.
There are a couple ways you could go about fixing this. One would be to sum all of the values in bills_hash as its own variable, then subtract that from the net_income:
total_expenditures = bills_hash.values.inject(&:+) # sum the values
disposable_income = net_income - total_expenditures
Using the same #inject method, this could also be done in one function call:
disposable_income = bills_hash.values.inject(net_income, :-)
# starting with net_income, subtract each value in turn
See the documentation for Enumerable#inject.
It's a very powerful and useful method to know. But make sure you go back and understand how return values work and why the original setup was raising an exception.

Random record in Rails

Now i am making a web application (Online word learning) that allow user to choose the correct meaning of the word. When they click start, it will select randomly one word from the database and show to the user. After the user choose the answer, it will go to the next question.
Please see the image below:
If i use, Word.order("rand()").limit(1), i wonder can the word will be repeated with the last selected word?
With the app as in the image above, any better ideas to solve this problem?

I would add the following scopes to the model (depends on the database you are using):
# in app/models/word.rb
# 'RANDOM' works with postgresql and sqlite, whereas mysql uses 'RAND'
scope :random, -> { order('RAND()') }
scope :without, ->(ids) { where.not(id: ids) }
With that scopes you can write the following query in your controller:
#word = Word.random.without(params[:last_ids]).limit(1)
When you want to load new random elements in the view, just add the ids of the current words to the request. This ensures that this ids (params[:last_ids]) are not randomly choosen.

Long story short, in order not to repeat yourself, you have to store those words somewhere. Either the ones that are yet to be shown, or the ones that have been already displayed. And If I were you I would go one of the following routes:
Fetch all the words before starting the quiz and randomize them. This could be something like:
session[:words] = Word.order("RAND()").select(:id).take(10)
Or even better by defining a scope for your random words:
class Word < ActiveRecord::Base
# ...
scope :random_quiz, -> { order("RAND()").take(10).pluck(:id) }
# ...
end
# ... in the controller when the quiz is getting started:
session[:words] = Word.random_quiz
# ... in the controller when you want to show the word:
new_word = Word.find(sessions[:words].pop)
As ORDER BY RAND() is a very expensive operation, this might make sense. And then you just pop the word ID's one by one by using session[:words].pop and present the questions.
This way it will guarantee that you won't repeat the words in the quiz and give you pretty optimal performance.
Fetch words one by one as you're progressing with giving out the questions and save the ones you've already asked about.
class Word < ActiveRecord::Base
# ...
def self.random_word(exclusions)
eligible = where('id NOT IN (?)', exclusions)
eligible.offset(rand(0..eligible.count)).take!(1)
end
# ..
end
# ... in the controller when you need a new word:
session[:words_shown] ||= [ ]
new_word = Word.random_word(session[:words_shown])
# mark the word as shown:
session[:words_shown].push(new_word.id)
You might have noticed the weird way of getting a random record in the second example. It turns out to be more efficient as it generates the following query:
SELECT * FROM words OFFSET _random_number_ LIMIT 1
Instead of:
SELECT * FROM words ORDER BY RAND() LIMIT 1
The first one is just an ordinary select, while the second one requires unindexed sorting by RAND() of the entire table before giving you that random result. Turns out to be the former is almost tenfold faster than the latter.
Hope that makes sense!

Testing if object exists in joined ActiveRecord query

I have User model that is related to a Friend model (has_many / belongs_to)
After joining, I would like to be able to check if a certain friend object exists in the friends that were joined to users:
users = User.joins(:friends).where("some condition") # subset of total friends
fs = Friend.all
fs.each do |f|
if users.friends.includes?(f) # match!
...
else # no match
...
end
The code as-is does not work and I am having difficulties getting this functionality in code.

Try something like this:
users.friends.where(id: u.id).exists?
That should generate a query like so:
SELECT 1 AS one FROM `users` WHERE `users`.`friend_id` = 42 AND `users`.`id` = 1 LIMIT 1
You'll either get back the number 1 (considered "truthy"), or nil (considered "falsey").
Side note: Unless you need to use your u variable later, you can probably get away with simply placing some_id directly in the where clause, and not do the second User lookup.
Edit
Just noticed a problem in your loop that might be what is causing your original problem. When you loaded up the list of users, unless you have some limit clause or invoked .first, you'll get back an array of users. So I'm guessing your application is crapping out on this line:
users.friends.includes?(f)
Because .friends is a method of a User object, not of an array.
So you'll have to do a nested loop instead like so:
fs.each do |f|
users.each do |u|
u.friends.includes?(f)
end
end
Note that this method might be very slow, depending on the number of friends and users. It is a very inefficient algorithm, which is why I'm trying to understand your situation better in the comments, because I'm certain there's a more efficient way to accomplish your task.

Returning a semi-unique set of most recent records

In my application a User has Highlights.
Each Highlight has a HighlightType. So if I run user.highlights I might see an output like this:
Notice that there are many highlights of type_id 47. This marks milestones of the number of times the user has gone running.
What I would like to do is return this full list of records, but only include one highlight for each highlight_type, and I want that one record to be the most recent record (in this case the "50th run" highlight). So in the example above I would get the same results but with IDs 195-199 removed.
Is there an efficient way to accomplish this?

I don't think there is an easy or clean way to achieve that, nor a "Rails way". Look at e.g. this link
According to one suggestion in that link you would do this SQL request:
SELECT h1.*
FROM highlights h1
LEFT JOIN highlights h2
ON (h1.user_id = h2.user_id
AND h1.highlight_type_id = h2.highlight_type_id
AND h1.created_at < h2.created_at)
WHERE h2.id IS NULL AND h1.user_id = <the user id you are interested in>
group by h1.highlight_type_id
I think it will be some performance problem if you have big tables maybe, an it not so very clean I think.
Otherwise, if there isn't so much highlights for a user I would have done something like this:
rows = {}
user.highlights.order('highlight_type_id, created_at DESC').each do |hi|
rows[hi.highlight_type_id] ||= hi
end
# then use rows which will have one object for each highlight_type_id
The DESC on created_at is important
EDIT:
I also saw some suggestions based on this
user.highlights.group('highlight_type_id').order('created_at DESC')
And that was also how I first thought it should be solved, but I tested it and it doesn't seems to get a correct result - at least on my test data.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart