Let's say I have a list of emails in an array, let's say ~2000 emails.
emails = ["AllenXiang#boyaa.com", "2dlogic#gmail.com", "support#KalromSystems.com", "kangisupport#helendorongroup.com", "James#APPCRASHCOURSE.COM", "James#appcrashcourse.com", "SpartanAppsUK#gmail.com"]
Let's say I theoretically want to get the emails site name. So I want to do emails.each do |email| puts email.split("#")[1]
Which would get me each of the site names for the emails. But I'm curious, is there a faster way to get that out of the array?
Ideally I'd like to create an array.uniq that contains a unique list of every site linked to the emails. I could do this manually, but I'm wondering if there's a quicker way to do this on the array itself (the array I actually have is ~2 million emails in length).
What you're creating isn't really an array, which cares about order and doesn't care about uniqueness. You want a Set, which doesn't concern itself with order, but doesn't allow duplicates.
require 'set'
email_domains = Set.new
emails.each do |email|
email_domains.add email.split('#', 2).last
end
I prefer this solution
emails.map { |email| email.split('#').last }.uniq
Updated
Or this one
emails.collect { |email| email.split('#').last }.uniq
Related
I know that find_each has been designed to consume smaller memory than each.
I found some code that other people wrote long ago. and I think that it's wrong.
Think about this codes.
users = User.where(:active => false) # What this line does actually? Nothing?
users.find_each do |user|
# update or do something..
user.update(:do_something => "yes")
end
in this case, It will store all user objects to the users variable. so we already populated the full amount of memory space. There is no point using find_each later on.
Am I correct?
so in other words, If you want to use find_each, you always need to use it with ActiveRecord::Relation object. Like this.
User.where(:active => false).find_each do |user|
# do something...
end
What do you think guys?
Update
in users = User.where(:active => false) line,
Some developer insists that rails never execute query unless we don't do anything with that variable.
What if we have a class with initialize method that has query?
class Test
def initialize
#users = User.where(:active => true)
end
def do_something
#user.find_each do |user|
# do something really..
end
end
end
If we call Test.new, what would happen? Nothing will happen?
users = User.where(:active => false) doesn't run a query against the database and it doesn't return an array with all inactive users. Instead, where returns an ActiveRecord::Relation. Such a relation basically describes a database query that hasn't run yet. The defined query is only run against the database when the actual records are needed. This happens for example when you run one of the following methods on that relation: find, to_a, count, each, and many others.
That means the change you did isn't a huge improvement, because it doesn't change went and how the database is queried.
But IMHO that your code is still slightly better because when you do not plan to reuse the relation then why assign it to a variable in the first place.
users = User.where(:active => false)
users.find_each do |user|
User.where(:active => false).find_each do |user|
Those do the same thing.
The only difference is the first one stores the ActiveRecord::Relation object in users before calling #find_each on it.
This isn't a Rails thing, it applies to all of Ruby. It's method chaining common to most object-oriented languages.
array = Call.some_method
array.each{ |item| do_something(item) }
Call.some_method.each{ |item| do_something(item) }
Again, same thing. The only difference is in the first the intermediate array will persist, whereas in the second the array will be built and then eventually deallocated.
If we call Test.new, what would happen? Nothing will happen?
Exactly. Rails will make an ActiveRecord::Relation and it will defer actually contacting the database until you actually do a query.
This lets you chain queries together.
#inactive_users = User.where(active: false).order(name: :asc)
Later you can to the query
# Inactive users whose favorite color is green ordered by name.
#inactive_users.where(favorite_color: :green).find_each do |user|
...
end
No query is made until find_each is called.
In general, pass around relations rather than arrays of records. Relations are more flexible and if it's never used there's no cost.
find_each is special in that it works in batches to avoid consuming too much memory on large tables.
A common mistake is to write this:
User.where(:active => false).each do |user|
Or worse:
User.all.each do |user|
Calling each on an ActiveRecord::Relation will pull all the results into memory before iterating. This is bad for large tables.
find_each will load the results in batches of 1000 to avoid using too much memory. It hides this batching from you.
There are other methods which work in batches, see ActiveRecord::Batches.
For more see the Rails Style Guide and use rubocop-rails to scan your code for issues and make suggestions and corrections.
How can I deliver_now conditionally based on the content of the data returned by the mailer method?
I have a Mailer that is sent via a loop through a list of users, like so:
users.each do |u|
AbcMailer.with(user_id: u.user_id).abc_report.deliver_now
end
The list of users that should receive the mailer (users in the loop) lives in ActiveRecord, and all the users' actual data lives in an external MySql DB.
The abc_report method in the AbcMailer class makes some queries to the MySql DB and returns a bunch of info for each user, which is then inserted into an html.erb email template, and delivered now.
My issue is that I need to only deliver to some of those users, because in the DB, one of the pieces of info that comes back is whether the user is active or not. So I would like to only deliver_now if the user has active = 1. But I can't find any examples of unchaining these methods to do what I want.
When I just do AbcMailer.with(user_id: u.user_id).abc_report, what it returns is actually the filled-out html.erb template already. When I do AbcMailer.with(user_id: u.user_id) by itself, it returns #<ActionMailer::Parameterized::Mailer:0x00007fdd43eb5528>.
Things I've Tried
I tried inserting a return if user["Active"] == 0 in the abc_report method but that obviously killed the entire loop rather than skipping to the next item, so I'm working under the assumption that the skip has to happen with a next in the actual loop itself, not in an external method being called.
I also found this which seems like a great solution in a plain Ruby context but because in this case, abc_report is automatically filling out and returning the AbcMailer html.erb template...I'm stumped on how I would get it to just return a boolean without killing the whole loop.
Assuming that you have active field of boolean type in users table and you want to send the emails to all active users.
users.each do |u|
AbcMailer.with(user_id: u.user_id).abc_report.deliver_now if u.active
end
you can add any other condition on this as well.
Add this to user.rb
scope :active, -> {
where(active: 1)
}
Now in your mailer
Users.active.each do |user|
AbcMailer.with(user_id: user.user_id).abc_report.deliver_now
end
This way you have to get less data from the database, resulting in a faster query.
In our Rails app, the user (or we on his behalf) load some data or even insert it manually using a crud.
After this step the user must validate all the configuration (the data) and "accept and agree" that it's all correct.
On a given day, the application will execute some tasks according the configuration.
Today, we already have a "freeze" flag, where we can prevent changes in the data, so the user cannot mess the things up...
But we also would like to do something like hash the data and say something like "your config is frozen and the hash is 34FE00...".
This would give the user a certain that the system is running with the configuration he approved.
How can we do that? There are 7 or 8 tables. The total of records created would be around 2k or 3k.
How to hash the data to detect changes after the approval? How would you do that?
I'm thinking about doing a find_by_user in each table, loop all records and use some fields (or all) to build a string and hash it at the end of the current loop.
After loop all tables, I would have 8 hash strings and would concatenate and hash them in a final hash.
How does it looks like? Any ideas?
Here's a possible implementation. Just define object as an Array of all the stuff you'd like to hash :
require 'digest/md5'
def validation_hash(object, len = 16)
Digest::MD5.hexdigest(object.to_json)[0,len]
end
puts validation_hash([Actor.first,Movie.first(5)])
# => 94eba93c0a8e92f8
# After changing a single character in the first Actors's biography :
# => 35f342d915d6be4e
Im creating an api to create a new userS(plural) from an array of emails.
We are assuming that there are no validations other than user requires an email. So all i need is an email to create a user.
Reason i'm doing this is because i'm creating an API.
How do i create users from an array of emails?
Here is the array. I actually have real emails but for this example i will make them up.
# => [
# [0] "email1#example.com.au",
# [1] "email2#example.com.au",
# [2] "email3#example.com",
# [3] "email4#example.com.au",
# [4] "email5#example.com.au"
# ]
To create a user it's just the typical way
User.new(email: 123#example.com)
Thanks in advance for any help.
To save records based on your array, you will want to store that array in a variable (remember that in Ruby, everything is an object). Let's say you have:
emails = ["email1#example.com.au",
"email2#example.com.au",
"email3#example.com",
"email4#example.com.au",
"email5#example.com.au"]
From there you can write a loop that iterates over your lovely array and creates a User for each array item you declared:
emails.each do |e|
User.create(email: e)
end
User.new will not save the records, so please use User.create.
If you just want to save records basing on the emails list(which is an array basing on your description), you just need to do everything like #l0010o0001l said(I love this nickname! :) ).
But in my opinion, you could do something more if this api will be provided to others.
The first thing is that you should format the email address with all words in lower-case before you save them. This may do great help whenever save new records or maintain old records. Just like:
emails.each do |email|
User.create(email: email.downcase)
end
Then you need to present result to the one who calls the api. just like: if all the email list was created successfully you can respond with the records amount created successfully. And if some records can not be created(format error, record has existed .etc) you should respond with the error info (you may need to use user.errors.full_messages to get error messages).
The way to do this is somethign like
user_emails.each do |user_email|
User.create(email: user_emails )
end
I'm writing an application that allows users to send one another messages about an 'offer'.
I thought I'd save myself some work and use the Mailboxer gem.
I'm following a test driven development approach with RSpec. I'm writing a test that should ensure that only one Conversation is allowed per offer. An offer belongs_to two different users (the user that made the offer, and the user that received the offer).
Here is my failing test:
describe "after a message is sent to the same user twice" do
before do
2.times { sending_user.message_user_regarding_offer! offer, receiving_user, random_string }
end
specify { sending_user.mailbox.conversations.count.should == 1 }
end
So before the test runs a user sending_user sends a message to the receiving_user twice. The message_user_regarding_offer! looks like this:
def message_user_regarding_offer! offer, receiver, body
conversation = offer.conversation
if conversation.nil?
self.send_message(receiver, body, offer.conversation_subject)
else
self.reply_to_conversation(conversation, body)
# I put a binding.pry here to examine in console
end
offer.create_activity key: PublicActivityKeys.message_received, owner: self, recipient: receiver
end
On the first iteration in the test (when the first message is sent) the conversation variable is nil therefore a message is sent and a conversation is created between the two users.
On the second iteration the conversation created in the first iteration is returned and the user replies to that conversation, but a new conversation isn't created.
This all works, but the test fails and I cannot understand why!
When I place a pry binding in the code in the location specified above I can examine what is going on... now riddle me this:
self.mailbox.conversations[0] returns a Conversation instance
self.mailbox.conversations[1] returns nil
self.mailbox.conversations clearly shows a collection containing ONE object.
self.mailbox.conversations.count returns 2?!
What is going on there? the count method is incorrect and my test is failing...
What am I missing? Or is this a bug?!
EDIT
offer.conversation looks like this:
def conversation
Conversation.where({subject: conversation_subject}).last
end
and offer.conversation_subject:
def conversation_subject
"offer-#{self.id}"
end
EDIT 2 - Showing the first and second iteration in pry
Also...
Conversation.all.count returns 1!
and:
Conversation.all == self.mailbox.conversations returns true
and
Conversation.all.count == self.mailbox.conversations.count returns false
How can that be if the arrays are equal? I don't know what's going on here, blown hours on this now. Think it's a bug?!
EDIT 3
From the source of the Mailboxer gem...
def conversations(options = {})
conv = Conversation.participant(#messageable)
if options[:mailbox_type].present?
case options[:mailbox_type]
when 'inbox'
conv = Conversation.inbox(#messageable)
when 'sentbox'
conv = Conversation.sentbox(#messageable)
when 'trash'
conv = Conversation.trash(#messageable)
when 'not_trash'
conv = Conversation.not_trash(#messageable)
end
end
if (options.has_key?(:read) && options[:read]==false) || (options.has_key?(:unread) && options[:unread]==true)
conv = conv.unread(#messageable)
end
conv
end
The reply_to_convesation code is available here -> http://rubydoc.info/gems/mailboxer/frames.
Just can't see what I'm doing wrong! Might rework my tests to get around this. Or ditch the gem and write my own.
see this Rails 3: Difference between Relation.count and Relation.all.count
In short Rails ignores the select columns (if more than one) when you apply count to the query. This is because
SQL's COUNT allows only one or less columns as parameters.
From Mailbox code
scope :participant, lambda {|participant|
select('DISTINCT conversations.*').
where('notifications.type'=> Message.name).
order("conversations.updated_at DESC").
joins(:receipts).merge(Receipt.recipient(participant))
}
self.mailbox.conversations.count ignores the select('DISTINCT conversations.*') and counts the join table with receipts, essentially counting number of receipts with duplicate conversations in it.
On the other hand, self.mailbox.conversations.all.count first gets the records applying the select, which gets unique conversations and then counts it.
self.mailbox.conversations.all == self.mailbox.conversations since both of them query the db with the select.
To solve your problem you can use sending_user.mailbox.conversations.all.count or sending_user.mailbox.conversations.group('conversations.id').length
I have tended to use the size method in my code. As per the ActiveRecord code, size will use a cached count if available and also returns the correct number when models have been created through relations and have not yet been saved.
# File activerecord/lib/active_record/relation.rb, line 228
def size
loaded? ? #records.length : count
end
There is a blog on this here.
In Ruby, #length and #size are synonyms and both do the same thing: they tell you how many elements are in an array or hash. Technically #length is the method and #size is an alias to it.
In ActiveRecord, there are several ways to find out how many records are in an association, and there are some subtle differences in how they work.
post.comments.count - Determine the number of elements with an SQL COUNT query. You can also specify conditions to count only a subset of the associated elements (e.g. :conditions => {:author_name => "josh"}). If you set up a counter cache on the association, #count will return that cached value instead of executing a new query.
post.comments.length - This always loads the contents of the association into memory, then returns the number of elements loaded. Note that this won't force an update if the association had been previously loaded and then new comments were created through another way (e.g. Comment.create(...) instead of post.comments.create(...)).
post.comments.size - This works as a combination of the two previous options. If the collection has already been loaded, it will return its length just like calling #length. If it hasn't been loaded yet, it's like calling #count.
It is also worth mentioning to be careful if you are not creating models through associations, as the related model will not necessarily have those instances in its association proxy/collection.
# do this
mailbox.conversations.build(attrs)
# or this
mailbox.conversations << Conversation.new(attrs)
# or this
mailbox.conversations.create(attrs)
# or this
mailbox.conversations.create!(attrs)
# NOT this
Conversation.new(mailbox_id: some_id, ....)
I don't know if this explains what's going on, but the ActiveRecord count method queries the database for the number of records stored. The length of the Relation could be different, as discussed in http://archive.railsforum.com/viewtopic.php?id=6255, although in that example, the number of records in the database was less than the number of items in the Rails data structure.
Try
self.mailbox.conversations.reload; self.mailbox.conversations.count
or perhaps
self.mailbox.reload; self.mailbox.conversations.count
or, if neither of those work, just try reloading as many of the objects as possible to see if you can get it to work (self, mailbox, conversations, etc.).
My guess is that something is messed up between memory and the DB. This is definitely a really weird error though, might wanna put in an issue on Rails to see why this would be the case.
The result of mailbox.conversations is cached after the first call. To reload it write mailbox.conversations(true)