Ruby on Rails performance of search engine with indexed column - ruby-on-rails

Hi i'm try to check search engine's performance of my ROR application.
I have 4 search input forms : title, content, created_on (date) and updated_on (date)
I want to check performace of search depending on the presence or absence of an index. (in my case, index presence on created_on and absence on updated_on)
My controller of Post
def index
search_start_time = Time.now
#posts = Post.search(params[:title], params[:content], params[:created_on], params[:updated_on])
# this line for check performance of search
puts Time.now - search_start_time
end
My schema
create_table 'posts', force: :cascade do |t|
t.string 'title', null: false
t.string 'content', null: false
t.date 'created_on', null: false, index: true
t.date 'updated_on', null: false
end
In my post.rb, i maked search method like this
def self.search(title, content, started_on, finished_on)
where([
"title LIKE ? AND content LIKE ? AND CAST(started_on AS text) LIKE ? AND CAST(finished_on AS text) LIKE ?",
"%#{title}%", "%#{content}%", "%#{started_on}%", "%#{finished_on}%"
])
end
With my code, i performance but there were not big difference with search performance of "indexed" and "not indexed" columns.
Is there a problem with my code? Or does the index not affect the search results?
The number of records is 10 million, and an indexed column always comes out similar to the speed of an unindexed column.
I tried to change my search method like this ->
def self.search(title = '', content = '', started_on = '', finished_on = '')
But there was not difference.

Related

Why integer is automatically converted to string when updating/saving in DB?

I have a column that is supposed to be a string. In schema.rb it looks something like this:
create_table "users", force: :cascade do |t|
t.string "login_token", default: "xxxxx", null: false
end
But if I try to update the column, the DB accepts integers and automatically converts them to strings for some reason.
user = User.first.update(login_token: 1)
#=> true
user.login_token
#=> "1"
Why is this, and is it possible to add any restrictions to the DB or validations in Rails to prevent this kind of typecasting?
Works the other way around too. If you have an integer column and pass a string, rails tries to convert it. Very useful when you, say, create a record from an html form (most everything comes as a string from the browser).
user_params # => { "age" => "20" }
u = User.create(user_params)
u.age # => 20
It's a feature/convention. I wouldn't fight it, if I were you.

Rails 5 - How to get records with OR condition instead of AND

I'm trying to get a complicated piece of data from rails.
What I want are all users, the tasks that are associated with them and limited to specific project.
Here's the task schema for reference:
create_table "tasks", force: :cascade do |t|
t.date "start_date"
t.date "end_date"
t.integer "hours"
t.string "priority"
t.integer "project_id"
t.integer "user_id"
t.string "name"
t.string "description"
end
I accomplish parts of this with this call
users = User.includes(:tasks).where('tasks.end_date Between ? AND ?', Date.today.beginning_of_week, Date.today.end_of_week).references(:tasks).where('tasks.start_date Between ? AND ?', Date.today.beginning_of_week, Date.today.end_of_week).references(:tasks).where('tasks.project_id = ?', #project.id).map { |user| user.as_json.merge({tasks: user.tasks.as_json}) }
My problem is that my query is not finding the tasks based on their dates correctly.
I am trying to find all tasks within a week range that either have a start_date or end_date within that week.
Is this possible within one query or do I require more advanced logic?
If you are using Rails 5 you can make user of or method
User.joins(:tasks).where(tasks: {end_date: Date.today.beginning_of_week..Date.today.end_of_week})
.or(
User.joins(:tasks).where(tasks: {start_date: Date.today.beginning_of_week..Date.today.end_of_week})
)
For sake of brevity, I haven't included project where clause.
I haven't tested this but I think what was happening was your where grabbing all of the users with tasks that end_date happens between given times and then querying those models with every user with tasks that start_date happens between given times. Giving you only users whose tasks start_date and end_date happen between the given times.
users = User.includes(:tasks).where('((tasks.end_date BETWEEN ? AND ?) OR (tasks.start_date BETWEEN ? AND ?)) AND tasks.project_id = ?', Date.today.beginning_of_week, Date.today.end_of_week, Date.today.beginning_of_week, Date.today.end_of_week, #project.id).references(:tasks).map { |user| user.as_json.merge({tasks: user.tasks.as_json}) }
Hope it helps. Cheers!
Here's what ended up working for me, thanks #vijay
User.includes(:tasks).where(tasks: {end_date:Date.today.beginning_of_week..Date.today.end_of_we‌​ek}).or(User.include‌​s(:tasks).where(task‌​s: {start_date: Date.today.beginning_of_week..Date.today.end_of_week})).wher‌​e('tasks.project_id = ?', #project.id).map { |user| user.as_json.merge({tasks: user.tasks.as_json}) }

ThinkingSphinx: dynamic indices on the SQL-backed indices?

I am trying to use ThinkingSphinx (with SQL-backed indices) in my Rails 5 project.
I need some dynamic run-time indices to search over.
I have a Message model:
class Message < ApplicationRecord
belongs_to :sender, class_name: 'User', :inverse_of => :messages
belongs_to :recipient, class_name: 'User', :inverse_of => :messages
end
and its indexer:
ThinkingSphinx::Index.define :message, :with => :active_record, :delta => true do
indexes text
indexes sender.email, :as => :sender_email, :sortable => true
indexes recipient.email, :as => :recipient_email, :sortable => true
indexes [sender.email, recipient.email], :as => :messager_email, :sortable => true
has sender_id, created_at, updated_at
has recipient_id
end
schema.rb:
create_table "messages", force: :cascade do |t|
t.integer "sender_id"
t.integer "recipient_id"
t.text "text"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.boolean "read", default: false
t.boolean "spam", default: false
t.boolean "delta", default: true, null: false
t.index ["recipient_id"], name: "index_messages_on_recipient_id", using: :btree
t.index ["sender_id"], name: "index_messages_on_sender_id", using: :btree
end
The problem is about so-called "dialogs". They don't exist in the database - they are determined at run-time. A dialog - that's a set of messages between 2 users, where each user may be either a sender or a receiver.
The task is to search through my dialogs and to find the dialog (dialog's messages) by the piece of the correspondent email. So complicated!
Here's my effort:
conditions = {messager_email: search_email}
with_current_user_dialogs =
"*, IF(sender_id = #{current_user.id} OR recipient_id = #{current_user.id}, 1, 0) AS current_user_dialogs"
messages = Message.search search_email, conditions: conditions,
select: with_current_user_dialogs,
with: {'current_user_dialogs' => 1}
This is almost fine - but still not. This query correctly searches only within my dialog (within the messages I sent or received) and only within :sender and :recipient fields simultaneously (which is not best).
Say my email is "client1#example.com". Other emails are like "client2#example.com", "client3#example.com", "manager1#example.com".
The trouble is that when I search for "client1" - I get all the messages where I was either a sender or a receiver. But I should get nothing in response - I need to search only across my correspondents emails - not mine.
Even worse stuff happens also while querying "client" - I get back the correct correspondents with "client2#example.com", "client3#example.com" - but the result is spoiled with wrong "client1#example.com".
I need a way to choose at run-time - which index subset to search within.
I mean this condition is not enough for me:
indexes [sender.email, recipient.email], :as => :messager_email, :sortable => true
It searches (for "client") within all the sender.email and all the recipient.email at once.
But I need to dynamically choose like: "search only within sender.email values conforming to if sender.id != current_user.id" OR "search only within recipient.email conforming to if recipient.id != current_user.id" (because I can be as a sender as a receiver).
That's what I call a "dynamic index".
How to do that? Such "dynamic index" surely would depend on the current current_user value - so it will be different for the different users - even on the same total messages set.
It is clear that I can't apply whatever post-search cut-offs (what to cut off?) - I need to somehow limitate the search itself.
I tried to search over some scope - but got the error that "searching is impossible over scopes" - something like that.
Maybe I should use the real-time indexing instead of the SQL-backed indexing?
Sorry for the complexity of my question.
Would the following work?
other = User.find_by :email => search_email
with_current_user_dialogs = "*, IF((sender_id = #{current_user.id} AND recipient_id = #{other.id}) OR (recipient_id = #{current_user.id} AND sender_id = #{other.id}), 1, 0) AS current_user_dialogs"
Or do you need partial matches on the searched email address?
[EDIT]
Okay, from the discussion in the comments below, it's clear that the field data is critical. While you can construct a search query that uses both fields and attributes, you can't have logic in the query that combines the two. That is, you can't say: "Search field 'x' when attribute 'i' is 1, otherwise search field 'y'."
The only way I can possibly see this working is if you're using fields for both parts of the logic. Perhaps something like the following:
current_user_email = "\"" + current_user.email + "\""
Message.search(
"(#sender_email #{current_user_email} #recipient_email #{search_email}) | (#sender_email #{search_email} #recipient_email #{current_user_email})"
)

How to show entries from the current month?

For the sake of explanation, I'm writing an app where a User can log their expenses.
In the User's show view, I want to only show the User's expenses from the current month.
My expenses table looks like this:
create_table "expenses", force: :cascade do |t|
t.date "date"
t.string "name"
t.integer "cost"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.integer "user_id"
end
The date field is in the date format, so looks like: Thu, 14 Apr 2016
In my controller, I've got something like:
def show
month = Date.today.strftime("%m")
#user = User.find(params[:id])
#expenses = Expense.where(:user_id => #user.id, :date => month)
end
Obviously, this isn't going to work, but it will be something along these lines, I'm guessing?
Any help would be great, thanks!
Usually you can tackle it this way:
Expense.where(date: (Date.today.beginning_of_month..Date.today.end_of_month))
Where that defines a range that can be used as a BETWEEN x AND y clause.
If this is a common operation you might want to express the date as a separate column in YYYYMM format so that these are easily retrieved.
If you're using MySQL, you can use the extract function, to create a .where like:
def show
month = Date.today.strftime("%m")
year = Date.today.strftime("%y")
#user = User.find(params[:id])
#expenses = Expence.where('extract(month from `date`) = ? AND extract(year from `date`) = ? AND `user_id` = ?', month, year, #user.id)
end
Havent tested, although it should work.
Sources:
https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html

Get data in batch using ActiveRecord

I create Rails APP and I would like to fetch data batch starting from specific point. I use AR and my table structure looks following:
create_table(:types) do |t|
t.string :name, null: false
t.string :type, null: false
t.string :type_id, null: false
t.text :metadata
t.timestamps
end
To get data I use type_id which is in following format (GUID):
"b2d506fd-409d-4ec7-b02f-c6d2295c7edd"
I would like to fetch specific count of data, ascending or descending ,starting from specific type_id. To be more specific I want do do something like this:
Model.get_batch(type_id: type, count: 20).desc
Can I do it simply in ActiveRecord?
You can use ActiveRecord::Batches to find records in batches
example
Model.where('your condition').find_in_batches(start: 2000, batch_size: 2000) do |group|
# do something with batch
end
check also ActiveRecord::Batches.find_in_batch
You can do like following
Model.find_by_type_id(type).offset(batch_offset).limit(amount_in_batch)
Or as on answer above
Model.where(type_id: your_value).find_in_batches(start: 2000, batch_size: 2000) do |group|
# do something with batch
end

Resources