Let's say I have a ruby on rails model, with a text field. Let's say I also have a query string. I want to make a query that sorts the database in descending order of maximal number of overlapping characters between the text field and the query string. How would I go about this?
Probably you can go with LIKE SQL statement, but will not be efficient.
Here is an example:
Company.where(Company.arel_table[:name].matches("%stack%"))
If you need to make a lot of queries search the text inside a field in database, you really need to start looking for a full text search software.
I can recommend you Elastic Search, unless you are familiar with some other tool.
You have here a recent tutorial and also a Railscast on subject.
Related
I'm trying to use dart parse server in order to do a full-text search as explained here.
So far as I understand, I have the falling 2 options if I want to do that:
whereContainsWholeWord
whereContains
In the case of the first one, I will search the whole database for that specific word, and for the second one, I will search the database for some partial word.
This is exactly what I need, but because the second search is going to utilize regex, this is going to be slow.
Is there any way to create an index for full-text search via the dart parse server, and afterward use some query on that index? Or this feature is not implemented yet, because I could not find anything in this regard in the document
I am trying to fetch all cards that do not belong to the current user and it's title contains some keywords.
What I have so far after research is this code below.
#other_cards = Card.where(:user_id.ne => #temp_current_user, "title LIKE ?" => "%Business%").limit(10).order('created_at desc down_title asc')
But there are no records being returned even thought they are present.
I have also tried to skip the :user_id condition and all the limits and orders, resulting in this simple code to check if it will return anything
#other_cards = Card.where("title LIKE ?", "%Business%")
EDIT
I have also tried putting the conditions as an array suggested in the first comment. It also resulted in a error
EDIT 2
Attaching screenshot of running the query in rails console
LIKE syntax is part of SQL. MongoDB, and Mongoid, do not (directly) support SQL. The general rule of thumb is Ruby methods carry over from ActiveRecord to Mongoid, for example the following works with both AR and Mongoid:
Card.where(id: 123)
... but once you start to specify complex conditions in either keys or values, they are often specific to the database you are working with. In the code you provided, the :user_id.ne syntax will not be recognized by ActiveRecord, and Mongoid/MongoDB do not recognize title LIKE ?.
Your problem statement was:
I am trying to fetch all cards that do not belong to the current user and its title contains some keywords.
MongoDB has native full text search capability, which is a superior solution to trying to match keywords with regular expressions or SQL LIKE operators. MongoDB has good documentation on full text searches here. The Ruby driver has further documentation, and examples, for how to use full text search here. Mongoid does not currently have documentation for full text search (there is an in progress pull request to add it here) but the gist of it is to combine the driver documentation referenced earlier with the documentation here for how to work with indexes in Mongoid in general.
I have a search record that stores an all_words column. It is a list of words separated by a space. I have another model named Lead. I want to search all the columns of all the rows of the leads table for the values in all_words. And any record that produces a match in any of its columns will be retrieved. Kind of like this:
possible_values = search.all_words.split
Lead.where(first_name: possible_values )
.where(last_name: possible_values )
.where(status: possible_values )
...
But this doesn't look clean. How can I go about this?
Indexing
You'll be much better suited to using an index-based search solution
I wrote about this the other day - if you're going to "search" your database (especially multiple attributes), you really need to use a third party solution to provide access to the data you require.
The "common" way to search databases, across all flavours of SQL, is to use full text search (which basically looks up information within an array of different attributes / columns, rather than specific matches).
The following solutions are popular for Rails based projects:
Thinking Sphinx
Sunspot Solr
ElasticSearch
--
References
The magic of these is that they will index any of the data you wish to search, storing the data in a semi-persistent data set.
This is vitally important, as one of the main reasons why full text searching your database is a bad idea is the performance implications it will cause. You'll be best using one of the aforementioned gems to get it working correctly
There's a good Railscast about this here:
What you are looking for is full-text search. Depending on the type of database you have you will use different strategies.
You will be able to create a search index on as many columns as you like.
For Postgresql
The good thing is that Postgresql already has full-text search capabilities. You can use those gems to benefit from it.
PG Search
Textacular
For Mysql
Dusen (uses FULLTEXT index capabilities in MySQL and LIKE queries)
Thinking sphinx (uses Sphinx search server)
Sunspot (uses solr search server)
I want to do the opposite of this:
image_name = "blah"
Pipe.where("LOWER(name) like ?", "%#{image_name.downcase}%") (this would find a Pipe named 'blahzzz`)
What I want is the opposite, where I have a pipe named ah and given image_name = "blah", I want to be able to find that Pipe. How would I accomplish this?
i think what you are looking for is a functionality of a full-text search index like lucene. search for sunspot or tire, they provide bindings for solr and elasticsearch. those are the most common full-text servers out there.
if you want to find partials of text, there is a feature called n-gram that allows you to find matching parts or substrings. i think that would be the way to go on a larger scale.
if you have just one place, where you are going to implement this functionality and your database is not too large, you can mimic the behavior in a relational database by combining a lot of OR LIKE queries and providing substrings of the input.
I'm building my first Rails app and have it working great with Thinking Sphinx. I'm understanding most of it but would love it if someone could help me clarify a few conceptual questions
When displaying search results after a sphinx query, should I be using the sphinx_attributes that are returned from the sphinx query? Or should my view use normal rails objects, such as #property.title, #property.amenities.title etc? If I use normal rails objects, doesn't that mean its doing extra queries?
In a forum, I'd like to display 'unread posts'. Obviously this is true/false for each user/topic combination, so I'm thinking I should be caching the 'reader' ids within the topic's sphinx index. This way I can quickly do a query for all unread posts for a given user_id. I've got this working, but then realised its pointless, as there is a time delay between sphinx indexes. So if a user clicks on an unread post, it will still appear unread until the sphinx DB is re-indexed
I'm still on development so I'm manually indexing/rebuilding, but on production, what is a standard time between re-indexing?
I have a model with several text fields - should I concat these all into one column in the sphinx index for a keyword search? Surely this is quicker than indexing all the separate fields.
Slightly off-topic, but just wondering - when you access nested models, for example #property.agents.name, does this affect performance? Or does rails automatically fetch all associated entries when a property is pulled from the database?
To answer each of your points:
For both of your examples, sphinx_attributes would not be helpful. Firstly, you've already loaded the property, so the title is available directly without an extra database hit. And for property.amenities.title you're dealing with an array of strings, which Sphinx has no concept of. Generally, I would only use sphinx_attributes for complicated calculated attributes, not standard column references.
Yes, you're right, there will be a delay with this value.
It depends on how often your data changes. I have some apps where I can index every day because changes are so rare, but others where we'll run it every 10 minutes. If the data is particularly volatile, I'll look at using deltas (usually via Sidekiq) to have changes reflected in Sphinx in a few seconds.
I don't think it's much difference either way - unless you want to search on any of those columns separately? If so, it'll need to be a separate field.
By default, as you use each property's agents, the agents for that property will be loaded from the database (one SQL call per property). You could look at the eager loading docs for how to manage this better when you're dealing with multiple records. Thinking Sphinx has the ability to pass through :include options to the underlying ActiveRecord call.