rails search all columns of all rows for value - ruby-on-rails

I have a search record that stores an all_words column. It is a list of words separated by a space. I have another model named Lead. I want to search all the columns of all the rows of the leads table for the values in all_words. And any record that produces a match in any of its columns will be retrieved. Kind of like this:
possible_values = search.all_words.split
Lead.where(first_name: possible_values )
.where(last_name: possible_values )
.where(status: possible_values )
...
But this doesn't look clean. How can I go about this?

Indexing
You'll be much better suited to using an index-based search solution
I wrote about this the other day - if you're going to "search" your database (especially multiple attributes), you really need to use a third party solution to provide access to the data you require.
The "common" way to search databases, across all flavours of SQL, is to use full text search (which basically looks up information within an array of different attributes / columns, rather than specific matches).
The following solutions are popular for Rails based projects:
Thinking Sphinx
Sunspot Solr
ElasticSearch
--
References
The magic of these is that they will index any of the data you wish to search, storing the data in a semi-persistent data set.
This is vitally important, as one of the main reasons why full text searching your database is a bad idea is the performance implications it will cause. You'll be best using one of the aforementioned gems to get it working correctly
There's a good Railscast about this here:

What you are looking for is full-text search. Depending on the type of database you have you will use different strategies.
You will be able to create a search index on as many columns as you like.
For Postgresql
The good thing is that Postgresql already has full-text search capabilities. You can use those gems to benefit from it.
PG Search
Textacular
For Mysql
Dusen (uses FULLTEXT index capabilities in MySQL and LIKE queries)
Thinking sphinx (uses Sphinx search server)
Sunspot (uses solr search server)

Related

How do a general search across string properties in my nodes?

Working with Neo4j in a Rails app.
I have nodes with several string properties containing long strings of user generated content. For example in my nodes of type: "Book", I might have properties, "review", and "summary", which would contain long-form string values.
I was trying to design queries that returned nodes which match those properties to general language search terms provided by a user in a search box. As my query got increasingly complicated, it occurred to me that I was trying to resolve natural language search.
I looked into some of the popular search gems in Rails, but they all seem to depend on ActiveRecord. What search solutions exist for Neo4j.rb?
There are a few ways that you could go about this!
As FrobberOfBits said, Neo4j has what are called "legacy indexes" which use Lucene it the background to provide indexing of generic things. It does support the new schema indexes. Unfortunately those are based on exact matches (though I'm pretty sure that will change in Neo4j 2.3.x somewhat).
Neo4j does support pattern matching on strings via the =~ operator, but those queries aren't indexed. So the performance depends on the size of your database.
We often recommend a gem called searchkick which lets you define indexes for Elasticsearch in your models. Then you can just call a Model.search method to do your searches and it will first query elasticsearch to get the node IDs and then load those nodes via Neo4j.rb. You can use that via the neo4j-searchkick gem: https://github.com/neo4jrb/neo4j-searchkick
Lastly, if you're doing NLP and are trying to extract important words from your text, you could create a Tag/Word label and create relationships from your nodes to these NLP extracted nodes so that you can search based on those nodes in the future. You could even build recommendations from one text node to another based on the number/type of common tag nodes.
I don't know if anything specific exists for neo4j.rb and activerecord. What I can say is that generally this stuff is handled through the use of legacy indexes that are implemented by Lucene.
The premise is that you create a lucene-managed index on certain properties, and that then gives you access to use the Lucene query language via cypher to get data from those indices. Relative to neo4j.rb, it doesn't look any different than running cypher queries, like this:
START item=node:node_auto_index("(title:'foo bar' AND body:baz*) OR title:'bat'")
RETURN item
Note that lucene indexes and that query language can only be used in a START block, not a MATCH block. Refer to the Lucene Query Syntax to discover more about what you can do with that query syntax (fuzzy matching, wildcards, etc -- quite a bit more extensive than what regex would give you).

Neo4j schema indexes for fuzzy search

Right now I'm thinking on possibility to create fuzzy search in my application over my Neo4j database.
The main criteria are: fuzzy search and performance.
What is the best way to achive these goals with a last version of Neo4j community edition ?
Fuzzy search is a tricky thing. Even in plain lucene (where you can do fuzzy search with lucene query strings) it is not recommended because it is quite expensive.
You can use that query syntax in Neo4j too when you indexed your data with a manual index.
The solution that most suggest is to rather go with auto-suggestion, i.e. match on the first few characters, present the options in the auto-complete box and then search by using the user selected strings.

Rails Active Record Relation sorting

Let's say I have a ruby on rails model, with a text field. Let's say I also have a query string. I want to make a query that sorts the database in descending order of maximal number of overlapping characters between the text field and the query string. How would I go about this?
Probably you can go with LIKE SQL statement, but will not be efficient.
Here is an example:
Company.where(Company.arel_table[:name].matches("%stack%"))
If you need to make a lot of queries search the text inside a field in database, you really need to start looking for a full text search software.
I can recommend you Elastic Search, unless you are familiar with some other tool.
You have here a recent tutorial and also a Railscast on subject.

Rails 4 + SQL Lite: Find records with substring of big string?

I want to do the opposite of this:
image_name = "blah"
Pipe.where("LOWER(name) like ?", "%#{image_name.downcase}%") (this would find a Pipe named 'blahzzz`)
What I want is the opposite, where I have a pipe named ah and given image_name = "blah", I want to be able to find that Pipe. How would I accomplish this?
i think what you are looking for is a functionality of a full-text search index like lucene. search for sunspot or tire, they provide bindings for solr and elasticsearch. those are the most common full-text servers out there.
if you want to find partials of text, there is a feature called n-gram that allows you to find matching parts or substrings. i think that would be the way to go on a larger scale.
if you have just one place, where you are going to implement this functionality and your database is not too large, you can mimic the behavior in a relational database by combining a lot of OR LIKE queries and providing substrings of the input.

Thinking Sphinx & Rails questions

I'm building my first Rails app and have it working great with Thinking Sphinx. I'm understanding most of it but would love it if someone could help me clarify a few conceptual questions
When displaying search results after a sphinx query, should I be using the sphinx_attributes that are returned from the sphinx query? Or should my view use normal rails objects, such as #property.title, #property.amenities.title etc? If I use normal rails objects, doesn't that mean its doing extra queries?
In a forum, I'd like to display 'unread posts'. Obviously this is true/false for each user/topic combination, so I'm thinking I should be caching the 'reader' ids within the topic's sphinx index. This way I can quickly do a query for all unread posts for a given user_id. I've got this working, but then realised its pointless, as there is a time delay between sphinx indexes. So if a user clicks on an unread post, it will still appear unread until the sphinx DB is re-indexed
I'm still on development so I'm manually indexing/rebuilding, but on production, what is a standard time between re-indexing?
I have a model with several text fields - should I concat these all into one column in the sphinx index for a keyword search? Surely this is quicker than indexing all the separate fields.
Slightly off-topic, but just wondering - when you access nested models, for example #property.agents.name, does this affect performance? Or does rails automatically fetch all associated entries when a property is pulled from the database?
To answer each of your points:
For both of your examples, sphinx_attributes would not be helpful. Firstly, you've already loaded the property, so the title is available directly without an extra database hit. And for property.amenities.title you're dealing with an array of strings, which Sphinx has no concept of. Generally, I would only use sphinx_attributes for complicated calculated attributes, not standard column references.
Yes, you're right, there will be a delay with this value.
It depends on how often your data changes. I have some apps where I can index every day because changes are so rare, but others where we'll run it every 10 minutes. If the data is particularly volatile, I'll look at using deltas (usually via Sidekiq) to have changes reflected in Sphinx in a few seconds.
I don't think it's much difference either way - unless you want to search on any of those columns separately? If so, it'll need to be a separate field.
By default, as you use each property's agents, the agents for that property will be loaded from the database (one SQL call per property). You could look at the eager loading docs for how to manage this better when you're dealing with multiple records. Thinking Sphinx has the ability to pass through :include options to the underlying ActiveRecord call.

Resources