Search Gems for Rails - ruby-on-rails

I was browsing reddit for the answer to this and came across this conversation which lists out a bunch of search gems for rails, which is cool. But what I wanted was something where I could:
Enter: OMG Happy Cats
It searches the whole database looking for anything that has OMG Happy Cats and returns me a an array of model objects that contain that value, that I can then use Active model serializer (Very important to be able to use this) on to return you a json object of search results so you can display what ever you want to the user.
So that json object, if this was a blog, would have a post object, maybe a category object and even a comment object.
Everything I have seen is very specific to one controller, one model. Which is nice an all but I am more of a "search for what you want, we will return you what you want, maybe grow smarter like this gem, searchkick which also has the ability to offer spelling suggestion.
I am building this with an API, so it would be limited to everything that belongs to a blog object (as to make it not so huge of a search), so it would search things like posts, tags, categories, comments and pages looking for your term, return a json object (as described) and boom done.
Any ideas?

You'll be best considering the underlying technology for this
--
Third Party
As far as I know (I'm not super experienced in this area), the best way to search an entire Rails database is to use a third party system to "index" the various elements of data you require, allowing you to search them as required.
Some examples of this include:
Sunspot / Solr
ElasticSearch
Essentially, having one of these "third party" search systems gives you the ability to index the various records you want in a separate database, which you can then search with your application.
--
Notes
There are several advantages to handling "search" with a third party stack.
Firstly, it takes the load off your main web server - which means it'll be more reliable & able to handle more traffic.
Secondly, it will ensure you're able to search all the data of your application, instead of tying into a particular model / data set
Thirdly, because many of these third party solutions index the content you're looking for, it will free up your database connectivity for your actual application, making it more efficient & scaleable

For PostgreSQL you should be able to use pg_search.
I've never used it myself but going by the documentation on GitHub, it should allow you to do:
documents = PgSearch.multisearch('OMG Happy Cats').to_a
objects = documents.map(&:searchable)
groups = objects.group_by{|o| o.class.name.pluralize.downcase}
json = Hash[groups.map{|k,v| [k,ActiveModel::ArraySerializer.new(v).as_json]}].as_json
puts json.to_json

Related

What is a good approach to allow users subscribe to keywords

I have a rails application using Rails 4, PostgreSQL and hosted on Heroku.
The application revolves around the following models: User and Article.
A user can create articles. An article contains a title, description, location (latitude, longitude) and an image.
I would like to add a notification system that works as follows:
A user can set-up a list of keywords that they wish to subscribe to.
The user gets a notification if an article containing one of their keywords is added (in the title, but perhaps in description in time).
What is the best approach to implement this in a scalable way?
In its simplest form, I could create a model called Keyword that stores what keywords a user wants to be notified for.
Then in the create action for article, check to see if the title (or description) contains any of the saved keywords.
This sounds good but will probably fall over once any reasonable amount of users are added.
Obviously, a background task would do the trick but it still sounds wrong to do a basic string contains directly on the database.
Perhaps I could tokenize the title and description into an index and use a background process to handle the heavy lifting? I heard Postgres has some built in text search - could this work?
Could I use a Heroku add-on like Solr or Redis to handle all this or is it overkill? (Not having to pay for an add-on is an advantage).
Perhaps someone has a better implementation for the same functionality.
I know I can implement it quickly, I just want to be sure it implementation is up to scratch.
Thanks,
Brian
I have faced a similar problem. The slowest thing is to do a case insensitive search. What I would suggest to you is the following approach: let TID be the id of the row in which you store the title; then create a table which has one row for every word in your title in lowercase, with the corresponding TID. Than what you need is a join between the word and the keywords of the given user. You can speed up this query with hash indexes.
In my case, no one of the postgres text function was usable because they all have poor performance.
PS we implemented a full text search over about 60000 documents, so your case might be a bit different.

Rails - Store unique data for each open tab/window

I have an application that has different data sets depending on which company the user has currently selected (dropdown box on sidebar currently used to set a session variable).
My client has expressed a desire to have the ability to work on multiple different data sets from a single browser simultaneously. Hence, sessions no longer cut it.
Googling seems to imply get or post data along with every request is the way, which was my first guess. Is there a better/easier/rails way to achieve this?
You have a few options here, but as you point out, the session system won't work for you since it is global across all instances of the same browser.
The standard approach is to add something to the URL that identifies the context in which to execute. This could be as simple as a prefix like /companyx/users instead of /users where you're fetching the company slug and using that as a scope. Generally you do this by having a controller base class that does this work for you, then inherit from that for all other controllers that will be affected the same way.
Another approach is to move the company identifying component from the URL to the host name. This is common amongst software-as-a-service providers because it makes sharding your application much easier. Instead of myapp.com/companyx/users you'd have companyx.myapp.com/users. This has the advantage of preserving the existing URL structure, and when you have large amounts of data, you can partition your app by customer into different databases without a lot of headache.
The answer you found with tagging all the URLs using a GET token or a POST field is not going to work very well. For one, it's messy, and secondly, a site with every link being a POST is very annoying to work with as it makes navigating with the back-button or forcing a reload troublesome. The reason it has seen use is because out of the box PHP and ASP do not have support routes, so people have had to make do.
You can create a temporary database table, or use a key-value database and store all data you need in it. The uniq key can be used as a window id. Furthermore, you have to add this window id to each link. So you can receive the corresponding data for each browser tab out of the database and store it in the session, object,...
If you have an object, lets say #data, you can store it in the database using Marshal.dump and get it back with Marshal.load.

MongoDB and embedded documents, good use cases

I am using embedded documents in MongoDB for a Rails 3 app. I like that I can use embedded documents and the values are all returned with one query and there is less load on the database server. But what happens if I want my users to be able to update properties that really should be shared across documents. Is this sort of operation feasible with MongoDB or would I be better off using normal id based relations? If ID based relations are the way to go would it affect performance to a great degree?
If you need to know anything else about the application or data I would be happy to let you know what I am working with.
Document that has many properties that all documents share.
Person
name: string
description: string
Document that wants to use these properties:
Post
(references many people)
body: string
This all depends on what are you going to do with your Person model later. I know of at least one working example (blog using MongoDB) where its developer keeps user data inside comments they make and uses one collection for the entire blog. Well, ok, he uses second one for his "tag cloud" :) He just doesn't need to keep centralized list of all commenters, he doesn't care. His blog contains consolidated data from all his previous sites/blogs?, almost 6000 posts total. Posts contain comments, comments contain users, users have emails, he got "subscribe to comments" option for every user who comments some post, authorization is handled by the external OpenID service aggregator (Loginza), he keeps user email got from Loginza response and their "login token" in their cookies. So the functionality is pretty good.
So, the real question is - what are you going to do with your Users later? If really feel like you need a separate collection (you're going to let users have centralized control panels, have site-based registration, you're going to make user-centristic features and so on), make it separate. If not - keep it simple and have fun :)
It depends on what user info you want to share acrross documents. Lets say if you have user and user have emails. Does not make sence to move emails into separate collection since will be not more that 10, 20, 100 emails per user. But if user say have some big related information that always growing, like blog posts then make sence to move it into separate collection.
So answer depend on user document structure. If you show your user document structure and what you planning to move into separate collection i will help you make decision.

Is it worth using "pretty URLs" if you don't care about SEO/SEM

I'm designing a hosted software-as-a-service application that's like a highly specialized version of 37Signal's Highrise product. In that context, where SEO is a non-issue, is it worth implementing "pretty URLs" instead of going with numeric IDs (e.g. customers/john-smith instead of customers/1234)? I notice that a lot of web applications don't bother with them unless they provide a real value (e.g. e-commerce apps, blogs - things that need SEO to be found via search engines)
Depends on how often URLs are transmitted verbally by its users. People tend to find it relatively difficult to pronounce something like
http://www.domain.com/?id=4535&f=234&r=s%39fu__
and like
http://www.domain.com/john-doe
much better ;)
In addition to readability, another thing to keep in mind is that by exposing an auto-incrementing numeric key you also allow someone to guess the URLs for other resources and could give away certain details about your data. For instance, if someone signs up for your app and sees that their account is at /customer/12, it may effect their confidence in your application knowing that you only have 11 other customers. This wouldn't be an issue if they had a url of /customer/some-company.
It's always worth it if you just have the time to do it right.
Friendly-urls look a lot nicer and they give a better idea where the link will lead. This is useful if the link is shared eg. via instant message.
If you're searching for a specific page from browser history, human readable url helps.
Friendly url is a lot easier to remember (useful in some cases).
Like said earlier, it is also a lot easier to communicate verbally (needed more often than you'd think).
It hides unnecessary technical details from the user. In one case where user id was visible in the url, several users asked why their user id is higher than total amount of users. No damage done, but why have a confused user if you can avoid it.
I sure am a lot more likely to click on a link when I mouseover it, and it has http://www.example.com/something-i-am-interested-in.html.
Rather than seeing http://www.example.com/23847ozjo8uflidsa.asp.
It's quite annoying clicking links on MSDN because I never know what to expect I will get.
When I create applications I try my best to hide its structure from prying eyes - while it's subjective on how much "SEO" you get out of it - Pretty URLs tend to help people navigate and understand where they are while protecting your code from possible injections.
I notice you're using Rails app - so you probably wouldn't have a huge query string like in ASP, PHP, or those other languages - but in my opinion the added cleanliness and overall appearance is a plus for customer interaction. When sharing links it's nicer for customers to be able to copy the url: customer/john_doe than have to hunt for a "link me" or a random /customer/
Marco
I typically go with a combination -- keeping the ease of using Rails RESTful routing while still providing some extended information in URLs.
My app URLs look something like this:
http://example.com/discussions/123-is-it-worth-using-pretty-urls/
http://example.com/discussions/123-is-it-worth-using-pretty-urls/comments
http://example.com/discussions/123-is-it-worth-using-pretty-urls/comments/34567
You don't have to add ANY custom routes to pull this off, you just need to add the following method to your model:
def to_param
[ id, permalink ].join("-")
end
And ensure any find calling params[:id] in your controller is converted to an integer by setting params[:id].to_i.
Just a note, you'll need to set a permalink attribute when your record is saved...
If your application is restful, the URLs that rails gives you are SEO-friendly by default.
In your example, customers/1234 will probably return something like
<h1>Customer</h1>
<p><strong>Name:</strong> John Smith</p>
etc etc
Any current SEO spider will be smart enough to parse the destination page and extract that "John Smith" from there anyway.
So, in that sense, customers/1234 is already a "nice" URL (as opposed to other systems, in which you would have something like resource/123123/1234 for customer 1234 resource/23232/321 for client 321).
Now, if you want your users to be regularly using urls (like in delicious, etc) you might want to start using logins and readable fields instead of ids.
But for SEO, ids are just fine.

RESTful route for a list of members that are not in a collection

I'm trying to figure out what the best way to show a list of members (users) that aren't a collection (group).
/users
is my route for listing all of the users in the account
/group/:id/members
is my route for listing all of the users in the group
/users?not_in_group=:id
is my current option for showing a list of users NOT in the group. Is there a more RESTFul way of displaying this?
/group/:id/non_members
seems sort of odd…
Either query parameters or paths can be used to get at the representation you want. But I'd follow Pete's advice and make sure your API is hypertext-driven. Not doing so introduces coupling between client and server that REST was intended to prevent.
The best answer to your question might depend on your application. For example, if your system is small enough, it may suffice to only support a representation consisting of a list of users and their respective groups (the resource found at /users). Then let the client sort out what they want to do with the information. If your system has lots of groups and lots of users, each of which belongs to only a couple of groups, your available_users representation for any group is likely to be only slightly smaller than the entire list of users anyway.
Creative design of media types can go a long way to solving problems like this.
Spoke with my partner. He suggested:
/group/:id/available_members
Seems much more positive.
The main precept of REST is "hypertext as the engine of application state". The form of the URI is irrelevant, what matters is that it is navigable from the representation returned at the application's entry point.

Resources