I am trying to add full text search capabilities to my RoR app, but facing some issues when it comes to Arabic. AFAIK, there aren't many search engines out there that support arabic stemming, morphology and other advanced full text search. The only thing I found was Lucene with the AraMorph tokenizer.
The acts_as_solr plugin (solr is based on lucene, and this plugin integrates it with Rails) seems to be abandoned, and I can't find any helpful documentation.
I've looked into sphinx, xapian, ferret, and acts_as_searchable but none of them offers advanced arabic search functionality to the best of my knowledge.
Any help would be really appreciated
== Update
I've got suggestions to use sphinx, and I did use it on an earlier project, and it works just fine. However, it does not provide any advanced search capabilities.
for instance, the words: كتاب (book), مكتبة (library), and كاتب (writer) are all derived from the same stem كتب. I want to have the ability to search for "writer" and get results for all words derived from the same stem.
Also, I want the search to take into account common arabic dictation styles. Some use the "hamza" (همزة) and some people don't. Others write words with the letter "taa marboota" (التاء المربوطة) while others use the letter "haa" (الهاء). A good arabic search engine should realize such subtle differences and look for them.
With sphinx you only get what you search for, and the only engine I found to accommodate such matters in the arabic language, was Lucene with AraMorph tokenizer. However, acts_as_solr (the lucene plugin for rails) is abandoned . So my question is: is there any other such tokenizer for any search engine?
KandadaBoggu mentioned sunspot, I'll give that a go, and respond back
You should try this by extending Thinking Sphinx options
Read this: http://www.expressionlab.com/2008/11/19/thinking-sphinx-in-arabic-unicode
For Solr use Sunspot and Sunspot Rails.
For Sphinx use Thinking Sphinx
Both gems are excellent and have a large install base. I have used ThinkingSphinx in few projects and I highly recommend it.
Related
I'm building my first rails app and I'm trying to build the search page on an ecommerce type site. The idea is the model pulls all the data from the database according to the filter that is checked or selected on the view such as (category, sub-category, price, date, etc.)
I've watched railscasts on elasticsearch, solar, etc. They seem like they'd each work in this scenario but are they overkill? I'm just not sure what is the quickest and most scalable way to set up this search. I've read a little about the has_scope gem which seems like it would be one way but I can't find a good tutorial or documentation on has_scope. Can someone point me in the right direction for creating this search page? Should I build it out with has_scope, solar, or elasticsearch?
Thanks
In my own experience solr is the best search technology I've come across. It provides a feature called faceting which is what you are describing. You can read about it on their wiki here: http://wiki.apache.org/solr/SolrFacetingOverview
The best solr gem I've come across is sunspot. It has a very easy to use DSL for interfacing with solr from a Rails app and hooks in to active record very easily. Take a look at their github project page. I think that will answer your question.
I am wondering on how to implement a search functionality like Github.
Just one search box on the top header right and when searched for a keyword, displays the results for Repository, Code and User.
Is there any tutorial or example to implement this on Rails 3?
Odds are really good they're doing separate searches across the tables for the same value, then combining the results afterwards.
Use Rails to create a small form containing a text field. When it's submitted take the value of the field and do a query using that as the search term.
If you're not sure how to do queries using ActiveRecord, see "Active Record Query Interface" for a nice overview.
You will have to do several queries, one per model, and put the results together on the same view.
If your question is "how do I do full text searches on several activerecord models in a DRY way" then there are basically two paths:
The common solution, but a bit complex, is using a dedicated daemon on your machine, like Sphinx. Sphinx is a service in (like Apache or MySQL) that indexes your content and allows you to do searches. You can use the Thinking Sphinx gem to communicate with it easily from rails. An alternative to Sphinx is Solr (there's also a gem for it called Sunspot)
If you are using Postgresql, there's a simpler alternative that doesn't require external services running on your server. Postgresql has with some full-text search capabilities built-in. There's a gem called texticle that helps using these services from rails. You can have that working very quickly.
I've been working on a new project lately where a fantastic search engine is crucial. It's a rails3 app hosted on heroku and I'm looking into possible solutions(a rubygem would be ideal) which offer a easy way to have powerful full-text search.
Right now, I'm using acts_as_tsearch which leverages PostgreSQL and performs a basic MATCH query. Though, it's not really pulling back good results(for example, if I search for "create a project" and "how do i create a project" exists as a query, it doesn't find it).
Can anyone share their experiences with full text search, anyone tried out Solr ?
IndexTank is your best bet. They were recently added as a Heroku add-on.
We recently tried to just run our own search for our Heroku app and it's just not worth it because you have to worry about stability and scaling of that search box. It's better to go with a provider, like IndexTank.
IndexTank also powers Reddit and Wordpress.com, so can bet it'll be reliable.
SOLR works very nicely -- it's a bit pricey to get starts ($20 a month), but it just works, and works well.
They recently added the ability to ask the user "Did you mean to search for [correct spelling]".
You can easily cross-model search (search for Users and Cars and Dealerships).
Heroku offers addons which you can easily add to your application. You should take a look at Solr and IndexTank.
There's a free solution in the Texticle gem. It uses PostgreSQL's (> 8.3) full text index support and creates a search method on your models. If you create indexes, the speed is very good (for a free solution).
Hope that helps!
I need to add full web search to my site. I need something like Google Custom Search but with no ads and it has to be free. Any recommendation of a web service or open source project that can index my site and allow me to search it will be helpful.
My site is made in ruby on rails, if that helps.
I'll make this question community-wiki so you can edit my bad English. I think many people can benefit from this question.
Check out Lucene. It's an open source search engine that will certainly be a fun learning experience to implement on your own site. It was originally designed by the Excite folks, I do believe.
Ferret is the Ruby port of Lucene. Check out the acts_as_ferret plugin.
Depends what you mean by full web search really. If you want to search the whole web then the answers above wont help you much as they are really for indexing and searching the content of your site. I would suggest using the Google ajax search (just a 'powered by google' needed, no ads) or Boss from yahoo (might require ads not sure).
http://code.google.com/apis/ajaxsearch/
http://developer.yahoo.com/search/boss/
People are going to acts_as_solr and thinking sphinx in the blogs i read:
http://acts-as-solr.rubyforge.org/
http://ts.freelancing-gods.com/
I've aslo been looking at tsearch in postgres, it looks very capable:
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
What do you mean by "full web search"?
The are good answers available for full-text search where a search engine indexes and queries the model objects stored in your database.
If you mean something that indexes and queries your rendered HTML, Nutch is a popular option with a web-crawler, parser, indexer, and query interface.
I recommend acts_as_xapian. It's very easy to implement, it's fast enough, and it's the got the features you'll normally need.
I would like to do full-text searching of data in my Ruby on Rails application. What options exist?
There are several options available and each have different strengths and weaknesses. If you would like to add full-text searching, it would be prudent to investigate each a little bit and try them out to see how well it works for you in your environment.
MySQL has built-in support for full-text searching. It has online support meaning that when new records are added to the database, they are automatically indexed and will be available in the search results. The documentation has more details.
acts_as_tsearch offers a wrapper for similar built-in functionality for recent versions of PostgreSQL
For other databases you will have to use other software.
Lucene is a popular search provider written in Java. You can use Lucene through its search server Solr with Rails using acts_as_solr.
If you don't want to use Java, there is a port of Lucene to Ruby called Ferret. Support for Rails is added using the acts_as_ferret plugin.
Xapian is another good option and is supported in Rails using the acts_as_xapian plugin.
Finally, my preferred choice is Sphinx using the Ultrasphinx plugin. It is extremely fast and has many options on how to index and search your databases, but is no longer being actively maintained.
Another plugin for Sphinx is Thinking Sphinx which has a lot of positive feedback. It is a little easier to get started using Thinking Sphinx than Ultrasphinx. I would suggest investigating both plugins to determine which fits better with your project.
I can recommend Sphinx. Ryan Bates has a great screencast on using the Thinking Sphinx plugin to create a full-text search solution.
You can use Ferret (which is Lucene written in Ruby). It integrates seamless with Rails using the acts_as_ferret mixin. Take a look at "How to Integrate Ferret With Rails". A alternative is Sphinx.
Two main options, depending on what you're after.
1) Full Text Indexing and MATCH() AGAINST().
If you're just looking to do a fast search against a few text columns in your table, you can simply use a full text index of those columns and use MATCH() AGAINST() in your queries.
Create the full text index in a migration file:
add_index :table, :column, type: :fulltext
Query using that index:
where( "MATCH( column ) AGAINST( ? )", term )
2) ElasticSearch and Searchkick
If you're looking for a full blown search indexing solution that allows you to search for any column in any of your records while still being lightning quick, take a look at ElasticSearch and Searchkick.
ElasticSearch is the indexing and search engine.
Searchkick is the integration library with Rails that makes it very easy to index your records and search them.
Searchkick's README does a fantastic job at explaining how to get up and running and to fine tune your setup, but here is a little snippet:
Install and start ElasticSearch.
brew install elasticsearch
brew services start elasticsearch
Add searchkick gem to your bundle:
bundle add searchkick --strict
The --strict option just tells Bundler to use an exact version in your Gemfile, which I highly recommend.
Add searchkick to a model you want to index:
class MyModel < ApplicationRecord
searchkick
end
Index your records.
MyModel.reindex
Search your index.
matching_records = MyModel.search( "term" )
I've been compiling a list of the various Ruby on Rails search options in this other question. I'm not sure how, or if to combine our questions.
It depends on what database you are using. I would recommend using Solr as it offers up a lot of nice options. The downside is you have to run a separate process for it. I have used Ferret as well, but found it to be less stable in terms of multi-threaded access to the index. I haven't tried Sphinx because it only works with MySQL and Postgres.
Just a note for future reference: Ultra Sphinx is no longer being maintained. Thinking sphinx is its replacement. Although it lacks several features at this time like excerpting which Ultra sphinx had, it makes up for it in other features.
I would recommend acts_as_ferret as I am using it for Scrumpad project at work. The indexing can be done as a separate process which ensures that while re-indexing we can still use our application. This can reduce the downtime of website. Also the searching is much faster. You can search through multiple model at a time and have your results sorted out by the fields you prefer.