I am using thinking_sphinx in Rails. As far as I know, Sphinx is used for full text search. Let's say I have these queries:
keyword
country
sort order
I use Sphinx for all the search above. However, when I am querying without keyword, but just country and sort order only, is it a better to use just normal query in MySQL instead of using Sphinx?
In other words, should Sphinx be used only when keyword is searched?
Looking at overall performance and speed.
Not to sound snarky, but does performance really matter?
If you're building an application which will only be used by a handful of users within an organization, then you can probably dismiss the performance benefits of using one method over the other and focus instead on simplicity in your code.
On the other hand, if your application is accessed by a large number of users on the interwebz and you really need to focus on being performant, then you should follow #barryhunter's advice above and benchmark to determine the best approach in a given circumstance.
P.S. Don't optimize before you need to. Fight with all your heart to keep code out of your code.
Benchmark! Benchmark! Benchmark!
Ie test it yourself. The exact performance will vary depending on the exact data, and perhaps even the relative perofrmance of your sphinx and mysql servers.
Sphinx will offer killer-speeds over MySQL when searching by a text string and MySQL will probably be faster when searching by a numerical key.
So, assuming that both "country" and "sort order" can be indexed using a numerical index in MySQL, it will be better to use Sphinx only with "keyword" and for the other two - MySQL normal query.
However, benchmarks won't hurt, as barryhunter suggested ;)
Related
I would like to modify the way Cypher processes queries sent to it for pattern matching. I have read about Execution plans and how Cypher chooses the best plan with the least number of operations and all. This is pretty good. However I am looking into implementing a Similarity Search feature that allows you to specify a Query graph that would be matched if not exact, close (similar). I have seen a few examples of this in theory. I would like to implement something of this sort for Neo4j. Which I am guessing would require a change in how the Query Engine deals with queries sent to it. Or Worse :)
Here are some links that demonstrate the idea
http://www.cs.cmu.edu/~dchau/graphite/graphite.pdf
http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper72.pdf
I am looking for ideas. Anything at all in relation to the topic would be helpful. Thanks in advance
(:I)<-[:NEEDING_HELP_FROM]-(:YOU)
From my point of view, better for you is to create Unmanaged Extensions.
Because you can create you own custom functionality into Neo4j server.
You are not able to extend Cypher Language without your own fork of source code.
I'm assuming the best practice for Neo4j (using v2.2.3 and higher) is to use MATCH with the new indexing and avoid using START and legacy indexes--which I don't think you can even create with Cypher.
Is there any current functionality that cannot be adequately achived using 100% new methods and avoiding legacy indexes entirely?
Well, in the documentation, full-text and Lucene indexes are considered legacy indexes.
So I think the answer to your question is, "Yes, quite a lot".
For example, anything you'd do via the lucene syntax would probably fall into that category. Fuzzy search, edit distance search, querying for numeric ranges, etc.
New schema indexes and methods that have come along since 2.0 I don't think address those features in any way.
EDIT See also a related answer I wrote about how lucene indexes work.
I am using thinking sphinx version 2.0.10 in rails for full text search and I am dealing with millions of record in Database. it take huge time to return result. so is there any way to keep the indexes on swap device. so it will work faster.
Thank You for Help
Thinking Sphinx configures Sphinx to store attributes in memory - but as far as I know there's no such setting that applies to field data. Sphinx index files can be stored on any disk you like though, instead of just RAILS_ROOT/db/sphinx/RAILS_ENV - this is configured using the searchd_file_path setting in config/sphinx.yml.
Perhaps you could elaborate on how you're using Sphinx and Thinking Sphinx - what kinds of queries you're running that are slow, and what the relevant index structures look like. There may be other ways of improving the speed of this.
I'm using ElasticSearch to implement search on a Webapp (Rails + Tire). When querying the ES server, is there a way to know what field of the Json returned matched the query?
The simplest way is to use the highlight feature, see support in Tire: https://github.com/karmi/tire/blob/master/test/integration/highlight_test.rb.
Do not use the Explain API for other then debugging purposes, as this will negatively affect the performance.
Have you tried using the Explain API from elastic search? The output of explain gives you a detailed explanation of why a document was matched, and it's relevance score.
The algorithm(s) used for searching the records are often much more complex than a single string match. Also, given the fact that you have the possibility of a term matching multiple fields (with possibly different weights), it may not be easy to come up with a simple answer. But, looking at the output of Explain API, you should be able to construct a meaningful message.
I have a database in (psql) that contains about 16,000 records; they are the titles of movies. I am trying to figure out what is the most optimal way to go about searching them (currently they are being searched via the web on a Heroku hosted website for Ruby on Rails). However, some queries such as searching for something like the word 'a' can take up to 20 seconds. I was thinking of using Sphinx however, such packages are advertised for full text searching, so I am wondering if that is appropriate for my problem. Any advice would be appreciated.
16000 records are too few both in both number and size (as you said title) to qualify for a Search Engine search. Try out normal full text search of your database. Set up the indexes for making it faster.
However this does not stop you from trying out some Search Engine like Sphinx or Solr. Both are open source. Sphinx pretty easy to setup too. But again to reiterate there is no need for this as the data size is too less and comes under the domain of Database Full Text Search.
If your database is on PSQL then sphinx is not possible as up till now heroku postgres is not supported to work with sphinx so the remaining choice so far is to use solr which is also good for full text search and some simple steps to make it implement.