If I search for:
project=cp and summary ~ "Bridge"
I get 52 results. However, if I search for:
issueFunction in issueFieldMatch("project=CP", "summary", "Bridge")
I get 0 results.
I don't have admin access, so I can't look at the logs to try to debug this. Any ideas?
Check and compare the issues that are found by the first query. Do they really contain "Bridge" word?
There are two things that might affect your search of the latter query:
stemming -- summary ~ Bridge can find all issues using "bridging" while the latter query not
case sensitivity -- please note that the latter query is case sensitive and won't find summaries with "bridge"
To search summaries in case insensitive way, use this instead:
issueFunction in issueFieldMatch("project=CP", "summary", "(?i)Bridge")
Related
I would like to exlude some terms a user may enter into the search bar as they bloat the results.
For separate reasons I need the operator: :or, but if I have a search term like "The Beatles" it searches the whole database for "The" and "Beatles" which is way too many results. I would like to exclude "the" from any query received, so it would be as if the user had only searched "Beatles".
Maybe this isn't possilbe. Thanks for the help 🙏
I tried to add exclude: ["the"], but this removed any ressults which had the term "the".
Did you tried it:
"The Beatles".gsub('The ', '')
Elasticsearch have great solution fro solving such questions - analyzers.
In your case you need to implement Stop analyser in your mapping. Details and docs can be found here: Stop Analyzer
I am working on a model using wikipedia topics' names for my experiments in full-text index.
I set up and index on 'topic' (legacy), and do a full text search for : 'united states':
start n=node:topic('name:(united states)') return n
The first results are not relevant at all:
'List of United States National Historic Landmarks in United States commonwealths and territories, associated states, and foreign states'
[...]
and the actual 'united states' is buried deep down the list.
As such, it raises the problem that, in order to find the best match (e.g. levershtein, bi-gram, and so on algorithms) on results, you first must fetch all the items matching the pattern.
That would be a serious constraint, cause just in this case I have 21K rows, ~4 seconds.
Which algorithms does neo4j use to order the results of a full-text search (START)?
Which rationale does it use to sort result and how to change it using cypher?
In the doc is written to use JAVA api to apply sort() - it would be very useful to have a tutorial for appointing to which files to modify and also to know which ranking rationale is used before any tweak.
EDITED based on comments below - pagination of results is possible as:
n=node:topic('name:(united states)') return n skip 10 limit 50;
(skip before limit) but I need to ensure first results are meaningful before pagination.
I don't know which order algorithms does lucene use to order the results.
However, about the pagination, if you change the order of limit and skip like follows, should be ok.
start n=node:topic('name:(united states)') return n skip 10 limit 50 ;
I would also add that if you are performing full-text search maybe a solution like solr is more appropriate.
For just a lucene index lookup with scoring you might be better off with this:
http://neo4j.com/docs/stable/rest-api-indexes.html#rest-api-find-node-by-query
I am trying to link two types of documents in my Solr index. The parent is named "house" and the child is named "available". So, I want to return a list of houses that have available documents with some filtering. However, the following query gives me around 18 documents, which is wrong. It should return 0 documents.
q=*:*
&fq={!join from=house_id_fk to=house_id}doctype:available AND discount:[1 TO *] AND start_date:[NOW/DAY TO NOW/DAY%2B21DAYS]
&fq={!join from=house_id_fk to=house_id}doctype:available AND sd_year:2014 AND sd_month:11
To debug it, I tried first to check whether there is any available documents with the given filter queries. So, I tried the following query:
q=*:*
&fq=doctype:available AND discount:[1 TO *] AND start_date:[NOW/DAY TO NOW/DAY%2B21DAYS]
&fq=doctype:available AND sd_year:2014 AND sd_month:11
The query gives 0 results, which is correct. So as you can see both queries are the same, the different is using the join query parser. I am a bit confused, why the first query gives results. My understanding is that this should not happen because the second query shows that there is no any available documents that satisfy the given filter queries.
I have figured it out.
The reason is simply the type of join in Solr. It is an outer join. Since both filter queries are executed separately, a house that has available documents with discount > 1 or (sd_year:2014 AND sd_month:11) will be returned even though my intention was applying bother conditions at the same time.
However, in the second case, both conditions are applied at same time to find available documents, then houses based on the matching available documents are returned. Since there is no any available document that satisfies both conditions, then there is no any matching house which gives zero results.
It really took sometime to figure this out, I hope this will help someone else.
I'm trying to make partial search working, a search for
"sw"
"swe"
"swed"
should match "Sweden"
I looked around and just can't get it to work
Rails Code
I'm using
this code from the Tire repo as templatecode.
whole words still match!
I have reindex and also tried using the edgengram filter.
I'm not a Ruby developper but found this article useful:
http://dev.af83.com/2012/01/19/autocomplete-with-tire.html
Like it sais:
Most databases handle that feature with a filter (with LIKE keyword in SQL, regular expression search with mongoDB). The strategy is simple: iterate on all results and keep only words which match the filter. This is brutal and hurts the hard drive. Elastic Search can do it too, with the prefix query.
With a small index, it plays well. For large indexes it will be more slow and painful.
You said:
whole words still match!
And what do you expect? Is "Sweden" not supposed to match "Sweden" but only "Swe", "Swed" or Swede" ?
Because your query on the field is also analyzed
Use the edgengram token filter. That will get you what you're looking for.
Is it possible to do something like the following in lucene? If not, can you give any suggestions for how to get around this limitation?
SELECT
start.dt AS eventstarttime,
last.dt AS eventfinishtime
WHERE
start.evt:"Started" AND last.evt:"Ended" AND start.evtgrpid = last.evtgrpid
Your question does not give enough information to fully answer it. This SQL is not even valid - where is the FROM clause (for a start)?
Suggestion 1: run two queries ("Started" and "Ended") separately and merge the results based on evtgrpid.
Suggestion 2: run one query (e.g. "Started") and filter the results based on "Ended" criteria.
Suggestion 3: do not use Lucene for what databases are built for. Really. Often database logic does not even apply to Lucene (e.g. what if stopwords are used when indexing?).