I know that you can do an odata query like this for wildcards
$filter=search.ismatch('lux*', 'Description')
What I would like to do is this
$filter=search.ismatch('*lux', 'Description')
I have tried the above query and it did not return any information and I know there are matches for '*lux'
Ideally I would like to have 2 different fields in the query like this
search=&$filter=Hotel eq 'Southern' and search.ismatch('*lux', 'Description')
That syntax does not return anything either
Ideal result set:
Hotel: Description:
Southern Ultra lux
Southern Mega lux
Also I did not know how to tag this as I don't work with it a lot so I am sorry if it is mis tagged
I found this answer: You cannot use a * or ? symbol as the first character of a search. No text analysis is performed on wildcard search queries. At query time, wildcard query terms are compared against analyzed terms in the search index and expanded.
Looks like it isn't possible only if the wildcard is at the beginning of the search query. If the wildcard is at the end or middle of the search query then it will work perfectly.
You can do prefix, suffix and infix queries in Azure Cognitive Search if your queryType is set to "full" (invokes the full Lucene parser) and if you're using regular full text search ("search=") instead of a $filter. Prefix, suffix, and infix are variations of wildcard search forms. There are some examples in the docs if you want to take a look.
Related
We are using neo4j version 4.1.1,
and we have a graph that represents a structure of objects.
we support translation using nodes for translation and the connection between an object and a translation node is the object name and description.
for example:
(n:object)-[r:Translation]-(:ru)
means that on relationship r is the name and description of object n in russian.
In order to search by name and description we implemented a fullText index like that:
CALL db.index.fulltext.createRelationshipIndex("TranslationRelationshipIndex",["Translation"],["Name","Description"], { eventually_consistent: "true" })
We also support search for items in order to do it we are using the index to query and we have names like "UFO41.SI01V03":
CALL db.index.fulltext.queryRelationships('TranslationRelationshipIndex', '*FO41.SI0*') YIELD relationship, score
but for names as shown above([0-9.*]) no results are returned
while results are returned for name like "ab.or"
Is there any one who knows how to make it work? I've tried all 46 analyzers available.
I know we can solve it just using match()-[r]-() where r.Name contains "<string>"
but we prefer a more efficient index-using solution to this problem.
stay safe!
and thanks in advance.
p.s if needed I can supply a few lines to recreate it locally just ask.
The analyzer will probably recognise words like ab.or differently than ab.or123 and consider them a single token in the first case and two tokens in the second case.
There is no analyzer that will really fit your needs except than creating your own.
You can however replace the . in your query with a simple AND, for eg :
CALL db.index.fulltext.queryNodes('Test', replace("*FO41.SI0*", ".", " AND "))
Will return you the results you're looking at.
Resources for creating your own analyser :
https://graphaware.com/neo4j/2019/09/06/custom-fulltext-analyzer.html
https://neo4j.com/docs/java-reference/current/extending-neo4j/full-text-analyzer-provider/
While searching on azure using Rest API provided by Microsoft Search API
Not behaving correctly when search string contains '#'.
Example: I've 3 rows in Azure Document
CES
CES#123
CES#1234
When My search string was CES* then all 3 were the result.
When My Search string was CES#123* then only one exact matching record was in result.
When My Search string was CES#* then there was no result.
As per my requirement in case of "CES#*" search string, all 3 records should be part of result set.
I've tried " "(space) in replacement of # it works, but my data contains # for search I need to maintain this.
I'm using SearchMode:Any.
This behavior is expected.
Query terms of prefix queries are not analyzed. Therefore, in your example with "CES#*" you are searching for term CES# while the # sign was stripped from the terms in the index: CES, 123, 1234.
Here is an excerpt from the How full text search works in Azure Search article:
Exceptions to lexical analysis
Lexical analysis applies only to query types that require complete
terms – either a term query or a phrase query. It doesn’t apply to
query types with incomplete terms – prefix query, wildcard query,
regex query – or to a fuzzy query. Those query types, including the
prefix query with term air-condition* in our example, are added
directly to the query tree, bypassing the analysis stage. The only
transformation performed on query terms of those types is lowercasing.
Dear programmers and IT experts, I need your help. I've just started to research what Sphinx is. I even made my own "google suggest", that fix frequent and common human search input mistakes. The problem is, that it tries to fix errors all the time and interrupt the real input.
Whell, I want the search engine try to find consilience in searched field by substring first, than, if consiliences are not found, than use my logic for fixing errors. If to say shortly, I want sphinx, first of all, execute this SQL equivalent command
SELECT * FROM suggest WHERE keyword LIKE('%$keyword%')
than, if nothing found, continue mistakes fixing.
The main questioin is....is it possible to tell spinx to search by substring?
Sphinx can mostly do that, but need to understand how it works. Sphinx indexes individual words, and matches by keywords. It uses a large inverted index to make queries fast (rather than running a substring match)
So can do MATCH('one two') as a query, and it will match a document that contains '... one two ...', but the order doesn't matter, and other words can be present, so will ALSO match '... two three one ...' which wouldn't happen with mysql LIKE (its a pure substring match)
Can use phrase operator to do that MATCH('"one two"')
Furthermore Sphinx matches whole words by default. So MATCH('one two') will only match those two works. It wont match against a document say "... one twotwentyone ...' whereas LIKE doesn't restrict to whole words.
So can use wildcards to allow part matches. MATCH('"*one two*"') --- also need to enable it on the index with min_infix_len config!
And even more, sphinx doesn't index punctuation etc (with default charset_table), so a document say '... One! (two?) ... ' WOULD still match MATCH('"one two"'). SQL like would NOT ignore that.
You could change sphinx to index more punctuation (via charset_table) to get a closer to substring match.
So SELECT * FROM index WHERE MATCH('"*$keyword*"') is possibly the closest sphinx query to your original (ie a substring match). Just as long as you aware of the differences. Also there is MySQL Collations to consider, they not exactly same as charset_table.
(frankly while this is correct. I wonder, if a bit OTT. If you just have a textual corpus you want to search, you could index it as normal. Then run queries though CALL KEYWORDS(), to get idea if the query is a valid word in the index (ie just tells you how many times given words appear in index). Can then run your algorithm to fix the mistakes)
As a side note sphinx does have a built in suggest system
http://sphinxsearch.com/blog/2016/10/03/2-3-2-feature-built-in-suggests/
Using Rails 3.2, Sphinx Search and Thinking Sphinx gem. I have tried wrapping % and *, and wildcard, in the keywords in my controller, but can't achieve what I want:
Keywords to search:
world trade
worldtrade
worl trade
trade world
trad worl
Expected matched search results:
World Trade Center
How should I format the keywords in my controller so that I get the expected search result with the different keywords as shown above?
For Sphinx, the wildcard character is * - so there's no point using %.
If you want every query to have wildcards added to it, use :star => true in your Thinking Sphinx search call.
If you want to match on parts of words, you need to enable either min_prefix_len or min_infix_len. The default is 0 (disabled), but given your examples, perhaps 4 is a good setting? The lower it is, the larger your index files get (and so searches may get slower as well), so perhaps don't set it to 1 unless you need to.
All of the above points won't make 'worldtrade' match, because you're providing one long word instead of two separate ones. You could look into adding a wordforms file to your Sphinx configuration, but that may be overkill in this case, as you'd want to cover each specific case.
I'm trying to make partial search working, a search for
"sw"
"swe"
"swed"
should match "Sweden"
I looked around and just can't get it to work
Rails Code
I'm using
this code from the Tire repo as templatecode.
whole words still match!
I have reindex and also tried using the edgengram filter.
I'm not a Ruby developper but found this article useful:
http://dev.af83.com/2012/01/19/autocomplete-with-tire.html
Like it sais:
Most databases handle that feature with a filter (with LIKE keyword in SQL, regular expression search with mongoDB). The strategy is simple: iterate on all results and keep only words which match the filter. This is brutal and hurts the hard drive. Elastic Search can do it too, with the prefix query.
With a small index, it plays well. For large indexes it will be more slow and painful.
You said:
whole words still match!
And what do you expect? Is "Sweden" not supposed to match "Sweden" but only "Swe", "Swed" or Swede" ?
Because your query on the field is also analyzed
Use the edgengram token filter. That will get you what you're looking for.