Open Search Server v1.4 select query special characters - search-engine

We're using Open Search Server v1.4. When the user enters a search for the text "Refrigerator temperature chart (5" we create a URL something like
http://10.192.16.160:8080/services/rest/select/search/<indexname/json?login=<login>&key=<apikey>template=search&query=Refrigerator%20temperature%20chart%20%285&start=0&rows=1000&filter=fileType%3afile&lang=ENGLISH
This fails with ...
HTTP Status 500 - org.apache.cxf.interceptor.Fault:
com.jaeksoft.searchlib.SearchLibException:
com.jaeksoft.searchlib.query.ParseException:
org.apache.lucene.queryParser.ParseException: Cannot parse
'content:(Refrigerator temperature chart (5) OR content:("Refrigerator
temperature chart (5") OR
So adding an escape character %5C before the open bracket fixes this query like so...
http://10.192.16.160:8080/services/rest/select/search/<indexname/json?login=<login>&key=<apikey>template=search&query=Refrigerator%20temperature%20chart%20%5C%285&start=0&rows=1000&filter=fileType%3afile&lang=ENGLISH
Can someone point me to some documentation that lists all the special characters that can be used in an Open Search select query that need to be escaped when entered as part of the search string?

Yes you are right, characters listed in section "Escaping Special Characters" in the page you linked also need to be escaped in OpenSearchServer.
We recently released a fix allowing to escape those characters in query of type Search (field) for Searched fields configured with a pattern mode.
Previously escaping of characters was only available in query of type Search (pattern).
(more information of these two kind of queries here: http://www.opensearchserver.com/documentation/tutorials/functionalities.html#two-kinds-of-queries)
Regards,
Alexandre

I believe Open Search Server is based on Lucene. The query syntax for the Lucene engine is described here...
http://lucene.apache.org/core/2_9_4/queryparsersyntax.html
Lucene supports escaping special characters that are part of the query
syntax. The current list special characters are
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
To escape these character use the \ before the character. For example
to search for (1+1):2 use the query:
\(1\+1\)\:2

Related

TFS code search find <Button>

I want to find all usages of my react component in code.
I tried <Button> but Special chars <> are not supported.
Tried "Button" and i get "Button" and button with lowercase as results as well.
So exact match is also not supported.
Is there is a way to find a string exactly without any additional results?
Unfortunately, search symbols (<> and "" in your scenario) is not supported in code search.
In tfs the symbol "" is used for finding an exact match to a set of words by enclosing your search terms in double-quotes. For example, "Client not found".
Is there is a way to find a string exactly without any additional
results?
Yes, but it seems a little complex, just reference my answer in another thread:Is there a way to make TFS code search recognize the "#" symbol?
Checked for some characters in code search. You can't use the symbol
characters except * and ? as part of your search query, which
including below characters: . , : ; / \ ` ' " # = ! # $ & + ^ | ~ < >
( ) { } [ ]. The search will simply ignore these symbols.
But you can use wildcard characters * and ? to broaden your search.
You can use wildcard characters anywhere in your search string except
as a prefix in a simple search string or a query that uses a code type
filter. For example, you cannot use a search query such as
*RequestHandler or class:?RequestHandler. However, you can use prefix wildcards with the other search filter functions; for
example, the
search query strings file:*RequestHandler.cs and repo:?Handlers are
valid.
Please see Broaden your search with wildcards for details.
If you want to search the strings including these symbol exactly(such
as '#' here), you can code search with other strings (eg,
testexample.com here) to narrow down the scope first, then copy the
specific code to text editor which support the symbols (eg,
Notepad++), then search stings with the symbol characters.
Besides, if you are using Git, another workaround is using the code
search tool Hound: a lightning fast code search tool, it supports
the symbol characters. Reference this thread to use it:
How can I publish source code (Visual Studio) on a intranet?
Also, there is a User Voice here to suggest the feature, you can go and vote it up to achieve that in future.

Jira Rest Api JQL isnt working using &

Iam using the Jira Rest Api to read data from our Jira-System. In most cases this works fine but i have a problem if & is in query.
I try following query:
http://jira-test.meinServer.de/rest/api/2/search?jql=labels="F&E"
which produces this error message:
"Fehler in der JQL-Abfrage: Die Zeichenfolge mit Anführungszeichen 'F' wurde nicht abgeschlossen. (Zeile 1, Zeichen 8)"
in englisch:
"Error in JQL-Syntax: The string with 'F' isn't closed. (line 1, char 8)"
I found that & is the problem. But i can't find some workaround or documentation how to escape this.
Someone got a solution for this?
You didn't encode your query string properly. Try: http://jira-test.meinServer.de/rest/api/2/search?jql=labels%3D%22F%26E%22
Reference
The part of an URI caught between ? and # characters is called the query string (the part after # is called fragment). Often the query is a sequence of key–value pairs (laid out as {key}={value}) separated with the & character.
If you analyze the URL you provided and split it by & you'll see that in fact you are passing two parameters in your query string:
parameter jql with a value of labels="F
parameter E" with no value
The second parameter is ignored as this particular REST endpoint doesn't expect it. As you can now clearly see you are passing a malformed JQL in your URL. It's because you want to include a special character in your JQL query.
To make it possible you have to properly encode your JQL. This is a common problem and most of the platforms provide tools to do that. Here are examples for JavaScript, C# and Java.
Click here to learn more about the query strings.

Neo4j - search like query with non english characters

Is there an option in neo4j to write a select query with where clause, that ignores non-latin characters ?
MATCH (places:Place)
WHERE (places.name =~ '.*(?ui)Fabergé.*')
RETURN places
I have place with Fabergé name in graph and i want to find it when user type Fabergé or Faberge without this special character.
I'm not aware of an easy way to do this directly with a regex match in Cypher.
One possible workaround is to store the string in question in a normalized form in a second property e.g. place.name_normalized and then compare it with the normalized search string. Of course normalization needs to be done on client side, see another SO question on how to achive this: Remove diacritical marks (ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ) from Unicode chars

Apache solr not work with special character for search?

I am creating search query for filter my data it not filter as per my expectation
my query string is :Reinvestment Act of 2009 - RD&D
its not return me any result
after replace string : Reinvestment Act of 2009 / RD&D
its working fine.
is there any limitation at solr search if yes then which special char are not allowed.
what alternative to search using special character using solr
Solr query parsers treat certain characters specially. For example, - means exclude the next term in the Solr Query Parser syntax. You can escape these special characters with a backslash, or enclose them in quotes.
You can find more information in the Solr query documentation.

Detecting regional settings (List Separator) from web

After having the unpleasant surprise that Comma Seperated Value (CSV) files are not necessarily comma-separated, I'm trying to find out if there is any way to detect what the regional settings list separator value is on the client machine from http request.
Scenario is as follows: A user can download some data in CSV format from web site (RoR, if it matters). That CSV file is generated on the fly, sent to the user, and most of the time double-clicked and opened in MS Excel on Windows machine at the destination. Now, if the user has ',' set as the list separator, the data is properly arranged in columns, but if any other separator (';' is widely used here) is set, it all just gets thrown into a single column. So, is there any way to detect what separator is used on the client machine, and generate the file accordingly?
I have a sinking feeling that it is not, but I'd like to be sure before I pass the 'can't be done, sorry' line to the customer :)
Here's a JavaScript solution that I just wrote based on the method shown here:
function getListSeparator() {
var list = ['a', 'b'], str;
if (list.toLocaleString) {
str = list.toLocaleString();
if (str.indexOf(';') > 0 && str.indexOf(',') == -1) {
return ';';
}
}
return ',';
}
The key is in the toLocaleString() method that uses the system list separator.
You could use JavaScript to get the list separator and set it in a cookie which you could then detect from your server.
I checked all the Windows Locales, and it seems that the default list separator is virtually always either ',' or ';'. For some locales the drop-down list in the Control Panel offers both options; for others it offers just ','. One locale, Divehi, has a strange character that I've not seen before as the list separator, and, for any locale, it is possible for the user to enter any string they want as the list separator.
Putting random strings as the separator in a CSV file sounds like trouble to me, so my function above will only return either a ';' or a '.', and it will only return a ';' if it can't find a ',' in the Array.toLocaleString string. I'm not entirely sure about whether array.toLocaleString has a format that's guaranteed across browsers, hence the indexOf checks rather than picking out a character at a specific index.
Using Array.toLocaleString to get the list separator works on IE6, IE7, and IE8, but unfortunately it doesn't seem to work on Firefox, Safari, Opera, and Chrome (or at least the versions of those browsers on my computer): they all seem to separate array items with a comma, irrespective of the Windows "list separator" setting.
Also worth noting that by default Excel seems to use the system "decimal separator" when it's parsing numbers out of CSV files. Yuk. So, if you're localizing the list separator you might want to localize the decimal separator too.
I think everyone should use Calc from OpenOffice - it asks when you open a file about encoding, column separators and other. I don't know answer for your question, but maybe you can try to send data in html tables or in xml - excel should read both of them correctly. From my experience it isn't easy to export data to excel. Few weeks ago I have problem with it and after few hours of work I asked a person, who couldn't open my csv file in excel, about version. It was Excel 98...
Take a look on html example and xml.
The simplier version of getListSeparator function, enabling any character to be a separator:
function getListSeparator_bis()
{
var list = ['a', 'b'];
return(list.toLocaleString().charAt(1));
}// getListSeparator_bis
Just set any char (f.e. '#') as list separator in your OS and try the code as above. The appropriate char (i.e. '#' if set as sugested) is returned.
Could you just have the users with non comma separators set a profile kind of option and then generate CSVs based on user settings with the default being commas?
Toms, as far as I'm aware there is no way of achieving what you're after. The most you can do is try and detect the user locale and map it against a database of locales/list separators, altering the list separator in the .CSV file as a result.

Resources