Using "~*" in Postgres using Entity Framework 6 - entity-framework-6

I'm trying to use the case-insensitive regex operator ~* of Postgres using Entity Framework 6.
When using Regex.IsMacht(column, pattern, RegexOptions.IgnoreCase) this translates into ~
(as described in the mapping page https://www.npgsql.org/efcore/mapping/translations.html)... and this does result in a case-insensitive filter, using ~* also ignores diacritics so this would result in better results for our users.
Is it possible to create a query with ~* using EF6?
edit:
Here is a screenshot of a query as a reaction to the remark by Shay Rojansky.

The answer here is that there's no behavioral difference between the PostgreSQL ~* operator and Npgsql's current way of doing case-insensitive matches (?i)...; both are equivalent.
Note that I've merged this PR for version 8.0 which improves various aspects of the SQL generated around regexes, and also switches to ~* (but for unrelated readability reasons).

Related

Separating relationship types by | (pipe) vs |: (pipe colon)

The Neo4j MATCH documentation says that
To match on one of multiple types, you can specify this by chaining them together with the pipe symbol |
but gives an example where the separator used is in fact |:, not just |
MATCH (wallstreet { title:'Wall Street' })<-[:ACTED_IN|:DIRECTED]-(person)
RETURN person
Experimenting in my local Neo4j browser, it seems that the two separators (| and |:) behave identically; that is, the query
MATCH (wallstreet { title:'Wall Street' })<-[:ACTED_IN|DIRECTED]-(person)
RETURN person
seems to do the same thing as the one from the Neo4j docs, at least on my data set. But this invites the question of why Neo4j would implement two similar syntaxes to do exactly the same thing.
Are the behaviours of the two syntaxes above in fact identical, or is there a subtle difference between them that doesn't show up on my data set? Whatever the answer may be, is it documented anywhere? And if there is no difference between them, what is the rationale for Cypher supporting both syntaxes?
AFAIK - there are no differences.
Rationality - backward compatibility.
Over time, Cypher language has been evolved.
If I recall correctly, there were ~3 implementations of Cypher language.
So, to allow users to migrate to new Neo4j versions, without rewriting all queries, Cypher retained old syntax support.
For example, in the past (< 3.0.0) you were able to use "bare node" syntax:
node-[rel]-otherNode
General recommendation - do not use deprecated syntax.
If the syntax is not mentioned in documentation explicitly - it can be considered as deprecated.
Deprecations page in documentation - http://neo4j.com/docs/stable/deprecations.html

Querying for tag values in a given list

Is there any shortform syntax in influxdb to query for membership in a list? I'm thinking of something along the lines of
SELECT * FROM some_measurement WHERE some_tag IN ('a', 'b', 'c')
For now I can string this together using ORed =s, but that seems very inefficient. Any better approaches? I looked through the language spec and I don't see this as a possibility in the expression productions.
Another option I was thinking was using the regex approach, but that seems like a worse approach to me.
InfluxDB 0.9 supports regex for tag matching. It's the correct approach although of course regex can be problematic. It's not a performance issue for InfluxDB, and in fact would likely be faster than multiple chained OR statements. There is no support yet for clauses like IN or HAVING.
For example: SELECT * FROM some_measurement WHERE some_tag =~ /a|b|c/

MongoID Query with Regex and escaping

I want to know if it is necessary to escape regex in query calls with rails/mongoID ?
This is my current query:
#model.where(nice_id_string: /#{params[:nice_id_string]}/i)
I am now unsure if it is not secure enough, because of the regex.
Should i use this code below or does MongoID escape automatically query calls?
#model.where(nice_id_string: /#{Regexp.escape(params[:nice_id_string])}/i)
Of course you should escape the input. Consider params[:nice_id_string] being .*, your current query would be:
#model.where(nice_id_string: /.*/i)
whereas your second would be:
#model.where(nice_id_string: /\.\*/i)
Those do very different things, one of which you probably don't want. Someone with a sufficiently bad attitude could probably slip some catastrophic backtracking through your current version and I'm not sure what MongoDB/V8's regex engine will do with that.

idiomatic way to do regular expression searches in rails models?

in my rails controller, i would like to do a regular expression search of my model. my googling seemed to indicate that i would have to write something like:
Model.find( :all, :condition => ["field REGEXP '?' " , regex_str] )
which is rather nasty as it implies MySQL syntax (i'm using Postgres).
is there a cleaner way of forcing rails (4 in my case) to do a regexp search on a field?
i also much prefer using using where() as it allows me to map my strong parameters (hash) directly to a query. so what i would like is something like:
Model.where( params, :match_by => { 'field': '~' } )
which would loosely translate to something like (if params['field'] = 'regex_str')
select * from models where field ~ regex_str
Unfortunately, there is no idiomatic way to do this. There's no built-in support for regular expressions in ActiveRecord. It'd be impossible to do efficiently unless each database adapter had a database-specific implementation, and not all databases support regular expression matches. Those that do don't all support the same syntax (for example, Postgres doesn't have the same regexp syntax as Ruby's Regexp class).
You'll have to roll your own using SQL, as you've noted in your question. There are alternatives, however.
For a Postgres-specific solution, check out pg_search, which uses Postgres's full text search capabilities. This is very fast and supports fuzzy searching and some pattern matching.
elasticsearch requires more setup, but is incredibly fast, with some nice gems to make your life easier. Here's a RailsCasts episode introducing it. It requires running a separate server, but it's not too hard to get started, and it's powerful. Still no regular expressions, but it's worth looking at.
If you're just doing a one-off regexp search against a single field, SQL is probably the way to go.

SQLite Order By places umlauts & special chars at end

I'm using Phonegap to do a dictionary app for iOS.
When querying the database for an alphabetical list I use COLLATE NOCASE:
ORDER BY term COLLATE NOCASE ASC
This solved the problem that terms starting with a lower case letter where appended to the end (Picked it up from that question).
However non-standard characters as öäüéêè still get sorted in the end ~ here 2 examples:
Expected: Öffnungszeiten Oberved: Zuzahlung
Zuzahlung Öffnungszeiten
(or) clé cliquer sur
cliquer sur clé
I looked around and found similar matters discussed here or here but it seems the general advice is to install some type of extension
This extension can probably help you ...
...use ICU either as an extension
SQLite supports integration with ICU ...
But I'm not sure if this is applicable in my situation where the database is not hosted by myself but running on the customers device. So I'd guess I'd to ship this extension w/ my app-package.
I'm not very familiar with iOS but I've got the feeling that would be complicated - at least.
Also in the official forum I've found that hint:
SQLite does not properly handle accented characters.
and a little bit down in the text the poster mentions a bug in SQLite.
All the links I've found haven't been active for >= 1 year and non of them seems to deal with the mobile environment I'm currently developing in.
So I was wondering if anyone else found a solution on their iOS projects.
The documentation states they're only 3 default COLLATION option:
6.0 Collating Sequences
When SQLite compares two strings, it uses a collating sequence or
collating function (two words for the same thing) to determine which
string is greater or if the two strings are equal. SQLite has three
built-in collating functions: BINARY, NOCASE, and RTRIM.
BINARY - Compares string data using memcmp(), regardless of text encoding.
NOCASE - The same as binary, except the 26 upper case characters of ASCII are folded to their lower case equivalents before the
comparison is performed. Note that only ASCII characters are case
folded. SQLite does not attempt to do full UTF case folding due to the
size of the tables required.
RTRIM - The same as binary, except that trailing space characters are ignored.
For now my best guess would be to do the sorting in JavaScript but I suspect that this wouldn't do anything well to overall performance.
The reason is that the SQLite on iOS doesn't come with ICU (International Components for Unicode) enabled. So you need to build your own SQLite version with ICU enabled + your own ICU version as static lib + add the ICU .dat and make SQLite load this .dat file. Then you can load any collation via a simple SQL command (i.e. 'icu_load_collation("de_DE", "DEUTSCH")', once after the db was opened)
It doesn't only sound like it's dirt work, it really is. Try to find a version of SQLite + ICU with all of it done already.

Resources