fuzzy search using cypher - neo4j

I have node for user with FirstName, LastName properties. Now I want to search some value in both properties from both site. Let I have to explain.
FirstName LastName
--------- --------
Manish Pal
Pal Dharmesh
Rajpal Yadav
sharma shreepal
Now I want to search which node's firstname or lastname contain 'pal'.
I have written query like this.
START users=node(*)
WHERE (users.FirstName =~ '(?i)pal.*' OR users.LastName =~ '(?i)pal.*')
RETURN users;
It gives me just 2 nodes, but I want all node with is containing 'pal'
If I try like this
START users=node(*)
WHERE (users.FirstName =~ '(?i)*.pal.*' OR users.LastName =~ '(?i)*.pal.*')
RETURN users;
It is giving me following error.
"PatternSyntaxException"
Dangling meta character '' near index 4 (?i).ant. ^*
I have set example here for your ready reference.
Thanks.

The second query contains invalid regular expression syntax. I think you mean:
START users=node(*)
WHERE (users.FirstName =~ '(?i).*pal.*' OR users.LastName =~ '(?i).*pal.*')
RETURN users
Note the difference to the query in your post:
'(?i)*.pal.*' in your post, and
'(?i).*pal.*' in the above query
The asterisk * means the expression before me [the asterisk] may appear an arbitrary number of times, including zero. But (?i) is no regular expression but just a modifier to ignore the case of the actual expression. I think you meant .*. The regular expression . matches any character, the asterisk allows any character to appear an arbitrary number of times.
Thus, '(?i).*pal.*' says: [ignore case] <arbitrary number of any characters><the exact character sequence: "pal"><arbitrary number of any characters>
The above query returned four results for me:
users.FirstName | users.LastName
---------------------------------
sharma | shreepal
Rajpal | Yadav
Pal | Dharmesh
Manish | Pal
Which is what you wanted, if I understood your correctly.

Related

How to count and compare amount of regex matches

I want to use Sumo Logic to count how often different APIs are called. I want to have a table with API call name and value. My current query is like this:
_sourceCategory="my_category"
| parse regex "GET.+443 (?<getUserByUserId>/user/v1/)\d+" nodrop
| parse regex "GET.+443 (?<getUserByUserNumber>/user/v1/userNumber)\d+"
| count by getUserByUserId, getUserByUserNumber
This gets correct values but they go to different columns. When I have more variables, table becomes very wide and hard to read.
I figured it out, I need to use same group name for all rexexes. Like this:
_sourceCategory="my_category"
| parse regex "GET.+443 (?<endpoint>/user/v1/)\d+" nodrop
| parse regex "GET.+443 (?<endpoint>/user/v1/userNumber)\d+"
| count by endpoint

[splunk]: Obtain a count of hits in a query of regexes

I am searching for a list of regexes in a splunk alert like this:
... | regex "regex1|regex2|...|regexn"
Can I modify this query to get a table of the regexes found along with their count. The table shouldn't show rows with 0 counts.
regex2 17
regexn 3
The regex command merely filters events. All we know is each result passed the regular expression. There is no record or indication of why or how any event passed.
To do that, you'd have to extract a unique field or value from each regex and then test the resulting events to see which field or value was present. The regex command, however, does not extract anything. You'd need the rex command or the match function to do that.
Looks like | regex line is not needed. This is working for me. Notice the extra brackets.
| rex max_match=0 "(?P<countfields>((regex1)|(regex2)|..|(regexn)))"
| stats count by countfields

In Sumo Logic, how to search for logs matching a regular expression?

I'm trying to do a Sumo Logic search for logs matching the following regular expression:
"Authorization \d+ for story is not voided. Story not removed"
That is, the \d+ consists of one or more digits, but it doesn't matter what they are exactly.
Based on the search examples cheat sheet (https://help.sumologic.com/05Search/Search-Cheat-Sheets/General-Search-Examples-Cheat-Sheet), I've tried to use a * | parse regex pattern for this, but that doesn't work:
I get a 'No capture group found in regex' error. I'm actually not really interested in capturing the digits, though, just in matching the regular expression in my search. How can I achieve this?
I managed to get it to work in two ways. Firstly, using the regular parse instead of parse regex:
* | parse "Authorization * for story is not voided. Story not removed" as id |
count by _sourceHost | sort by _count
or, when using a regular expression, it needs to be a named group:
* | parse regex "Authorization (?<id>\d+) for story is not voided. Story not removed" |
count by _sourceHost | sort by _count

ActiveRecord: match fields that start with plus (+)

I am trying to get all records where my phone field starts with a '+'
Company.where("phone LIKE ?", "+%") // RETURNS 0 RESULTS
and for some reason its listing zero results, even when there are results that start with '+'
I also tried to use a \ in order to escape the special meaning of '+' to no avail.
Although, if I try to match string that start with +1 for example, it works as expected.
Company.where("phone LIKE ?", "+1%") // WORKS FINE
Its the quotation marks!
Company.where("phone LIKE ?", '+%')
Using ' single quotes instead of " double-quotes, fixed the query.

Configure Sphinx to index dash and search it with and without it

I have a record
Item id: 1, name: "wd-40"
How do I configure Sphinx to match this record on the following queries:
Item.search("wd40")
Item.search("wd-40")
To answer your title question, charset_table is what you want.
http://sphinxsearch.com/docs/current.html#charsets
But that doesnt actully solve the query of matching those two queries, indexing - wouldn't work, just be the inverse of indexing it.
Instead, you probably want ignore_chars
http://sphinxsearch.com/docs/current.html#conf-ignore-chars
First indexing:
By default, only ascii characters are indexed by Sphinx; the others are considered word separators. To fix that, you need to use the charset_table parameter to map the dash to the dash character.
Second searching:
AFAIK, it is not possible to make Sphinx to consider both searches like you are asking for. However, you can just use something like:
# in Python, but I believe is understandable
query = word
if '-' in word:
query += " | " + word.replace('-','')
Item.search(query) # if word = 'wd-40', query = 'wd-40 | wd40'

Resources