Escaping # at symbol in Ruby Elastic Search gem? - ruby-on-rails

I have the following code in the custom ES 'where' wrapper method
filter: { term: params }
Then we have a sample ES document that contains:
"emails" => { "email" => "johndoe#email.com" }
It is returned when my search is:
query.where("emails.email" => "johndoe")
but I get no results when:
query.where("emails.email" => "johndoe#email.com")
It seems like I have to escape at symbol somehow when using ES gem?

It's probably because your field is analyzed using the default standard analyzer and is thus tokenized at the # sign.
You can see what ES has indexed by running the command below:
curl -XGET 'localhost:9200/_analyze?analyzer=standard&pretty' -d 'johndoe#email.com'
And the result is
{
"tokens" : [ {
"token" : "johndoe",
"start_offset" : 0,
"end_offset" : 7,
"type" : "<ALPHANUM>",
"position" : 1
}, {
"token" : "email.com",
"start_offset" : 8,
"end_offset" : 17,
"type" : "<ALPHANUM>",
"position" : 2
} ]
}
As you can see, your email field has been tokenized as two different tokens and that's probably why searching for johndoe works, while searching for the full email address doesn't.
There are a few ways out from here, but one way that would work is to create your own analyzer based on a pattern_capture token filter and use it as index_analyzer for your emails.email field.
{
"settings" : {
"analysis" : {
"filter" : {
"email" : {
"type" : "pattern_capture",
"preserve_original" : 1,
"patterns" : [ "([^#]+)", "(\\p{L}+)", "(\\d+)", "#(.+)" ]
}
},
"analyzer" : {
"email" : {
"tokenizer" : "uax_url_email",
"filter" : [ "email", "lowercase", "unique" ]
}
}
}
},
"mappings": {
"emails": {
"properties": {
"email": {
"type": "string",
"analyzer": "email" <-- use the analyzer here
}
}
}
}
}
At indexing time, that analyzer will produce all of the following tokens, which will allow you to search for any parts of your email address:
johndoe#email.com
johndoe
email.com
email
com

Related

How do I compare two nested documents using mongo_dart

Here is my database entry structure, it has a nested document called friends, I want to compare two different _id's friends list in dart using mongo_dart
{
"_id" : ObjectId("60ae06074e162995281b4666"),
"email" : "one#one.com",
"emailverified" : false,
"username" : "one#one.com",
"displayName" : "complete n00b",
"phonenumber" : "",
"dob" : "",
"points" : 0,
"friends" : [
{
"username" : "three#one.com",
"sent" : ISODate("2021-05-26T10:01:30.616Z")
},
{
"username" : "six#one.com",
"sent" : ISODate("2021-05-26T10:43:16.822Z")
}
]
}
Here is my code, but I am not getting any returns
Future<Map> commonFriends(store, myObjectId, theirObjectId) async {
var commonList = await store.aggregate([
{
'\$project': {
'friends': 1,
'commonToBoth': {
'\$setIntersection': [
{'_id': myObjectId, 'friends': '\$username'},
{'_id': theirObjectId, 'friends': '\$username'}
]
},
}
}
]);
return commonList;
}
I am getting an error from db.dart which is apart of mongo_dart package. The error is "Exception has occurred.
Map (4 items)"

firebase database filtering is not working. Need some assistance

I have a simple test database that I cant get to filter. I indexed the category in the rules:
"questions":{
".indexOn": ["category"]
},
My filter for the quiz app:
/questions.json?orderBy="category"&equalTo="Basics"&print=pretty
and my database:
"-MKoucSP33zm4jC43AnY" : {
"title" : {
"answers" : [ {
"score" : 30,
"text" : "Pineapple"
}, {
"score" : 5,
"text" : "Ham"
}, {
"score" : 20,
"text" : "Yogurt"
}, {
"score" : 10,
"text" : "Crab"
} ],
"category" : "Basics",
"questionId" : "101",
"questionImage" : "",
"questionLink" : "",
"questionText" : "What topping do you like the best on pizza?"
}
}
The category property is nested under the title node, so the property you need to order/filter on is title/category:
/questions.json?orderBy="title/category"&equalTo="Basics"&print=pretty
You'll also need to update your index definition for that path, so:
"questions": { ".indexOn": "title/category" }
Working example: https://stackoverflow.firebaseio.com/64596200/questions.json?orderBy="title/category"&equalTo="Basics"

How to query specific branch build number from Jenkins JSON Remote Access API

Within the browser for my jenkins job I'm running the following query.
lastStableBuild/api/json?pretty=true&tree=actions[buildsByBranchName[*[*]]]
Results from the above query
{
"_class" : "hudson.model.FreeStyleBuild",
"actions" : [
{
"_class" : "hudson.model.CauseAction"
},
{
},
{
"_class" : "jenkins.metrics.impl.TimeInQueueAction"
},
{
},
{
"_class" : "hudson.plugins.git.util.BuildData",
"buildsByBranchName" : {
"my-branch-name" : {
"_class" : "hudson.plugins.git.util.Build",
"buildNumber" : 587,
"buildResult" : null,
"marked" : {
"SHA1" : "***",
"branch" : [
{
}
]
},
"revision" : {
"SHA1" : "***",
"branch" : [
{
}
]
}
},
"my-other-branch-name" : {
"_class" : "hudson.plugins.git.util.Build",
"buildNumber" : 1373,
"buildResult" : null,
"marked" : {
"SHA1" : "***",
"branch" : [
{
}
]
},
"revision" : {
"SHA1" : "***",
"branch" : [
{
}
]
}
},
I would like to be able to narrow it down to just the build number like you would get with
/lastSuccessBuild/buildNumber
using the api but I would settle for just everything inside of the branch name key so that I wouldn't have to loop through all branches and compare the name. I'm assuming I can narrow it down more where I have my "*" specified but can't figure out the right syntax to use.
I got that info from here instead.
tree=actions[lastBuiltRevision[*,branch[*]]]
Either way, if you want the branch info, from inside the buildsByBranchName section of the tree, you will have to query it as I did above.
If you don't mind getting your answer back in xml, xpath works very well.
For the url:
/lastStableBuild/api/xml?xpath=//buildsByBranchName&wrapper=meep
Creates an xml that looks like:
<meep>
<buildsByBranchName>
...
</buildsByBranchName>
</meep>
And will be populated with the buildsByBranchName (NOTE: there may be more than one if there are multiple git remotes, hence the need for a wrapper) for the specified last successful build of the job specified in the url. You can substitute anything for the word "meep", that will become the wrapper object for the newly created xml object.

How to check elasticsearch tokens after running a query in Rails?

My problem is the following:
I run an elasticsearch query in a rails app using specific settings to my index and my search analyzer, the problem is that it doesnt return any results in the app, in the other hand when i try to run it directly from my elasticsearch docker, i have tokens returned. If i use these tokens in my app query, i get results...
so this is my elasticsearch query:
curl -XGET 'localhost:9200/development-stoot-services/_analyze?analyzer=search_francais' -d 'cours de guitare'
{"tokens":[{"token":"cour","start_offset":0,"end_offset":5,"type":"<ALPHANUM>","position":1},{"token":"guitar","start_offset":9,"end_offset":16,"type":"<ALPHANUM>","position":3}]}
here is the query from my rails app to elasticsearch:
query = {
"query" : {
"bool" : {
"must" : [
{
"range" : {
"deadline" : {
"gte" : "2016-05-26T10:27:19+02:00"
}
}
},
{
"terms" : {
"state" : [
"open"
]
}
},
{
"query_string" : {
"query" : "cours de guitare",
"default_operator" : "AND",
"fields" : [
"title",
"description",
"brand",
"category_name"
]
}
}
]
}
},
"filter" : {
"and" : [
{
"geo_distance" : {
"distance" : "40km",
"location" : {
"lat" : 48.855736,
"lon" : 2.32927300000006
}
}
}
]
},
"sort" : [
{
"created_at" : "desc"
}
]
}
the last query does not return any result, but if i try a query with the tokens returned by elasticsearch ('cour', 'guitar') i have expected results. So i guess there is a problem between rails and elasticsearch that i dont find...
Can anyone help on that ?
Try to modify your query like this, i.e. you need to specify the search_francais analyzer in your query_string in order to analyze cours de guitare the same way you did with the _analyze endpoint:
...
{
"query_string" : {
"query" : "cours de guitare",
"default_operator" : "AND",
"analyzer": "search_francais", <--- add this line
"fields" : [
"title",
"description",
"brand",
"category_name"
]
}
},
...

How to set "search_type" to "count" in elasticsearch-rails?

Here's the query I'd like to get working with elasticsearch-rails. (The query works in Sense). My goal is to return all the buckets for items that have a person whose name begins with the letter B. My first stumbling block is that I can't figure out how to specify that the search_type should be set to count.
GET _search?search_type=count
{
"query": {
"prefix": {
"person": "B"
}
},
"aggs" : {
"facets" : {
"terms" : {
"field" : "person",
"size" : 0,
"order" : { "_term" : "asc" }
}
}
}
}
According to this issue, this doesn't seem supported yet.
An alternative that works is simply setting size: 0 in your query, like this:
{
"size": 0, <--- add this
"query": {
"prefix": {
"person": "B"
}
},
"aggs" : {
"facets" : {
"terms" : {
"field" : "person",
"size" : 0,
"order" : { "_term" : "asc" }
}
}
}
}
It is worth noting, though, that search_type=count is going to be deprecated is now deprecated in ES 2.0 and the recommendation will be to simply set size: 0 in your query as mentioned above. Doing so would make you ES 2.0-compliant... at least for that query, that is :)

Resources