elasticsearch filter aggregation - multiple filters - ruby-on-rails

Using ElasticSearch aggregation to aggregate data on my documents.
so the aggregation request looks something like this
{
"query":{
"bool":{
"filter":[
{
"terms":{
"status":[
"active",
"deleted"
]
}
},
{
"terms":{
"address_ids":[
4078,
4080
]
}
}
]
}
},
"aggregations":{
"neighbourhoods":{
"terms":{
"field":"neighbourhoods",
"size":100
}
},
"budgets_stats":{
"filter":{
"range":{
"budget":{
"lt":100000
}
}
},
"aggregations":{
"budget":{
"stats":{
"field":"budget"
}
}
}
}
}
}
i want to filter out documents that match ** budget lower than 100000 ** and add more filters to that specific aggregation.
I cant add it to the query clause, because i have other aggregation in which i dont want these budget filters to apply.
I cant use the filters aggregation either because it creates a different bucket for each filter - instead of 1 bucket with all filters applied to it.
How can create a filter aggregation with more than 1 filter? lets say the other condition should be
"budgets_stats":{
"filter":
{
"range":{
"vip_budget":{
"gt":500
}
}
}
}
i have tried using Array but its not working.
filter: [{:range=>{:budget=>{:lt=>100000}}}, {:range=>{:vip_budget=>{:gt=>500}}}]
getting this error (using rails gems)
Elasticsearch::Transport::Transport::Errors::BadRequest Exception: [400] {"error":{"root_cause":[{"type":"parsing_exception","reason":"Expected [START_OBJECT] under [filter], but got a [START_ARRAY] in [budgets_stats]","line":1,"col":233}],"type":"parsing_exception","reason":"Expected [START_OBJECT] under [filter], but got a [START_ARRAY] in [budgets_stats]","line":1,"col":233},"status":400}
Is this even possible? couldn't find a reference in the documentation of elastic / rails gems for elastic
Will appreciate any help,
Thanks

Related

How can I run a Graphql query in a Rails console?

I have a Graphql query working in Graphiql:
query MyConfigurationType {
myConfiguration {
number
expirationDate
}
}
Returns
{
"data": {
"myConfiguration": {
"number": 1,
"expirationDate": "2022/10/04"
}
}
}
But I need to actually use that result in my app therefore I want to be able to run it in my rails console. There doesn't seem to be much info about this.
How would one go about executing a Graphql query in the Rails console?
After looking at some documentation the best I could manage to do was, in the Rails console do:
query_string = "query MyConfigurationType {
myConfiguration {
number
expirationDate
}
}"
and the run
result = MySchema.execute(query_string)
Which has a result of
=> #<GraphQL::Query::Result #query=... #to_h={"data"=>{"myConfiguration"=>{"number"=>1, "expirationDate"=>"2022/10/04"}}}>
Therefore one can now do
[1] pry(main)> result['data']
=> {"myConfiguration"=>{"number"=1, "expirationDate"=>"2022/10/04"}}

Match multiple paths in artifactory repo

So I'm trying to write a search_spec.json file to exclude all docker images with path */latest or */develop im my artifactory repo. However I cannot find a solution to exclude multiple paths from the search results. With the current solution I'm getting a 400 from artifactory. Any ideas?
{
"files":[
{
"aql":{
"items.find":{
"repo":{
"$eq":"my-docker-repo"
},
"path":{
"$nmatch":"**/latest*",
"$nmatch":"**/develop*"
},
"updated":{
"$before":"8w"
},
"stat.downloaded":{
"$before":"12w"
}
}
},
"recursive":"true",
"sortBy":[
"created"
],
"limit":10000
}
]
}
You can use compound criteria for that, using an $and operator.
In your example, change -
"path":{
"$nmatch":"**/latest*",
"$nmatch":"**/develop*"
},
to -
"$and":[
{
"path":{
"$nmatch":"**/latest*"
}
},{
"path":{
"$nmatch":"**/develop*"
}
}
],

Elasticsearch function_score not working?

I'm using the following function score for outfits purchased:
{
"query": {
"function_score": {
"field_value_factor": {
"field": "purchased",
"factor": 1.2,
"modifier": "sqrt",
"missing": 1
}
}
}
}
However, when I create a search - I get the following error:
"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [purchased] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
The syntax is correct for the search as I've run it locally and it works perfectly. I'm now running it on my server and it's not workings. Do I need to define purchased as an integer somewhere or is this due to something else?
The purchased field is an analyzed string field, hence the error you see.
When indexing your documents, make sure that the numbers are not within double quotes, i.e.
Wrong:
{
"purchased": "324"
}
Right:
{
"purchased": 324
}
...or if you can't change the source documents (because you're not responsible for producing them), make sure that you create a mapping that defines the purchased field as being an integer field.
{
"your_type": {
"properties": {
"purchased": {
"type": "integer"
}
}
}
}

Elasticsearch validate API explain query terms from more like this against single field getting highlighted terms

I have an index, with effectively the converted word or pdf document plain text "document_texts", built on a Rails stack the ActiveModel is DocumentText using the elasticsearch rails gems, for model, and API. I want to be able to match similar word documents or pdf's based on the document text
I have been able to match documents against each other by using
response = DocumentText.search \
query: {
filtered: {
query: {
more_like_this: {
ids: ["12345"]
}
}
}
}
But I want to see HOW did the result set get queried, what were the query terms used to match the documents
Using the elasticsearch API gem I can do the following
client=Elasticsearch::Client.new log:true
client.indices.validate_query index: 'document_texts',
explain: true,
body: {
query: {
filtered: {
query: {
more_like_this: {
ids: ['12345']
}
}
}
}
}
But I get this in response
{"valid":true,"_shards":{"total":1,"successful":1,"failed":0},"explanations":[{"index":"document_texts","valid":true,"explanation":"+(like:null -_uid:document_text#12345)"}]}
I would like to find out how did the query get built, it uses upto 25 terms for the matching, what were those 25 terms and how can I get them from the query?
I'm not sure if its possible but I would like to see if I can get the 25 terms used by elasticsearches analyzer and then reapply the query with boosted values on the terms depending on my choice.
I also want to highlight this in the document text but tried this
response = DocumentText.search \
from: 0, size: 25,
query: {
filtered: {
query: {
more_like_this: {
ids: ["12345"]
}
},
filter: {
bool: {
must: [
{match: { documentable_type: model}}
]
}
}
}
},
highlight: {
pre_tags: ["<tag1>"],
post_tags: ["</tag1>"],
fields: {
doc_text: {
type_name: {
content: {term_vector: "with_positions_offsets"}
}
}
}
}
But this fails to produce anything, I think I was being rather hopeful. I know that this should be possible but would be keen to know if anyone has done this or the best approach. Any ideas?
Including some stop words just for anyone else out there this will give an easy way for it to show the terms used for the query. It doesnt solve the highlight issue but can give the terms used for the mlt matching process. Some other settings are used just to show
curl -XGET 'http://localhost:9200/document_texts/document_text/_validate/query?rewrite=true' -d '
{
"query": {
"filtered": {
"query": {
"more_like_this": {
"ids": ["12345"],
"min_term_freq": 1,
"max_query_terms": 50,
"stop_words": ["this","of"]
}
}
}
}
}'
https://github.com/elastic/elasticsearch-ruby/pull/359
Once this is merged this should be easier
client.indices.validate_query index: 'document_texts',
rewrite: true,
explain: true,
body: {
query: {
filtered: {
query: {
more_like_this: {
ids: ['10538']
}
}
}
}
}

Neo4j http API NullPointerException

I am using the HTTP API to query the Neo4j server. The exact same queries with slightly different values do not work consistently. Infact the entire system breaks because of the NullPointer exception that is being thrown. Cannot figure out the root of this problem
{
"query":"START n=node( { current_user_node } ), n1 = node( { contact_node } ) CREATE UNIQUE n-[:has_contact {device: {device_id}, name: {name} }]->(n1)",
"params":{"current_user_node":2,"contact_node":5941,"device_id":"F1485935-48F8-4624-AF5D-67529AE91227","name":"Samir Coll "}
}
The above query returns
{
"exception": "NullPointerException",
"fullname": "java.lang.NullPointerException",
"stacktrace": []
}
I tried the above query in the neo4j-shell from the command line and the query returned a null.
While
{
"query":"START n=node( { current_user_node } ), n1 = node( { contact_node } ) CREATE UNIQUE n-[:has_contact {device: {device_id}, name: {name} }]->(n1)",
"params":{"current_user_node":1,"contact_node":5658,"device_id":"FA2C589A-6AB5-4D78-ADED-7146CA71D0FC","name":"Jayesh New"}
}
The above returns
{ "columns": [], "data": [] }
The data is empty as the relationship already exists.
I am running neo4j 2.0.0 stable. All the nodes mentioned in the above queries are valid. I am very unsure as to how to proceed with this. Would appreciate it if someone can help with the problem.

Resources