Find open Shops through Timetable with Elasticsearch/Tire - ruby-on-rails

I have model Shop each has relation with Timetable which could contain something like:
shop_id: 1, day: 5, open_hour: 7, open_minutes: 0, close_hour: 13, close_minute: 30
shop_id: 1, day: 5, open_hour: 14, open_minutes: 30, close_hour: 18, close_minute: 00
of course Timetable could have more elegant format, but question is next: how with elasticsearch(tire) could I find Shop which is open?
all Idea will be apreciated! Thanks!
Found solution:
create separate index for each day (sunday, monday, ..)
for each day build full array of minutes from Timetable:
((open_hour * 60 + open_minute)..(close_hour * 60 + close_minute)).to_a
add filter to search:
filter :term, current_day_name => (current_hour * 60 + current_minutes)
this solution works as well, but it looks cumbersome, because if Shop works 8-h hours per day I have created array with size: 8 * 60 = 480 (which is converted to string as indexed field), so thats why this question is still open, and maybe someone will find better solution
Tire part for #Andrei Stefan answer:
indexes :open_hours, type: :nested do
indexes :open, type: 'integer'
indexes :close, type: 'integer'
end
open_hours_query = Tire::Search::Query.new do
filtered do
query { all }
filter :range, "open_hours.open" => { lte: current_time }
filter :range, "open_hours.close" => { gte: current_time }
end
end
filter :nested, { path: 'open_hours', query: open_hours_query.to_hash }

I would consider doing it like the following:
The opening and closing hours are integer values of an array of nested objects in Elasticsearch:
Example: shop opening at 07:00 and closing at 13:30 and then opening at 14:30 and closing at 18:00 in day 1 would be translated to this in ES:
"shop_name": "Shop 1",
"open_hours": [
{ "open": 420, "close": 810 },
{ "open": 870, "close": 1080 }
]
Each day in the week (1 -> 7) represents a value (to be added to the number of minutes):
Day 1 = addition 0
Day 2 = addition 2000
Day 3 = addition 4000
...
Day 7 = addition 10000
So, for each day there is an increment of 2000 because each day contains at most 1440 minutes (24 hours * 60 minutes) and to be able to differentiate one day from a single number these numbers don't have to intersect.
So, the example above with the shop opening at 07:00 would be translated for Day 4 for example to this:
"shop_name": "Shop 1",
"open_hours": [
{ "open": 6420, "close": 6810 },
{ "open": 6870, "close": 7080 }
]
When querying these documents, that point of the day you want to search needs to obey the same rules as above. For example, if you want to see if in Day 4 at 13:45 the "Shop 1" is opened you would search for a (6000 + 13*60 + 45 = 6825) minute.
The mapping for everything above in Elasticsearch would be this:
{
"mappings": {
"shop" : {
"properties": {
"shop_name" : { "type" : "string" },
"open_hours" : {
"type" : "nested",
"properties": {
"open" : { "type" : "integer" },
"close": { "type" : "integer" }
}
}
}
}
}
}
Test data:
POST /shops/shop/_bulk
{"index":{}}
{"shop_name":"Shop 1","open_hours":[{"open":420,"close":810},{"open":870,"close":1080}]}
{"index":{}}
{"shop_name":"Shop 2","open_hours":[{"open":0,"close":500},{"open":1000,"close":1440}]}
{"index":{}}
{"shop_name":"Shop 3","open_hours":[{"open":0,"close":10},{"open":70,"close":450},{"open":900,"close":1050}]}
{"index":{}}
{"shop_name":"Shop 4","open_hours":[{"open":2000,"close":2480}]}
{"index":{}}
{"shop_name":"Shop 5","open_hours":[{"open":2220,"close":2480},{"open":2580,"close":3000},{"open":3100,"close":3440}]}
{"index":{}}
{"shop_name":"Shop 6","open_hours":[{"open":6000,"close":6010},{"open":6700,"close":6900}]}
Querying for shops opened in Day 2 at minute #2400 of the day (06:40):
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "open_hours",
"query": {
"bool": {
"must": [
{
"filtered": {
"filter": {
"range": {
"open_hours.open": {
"lte": 2400
}}}}},
{
"filtered": {
"filter": {
"range": {
"open_hours.close": {
"gte": 2400
}}}}}
]
}}}}
]
}}}
Would output Shop 4 and Shop 5:
"shop_name": "Shop 4",
"open_hours": [
{
"open": 2000,
"close": 2480
}
]
"shop_name": "Shop 5",
"open_hours": [
{
"open": 2220,
"close": 2480
},
{
"open": 2580,
"close": 3000
},
{
"open": 3100,
"close": 3440
}
]
LATER EDIT: since Elasticsearch has come a looong way since I added this reply and many things changed since then, a filtered filter (in the context of the bool must I used) can be replaced by a bool filter or even a simple must. Also, the string doesn't exist in 6.x anymore, so you can use text if you somehow need to search by shop name using analyzers, or keyword ("shop_name" : { "type" : "text" },):
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "open_hours",
"query": {
"bool": {
"filter": [
{
"range": {
"open_hours.open": {
"lte": 2400
}
}
},
{
"range": {
"open_hours.close": {
"gte": 2400
}
}
}
]
}
}
}
}
]
}
}
}

Related

Getting a "hole" in Twitter data -- am I doing sth wrong?

I’m looking at -and trying to retrieve- data for a specific search query on the full archive (premium) search API (V1) but I’m getting a weird "hole" of about 9 days of data from 7 to 16 January. Results for the rest of January up until now are apparently ok.
Parameters passed to the search endpoint are:
'query' => '<a longish query string about 750 characters>',
'fromDate' => '202301070000',
'toDate' => '202301140830',
'maxResults' => '500'
but apparently the data is missing from the count endpoint as well, since this is what I’m getting with a bucket=day granularity (this covers all of Jan up to now):
{
"results": [{
"timePeriod": "202301010000",
"count": 525
},
>>>>> ...EVERYTHING FINE UP TO HERE <<<<<<
{
"timePeriod": "202301070000",
"count": 15 <--- THIS IS A PARTIAL RESULT
}, {
"timePeriod": "202301080000",
"count": 0
}, {
"timePeriod": "202301090000",
"count": 0
}, {
"timePeriod": "202301100000",
"count": 0
}, {
"timePeriod": "202301110000",
"count": 0
}, {
"timePeriod": "202301120000",
"count": 0
}, {
"timePeriod": "202301130000",
"count": 0
}, {
"timePeriod": "202301140000",
"count": 0
}, {
"timePeriod": "202301150000",
"count": 0
}, {
"timePeriod": "202301160000",
"count": 195 <--- ALSO A PARTIAL RESULT
},
{
"timePeriod": "202301170000",
"count": 682
},
>>>>> ...FINE FROM HERE ON <<<<<<
],
"totalCount": 10456,
"requestParameters": {
"bucket": "day",
"fromDate": "202301010000",
"toDate": "202301241720"
}
}
for your enjoyment here is a chart of what I'm (not) getting.
I'm a bit weirded out -- also, premium API access is all but free and is paid upfront, you know.
From further research and consultation on the Twitter dev fora, this is platform-wide. There may be possibly be a reindexing in a few days but as of now all searches return no results in the 2nd week of January 2023.

Vega-Lite Visualization interpreting dates from Google Sheet as long numbers

Pulling data into Google Data Studio from a Google Sheet with dates stored in yyyy-mm-dd format. The dates look correct and calculate correctly with formulas and adjustments everywhere except in a Gantt chart using the Vega-Lite Community Visualization, which shows the date in a long-number format (e.g. 20210520), and is unable to display the data when using "type": "temporal" or using "timeUnit": "utcyearmonthdatehours".
I've ran various tests, including...
Changing the date format for the date columns to plain text, yyyyddmm, yymmdd, yyyy/mm/dd formats.
Replace the current date columns with new columns using the alternate formats in point 1 (above).
Changing the date field formats directly in Google Data Studio to the formats in point 1 (above).
Creating a secondary set of date columns in plain-text using an Arrayformula and Text() function to reformat the actual dates to plain-text.
So far, options 2 & 4 are the only way I've been able to get the gantt to render correctly, reading the data in date format. But option 2 renders the other charts in GDS as unusable, as the other charts cannot translate the plain-text to usable dates.
Option 4 does work, but isn't the ideal route, given the redundant data. I'd prefer to have just 1 column for the Start Date and another for the End Date, rather than 2 columns for both. Feels like I may be missing something obvious here. Is there a way to either properly format the dates in Google Sheets to work properly with both the GDS date fields and Vega-Lite, or is there a way to properly parse the date data in Vega-Lite without needing to use a second set of plain-text columns?
Report replicating the issue: Project Tracking (debug report)
Edit: below is the code for the Vega-lite visualizations using the date fields from Google Sheets, which Vega-lite is not interpreting as dates.
Without timeunit or temporal field type:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "A bar chart with highlighting on hover and selecting on click. (Inspired by Tableau's interaction style.)",
"config": {
"background": null,
"view": {
"stroke": "transparent"
}
},
"layer": [
{
"layer": [
{
"params": [
{
"name": "grid",
"select": "interval",
"bind": "scales"
}
],
"mark": {
"type": "bar",
"cursor": "pointer",
"tooltip": true,
"point": true,
"cornerRadiusEnd": 5,
"opacity": 0.8
},
"encoding": {
"color": {
"field": "$dimension3",
"title": "$dimension3.name"
}
}
}
],
"encoding": {
"x": {
"field": "$dimension0",
"axis": {
"title": null,
"grid": true
}
},
"y": {
"field": "$dimension1",
"title": "$dimension1.name",
"type": "nominal",
"sort": "x",
"axis": {
"title": null,
"grid": true,
"tickBand": "extent"
}
},
"x2": {
"field": "$dimension2"
},
"yOffset": {
"field": "$dimension3"
}
}
}
]
}
With timeunit and field type temporal:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "A bar chart with highlighting on hover and selecting on click. (Inspired by Tableau's interaction style.)",
"config": {
"background": null,
"view": {
"stroke": "transparent"
}
},
"layer": [
{
"layer": [
{
"params": [
{
"name": "grid",
"select": "interval",
"bind": "scales"
}
],
"mark": {
"type": "bar",
"cursor": "pointer",
"tooltip": true,
"point": true,
"cornerRadiusEnd": 5,
"opacity": 0.8
},
"encoding": {
"color": {
"field": "$dimension3",
"title": "$dimension3.name"
}
}
}
],
"encoding": {
"x": {
"field": "$dimension0",
"type": "temporal",
"timeUnit": "utcyearmonthdatehours",
"axis": {
"title": null,
"grid": true
}
},
"y": {
"field": "$dimension1",
"title": "$dimension1.name",
"type": "nominal",
"sort": "x",
"axis": {
"title": null,
"grid": true,
"tickBand": "extent"
}
},
"x2": {
"field": "$dimension2"
},
"yOffset": {
"field": "$dimension3"
}
}
}
]
}

Highcharts - show only years in xAxis

I have grouped data by year and want to show only 2 bars (2014 + 2015).
Any ideas why highchart also includes months between the bars?
"chart": {
"type": "column"
},
"plotOptions": {
"column": {
"stacking": "normal"
},
},
"series": [
{
"data": [
[1388534400000,88]
],
"name": "First",
"id": "series-8"
},
{
"data": [
[1388534400000,39]
],
"name": "2nd",
"id": "series-9"
},
{
"data": [
[1420070400000,34]
],
"name": "3rd",
"id": "series-10"
}
],
"xAxis": {
"type": "datetime",
"dateTimeLabelFormats": {
"year": "%Y"
}
}
jsfiddle: http://jsfiddle.net/747hs83s/
I would like to have only 2014 and 2015 labels on x-axis.
You can define a pointRange as 1 year (365 * 24 * 3600 * 1000, time in miliseconds) and set tickInterval with the same value.
"plotOptions": {
"column": {
pointRange: 365 * 24 * 3600 * 1000,
"stacking": "normal"
},
},
"xAxis": {
tickInterval: 365 * 24 * 3600 * 1000,
"type": "datetime",
"dateTimeLabelFormats": {
"year": "%Y"
}
}
Example: http://jsfiddle.net/3d3jzywq/
The answer to "why does the chart show months" is simply that there are months between your data points to be shown, and the chart has no way of knowing that you don't want it to show them.
There are probably numerous solutions, but the one I would use is the pointRange property - tell the chart that each column represents a year, and it will display it accordingly:
pointRange:86400000 * 365//one year
Example:
http://jsfiddle.net/jlbriggs/747hs83s/2/
By default, it will show the years as labels. If you show/hide series, that may change. There are a number of ways to format the label as well.
Label formatting references:
http://api.highcharts.com/highcharts#xAxis.labels.format
http://api.highcharts.com/highcharts#xAxis.labels.formatter
http://api.highcharts.com/highcharts#Highcharts.dateFormat
http://api.highcharts.com/highcharts#Highcharts.dateFormats
http://api.highcharts.com/highcharts#xAxis.dateTimeLabelFormats

Elastic Search- Searching Multiple Queries in Single Field

I'm new to elastic Search. I have a field name clearance in my users table and I'm trying to filter my results based on this.
match: {
clearance: {
query: 'None',
type: 'phrase'
}
}
When I give the above match query i get 3 results. What I'm trying to get is to pass one more string along with None. For eg I want to find the users with clearance None and First Level
I tried this.
multi_match: {
clearance: {
query: 'None OR First Level',
type: 'phrase'
}
}
But ended up in some error. Please Help. Correct me if my question is wrong.
One way would be making clearance as not_analyzed field in the mapping and using terms filter.
Example:
PUT test
{
"mappings": {
"e1":{
"properties": {
"clearance":{
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
Some test data:
PUT test/e1/1
{
"clearance":"None"
}
PUT test/e1/2
{
"clearance":"First Level"
}
PUT test/e1/3
{
"clearance":"Second Level"
}
Now query part:
GET test/e1/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"terms": {
"clearance": [
"None",
"First Level"
],
"execution": "or"
}
}
}
}
}
Result verfication:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "e1",
"_id": "1",
"_score": 1,
"_source": {
"clearance": "None"
}
},
{
"_index": "test",
"_type": "e1",
"_id": "2",
"_score": 1,
"_source": {
"clearance": "First Level"
}
}
]
}
}

Converting JSON number to_i returning 1

I've been given this hash:
{
"item": {
"icon": "http://services.runescape.com/m=itemdb_rs/4332_obj_sprite.gif?id=4798",
"icon_large": "http://services.runescape.com/m=itemdb_rs/4332_obj_big.gif?id=4798",
"id": 4798,
"type": "Ammo",
"typeIcon": "http://www.runescape.com/img/categories/Ammo",
"name": "Adamant brutal",
"description": "Blunt adamantite arrow...ouch",
"current": {
"trend": "neutral",
"price": 227
},
"today": {
"trend": "neutral",
"price": 0
},
"day30": {
"trend": "positive",
"change": "+1.0%"
},
"day90": {
"trend": "positive",
"change": "+1.0%"
},
"day180": {
"trend": "positive",
"change": "+2.0%"
},
"members": "true"
}
}
I obtain the current price like this:
class GpperxpController < ApplicationController
def index
end
def cooking
require 'open-uri'
#sharkid = '385'
#sharkurl = "http://services.runescape.com/m=itemdb_rs/api/catalogue/detail.json?item=#{#sharkid}"
#sharkpage = Nokogiri::HTML(open(#sharkurl))
#sharkinfo = JSON.parse(#sharkpage.text)
#sharkinfo = #sharkinfo['item']['current']['price']
end
end
<%= #sharkinfo %> in my view returns 227. However, I want to perform some math operations on it, which is why I must use .to_i. Only problem is when I append .to_i, the value changes to 1. Why is that?
Price in the given json (http://services.runescape.com/m=itemdb_rs/api/catalogue/detail.json?item=385) contains ,.
... "current":{"trend":"neutral","price":"1,844"},...
^
Remove , before call String#to_i.
"1,844".to_i
# => 1
"1,844".gsub(',', '').to_i
# => 1844
Just running irb, and putting your JSON response in a variable, I had no problem getting the response to be 227, either by pulling the price out as text and then converting to an integer or by pulling the price out as an integer in one fell swoop.
So my initial code looked like:
json_text = '''
{
"item": {
"icon": "http://services.runescape.com/m=itemdb_rs/4332_obj_sprite.gif?id=4798",
"icon_large": "http://services.runescape.com/m=itemdb_rs/4332_obj_big.gif?id=4798",
"id": 4798,
"type": "Ammo",
"typeIcon": "http://www.runescape.com/img/categories/Ammo",
"name": "Adamant brutal",
"description": "Blunt adamantite arrow...ouch",
"current": {
"trend": "neutral",
"price": 227
},
"today": {
"trend": "neutral",
"price": 0
},
"day30": {
"trend": "positive",
"change": "+1.0%"
},
"day90": {
"trend": "positive",
"change": "+1.0%"
},
"day180": {
"trend": "positive",
"change": "+2.0%"
},
"members": "true"
}
'''
require 'json'
si = JSON.parse(json_text)
And then either of the following:
p = si['item']['current']['price']
price = p.to_i
or
price = si['item']['current']['price'].to_i
put the value of 227 in my price variable.
Something I would avoid if I were you though, is using the same variable name for different things. If what you want to have is the integer price in #sharkinfo, then you would do well to have a temporary name (without the # symbol) to put the price as text in, then assign the integer value to the desired variable.
Try this and see if it helps. I'll try to monitor this for a bit to see if you get anywhere. Also, at the point you pull the text out of the JSON, I believe this ceases to be a JSON problem any longer. Finally, you might include what version of ruby and what platform (Windows/Mac/Linux/etc) you are using.

Resources