Azure Data Explorer SubQuery KQL GeoJson - geojson

I am trying to create a KQL query where I can filter which locations are inside a geojson using function geo_point_in_polygon
I am trying to first, obtain my polygon with this query:
let polygon = adt_dh_FerrovialTwins_westeurope
| where ModelId == "dtmi:ferrovial:domain:models:v1:acopiotemploraltwin;1"
and Key == "localizacion"
| top 1 by Id
|project (Value);
polygon;
And with this data, trying to do another query to filter how many times a location is inside this polygon, with this query
let polygon = adt_dh_FerrovialTwins_westeurope
| where ModelId == "dtmi:ferrovial:domain:models:v1:acopiotemploraltwin;1"
and Key == "localizacion"
| top 1 by Id
|project (Value);
adt_dh_FerrovialTwins_westeurope
| where ModelId == "dtmi:ferrovial:domain:models:v1:camiondetierrastwin;1"
and Key == "localizacion"
| extend lon=(Value).geometry.coordinates[0], lat= (Value).geometry.coordinates[1]
| project todecimal(lon), todecimal(lat), Id
| where geo_point_in_polygon(lon, lat, polygon)
| summarize count() by Id, hash = geo_point_to_s2cell(lon, lat, 7)
| project geo_s2cell_to_central_point(hash), Id, count_
| render piechart with (kind=map) // map rendering available in Kusto Explorer desktop
It says it needs a dynamic value, but it is already one. I don´t know how to solve this, because if I try using a Polygon variable as string (dynamic) it works ok:
let polygon = dynamic({
"type": "Polygon",
"coordinates": [
[
[
-22.6430674134452,
-69.1258109131277
],
[
-22.6430533208934,
-69.1250474377359
],
[
-22.6453362953948,
-69.1243603098833
],
[
-22.6452658337868,
-69.1264980409803
],
[
-22.6431096910912,
-69.1257803741119
],
[
-22.6430674134452,
-69.1258109131277
]
]
]
});
adt_dh_FerrovialTwins_westeurope
| where ModelId == "dtmi:ferrovial:domain:models:v1:camiondetierrastwin;1"
and Key == "localizacion"
| extend lon=(Value).geometry.coordinates[0], lat= (Value).geometry.coordinates[1]
| project todecimal(lon), todecimal(lat), Id
| where geo_point_in_polygon(lon, lat, polygon)
| summarize count() by Id, hash = geo_point_to_s2cell(lon, lat, 7)
| project geo_s2cell_to_central_point(hash), Id, count_
| render piechart with (kind=map) // map rendering available in Kusto Explorer desktop

Here is a minimal, reproducible example:
let polygon = print dynamic({"type": "Polygon","coordinates": []});
print geo_point_in_polygon(0, 0, polygon)
Fiddle
You shared the IntelliSense error:
A value of type dynamic expected
However, you didn't share the run-time error:
Failed to resolve scalar expression named 'polygon'
As the error suggests, the function expects a scalar as an argument.
Your tabular expression can be converted to scalar using the toscalar() function.
let polygon = toscalar(print dynamic({"type": "Polygon","coordinates": []}));
print geo_point_in_polygon(0, 0, polygon)
Fiddle

Related

Copy value from a cell to another cell if it exists in another sheet's column

I have two sheets below. Links also added to each sheet for reference
Posts sheet:
id | title | tags
1 | title 1 | article, sports, football, england
2 | title 2 | news, sports, spain, france
3 | title 3 | opinion, political, france
4 | title 4 | news, political, russia
5 | title 5 | article, market, Germany
Tags sheet:
location | type | category
england | article | sports
spain | news | political
germany | opinion | market
russia | | football
france |
About each sheets:
Posts sheet consists of list of posts with title and tags associated with it.
Tags sheet consists of list of tags categorized to understandable heads.
What I am trying to do:
I need to extract the value from the tags column in Posts sheet and add the tag to individual columns based on what head its coming in tags sheet.
Desired Output:
id | title | type | category | location
1 | title 1 | article | sports, football | england
2 | title 2 | news | sports | spain, france
3 | title 3 | opinion | political | france
4 | title 4 | news | political | russia
5 | title 5 | article | market | Germany
I made this sample code for Google Apps Script that can help you sort the information. I added some comments in case you want to modify some of the columns or cells working on it. Here is the code:
function Split_by_tags() {
// Get the sheets you will work with by the name of the tab
const ss_posts = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Posts Sheet");
const ss_tags = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Tags Sheet");
const ss_output = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Expected Output");
// Get the range of the columns to of "Posts Sheet" we will work with
// If range_1 is the ID, range 2 is the title, range 3 is tags
// If you change the columns in the future, you only need to update this part
let range_1 = ss_posts.getRange("A2:A").getValues().flat();
let range_2 = ss_posts.getRange("B2:B").getValues().flat();
let range_3 = ss_posts.getRange("C2:C").getValues().flat();
// filter the arrays information to only the cells with values for "Posts Sheet"
// This way, you can add new information to the tags rows and they will be added
range_1 = range_1.filter((element) => {return (element !== '')});
range_2 = range_2.filter((element) => {return (element !== '')});
range_3 = range_3.filter((element) => {return (element !== '')});
// The values we will compare the tags with in arrays
let range_type = ss_tags.getRange("A2:A").getValues().flat();
let range_location = ss_tags.getRange("B2:B").getValues().flat();
let range_category = ss_tags.getRange("C2:C").getValues().flat();
// filter the arrays information to only the cells with values for "Tags Sheet"
// This way, you can add new information to the tags rows and they will be added
range_type = range_type.filter((element) => {return (element !== '')});
range_location = range_location.filter((element) => {return (element !== '')});
range_category = range_category.filter((element) => {return (element !== '')});
// new Arrays where the information will be sort, I added a new tag option called "Other"
// just in case the information in the column 2 has a value which is not under "Tags Sheet"
let type_tag = [];
let location_tag = [];
let category_tag = [];
let other_tag = [];
// for to copy the ID from "Posts Sheet" to "Expected Output"
for (let i=0; i< range_1.length ; i++){
ss_output.getRange(i+2,1).setValue(range_1[i]);
};
// for to copy the title from "Posts Sheet" to "Expected Output"
for (let j=0; j< range_2.length ; j++){
ss_output.getRange(j+2,2).setValue(range_2[j]);
};
// fuction to sort the tags from "Posts Sheet" base in "Tags Sheet"
function Separate_value(value_array){
for (let k=0; k < value_array.length; k++){
if(range_type.includes(value_array[k])){
type_tag.push(value_array[k]);
}
else if(range_location.includes(value_array[k])){
location_tag.push(value_array[k]);
}
else if(range_category.includes(value_array[k])){
category_tag.push(value_array[k]);
}
else{
other_tag.push(value_array[k]);
}
};
}
// Function to empty the arrays for the next loop
function Empty_value(){
type_tag = [];
location_tag = [];
category_tag = [];
other_tag = [];
}
// for to add the values we sorted to "Expected Output"
for (let e=0; e < range_3.length; e++ ){
let value_array = range_3[e].split(', ');
Separate_value(value_array)
ss_output.getRange(e+2,3).setValue(type_tag.join(", "));
ss_output.getRange(e+2,4).setValue(category_tag.join(", "));
ss_output.getRange(e+2,5).setValue(location_tag.join(", "));
ss_output.getRange(e+2,6).setValue(other_tag.join(", "));
Empty_value();
};
}
You can bound the script by accessing Extensions > Apps Script in your Google Sheet.
Copy and paste the sample code, and run it. The first time you run the Apps Script, it will ask you for permissions, accept those, and the information will get sorted.
You can also add a trigger to the Apps Script so it can sort the information automatically when new data is added.
Reference:
Create a bound Apps Script.
Create trigger.

Elm : How to Encode Nested objects

Model
type alias Model {
name : String
, poi_coordinates : Coordinates
}
type alias Coordinates =
{
coord_type : String
, coordinates : List Float
}
poiFormEncoder : Model -> Encode.Value
poiFormEncoder model =
Encode.object
[
( "name", Encode.string model.name )
, ( "type", Encode.string model.poi_coordinates.coord_type)
, ( "poi_coordinates", Encode.array Encode.float (Array.fromList model.poi_coordinates.coordinates) )
]
Can i ask how to encode for this data type? I have no idea , and the encoder i did gives no coordinates fill. Any help is really appreciate. The Json file format is at below
[
{
"name": "Mcd",
"coordinates": {
"type": "Point",
"coordinates": [
101.856603,
2.924
]
}
},
.
.
.
]
You can nest calls to Json.Encode.object. Each time you want a new object in the output, you need another one, e.g:
poiFormEncoder : Model -> Encode.Value
poiFormEncoder model =
Encode.object
[ ( "name", Encode.string model.name )
, ( "coordinates"
, Encode.object
[ ( "type", Encode.string model.poi_coordinates.coord_type )
, ( "coordinates", Encode.list Encode.float model.poi_coordinates.coordinates )
]
)
]
This should make sense: it is a list of (key, value) pairs, and the value should be another object.
On a side note, it will depend on your use case, but your Coordinates type looks like a prime candidate for a custom Elm type, e.g:
type Coordinates
= Point { x : Float, y : Float }
| Polar { r : Float, t : Float }
| ...
If you find you are doing a lot of checking the string type value and then dealing with the coordinates accordingly, something like this might be a much nicer structure to use internally. Of course, the best representation will depend on how you are using the type.

Why is AWSDynamoDBScanExpression returning empty when using sort key as filterExpression?

I have the following DynamoDB table:
BoardId | DateTime | Data | Type
-------------------------------------------------
1 | 20180424T123508Z | 68.1 | U
1 | 20181026T143233Z | 38.2 | T
1 | 20190108T120150Z | 38.1 | T
2 | 20180425T092311Z | 63.4 | U
"BoardId" is the partition key and "DateTime" is the sort key.
I want to fetch the entry with "BoardId"="1" and "DateTime" containing "2019".
I used AWSDynamoDBObjectMapper#scan to do this with the following code:
let dbObjMapper = AWSDynamoDBObjectMapper.default()
let scanExpression = AWSDynamoDBScanExpression()
scanExpression.limit = 10
scanExpression.filterExpression = "#id = :id AND contains(#dt, :dt)"
scanExpression.expressionAttributeNames = [
"#id" : "BoardId",
"#dt" : "DateTime"
]
scanExpression.expressionAttributeValues = [
":id" : "1",
":dt" : "2019"
]
dbObjMapper
.scan(Event.self, expression: scanExpression)
.continueWith(block: {
(task: AWSTask<AWSDynamoDBPaginatedOutput>!) -> Any? in
if let error = task.error as NSError? {
print("The request failed. Error: \(error)")
} else if let paginatedOutput = task.result {
print("The request was successful.")
print(paginatedOutput.items.count)
for event in paginatedOutput.items as! [Event] {
print(event)
}
}
return ()
})
But I am getting an empty result. There is no error ("The request was successful.") but printing paginatedOutput.items.count is 0. I expected to get the same result when doing the same Scan from the DynamoDB web console:
What is wrong with my usage of AWSDynamoDBScanExpression?
I tried using other Scan configurations:
do not set filterExpression => OK, returns up to 10 items
#id = :id => OK, returns up to 10 items with BoardId=1
contains(#dt, :dt) => no error but also returns empty results
set the scanExpression.indexName to some Index where "DateTime" is not the sort key (ex. "BoardId" is the partition key and "Type" is the sort key) => OK, returns the correct item same as the web console
Is it not allowed to use the sort key as a filter expression?
There is no mention of this in the AWS SDK for iOS docs or in the AWS Working with Scan docs (it is even stated here that "With Scan, you can specify any attributes in a filter expression—including partition key and sort key attributes")
I would try removing scanExpression.limit = 10 from your scan parameters.
From https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Scan.html#DDB-Scan-request-Limit: "The maximum number of items to evaluate (not necessarily the number of matching items)."
So what is probably happening is that the scan is looking at 10 items, then applying the filter, the filter matches none of them, and you get no results

Aerospike: how to bulk load a list of integers into a bin?

I'm trying to use the Aerospike bulk loader to seed a cluster with data from a tab-separated file.
The source data looks like this:
set key segments
segment 123 10,20,30,40,50
segment 234 40,50,60,70
The third column, 'segments', contains a comma separated list of integers.
I created a JSON template:
{
"version" : "1.0",
"input_type" : "csv",
"csv_style": { "delimiter": " " , "n_columns_datafile": 3, "ignore_first_line": true}
"key": {"column_name":"key", "type": "integer"},
"set": { "column_name":"set" , "type": "string"},
"binlist": [
{"name": "segments",
"value": {"column_name": "segments", "type": "list"}
}
]
}
... and ran the loader:
java -cp aerospike-load-1.1-jar-with-dependencies.jar com.aerospike.load.AerospikeLoad -c template.json data.tsv
When I query the records in aql, they seem to be a list of strings:
aql> select * from test
+--------------------------------+
| segments |
+--------------------------------+
| ["10", "20", "30", "40", "50"] |
| ["40", "50", "60", "70"] |
+--------------------------------+
The data I'm trying to store is a list of integers. Is there an easy way to convert the objects stored in this bin to a list of integers (possibly a Lua UDF) or perhaps there's a tweak that can be made to the bulk loader template?
Update:
I attempted to solve this by creating a Lua UDF to convert the list from strings to integers:
function convert_segment_list_to_integers(rec)
for i=1, table.maxn(rec['segments']) do
rec['segments'][i] = math.floor(tonumber(rec['segments'][i]))
end
aerospike:update(rec)
end
... registered it:
aql> register module 'convert_segment_list_to_integers.lua'
... and then tried executing against my set:
aql> execute convert_segment_list_to_integers.convert_segment_list_to_integers() on test.segment
I enabled some more verbose logging and notice that the UDF is throwing an error. Apparently, it's expecting a table and it was passed userdata:
Dec 04 2015 23:23:34 GMT: DEBUG (udf): (udf_rw.c:send_result:527) FAILURE when calling convert_segment_list_to_integers convert_segment_list_to_integers ...rospike/usr/udf/lua/convert_segment_list_to_integers.lua:2: bad argument #1 to 'maxn' (table expected, got userdata)
Dec 04 2015 23:23:34 GMT: DEBUG (udf): (udf_rw.c:send_udf_failure:407) Non-special LDT or General UDF Error(...rospike/usr/udf/lua/convert_segment_list_to_integers.lua:2: bad argument #1 to 'maxn' (table expected, got userdata))
It seems that maxn isn't an applicable method to a userdata object.
Can you see what needs to be done to fix this?
To convert your lists with string values to lists of integer values you can run the following record udf:
function convert_segment_list_to_integers(rec)
local list_with_ints = list()
for value in list.iterator(rec['segments']) do
local int_value = math.floor(tonumber(value))
list.append(list_with_ints, int_value)
end
rec['segments'] = list_with_ints
aerospike:update(rec)
end
When you edit your existing lua module, make sure to re-run register module 'convert_segment_list_to_integers.lua'.
The cause of this issue is within the aerospike-loader tool: it will always assume/enforce strings as you can see in the following java code:
case LIST:
/*
* Assumptions
* 1. Items are separated by a colon ','
* 2. Item value will be a string
* 3. List will be in double quotes
*
* No support for nested maps or nested lists
*
*/
List<String> list = new ArrayList<String>();
String[] listValues = binRawText.split(Constants.LIST_DELEMITER, -1);
if (listValues.length > 0) {
for (String value : listValues) {
list.add(value.trim());
}
bin = Bin.asList(binColumn.getBinNameHeader(), list);
} else {
bin = null;
log.error("Error: Cannot parse to a list: " + binRawText);
}
break;
Source on Github: http://git.io/vRAQW
If you prefer, you can modify this code and re-compile to always assume integer list values. Change line 266 and 270 to something like this (untested):
List<Integer> list = new ArrayList<Integer>();
list.add(Integer.parseInt(value.trim());

Recursive neo4j query

I have a graph which has categories and sub-categories and sub-sub-categories to indefinite level.
How I can I get all this hierarchical data in one cipher query?
I currently have this query :
START category=node:categoryNameIndex(categoryName = "category")
MATCH path = category <- [rel:parentCategory] - subcategory
RETURN category, collect(subcategory);
Which gives me following result:
| category | collect(subcategory) |
==> +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
==> | Node[26]{categoryName:"Test2",categoryDescription:"testDesc",imageUrl:"testUrl",imageName:"imageName"} | [Node[25]{categoryName:"Test1",categoryDescription:"testDesc",imageUrl:"testUrl",imageName:"imageName"}] |
==> | Node[1]{categoryName:"Test1",categoryDescription:"testDesc",imageUrl:"testUrl",imageName:"imageName"} | [Node[26]{categoryName:"Test2",categoryDescription:"testDesc",imageUrl:"testUrl",imageName:"imageName"},Node[2]{categoryName:"Test2",categoryDescription:"testDesc",imageUrl:"testUrl",imageName:"imageName"}] |
I am using node-neo4j.
I will give an example of what I want in json format.
[{
"categoryName": "Test2",
"categoryDescription": "testDesc",
"imageUrl": "testUrl",
"children": [{
"categoryName": "Test1",
"categoryDescription": "testDesc",
"imageUrl": "testUrl",
"children" : [{
"categoryName": "Test1",
"categoryDescription": "testDesc",
"imageUrl": "testUrl"
}]
}]
}]
I this possible? I know I can always do it programmatically OR by using multiple queries. But it will very helpful if it can be done in a single query.
You can match paths of arbitrary depth by adding a * after the relationship type:
START category=node:categoryNameIndex(categoryName = "category")
MATCH path = category <-[rel:parentCategory*]- subcategory
RETURN category, collect(subcategory);
Optionally, you can also specify a minimum and/or maximum path length:
START category=node:categoryNameIndex(categoryName = "category")
MATCH path = category <-[rel:parentCategory*2..5]- subcategory
RETURN category, collect(subcategory);
See reference here:
http://docs.neo4j.org/chunked/milestone/query-match.html#match-variable-length-relationships

Resources