XQuery: best way to convert node sequence to array - saxon

I am using Saxon, version 9.8.0.6 with this input document:
<simple>
<hello>Hello World!</hello>
<number>42</number>
<keyword>abc</keyword>
<keyword>def</keyword>
<keyword>ghi</keyword>
</simple>
And this query
xquery version "3.1";
fn:serialize(map{
'greeting': data(/simple/hello),
'number': number(/simple/number),
'keywords': array{ for $k in /simple/keyword return data($k) }
}, map{'method':'json', 'indent':true()})
Output is (as expected):
{
"number":42,
"keywords": [
"abc",
"def",
"ghi"
],
"greeting":"Hello World!"
}
Question:
'keywords': array{ for $k in /simple/keyword return data($k) } seems a little clumsy for me. Is this the way to do it? Any suggestions for improving?

You could reduce
array{ for $k in /simple/keyword return data($k) }
to
array{data(/simple/keyword)}

Related

Loop through a nested json object in OPA

I am very new to OPA and trying to loop through the below input data
input =
"data":{
"list":[
{"Name": "abc", "Status": "Done"},
{"Name": "def", "Status": "Done"},
{"Name": "ghi", "Status": "pending"},
{"Name": "jkl", "Status": ""},
{"Name": "mno", "Status": null},
]
}
and return two lists based on that input, one list that would return the names of all objects that has the status as 'Done' and another list with status not equal to 'Done', here is the rego code I was trying, I used somewhat python syntax to convey the code as I was not sure how the opa syntax would look like
package play
default done_list := []
default other_list := []
done_list {
some i
input.data.list[i].Status == "Done"
done_list.append(input.data.list[i].Name) #python syntax and looking for something similar in opa
}
other_list{
some j
input.data.list[j].Status != "Done"
other_list.append(input.data.list[j].Name)
}
#or maybe use a list comprehension like this?
#not even sure if this makes sense but ...
done_list = [x.Name | x := input.data.list[_]; x.Status = "Done"]
other_list = [x.Name | x := input.data.list[_]; x.Status != "Done"]
test_return{
"done_list": done_list,
"other_list": other_list
}
and the output I'm looking for is something like
{
"done_list": ["abc", "def"],
"other_list": ["ghi","jkl","mno"]
}
Any help is greatly appreciated. Thanks in advance.
I think you're best to use a list comprehension as you guessed :)
package play
doneStatus := "Done"
result := {
"done_list": [e |
item := input.data.list[_]
item.Status == doneStatus
e := item.Name
],
"other_list": [e |
item := input.data.list[_]
item.Status != doneStatus
e := item.Name
],
}
I've created a Rego playground which shows this: https://play.openpolicyagent.org/p/dBKUXFO3v2

How to pass different values to Pipeline Parameters

suppose if i am doing hyper parameter tuning to one of my model, lets say, i am using AdaBoostClassifier() and want to pass different base_estimator, so i pass SVC & DecisionTreeClassifier as estimator
_parameters=[
{
'mdl':[AdaBoostClassifier(random_state=23)],
'mdl__learning_rate':np.linspace(0,1,20),
'mdl__base_estimator':[SVC(),DecisionTreeClassifier()]
}
]
now, i want to pass different values to ccp_alpha of DecisionTreeClassifier, something like this
'mdl__base_estimator':[LinearRegression(),DecisionTreeClassifier(ccp_alpha=[0.1,0.2,0.3,0.4])]
how can i do that, i tried passing it like this, but it is not working, here is my entire code
pipeline=Pipeline(
[
('scal',StandardScaler()),
('mdl','passthrough')
]
)
_parameters=[
{
'mdl':[DecisionTreeClassifier(random_state=42)] ,
'mdl__max_depth':np.linspace(2,30,2),
'mdl__min_samples_split':np.linspace(1,10,1),
'mdl__max_features':np.linspace(1,100,1),
'mdl__ccp_alpha':np.linspace(0,1,10)
}
,{
'mdl':[AdaBoostClassifier(random_state=23)],
'mdl__learning_rate':np.linspace(0,1,20),
'mdl__base_estimator':[SVC(),DecisionTreeClassifier(ccp_alpha=[0.3,0.4,0.5,0.7])]
}
]
grid_search=GridSearchCV(_pipeline,_parameters,cv=3,n_jobs=-1,scoring='f1')
grid_search.fit(x,y
)
This kind of splitting is why param_grid can be a list of dicts, as in your outer split; but it cannot easily handle the nested disjunction you have. Two approaches come to mind.
More disjoint grids:
_parameters=[
{
'mdl': [DecisionTreeClassifier(random_state=42)],
'mdl__max_depth': np.linspace(2,30,2),
'mdl__min_samples_split': np.linspace(1,10,1),
'mdl__max_features': np.linspace(1,100,1),
'mdl__ccp_alpha': np.linspace(0,1,10),
},
{
'mdl': [AdaBoostClassifier(random_state=23)],
'mdl__learning_rate': np.linspace(0,1,20),
'mdl__base_estimator': [SVC()],
},
{
'mdl': [AdaBoostClassifier(random_state=23)],
'mdl__learning_rate': np.linspace(0,1,20),
'mdl__base_estimator': [DecisionTreeClassifier()],
'mdl__base_estimator__ccp_alpha': [0.3,0.4,0.5,0.7],
},
]
Or list comprehension:
_parameters=[
{
'mdl': [DecisionTreeClassifier(random_state=42)],
'mdl__max_depth': np.linspace(2,30,2),
'mdl__min_samples_split': np.linspace(1,10,1),
'mdl__max_features': np.linspace(1,100,1),
'mdl__ccp_alpha': np.linspace(0,1,10),
},
{
'mdl': [AdaBoostClassifier(random_state=23)],
'mdl__learning_rate': np.linspace(0,1,20),
'mdl__base_estimator': [SVC()] + [DecisionTreeClassifier(ccp_alpha=a) for a in [0.3,0.4,0.5,0.7]],
},
]

How to write a map to a YAML file in Dart

I have a map of key value pairs in Dart. I want to convert it to YAML and write into a file.
I tried using YAML package from dart library but it only provides methods to load YAML data from a file. Nothing is mentioned on how to write it back to the YAML file.
Here is an example:
void main() {
var map = {
"name": "abc",
"type": "unknown",
"internal":{
"name": "xyz"
}
};
print(map);
}
Expected output:
example.yaml
name: abc
type: unknown
internal:
name: xyz
How to convert the dart map to YAML and write it to a file?
It's a bit late of a response but for anyone else looking at this question I have written this class. It may not be perfect but it works for what I'm doing and I haven't found anything wrong with it yet. Might make it a package eventually after writing tests.
class YamlWriter {
/// The amount of spaces for each level.
final int spaces;
/// Initialize the writer with the amount of [spaces] per level.
YamlWriter({
this.spaces = 2,
});
/// Write a dart structure to a YAML string. [yaml] should be a [Map] or [List].
String write(dynamic yaml) {
return _writeInternal(yaml).trim();
}
/// Write a dart structure to a YAML string. [yaml] should be a [Map] or [List].
String _writeInternal(dynamic yaml, { int indent = 0 }) {
String str = '';
if (yaml is List) {
str += _writeList(yaml, indent: indent);
} else if (yaml is Map) {
str += _writeMap(yaml, indent: indent);
} else if (yaml is String) {
str += "\"${yaml.replaceAll("\"", "\\\"")}\"";
} else {
str += yaml.toString();
}
return str;
}
/// Write a list to a YAML string.
/// Pass the list in as [yaml] and indent it to the [indent] level.
String _writeList(List yaml, { int indent = 0 }) {
String str = '\n';
for (var item in yaml) {
str += "${_indent(indent)}- ${_writeInternal(item, indent: indent + 1)}\n";
}
return str;
}
/// Write a map to a YAML string.
/// Pass the map in as [yaml] and indent it to the [indent] level.
String _writeMap(Map yaml, { int indent = 0 }) {
String str = '\n';
for (var key in yaml.keys) {
var value = yaml[key];
str += "${_indent(indent)}${key.toString()}: ${_writeInternal(value, indent: indent + 1)}\n";
}
return str;
}
/// Create an indented string for the level with the spaces config.
/// [indent] is the level of indent whereas [spaces] is the
/// amount of spaces that the string should be indented by.
String _indent(int indent) {
return ''.padLeft(indent * spaces, ' ');
}
}
Usage:
final writer = YamlWriter();
String yaml = writer.write({
'string': 'Foo',
'int': 1,
'double': 3.14,
'boolean': true,
'list': [
'Item One',
'Item Two',
true,
'Item Four',
],
'map': {
'foo': 'bar',
'list': ['Foo', 'Bar'],
},
});
File file = File('/path/to/file.yaml');
file.createSync();
file.writeAsStringSync(yaml);
Output:
string: "Foo"
int: 1
double: 3.14
boolean: true
list:
- "Item One"
- "Item Two"
- true
- "Item Four"
map:
foo: "bar"
list:
- "Foo"
- "Bar"
package:yaml does not have YAML writing features. You may have to look for another package that does that – or write your own.
As as stopgap, remember JSON is valid YAML, so you can always write out JSON to a .yaml file and it should work with any YAML parser.
I ran into the same issue and ended up hacking together a simple writer:
// Save the updated configuration settings to the config file
void saveConfig() {
var file = _configFile;
// truncate existing configuration
file.writeAsStringSync('');
// Write out new YAML document from JSON map
final config = configToJson();
config.forEach((key, value) {
if (value is Map) {
file.writeAsStringSync('\n$key:\n', mode: FileMode.writeOnlyAppend);
value.forEach((subkey, subvalue) {
file.writeAsStringSync(' $subkey: $subvalue\n',
mode: FileMode.writeOnlyAppend);
});
} else {
file.writeAsStringSync('$key: $value\n',
mode: FileMode.writeOnlyAppend);
}
});
}

swift - how to parse this JSON object

// Get the #1 app name from iTunes and SwiftyJSON
DataManager.getTopAppsDataFromItunesWithSuccess { (iTunesData) -> Void in
let json = JSON(data: iTunesData)
println(json)
how to access all the elements of["venues"]["pub city"]["venue"] ?
{
"venues":{
"cityuser":"Beirut",
"venue-usernewplace":{
"star":[
],
"idcat":[
],
"namecat":[
],
"name":[
],
"id":[
],
"phone":[
],
"address":[
],
"crossStreet":[
],
"lat":[
],
"lng":[
],
"cc":[
]
},
"placesofpeople":{
"star":"false",
"nameplace":"B0 18",
"idplace":"4b52670df964a520847b27e3",
"count":"4",
"cc":"LB",
"phone":"01580018",
"crossStreet":"Main Highway",
"lat":"33.898404713314",
"lng":"35.534128372291",
"address":"Karantina"
},
"pubcity":{
"venue":[
{
"id":"4fe75b17e4b032d653ce50fd",
"idcat":"4bf58dd8d48988d11e941735",
"name":"Cl\u00e9 Cafe-Lounge Bar",
"phone":"71200712",
"address":"Mohammed Abdel Baki Street, Clemenceau",
"crossStreet":"Hamra, Facing Najjar Hospital",
"lat":"33.897185328966",
"lng":"35.487202808518",
"cc":"LB",
"count":"0",
"namecat":"Cocktail Bar",
"star":"false"
},
{
"id":"4e3e7533fa76455375c56a33",
"idcat":"4bf58dd8d48988d11f941735",
"name":"Skybar",
"phone":"03939191",
"address":"Biel",
"crossStreet":"Downtown Beirut",
"lat":"33.90610643966",
"lng":"35.510663636771",
"cc":"LB",
"count":"0",
"namecat":"Nightclub",
"star":"false"
},
{
"id":"4b52670df964a520847b27e3",
"idcat":"4bf58dd8d48988d11f941735",
"name":"B 018",
"phone":"01580018",
"address":"Karantina",
"crossStreet":"Main Highway",
"lat":"33.898404713314",
"lng":"35.534128372291",
"cc":"LB",
"count":"0",
"namecat":"Nightclub",
"star":"false"
},
There's two methods I know of.
1 Post that wall of text into a JSON formatter which makes that blob more readable. In which case you can then inspect which keys you can pull out from it.
2 Check the documentation.
Using swiftyJson:
if let venues = json["venues"]["pubcity"]["venue"].array {
//venue is an array of the dictionaries.
for venue in venues {
//just printing the name, but you have the whole dictioary of each venue here.
println(venue["name"].string!)
}
}

Group and sum collection in Groovy

I have a collection of objects that I want to group by month and name and sum total:
def things = [
[id:1, name:"fred", total:10, date: "2012-01-01"],
[id:2, name:"fred", total:10, date: "2012-01-03"],
[id:3, name:"jane", total:10, date: "2012-01-04"],
[id:4, name:"fred", total:10, date: "2012-02-11"],
[id:5, name:"jane", total:10, date: "2012-01-01"],
[id:6, name:"ted", total:10, date: "2012-03-21"],
[id:7, name:"ted", total:10, date: "2012-02-09"]
];
I would like the output to be:
[
"fred":[[total:20, month:"January"],[total:10, month:"February"]],
"jane":[[total:20,month:"January"]],
"ted" :[[total:10, month:"February"],[total:10, month:"March"]]
]
or something along those lines. What is the best way to accomplish this using groovy/grails?
The following lines
things.inject([:].withDefault { [:].withDefault { 0 } } ) {
map, v -> map[v.name][Date.parse('yyyy-MM-dd', v.date).format('MMMM')] += v.total; map
}
will give you this result:
[fred:[January:20, February:10], jane:[January:20], ted:[March:10, February:10]]
(works with Groovy >= 1.8.7 and 2.0)
I ended up with
things.collect {
// get the map down to name, total and month
it.subMap( ['name', 'total' ] ) << [ month: Date.parse( 'yyyy-MM-dd', it.date ).format( 'MMMM' ) ]
// Then group by name first and month second
}.groupBy( { it.name }, { it.month } ).collectEntries { k, v ->
// Then for the names, collect
[ (k):v.collectEntries { k2, v2 ->
// For each month, the sum of the totals
[ (k2): v2.total.sum() ]
} ]
}
To get the same result as Andre's much shorter, much better answer ;-)
Edit
bit shorter, but still not as good...
things.groupBy( { it.name }, { Date.parse( 'yyyy-MM-dd', it.date ).format( 'MMMM' ) } ).collectEntries { k, v ->
[ (k):v.collectEntries { k2, v2 ->
[ (k2): v2.total.sum() ]
} ]
}
Here's a solution to do the same thing as the other solutions, but in parallel using GPars. There may be a tighter solution, but this one does work with the test input.
#Grab(group='org.codehaus.gpars', module='gpars', version='1.0.0')
import static groovyx.gpars.GParsPool.*
//def things = [...]
withPool {
def mapInner = { entrylist ->
withPool{
entrylist.getParallel()
.map{[Date.parse('yyyy-MM-dd', it.date).format('MMMM'), it.total]}
.combine(0) {acc, v -> acc + v}
}
}
//for dealing with bug when only 1 list item
def collectSingle = { entrylist ->
def first = entrylist[0]
return [(Date.parse('yyyy-MM-dd', first.date).format('MMMM')) : first.total]
}
def result = things.parallel
.groupBy{it.name}.getParallel()
.map{ [(it.key) : (it.value?.size())>1?mapInner.call(it.value):collectSingle.call(it.value) ] }
.reduce([:]) {a, b -> a + b}
println "result = $result"
}

Resources