How to get a sub-field of a struct type map, in the search response of YQL query in Vespa? - yql

Sample Data:
"fields": {
"key1":0,
"key2":"no",
"Lang": {
"en": {
"firstName": "Vikrant",
"lastName":"Thakur"
},
"ch": {
"firstName": "维克兰特",
"lastName":"塔库尔"
}
}
}
Expected Response:
"fields": {
"Lang": {
"en": {
"firstName": "Vikrant",
"lastName":"Thakur"
}
}
}
I have added the following in my search-definition demo.sd:
struct lang {
field firstName type string {}
field lastName type string {}
}
field Lang type map <string, lang> {
indexing: summary
struct-field key {
indexing: summary | index | attribute
}
}
I want to write a yql query something like this (This doesn't work):
http://localhost:8080/search/?yql=select Lang.en from sources demo where key2 contains 'no';
My temporary workaround approach
I have implemented a custom searcher in MySearcher.java, through which I am able to extract the required sub-field and set a new field 'defaultLang', and remove the 'Lang' field. The response generated by the searcher:
"fields": {
"defaultLang": {
"firstName": "Vikrant",
"lastName":"Thakur"
}
}
I have written the following in MySearcher.java:
for (Hit hit: result.hits()) {
String language = "en"; //temporarily hard-coded
StructuredData Lang = (StructuredData) hit.getField("Lang");
Inspector o = Lang.inspect();
for (int j=0;j<o.entryCount();j++){
if (o.entry(j).field("key").asString("").equals(language)){
SlimeAdapter value = (SlimeAdapter) o.entry(j).field("value");
hit.setField("defaultLang",value);
break;
}
}
hit.removeField("Lang");
}
Edit-1: A more efficient way instead is to make use of the Inspectable interface and Inspector, like above (Thanks to #Jo Kristian Bergum)
But, in the above code, I am having to loop through all the languages to filter out the required one. I want to avoid this O(n) time-complexity and make use of the map structure to access it in O(1). (Because the languages may increase to 1000, and this would be done for each hit.)
All this is due to the StructuredData data type I am getting in the results. StructureData doesn't keep the Map Structure and rather gives an array of JSON like:
[{
"key": "en",
"value": {
"firstName": "Vikrant",
"lastName": "Thakur"
}
}, {
"key": "ch",
"value": {
"firstName": "维克兰特",
"lastName": "塔库尔"
}
}]
Please, suggest a better approach altogether, or any help with my current one. Both are appreciated.

The YQL sample query I guess is to illustrate what you want as that syntax is not valid. Picking a given key from the field Lang of type map can be done as you do in your searcher but deserializing into JSON and parsing the JSON is probably inefficient as StructuredData implements the Inspectable interface and you can inspect it directly without the need to go through JSON format. See https://docs.vespa.ai/documentation/reference/inspecting-structured-data.html

Related

Can't place GraphQL custom type as a Postman variable

Has anyone had luck with placing a GraphQL custom type argument as a Postman or Graphql variable? I'm kinda spinning in circles right now, I hope a fresh pair of eyes could point me in the right direction.
What I'm trying to do is to send a mutation request using Postman. The problem I'm having is that the method I'm calling is taking a custom type as an argument. Placing the content of that variable as GraphQL variable or Postman variable is giving me a headache. I can't embedd pictures yet, so here are the links (they are safe).
Schema
This custom type is a JSON-like structure, consisting of two enums and a set of primitive types (strings, ints...). I can screenshot the entire thing but basically that's it: two enums followed by strings, ints...
Custom type definition
What I've tried so far:
Simply hardcoding the request in Postman works but I wish to send multiple requests with varying data
Placing it in a GraphQL variable results in error message
{
"errors": [
{
"message": "Bad request - invalid request body.",
"locations": []
}
],
"data": null
}
Placing the custom type content as a Postman environment variable works, but I'm getting a syntax error (although the request passes...).
Request body is below. Hardcoding it and using a Postman variable produces the same request body, apart from the syntax error.
query: "mutation {
createApplication(request: {
applicationKind: NEW_ISSUANCE,
documentKind: REGULAR_PASSPORT,
personalData: {
timestamp: null,
firstname: "NAME",
lastname: "LASTNAME",
middlename: "MIDDLENAME",
dateOfBirth: "2011-09-28",
citizenshipCountryCode: "USA",
gender: MALE,
personalNumber: "3344",
placeOfBirth: "CHICAGO",
municipalityOfBirth: "SOUTH",
countryCodeOfBirth: "USA"},
addressData:{
street: "WEST",
municipality: "EAST",
place: "CHICAGO",
country: {
code: "USA",
name: null
},
entrance: "б",
flat: "13",
number: "35"}
})
{
__typename
... on AsyncTaskStatus {
taskID
state
payload {
... on ApplicationUpdated {
applicationID
applicationNumber
__typename
}
__typename
}
__typename
}
... on Error {
...errorData
__typename
}
}
}
fragment errorData on Error {
__typename
code
message
}"
Postman variable with a squiggly line
I'm spinning in circles right now. Has anyone had any luck with Postman requests of this kind?
I can post more info, screenshots...just let me know. I'll be watching this topic closely and provide feedback.
Thank you for your time.
please add a the variable in variable section as :
{
"request": {{request}}
}
and then refer this in your query as
$request

Zapier - add data to JSON response (App development)

We are creating a Zapier app to expose our APIs to the public, so anyone can use it. The main endpoint that people are using returns a very large and complex JSON object. Zapier, it looks like, has a really difficult time parsing nested complex JSON. But it does wonderful with a very simple response object such as
{ "field": "value" }
Our data that is being returned has this structure and we want to move some of the fields to the root of the response so it's easily parsed by Zapier.
"networkSections": [
{
"identifier": "Deductible",
"label": "Deductible",
"inNetworkParameters": [
{
"key": "Annual",
"value": " 600.00",
"message": null,
"otherInfo": null
},
{
"key": "Remaining",
"value": " 600.00",
"message": null,
"otherInfo": null
}
],
"outNetworkParameters": null
},
So, can we do something to return for example the remaining deductible?
I got this far (adding outputFields) but this returns an array of values. I'm not sure how to parse through this array either in the Zap or in the App.
{key: 'networkSections[]inNetworkParameters[]key', label: 'xNetworkSectionsKey',type: 'string'},
ie this returns an array of "Annual", "Remaining", etc
Great question. In this case, there's a lot going on, and outputFields can't quite handle it all. :(
In your example, inNetworkParameters contains an array of objects. Throughout our documentation, we refer to these as line items. These lines items can be passed to other actions, but the different expected structures presents a bit of a problem. The way we've handled this is by letting users map line-items from one step's output to another step's input per field. So if step 1 returns
{
"some_array": [
{
"some_key": "some_value"
}
]
}
and the next step needs to send
{
"data": [
{
"some_other_key": "some_value"
}
]
}
users can accomplish that by mapping some_array.some_key to data.some_other_key.
All of that being said, if you want to always return a Remaining Deductible object, you'll have to do it by modifying the result object itself. As long as this data is always in that same order, you can do something akin to
var data = z.JSON.parse(bundle.response.content);
data["Remaining Deductible"] = data.networkSections[0].inNetworkParameters[1].value;
return data;
If the order differs, you'll have to implement some sort of search to find the objects you'd like to return.
I hope that all helps!
Caleb got me where I wanted to go. For completeness this is the solution.
In the creates directory I have a js file for the actual call. The perform part is below.
perform: (z, bundle) => {
const promise = z.request({
url: 'https://api.example.com/API/Example/' + bundle.inputData.elgRequestID,
method: 'GET',
headers: {
'content-type': 'application/json',
}
});
return promise.then(function(result) {
var data = JSON.parse(result.content);
for (var i=0; i<data.networkSections.length; i++) {
for (var j=0; j<data.networkSections[i].inNetworkParameters.length; j++) {
// DEDUCT
if (data.networkSections[i].identifier == "Deductible" &&
data.networkSections[i].inNetworkParameters[j].key == "Annual")
data["zAnnual Deductible"] = data.networkSections[i].inNetworkParameters[j].value;
} // inner for
} // outer for
return data;
});

Extracting value of a node in Java using contains with JsonPath in RestAssured

I have to extract value of book title using JsonPath in RestAssured in Java from following json response
{
"spec": {
"groups": [
{
"name": "book",
"title": "classic-books:1.0.2"
},.......
]
}
}
I am looking to use contains to get the book with a specific title.Please help.
Assume you have response with JSON in it:
response.body().jsonPath().get("spec.groups[i].title");

Elastic Search - Performing Geo Location Query on Bulk imported JSON

I have a file modified.json containing JSON documents which contain the following:
{"index":{"_index": "trial", "_type": "trial", "_id":"1"}}
{"clinics" : [{"price": 1048, "city": "Bangalore", "Location": {"lon": 77.38381692924742, "lat": 12.952155989068519}}, {"price": 1048, "city": "Bangalore", "Location": {"lon": 77.38381692924742, "lat": 12.952155989068519}}]
.... And more similar documents. I am running a bulk insert through this command:
curl -s -XPOST 'http://localhost:9200/_bulk' --data-binary #modified.json
Next, through the Sense (Google Chrome) plugin, I am issuing the following request
Server: localhost:9200/trial/trial
PUT _search
{
"mappings": {
"clinics": {
"properties": {
"Location": {"type": "geo_point"}
}
}
}
}
I don't know whether or not this is creating a mapping for the Location fields.
After issuing a search request, like this:
Server: localhost:9200/trial/trial
GET _search
{
"query": {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "20km",
"Location" : {
"lat" : 12.958487287824958,
"lon" : 77.69648881178146
}
}
}
}
}
}
I am getting an error saying:
failed to find geo_point field [Location]]; }]",
Please help me regarding this. Also, if possible also guide me on how to do faceted search on this data, to show clinics within a range of distance: like between 20 and 30 kilometers
You need to reverse your ordering. PUT your mapping mapping first and then do the bulk insert.
By putting the map before the insert you are letting ES know what the datatype are so that it can index them correctly. ES allows you to add mappings to an existing index but the data that's already there isn't going to get re-indexed automatically. (This is a known issue).

How to make elasticsearch add the timestamp field to every document in all indices?

Elasticsearch experts,
I have been unable to find a simple way to just tell ElasticSearch to insert the _timestamp field for all the documents that are added in all the indices (and all document types).
I see an example for specific types:
http://www.elasticsearch.org/guide/reference/mapping/timestamp-field/
and also see an example for all indices for a specific type (using _all):
http://www.elasticsearch.org/guide/reference/api/admin-indices-put-mapping/
but I am unable to find any documentation on adding it by default for all documents that get added irrespective of the index and type.
Elasticsearch used to support automatically adding timestamps to documents being indexed, but deprecated this feature in 2.0.0
From the version 5.5 documentation:
The _timestamp and _ttl fields were deprecated and are now removed. As a replacement for _timestamp, you should populate a regular date field with the current timestamp on application side.
You can do this by providing it when creating your index.
$curl -XPOST localhost:9200/test -d '{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"_default_":{
"_timestamp" : {
"enabled" : true,
"store" : true
}
}
}
}'
That will then automatically create a _timestamp for all stuff that you put in the index.
Then after indexing something when requesting the _timestamp field it will be returned.
Adding another way to get indexing timestamp. Hope this may help someone.
Ingest pipeline can be used to add timestamp when document is indexed. Here, is a sample example:
PUT _ingest/pipeline/indexed_at
{
"description": "Adds indexed_at timestamp to documents",
"processors": [
{
"set": {
"field": "_source.indexed_at",
"value": "{{_ingest.timestamp}}"
}
}
]
}
Earlier, elastic search was using named-pipelines because of which 'pipeline' param needs to be specified in the elastic search endpoint which is used to write/index documents. (Ref: link) This was bit troublesome as you would need to make changes in endpoints on application side.
With Elastic search version >= 6.5, you can now specify a default pipeline for an index using index.default_pipeline settings. (Refer link for details)
Here is the to set default pipeline:
PUT ms-test/_settings
{
"index.default_pipeline": "indexed_at"
}
I haven't tried out yet, as didn't upgraded to ES 6.5, but above command should work.
You can make use of default index pipelines, leverage the script processor, and thus emulate the auto_now_add functionality you may know from Django and DEFAULT GETDATE() from SQL.
The process of adding a default yyyy-MM-dd HH:mm:ss date goes like this:
1. Create the pipeline and specify which indices it'll be allowed to run on:
PUT _ingest/pipeline/auto_now_add
{
"description": "Assigns the current date if not yet present and if the index name is whitelisted",
"processors": [
{
"script": {
"source": """
// skip if not whitelisted
if (![ "myindex",
"logs-index",
"..."
].contains(ctx['_index'])) { return; }
// don't overwrite if present
if (ctx['created_at'] != null) { return; }
ctx['created_at'] = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date());
"""
}
}
]
}
Side note: the ingest processor's Painless script context is documented here.
2. Update the default_pipeline setting in all of your indices:
PUT _all/_settings
{
"index": {
"default_pipeline": "auto_now_add"
}
}
Side note: you can restrict the target indices using the multi-target syntax:
PUT myindex,logs-2021-*/_settings?allow_no_indices=true
{
"index": {
"default_pipeline": "auto_now_add"
}
}
3. Ingest a document to one of the configured indices:
PUT myindex/_doc/1
{
"abc": "def"
}
4. Verify that the date string has been added:
GET myindex/_search
An example for ElasticSearch 6.6.2 in Python 3:
from elasticsearch import Elasticsearch
es = Elasticsearch(hosts=["localhost"])
timestamp_pipeline_setting = {
"description": "insert timestamp field for all documents",
"processors": [
{
"set": {
"field": "ingest_timestamp",
"value": "{{_ingest.timestamp}}"
}
}
]
}
es.ingest.put_pipeline("timestamp_pipeline", timestamp_pipeline_setting)
conf = {
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1,
"default_pipeline": "timestamp_pipeline"
},
"mappings": {
"articles":{
"dynamic": "false",
"_source" : {"enabled" : "true" },
"properties": {
"title": {
"type": "text",
},
"content": {
"type": "text",
},
}
}
}
}
response = es.indices.create(
index="articles_index",
body=conf,
ignore=400 # ignore 400 already exists code
)
print ('\nresponse:', response)
doc = {
'title': 'automatically adding a timestamp to documents',
'content': 'prior to version 5 of Elasticsearch, documents had a metadata field called _timestamp. When enabled, this _timestamp was automatically added to every document. It would tell you the exact time a document had been indexed.',
}
res = es.index(index="articles_index", doc_type="articles", id=100001, body=doc)
print(res)
res = es.get(index="articles_index", doc_type="articles", id=100001)
print(res)
About ES 7.x, the example should work after removing the doc_type related parameters as it's not supported any more.
first create index and properties of the index , such as field and datatype and then insert the data using the rest API.
below is the way to create index with the field properties.execute the following in kibana console
`PUT /vfq-jenkins
{
"mappings": {
"properties": {
"BUILD_NUMBER": { "type" : "double"},
"BUILD_ID" : { "type" : "double" },
"JOB_NAME" : { "type" : "text" },
"JOB_STATUS" : { "type" : "keyword" },
"time" : { "type" : "date" }
}}}`
the next step is to insert the data into that index:
curl -u elastic:changeme -X POST http://elasticsearch:9200/vfq-jenkins/_doc/?pretty
-H Content-Type: application/json -d '{
"BUILD_NUMBER":"83","BUILD_ID":"83","JOB_NAME":"OMS_LOG_ANA","JOB_STATUS":"SUCCESS" ,
"time" : "2019-09-08'T'12:39:00" }'

Resources