How come KSQLDB push query outputs chunked json? - ksqldb

I'm new to KsqlDB so I might be missing something obvious. My question is related to the chunked JSON output of a never-ending push-query not being valid JSON. Let me elaborate.
In short, my setup is as follows. From a typescript/node process I've created a push query on a ksql stream as such:
CREATE STREAM events (id VARCHAR, timestamp VARCHAR, location VARCHAR, events ARRAY<VARCHAR>) WITH (kafka_topic='mytopic', value_format='json', partitions=1);
The push query itself is created as a long-running REST stream (using axios):
const response = await axios.post(
`http://ksqldb-server:8088/query-stream`,
{
sql: `SELECT * FROM events EMIT CHANGES;`,
streamsProperties: {}
},
{
headers: {
'Content-Type': 'application/vnd.ksql.v1+json',
Accept: 'application/vnd.ksql.v1+json',
},
responseType: 'stream',
}
);
This works. When run, I first get the header row:
[{"header":{"queryId":"transient_EVENTS_2815830975103425962","schema":"`ID` STRING, `TIMESTAMP` STRING, `LOCATION` STRING, `EVENTS` ARRAY<STRING>"}}
Followed by new rows coming in one-by-one based on real-world events:
{"row":{"columns":["b82baad7-a87e-4617-b18a-1782b4cb49ce","2022-05-16 08:03:03","Home",["EventA","EventD"]]}},\n
Now, if this query would ever complete it would probably end up as valid JSON when concatenated together (although the header row is missing a , at the end). Since it's a push query however, it never completes and as such I won't receive the closing ] - which means it will never be valid JSON. Also, I'm looking to process events in real-time, otherwise I could have written a pull query instead.
My expectations were that each new row would be parseble by itself using JSON.parse(). Instead, I've ended up having to JSON.parse(data.slice(0, -2)) to get rid of the additional ,\n. However, it does not feel right to put this into production.
What is the rational behind outputting chunked JSON on push queries? It seems an illogical format to me for any use-case.
And is there a way to alter the output of ksql events to what I would expect? Maybe some header or attribute I'm missing?
Thanks for your insights!

You explicitly set application/vnd.ksql.v1+json as your desired response format in the headers:
headers: {
'Content-Type': 'application/vnd.ksql.v1+json',
Accept: 'application/vnd.ksql.v1+json',
},
application/vnd.ksql.v1+json means that the complete response will be a valid JSON doc.
As you pointed out, this is impractical as the push query never completes. You should remove the headers or set them explicitly to the default application/vnd.ksqlapi.delimited.v1. application/vnd.ksqlapi.delimited.v1 means that the every returned row is going to be valid JSON.
See https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-rest-api/streaming-endpoint/#executing-pull-or-push-queries for more details.

Related

Lua scripting language: modify a response body in API gateway

I would like to modify the response body returned by the backend.
As background I'll detail my specific problem (but I don't require a solution to the specific problem, just the method for manipulating a response body). I want to insert/add a key value pair to the response body based on the status code of the response and I want to transform snake_case keys into camelCase keys.
For example, given a response with
status code: 401
body: {'detail_message': 'user is not logged in'}
I want to transform it to a response with
status code: 401
body: {'success': False, 'detailMessage': 'user is not logged in'}
The rule for success would be True for anything below 400 and False for anything above or equal.
Lua scripting can be used in my API gateway which is Krakend
https://www.krakend.io/docs/endpoints/lua/
The documentation only includes examples for printing the response body and modifying headers but not for modifying the response body.
I have no experience with Lua and only need it for one task. I haven't been able to find online example of response body manipulation which I could play with.
What methods do I need in order to add a key value pair to a response body and to manipulate the keys in the response body?

ODATA javascript client libraries (what is their value over simple Fetch or AJAX)?

Currently we are using client-side javascript fetch to connect to our ODATA V4 ERP server:
const BaseURL = 'https://pwsepicorapp.com/ERP10.2/api/v1/Erp.BO.JobEntrySvc/'
const fetchJobNum = (async () => {
let url = BaseURL + 'GetNextJobNum'
const reply = await fetch(url,{
method: 'POST',
mode: 'cors',
headers: {
'Accept': 'application/json',
'Authorization': 'Basic xxxx',
'x-api-key' : '0HXJZgldKZjKIXNgIycD4c4DPqSrzn2UFCPHbiR1aY7IW',
'Access-Control-Allow-Origin': '*',
'Content-Type': 'application/json'
},
body: JSON.stringify({})
})
let rsp = await reply.json()
let job = rsp.parameters.opNextJobNum
return job
})
And this works fine for us. We recently started looking at javascript ODATA libraries (Apache OLINGO, O.js, JayData (or other ones suggested at: https://www.odata.org/libraries/)
But what I don't see is an objective guide for a developer understand why and what these libraries provide.
I.e. I think they read the meta-data for the particular ODATA service. Fine but what does power does that add?
Perhaps my mental block is that we are only:
searching only JSON data
Not doing any nested queries (only simple $filter, $select)
Just doing simple GET, POST, PATCH
Or perhaps these libraries were needed for functionality that was missing before ODATA V4
Can anyone give a succinct description of the features for these libraries and their UNIQUE VALUE PROPOSITIONS (to borrow a Venture Capital Term) to developers? I bet others would find this useful.
Short Answer
You are right. If all you are doing just the simple operations you don't need any of these libraries as at the end of the day they are just REST calls that follow some specific conventions(i.e. OData specification).
Long Answer
The reason we have all these client side APIs is that OData offers/defines a lot more stuff.
Lets try to go though it with an example. The example I am using is Batch Requests in OData. I simplest of the terms Olingo defines a way to club multiple HTTP requests in one. It has a well defined syntax for it. That looks something like this
POST /service/$batch HTTP/1.1
Host: host
OData-Version: 4.0
Content-Type: multipart/mixed; boundary=batch_36522ad7-fc75-4b56-8c71-56071383e77b
Content-Length: ###
--batch_36522ad7-fc75-4b56-8c71-56071383e77b
Content-Type: application/http
GET /service/Customers('ALFKI')
Host: host
--batch_36522ad7-fc75-4b56-8c71-56071383e77b
Content-Type: application/http
GET /service/Products HTTP/1.1
Host: host
--batch_36522ad7-fc75-4b56-8c71-56071383e77b--
Now there are quite a few things here.
You have to start the batch request with batch_<Unique identifier> and separate individual HTTP requests with the batch boundary, and when you are done you end it with batch__<Unique identifier>--
You set the batch identifier in as the header and send additional headers(like content-type, content-length) properly that you can see in the .
Now coming back to your original question sure you can use a lot of string concatenation in your JavaScript code and generate the right payload and make an ajax call then parse back a similar kind of response, but as an application developer all you care about is batching your GET, POST, PUT and DELETE request and operation the operation you desire.
Now if you use a client library(the example is generic and might differ from library to library) the code should look something like
OData.request( {
requestUri: "http://ODataServer/Myservice.svc/$batch",
method: "POST",
data: { __batchRequests: [
{ requestUri: "Customers('ALFKI')", method: "GET" },
{ requestUri: "Products", method: "GET" }
]}
},
function (data, response) {
//success handler
}, undefined, OData.batchHandler);
So on a purely business proposition terms libraries like these can save you quite a few man hours based on your application size that will be consumed on generating the right payload strings or right URL string(in case of filters, navigation properties etc.) and debugging thought the code in case you missed a bracket or misspelled a header name, which can be used on building the core logic for the application/product and let the standardized, repetitive and boring(opinionated thought) work for you.

Graph call to change planner task returning 204 (and not making the change)

I'm using python to make calls to the Graph API regarding planner and tasks. whenever I use PATCH to try and update the task I get a 204 response back and the task remains unchanged. According to Microsoft's documentation here, this request should always return either a 200, or a 400 level error.
I have tried changing the data that I send to the server, to change the title rather than the dates, however I get the same 204 response no matter what data I send or what field I attempt to change. I have no problem making other graph calls like updating files in One Drive or getting data about a user
def SetDates(task):
'''Update planner to match the start date and due date of the passed in task'''
tid = task["id"]
start = task["startDateTime"]
end = task["dueDateTime"]
newDates = {"dueDateTime": end,"startDateTime": start}
etag = task["#odata.etag"]
session.headers.update({'If-Match':etag})
response = session.patch(f"https://graph.microsoft.com/v1.0/planner/tasks/{tid}", data = newDates)
session.headers.pop('If-Match')
print(task["title"] + " Has been scheduled")
Based on the documentation I expect this to return a status code of 200, and for the response to contain the data of the task that was updated, and for the change to actually be applied to the task.
By default, PATCH requests return an empty response with 204 return code. To get the data updated data back, you should send "Prefer" HTTP header with value "return=representation".
PATCH https://graph.microsoft.com/v1.0/planner/tasks/{task-id}
Content-type: application/json
Content-length: 247
If-Match: W/"JzEtVGFzayAgQEBAQEBAQEBAQEBAQEBAWCc="
Prefer: return=representation
I have finally figured this out.
#Tarken Sevilmis mentioned that in order to get a 200 response from a PATCH request you need to add
Prefer: return=representation
to your request. In my case the reason that my changes weren't being applied was because I hadn't set the content type in the header. The Graph API didn't give an error, but this seems to have been the cause of the issue. Once I set the content type to application/json it gave a proper error indication that the values I gave in the body weren't being read correctly, and I realized that I forgot to parse them to JSON.
Once you set the content headers appropriately and make sure to convert your data to proper JSON everything should work as intended

What is the expected JSON Block for a desire2learn POST Org Parent?

I believe I have tried all combinations to POST to this route and continue to get a 404. What am I doing wrong?
I want to set the parent of a courseOffering to a different, but existing courseTemplate. The courseOffering orgUnitId is 31273, the new parent (courseTemplate) orgUnitId is 31286. The route is use is:
POST .../d2l/api/lp/1.2/orgstructure/31273/parents/
(also tried without the trailing /)
The JSON Block is:
{"OrgUnitId":31286}
I have also tried Id and Identifier instead of OrgUnitId and a string, "31286" instead of an int and orgUnitId (lower case) - all result in 404.
fwiw, a get with the same route works just fine.
cwt
from the Valence docs:
POST /d2l/api/lp/(version)/orgstructure/(orgUnitId)/parents/
Give the provided org unit a new parent org unit.
Parameters:
version (D2LVERSION) – API version.
orgUnitId (D2LID) – Org unit ID.
JSON Parameters:
OrgUnitId (D2LID as single JSON number) – Org unit to add as a parent.
The docs say that the expected input data for this POST is an org unit ID value provided as "a single JSON number".
When you say you're passing:
{"OrgUnitId": 31286}
what you're providing is a JSON object (composite) with a single property, named "OrgUnitId" that has a JSON number value. In fact, what you should provide is:
31286
But you should still indicate this is a JSON body, so you should set the HTTP request Content-Type header to application/json). Here's an excerpt from an interactive session successfully making this call; we dump the various components of the HTTP response/request object for you (r is the response, r.request is the original request that prompted the response):
In [40]: r.request.method
Out[40]: 'POST'
In [41]: r.request.url
Out[41]: 'https://{host}/d2l/api/lp/1.1/orgstructure/8083/parents/?x_b={user_id}&x_c={app_sig}&x_a={app_id}&x_d={user_sig}&x_t=1391183251'
In [42]: r.request.headers
Out[42]: CaseInsensitiveDict({'Content-Length': '4', 'User-Agent': 'python-requests/2.2.1 CPython/3.3.3 Darwin/12.5.0', 'Content-Type': 'application/json', 'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate, compress'})
In [43]: r.request.body
Out[43]: '6984'
In [44]: r.status_code
Out[44]: 200
Course offerings and course templates have a special relationship in that every course offering must have a course template as a parent. You might be able to first add the new course template as a parent to the course offering, and then delete the old one, or the back-end service might just not let you sever off a course template from a course offering at all.

HTTP Response changed?

i have an HttpWebRequest that do 'POST' to a web server and get an HTML page In response.
I've been asked how is the best practice to know that the response i got has been changed or not?
I can't relay on the web server headers, they don't have to be.
this will increase performance in a way that i wont be need to parse the response over again and will go to the next request in a half of a sec or so.
thank you in advanced
You can tell your webserver about last modification date. see here. If you can't rely on that you have to parse your response anyway. You could do that quickly using md5. So you "md5" your current response and compare it with the previous.
You shouldn't be attempting to rely on the headers for a POST request, as it shouldn't emit any caching headers anyway.
What you need to do instead is perform a hash/checksum (this can either be CRC(32) for absolute performance or a "real" hash such as md5) on the returned content (this means, everything underneath the \r\n\r\n in the headers) and do a comparison that way.
It should be good enough to store the last request's checksum/hash and compare against that.
For example (psuedo):
int lastChecksum = 0;
bool hasChanged() {
performWebRequest();
string content = stripHeaders();
int checksum = crc32string(content);
if(checksum != lastChecksum) {
lastChecksum = checksum;
return true;
}
return false;
}

Resources