Error submitting a cloud dataflow job - google-cloud-dataflow

Since a few days ago, I'm no longer able to submit my dataflow jobs, they fail with the error below.
I tried to submit the simple WordCount job and it succeeded. Even with a very simplified version of my own job, everything is fine. But when I add more code (adding GroupByKey transform), I'm no longer able to submit it.
Does anybody have any idea what does this error mean?
Thanks,
G
Exception in thread "main" java.lang.RuntimeException: Failed to create a workflow job: Invalid JSON payload received. Unknown token.
{ 8r W
^
at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.run(DataflowPipelineRunner.java:219)
at com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner.run(BlockingDataflowPipelineRunner.java:96)
at com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner.run(BlockingDataflowPipelineRunner.java:47)
at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:145)
at snippet.WordCount.main(WordCount.java:165)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "Invalid JSON payload received. Unknown token.\n\u001F \b\u0000\u0000\u0000\u0000\u0000\u0000\u0000 \t{ 8r\u0000 W\n^",
"reason" : "badRequest"
} ],
"message" : "Invalid JSON payload received. Unknown token.\n\u001F \b\u0000\u0000\u0000\u0000\u0000\u0000\u0000 \t{ 8r\u0000 W\n^",
"status" : "INVALID_ARGUMENT"
}

To debug this issue, we want to validate that the request that is being made is valid and find the invalid portion of the JSON payload. To do this we will:
Increase logging verbosity
Re-run the application and capture the logs
Find the relevant section within the logs representing the JSON payload
Validate the JSON payload
Increasing logging verbosity
By adding the following lines to your main before you construct your pipeline, you will tell the Java logger implementation to increase the verbosity for the "com.google.api" package. This in turn will log the HTTP request/responses to Google APIs.
import java.util.logging.ConsoleHandler;
import java.util.logging.Level;
import java.util.logging.Logger;
public class MyDataflowProgram {
public static void main(String[] args) {
ConsoleHandler consoleHandler = new ConsoleHandler();
consoleHandler.setLevel(Level.ALL);
Logger googleApiLogger = Logger.getLogger("com.google.api");
googleApiLogger.setLevel(Level.ALL);
googleApiLogger.setUseParentHandlers(false);
googleApiLogger.addHandler(consoleHandler);
... Pipeline Construction ...
}
Re-run the application and capture the logs
You will want to re-run your Dataflow application and capture the logs. This is dependent on your development environment, what OS and/or IDE that you use. For example, when using Eclipse the logs will appear within the Console window. Saving these logs will help you maintain a record of the issue.
Find the relevant section within the logs representing the JSON payload
During re-execution of your Dataflow job, you will want to find the logs related to submission of the Dataflow job. These logs will contain the HTTP request followed by a response and will look like the following:
POST https://dataflow.googleapis.com/v1b3/projects/$GCP_PROJECT_NAME/jobs
Accept-Encoding: gzip
... Additional HTTP headers ...
... JSON request payload for creation ...
{"environment":{"clusterManagerApiService":"compute.googleapis.com","dataset":"bigquery.googleapis.com/cloud_dataflow","sdkPipelineOptions": ...
-------------- RESPONSE --------------
HTTP/1.1 200 OK
... Additional HTTP headers ...
... JSON response payload ...
You are interested in the request payload as the error you are getting indicates that it is the source of the problem.
Validate the JSON payload
There are many JSON validators which can be used but I prefer to use http://jsonlint.com/ because of its simplicity. If you are able, please share your findings either by updating the question or if you get stuck, feel free to send me a private message.

Related

AWS Lambda Serverless endpoint exits without executing function

We have a POST endpoint in our serverless api which listens to a Magento 2 integration activation callback and processes the payload. The Content-Type of this callback request is application/x-www-form-urlencoded. However, when we try to get the callback, the lambda function finishes execution immediately, skipping the entire function body. What we see in the Cloudwatch logs is only this. Not even console.logs are printed. (the endpoint only prints a string to the console. No async operations are in place. Yet this problem persists)
2020-12-12T12:24:47.012+05:30 START RequestId: 4afba03d-54ef-4b5e-bd44-157b0b7a9f9b Version: $LATEST
2020-12-12T12:24:47.050+05:30 END RequestId: 4afba03d-54ef-4b5e-bd44-157b0b7a9f9b
2020-12-12T12:24:47.050+05:30 REPORT RequestId: 4afba03d-54ef-4b5e-bd44-157b0b7a9f9b Duration: 37.83 ms Billed Duration: 38 ms Memory Size: 128 MB Max Memory Used: 109 MB Init Duration: 893.79 ms
When we try to hit the same endpoint from POSTMAN with Content-Type: application/json, the endpoint works as expected.
Therefore we thought that the problem might be the Content-Type header and read somewhere that adding request mapping templated would solve this problem. Therefore, we even added a mapping template for content type application/x-www-form-urlencoded in the integration request of the lambda method with following content, time to time. But our problem was not solved unfortunately.
"{ "body": "$util.base64Decode($input.body)" }"
{
"formparams" : $input.json('$')
}
{
"body" : $input.json('$')
}
My question is: How we can set the endpoint to print the POST request payload, preventing it from immediate exiting?
We have been searching for a solution to this problem since a week. It would be a great help, if someone can input their helpful, valuable suggestions to solve this problem. Thanks in advance
Since the Content-Type of the Magento 2 Integration activation callback is application/x-www-form-urlencoded, the lambda event for that POST request was something like this.
console.log(event) -> {body: "a=var&b=other_var&c=another_var"}
The endpoint didn't even print anything because I had put console.log(JSON.parse(event. body)). This results in a JSON parse error and the endpoint immediately finishes execution.
When I started parsing the query parameter event body instead of JSON.parse(), the problem was solved.

how to add a status message on a grails respond

I have created a method that is called after every uncatched exception and respond a gson view:
void handleError(){
respond([status: 500, view: "/customErr"], [
code : 500,
message : "whatever internal error",
])
}
this works fine but the main problem that I have is that my client (another server acting as client) is receiving a http response with a 500 status but the status message is null. I've checked the respond docs and I don't see a property message or something.
this if what my client receive:
wslite.rest.RESTClientException: 500 null
and that null is the response's status message that is not set by grails
How can I add a detail message on my respond? idyllically something like this:
respond([status: 500, statusMessage: "my custom
message", view: "/customErr"], [
code : 500,
message : "whatever internal error",
])
grails uses Servlet API’s HttpServletResponse to build response.
now check the java doc for HttpServletResponse class.
there are only 2 methods to define status message:
void setStatus(int sc, String msg) Deprecated. As of version 2.1, due to ambiguous meaning of the message parameter. To set a status code use setStatus(int), to send an error with a description use sendError(int, String).
void sendError(int sc, String msg) Sends an error response to the client using the specified status and clears the buffer. The server defaults to creating the response to look like an HTML-formatted server error page containing the specified message, setting the content type to "text/html".
The first one is deprecated. The second one sends the status message, but it will not send the body - mainly this used for fatal errors...
So, officially by servlet documentation there is no way to send both: message and body in response.
The question is tricky. Because according to Apache Tomcat the "custom status message" feature will be removed starting from version 9: https://tomcat.apache.org/tomcat-8.5-doc/config/systemprops.html#Other
But according to RFC2616 sec 6.1.1 : The reason phrases listed here are only recommendations -- they MAY be replaced by local equivalents without affecting the protocol.
This doesn't directly answer your question, and may not solve your issue depending on how much control you have over the application that receives the error, but you can add error messages to the header of the response which should pass through to the client (if it knows to check for them).
String errorMessage = "Whatever Internal Server Error"
response.addHeader("custom-error", errorMessage)
response.sendError(500, errorMessage)
That'll send the message in the header and as the response message... I do not know if this works with respond() or how that interacts with the response object in the controller... but if you're responding to render a view as the error page, you should be able to add the message to the model included in the respond() call and show that on the error page (at least, I'd think so).

Mark successful siesta response as error

I'm working with a really strange (and nasty) API that I have no control over, and unfortunately when an invalid request is made, instead of responding with a 4xx status, it responds with a 200 status instead.
With this response, it also changes the response body from the usual XML response to plain text, but does not change the content type header. You can imagine how annoying this is!
I've got Siesta working with the API and the fact that it is no actually RESTful in the slightest, but I'm unsure how to get the next part working - handling the unsuccessful requests.
How do I go about transforming a technically valid and successful 200 response, into an error response? Right now I have the following setup:
configure("/endpoint") {
$0.mutateRequests { req in
... perform some mutation to request ...
}
$0.pipeline[.parsing].add(self.XMLTransformer)
}
configureTransformer("/endpoint") {
($0.content as APIResponse)
.data()
.map(Resource.init)
}
This is working just fine when the response actually is XML, however in the scenario where the response is an error, I receive the following:
bad api request: invalid api key
or something similar to this. The XMLParser class is already handling this, and in turn marks itself as having come across an error, however I don't know how to make Siesta realise that there is an error, and to not call my transformer but instead mark the request as failed to I can handle the error elsewhere.
How can I achieve what I'm after?
configureTransformer is just a common-case shortcut for the full-featured (but more verbose) arbitrary transformers Siesta’s pipeline supports. Full transformers can arbitrarily convert any response to any other, including success → failure and failure → success. The user guide discusses this a bit.
You can see this in action in the example project, which has a customer transformer that does something very similar to what you want, turning a 404 failure into a success with the content false. It is configured here and defined here. That example does a failure → success transformation, but you should find the code adaptable for your success → failure purposes.

YouTube API v3.0 CommentsThread.list proccessing failuer issue

When I send a comments thread. List request to the YouTube API
I get the following exception "But not for all videos":
Google.GoogleApiException: Google.Apis.Requests.RequestError
The API server failed to successfully process the request.
While this can be a transient error, it usually indicates that the requests input is invalid. Check the structure of the commentThread resource in the request body to ensure that it is valid. [400].
And for those videos I double checked the inputs sent with the request and I
make the request with the same data directly from the YouTube API requests trial section but everything goes right!
I want to know why this request becomes valid on some video ID's but invalid on the others ?
I'll appreciate any help.
Here is the full written log: System.AggregateException: One or more errors occurred. ---> Google.GoogleApiException: Google.Apis.Requests.RequestError
The API server failed to successfully process the request. While this can be a transient error, it usually indicates that the requests input is invalid. Check the structure of the commentThread resource in the request body to ensure that it is valid. [400]
Errors [
Message[The API server failed to successfully process the request. While this can be a transient error, it usually indicates that the requests input is invalid. Check the structure of the commentThread resource in the request body to ensure that it is valid.] Location[body - other] Reason[processingFailure] Domain[youtube.commentThread]
]
at Microsoft.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at Microsoft.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccess(Task task)
at Google.Apis.Requests.ClientServiceRequest`1.d__0.MoveNext() in c:\ApiaryDotnet\default\Src\GoogleApis\Apis\Requests\ClientServiceRequest.cs:line 0
--- End of inner exception stack trace ---
CommentThreads.list API doesn't need a request body as indicated in the API reference
Request body
Do not provide a request body when calling this method.
This may have caused the RequestError on your call. Try to remove any objects passed when calling this API, hopefully this would fix the issue.

Getting HTTP transport error while executing google dataflow job

I am getting a constant error while executing a Dataflow job:
BigQuery import job "dataflow_job_838656419" failed., : BigQuery creation of import job for table "TestTable" in dataset "TestDataSet" in project "TestProject" failed., : BigQuery execution failed., : HTTP transport error: Message: Invalid value for: String is not a valid value HTTP Code: 400
It does not give any specific reason for the google Dataflow job failing continuously.
How do I know what is the error I am committing while executing the google Dataflow job?
The issue is the incorrect use of the BigQuery API, which is case-sensitive with respect to field type. Please specify "STRING" as the field type in the schema that you're providing.
Please see https://cloud.google.com/bigquery/docs/reference/rest/v2/tables for more details.

Resources