How to fix "java.lang.RuntimeException: Failed to create job" in a Dataflow template job that writes to BigQuery? - google-cloud-dataflow

I'm trying to use the JDBC to BigQuery Dataflow template to copy data from a Postgres database to BigQuery. But my Dataflow job is failing and I'm running into this error below:
java.lang.RuntimeException: Failed to create job with prefix beam_bq_job_LOAD_jdbctobigquerydataflow0releaser1025092115d7a229e9_214eff91b59f4b8d863809d3865504fa_11cbacad09f05e44363d2dd2963e9fd1_00001_00000, reached max retries: 3, last failed job: null.
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers$PendingJob.runJob ( org/apache.beam.sdk.io.gcp.bigquery/BigQueryHelpers.java:200 )
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers$PendingJobManager.waitForDone ( org/apache.beam.sdk.io.gcp.bigquery/BigQueryHelpers.java:153 )
at org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.finishBundle ( org/apache.beam.sdk.io.gcp.bigquery/WriteTables.java:378 )
I saw other stack overflow posts and have already tried the following:
Ensure that Dataflow worker service account is granted BigQuery Admin and BigQuery User role for the dataset I'm writing to.
Database is not too large - I'm only copying <10 rows since I'm just testing it out.
Ensure that schema in Postgres DB is the same with BigQuery table's schema.
The solutions above still didn't work, anything else I can try? Thanks!

Is the project where the dataset is located different to the one where Dataflow is running? If they are different, you will need to assign the BigQuery User role in the Dataflow project. The BigQuery job is initiated in that project.
(the BigQuery Admin role for the destination dataset would remain as is)

Related

Display info (date) about completed Sidekiq job

I'm using Rails 6 in my app with Sidekiq on board. I've got FetchAllProductsWorker like below:
module Imports
class FetchAllProductsWorker
include Sidekiq::Worker
sidekiq_options queue: 'imports_fetch_all', retry: 0
def perform
(...)
end
end
end
I want to check when FetchAllProductsWorker was finished successfully last time and display this info in my front-end. This job will be fired sporadically but the user must have feedback when the last database sync (FetchAllProductsWorker is responsible for that) succeeded.
I want to have such info only for this one worker. I saw a lot of useful things inside the sidekiq API docs but none of them relate to the history of completed jobs.
You could use the Sidekiq Batches API that provides an on_success callback but that is mostly used for tracking batch work which is an overkill for your problem. I suggest writing your code at the end of the perform function.
def perform
(...) # Run the already implemented code
(notify/log for success) # If it is successful notify/log.
end
The simplified default lifecycle of a Sidekiq looks like this:
If there is an error then the job will be retried a couple of times (read about Retries in the Sidekiq docs). During that time you can see the failing job and the error on the Sidekiq Web UI if configured.
– If the job is finished successfully the job is removed from Redis and there is no information about this specific job available to the application.
That means Sidekiq does not really support running queries about jobs that run successfully in the past. If you need information about past jobs then you need to build this on its own. I basically see three options to allow monitoring Sidekiq Jobs:
Write useful information to your application's log. Most logging tools support monitoring for specific messages, sending messages, or creating views for specific events. This might be enough if you just need this information for debugging reasons.
def perform
Rails.logger.info("#{self.class.name}" started)
begin
# job code
rescue => exception
Rails.logger.error("#{self.class.name} failed: #{exception.message}")
raise # re-raise the exception to trigger Sidekiq's default retry behavior
else
Rails.logger.info("#{self.class.name}" was finished successfully)
end
end
If you are mostly interested in getting informed when there is a problem then I suggest looking at a tool like Dead Man's Snitch. how those tools are working is that you ping their API as the last step of a job which will only reach when there was no error. Then configure that tool to notify you if its API hasn't been pinged in the expected timeframe, for example, if you have a daily import job, then Dead Man's Snitch would send you a message only if there wasn't a successful import Job in the last 24 hours. If the job was successful it will not spam you every single day.
require 'open-uri'
def perform
# job code
open("https://nosnch.in/#{TOKEN}")
end
If you want to allow your application's users to see job return statuses on a dashboard in the application. Then it makes sense to store that information in the database. You could, for example, just create a JobStatus ActiveRecord model with columns like job_name, status, payload, and a created_at and then create records in that table whenever it feels useful. Once the data is in the database you can present that data like every other model's data to the user.
def perform
begin
# job code
rescue => exception
JobStatus.create(job_name: self.class.name, status: 'failed', payload: exception.to_json)
raise # re-raise the exception to trigger Sidekiq's default retry behavior
else
JobStatus.create(job_name: self.class.name, status: 'success')
end
end
And, of course, you can combine all those technics and tools for different use-cases. 1. for history and statistics, 2. for admins and people being on-call, 3. for users of your application.

Unable to connect to Neo4j from c# driver Session to fabric database

Using the Neo4j.Driver (4.1.0) I am unable to connect a session to server's configured fabric database. It works fine in the Neo4j Browser. Is there a trick to setting the context to a fabric database?
This times out:
var session = driver.AsyncSession(o => o.WithDatabase("fabric"));
Actual database names work fine.
Does the c# driver not support setting the Session context to a fabric database?
I'm trying to execute something like the following:
use fabric.graph(0)
match ...
set...
I found a workaround by co-opting a sub-query as follows, but it seems that setting the session context would make more sense.
use fabric
call {
use fabric.graph(0)
match ...
set ...
return 0
}
return 0
I've not yet worked with fabric. But I have worked with clusters. You can only add nodes/edges to the one Neo4j database that has a WRITE role. To do this you need a small function to query the routing table and determine the write database. Here's the key query:
CALL dbms.cluster.routing.getRoutingTable({}) YIELD ttl, servers UNWIND servers as server with server where server.role='WRITE' RETURN server.addresses
You then address your write query to that specific database.

How to test PG::QueryCanceled (due to timeout) error in a Rails app?

On production server I got this error
ActiveRecord::StatementInvalid: PG::QueryCanceled: ERROR: canceling statement due to statement timeout <SQL query here>
In this line:
Contact.where(id: contact_ids_to_delete).delete_all
SQL query was DELETE command with a huge list of ids. It timed out.
I came up with a solution which is to delete Contacts in batches:
Contact.where(id: contact_ids_to_delete).in_batches.delete_all
The question is, how do I test my solution? Or what is the common way to test it? Or is there any gem that would make testing it convenient?
I see two possible ways to test it:
1. (Dynamically) set the timeout in test database to a small amount of seconds and create a test in which I generate a lot of Contacts and then try to run my code to delete them.
It seems to be the right way to do it, but it could potentially slow down the tests execution, and setting the timeout dynamically (which would be the ideal way to do it) could be tricky.
2. Test that deletions are in batches.
It could be tricky, because this way I would have to monitor the queries.
This is not an edge case that I would test for because it requires building and running a query that exceeds your database's built-in timeouts; the minimum runtime for this single test would be at least that time.
Even then, you may write a test for this that passes 100% of the time in your test environment but fails 100% of the time in production because of differences between the two environments that you can never fully replicate; for one, your test database is being used by a single concurrent user while your production database will have multiple concurrent users, different available resources, and different active locks. This isn't the type of issue that you write a test for because the test won't ensure it doesn't happen in production. Best practices will do that.
I recommend that you follow the best practices for Rails by using the find_in_batches or find_each methods with the expectation that the database server can successfully act on batches of 1000 records at a time:
Contact.where(id: contact_ids_to_delete).find_in_batches do |contacts|
contacts.delete_all
end
Or if you prefer:
Contact.where(id: contact_ids_to_delete).find_in_batches(&:delete_all)
You can tweak the batch size with batch_size if you're paranoid about your production database server not being able to act on 1000 records at a time:
Contact.where(id: contact_ids_to_delete).find_in_batches(batch_size: 500) { |contacts| contacts.delete_all }

Activejob fails deserializing an object

To test that my mails are being sent I'm running heroku run rails c -a my_app. Then I enqueue the job and it is enqueued fine. However, when I go to Redis and see queued jobs the job is not there. Instead, it is on "retry".
This is what I see:
{"retry":true,"queue":"default","class":"ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper","args":[{"job_class":"SendMailJob","job_id":"4b4ba46f-94d7-45cd-b923-ec1678c73076","queue_name":"default","arguments":["any_help",{"_aj_globalid":"gid://gemfeedapi/User/546641393834330002000000"}]}],"jid":"f89235d7ab19f605ed0461a1","enqueued_at":1424175756.9351726,"error_message":"Error while trying to deserialize arguments: \nProblem:\n Document(s) not found for class User with id(s) 546641393834330002000000.\nSummary:\n When calling User.find with an id or array of ids, each parameter must match a document in the database or this error will be raised. The search was for the id(s): 546641393834330002000000 ... (1 total) and the following ids were not found: 546641393834330002000000.\nResolution:\n Search for an id that is in the database or set the Mongoid.raise_not_found_error configuration option to false, which will cause a nil to be returned instead of raising this error when searching for a single id, or only the matched documents when searching for multiples.","error_class":"ActiveJob::DeserializationError","failed_at":1424175773.317896,"retry_count":0}
However, object is in the Database.
I've tried to add after_create callback (Mongoid) but doesn't make any difference.
Any idea on what is happening?
Thanks.
Sidekiq is so fast that it executes your job before the database has committed the transaction. Use after_commit to create the job.
Ok, my fault. You need to start a new Heroku Worker Dyno in order to make Sidekiq work (it doesn't do it automatically).

Why can't bcp execute procedures having temp table(#tempTable)?

Recently I was tasked with creating a SQL Server Job to automate the creation of a CSV file. There was existing code, which was using an assortment of #temp tables.
When I set up the job to execute using BCP calling the existing code (converted into a procedure), I kept getting errors:
SQLState = S0002, NativeError = 208
Error = [Microsoft][SQL Native Client][SQL Server]Invalid object name #xyz
As described in other post(s), to resolve the problem lots of people recommend converting all the #tempTables to #tableVariables.
However, I would like to understand WHY BCP doesn't seem to be able to use #tempTables?
When I execute the same procedure from within SSMS it works though!? Why?
I did do a quick and simple test using global temp tables within a procedure and that seemed to succeed via a job using BCP, so I am assuming it is related to the scope of the #tempTables!?
Thanks in advance for your responses/clarifications.
DTML
You are correct in guessing that it's a scope issue for the #temp tables.
BCP is spawned as a separate process, so the tables are no longer in scope for the new processes. SSMS likely uses sub-processes, so they would still have access to the #temp tables.

Resources