Execute stored procedure at the end of Snowpipe job - stored-procedures

Kind of self-explaining title ^_^
I have a Snowpipe job that ingests datafiles into a Snowflake staging table.
Is it possible to trigger a stored procedure call at the end of Snowpipe job, instead of having Snowflake task running it, based on recurrent schedule?
Thanks for helping,

Related

Currently running query inside postgresql stored procedure from application level

I have a Ruby on Rails application in which I call different stored procedures from a Postgres database. I want to know if there is a way to see the currently running queries inside a stored procedure when called. This is needed in order to create console logs in case of a stored procedure bottleneck.
I don't want to modify the stored procedure with lines like raise notice, I only want to see the queries from the application level if possible.
I'm currently using the PG gem: connection.exec(sql_query).to_a where connection is of type PGConnection and sql_query = stored procedure (E.g. procedure1("1", true))
I tried searching for some solutions in the pg-gem documentation but couldn't find any.
Any tips?

Is there any pre-requisite to call a Snowflake stored procedure from a Informatica mapping?

I would like to know if there's any special requirement when calling a Snowflake stored procedure in a Informatica mapping. Concretely, I have a mapping in which the target is a snowflake table, and as Post-SQL, I want to call a stored procedure that is in the same database as my table.
I call my stored procedure in Post-SQL as following:
CALL spname();
However, I get the following error when running:
SQL compilation error: Unknown function spname
Do you know which could be the problem here?
That error message is coming from Snowflake, so Informatica (is this PowerCenter on-prem?) is attempting to run the SP and it's getting a response back from Snowflake. Here are some things to check:
Does the Snowflake user PowerCenter runs as have the required grants to run the SP? The error message will be the same whether the SP does not exist or the user lacks privileges to run it.
Does the user running PowerCenter have the required grants on the database and schema containing the stored procedure?
You can ensure that PowerCenter is looking in the right namespace by specifying both the database and schema before the SP name, such as call "MY_DB"."MY_SCHEMA"."MY_PROC"();

Automatically delete all records after 30 days

I have a concept for a rails app that I want to make. I want a model that a user can create a record of with a boolean attribute. After 30 days/Month unless the record has true boolean attribute, the record will automatically delete itself.
In rails 5 you have access to "active_job"(http://guides.rubyonrails.org/active_job_basics.html)
There are two simple ways.
After creating the record, you could set this job to be executed after 30 days. This job checks if the record matches the specifications.
The other alternative is to create an alternative job, that runs everyday which queries the database for every record (of this specific model) that where created 30 days ago and destroy them if they do not match the specifications. (If thats on the database it should be easy as: MyModel.where(created_at: 30.days.ago, destroyable: true).destroy_all)
There are a couple of options for achieving this:
Whenever and run a script or rake task
Clockwork and run a script or rake task
Background jobs which in my opinion is the "rails way".
For the 1,2 you need to check everyday if a record is 30 days old and then delete it if there isn't a true boolean (which means, checking all the records or optimize the query everyday and check only the 30 days old records etc...). For the 3rd option, you can schedule on the record creation, a job to run after 30 days and do the check for each record independently. It depends on how you are processing the jobs, for example, if you use sidekiq you can use scheduled jobs or if you use resque check resque-scheduler.
Performing the deletion is straightforward: create a class method (e.g Record.prune) on the record class in question that performs the deletion based on a query e.g. Record.destroy_all(retain: false) where retain is the boolean attribute you mention. I'd recommend then defining a rake task in lib/tasks that invokes this method (e.g.)
namespace :record
task :prune => :environment
Record.prune
end
end
The scheduling is more difficult; a crontab entry is sufficient to provide the correct timing, but ensuring that an appropriate environment (e.g. one that's loaded rbenv/rvm and any appropriate environment variables) is more difficult. Ensuring that your deployment process produces binstubs is probably helpful here. From there the bin/rake record:prune ought to be enough. It's hard to provide a more in-depth answer without more knowledge of all the environments in which you hope to accomplish this task.
I want to mention a non Rails approach. It depends on your database. When you use mongodb you can utilize mongodb "Expire Data from Collections" feature. When you use mysql you can utilize mysql event scheduler. You can find a good example here What is the best way to delete old rows from MySQL on a rolling basis?

Record progress for long running ActiveJob

Based on this question How to reference active delayed_job within the actual job I'm using Delayed::Job with an additional progress text column to record progress of a long running task.
I'm now trying to update my code to use ActiveJob, so I've replaced def before with before_perform, but the job object passed to before_perform is not the same as the one passed to before. And quite rightly, because the queue adapter is configurable and may not always be :delayed_job.
So, given that the queue adapter is configurable, is there a correct way to access (read and write) the progress column in table delayed_jobs?
Thanks.

How to create unique delayed jobs

I have a method like this one
def abc
// some stuff here
end
handle_asynchronously :abc, queue: :xyz
I want to create a delayed job for this only if there isn't one already in the queue.
I really feel like this should have an easy solution
Thanks!
I know this post is old but it hasn't been replied.
Delayed jobs does not provide a way to identify jobs. https://github.com/collectiveidea/delayed_job/issues/192
My suggestion is that your job could check if it still has to run when it is executing, for example, comparing to a database value, etc. Inserting jobs in the table should be quick and you might lose that if you start checking for a certain job in the queue.
If you still want to look for duplicates when enqueuing, this might help you.
https://gist.github.com/landovsky/8c505ecab41eb38fa1c2cd23058a6ae3

Resources