Snowflake, Tasks and Session variables problem - stored-procedures

I have a problem in Snowflake with Task that executes Stored Procedures and that SP is using a Session variable QUERY_TAG that I want to use for logging purposes.
When the Task executes the SP, I'll get the error:
"Session variable '$QUERY_TAG' does not exist"
EXECUTE AS CALLER is there
It doesn't matter where I try to set the QUERY_TAG (in the first Task precond-code or in the definition).
The Tasks and SP are created by me as SYSADMIN
When I'm executing the SP in a query editor (Snowflake, DBeaver etc) it runs fine, so no coding errors in the SP.
SET QUERY_TAG = 'A nice query tag'
CALL TASK_SCHEMA.SP_TASK_ONE()
This runs fine when I'm calling doing it in the Worksheet or DBeaver or similar.
Both the ways in the SP works (inline SQL or by the getQueryTag function)
Here is the code for Tasks and SP
CREATE OR REPLACE TASK TASK_SCHEMA.TASK_ONE_PRECOND
WAREHOUSE = TASK_WH
SCHEDULE = '2 minute'
QUERY_TAG = 'My Query Tag'
AS
SET QUERY_TAG = 'My Query Tag 2'
CREATE OR REPLACE TASK TASK_SCHEMA.TASK_ONE
WAREHOUSE = TASK_WH
AFTER TASK_SCHEMA.TASK_ONE_PRECOND
AS
CALL TASK_SCHEMA.SP_TASK_ONE()
create or replace procedure TASK_SCHEMA.SP_TASK_ONE()
RETURNS VARCHAR(50)
LANGUAGE JAVASCRIPT
EXECUTE AS CALLER
as $$
function getQueryTag()
{
var QueryTag;
rs_QT = snowflake.execute ( { sqlText: `SELECT $QUERY_TAG;` } );
if( rs_QT.next())
{
QueryTag = rs_QT.getColumnValue(1); // get the QueryTag
}
return QueryTag;
}
var qtag = getQueryTag();
//rs = snowflake.execute ( { sqlText:
//`INSERT INTO "LOG"."TESTSESSIONLOG"
// ("SESSION_NAME")
//SELECT $QUERY_TAG
//` } );
snowflake.execute({
sqlText: `INSERT INTO LOG.TESTSESSIONLOG
(SESSION_NAME)
VALUES (?)`
,binds: [ qtag]
});
return "SESSION_OK";
$$;

Edit 4 Nov 2019: My answer below is not entirely correct, there is a way to pass values between a task and its successor. See doc on SYSTEM$SET_RETURN_VALUE.
Even if you define dependencies between tasks, that doesn't mean a task inherits anything from the predecessor in the task tree.
So if you set a variable in one task, that variable is lost when the task finishes.
This is different from a normal session (like in the GUI) where the session state is preserved between the commands you execute within the session.
Between tasks, the only thing related is the end time of the predecessor and the start time of the successor(s).
When it comes to extracting the query tag, you should preferably ask the system for it:
function getQueryTag()
{
var rs_QT = snowflake.execute ( { sqlText: `SHOW PARAMETERS LIKE 'QUERY_TAG'` } );
return rs_QT.next() && rs_QT.getColumnValue("value"); // get the QueryTag
}

Related

Exception when running raw query for the creation of a trigger on TypeORM

I am having an issue creating triggers on a BD created using TypeORM. I have my entities creating the tables without a problem, then I am running a few queries to create virtual tables and triggers on the database like this:
AppDataSource.initialize()
.then(async () => {
try{
console.log("DB ready");
AppDataSource.manager.query(queries.parent_company);
AppDataSource.manager.query(queries.sales_rep);
AppDataSource.manager.query(queries.__parent_company___after_insert);
}
catch(e){
}
})
.catch((error) => console.log(error))
This is not the only trigger I want to create but it's one as an example, all of them get me Exceptions.
Here are some of the queries I am running, one that creates the virtual table and then the trigger that is giving me the issue.
export const parent_company =
"CREATE VIRTUAL TABLE IF NOT EXISTS parent_company USING FTS4(\
id,\
name,\
content='__parent_company'\
);"
export const __parent_company___after_insert =
"CREATE TRIGGER IF NOT EXISTS __parent_company___after_insert\
AFTER INSERT ON __parent_company\
BEGIN\
INSERT INTO parent_company (docid, id, name)\
VALUES (new.rowid, new.id, new.name);\
END;"

JS Neo4jError: Cannot run query in this transaction, because it has been rolled back either because of an error or explicit termination

I fire few hundreds of below mentioned query concurrently (tried synchronously also) from JS neo4j-driver 4.4.1. Few of the queries, sometimes throws the following error in nodejs. But when my retry logic retries after sometime, it works.
Query
MERGE (n0:Movie {movie_id: $movie_id})
WITH n0
CALL apoc.lock.nodes([n0])
CALL {
WITH n0
WITH n0 WHERE n0.updated_at IS NULL OR n0.updated_at < datetime($updated_at)
MERGE (n:Movie {movie_id: $movie_id})
ON CREATE SET n.movie_id = $movie_id
SET n.name = $name
SET n.downloads = $downloads
SET n.updated_at = datetime($updated_at)
RETURN count(*) AS cnt
}
RETURN n0, cnt
I run this query in separate transactions like below.
const session = driver.session();
await session.writeTransaction(async tx => {
return await tx.run(QUERY, {args});
});
Log
Neo4jError: Cannot run query in this transaction, because it has been rolled back either because of an error or explicit termination.
I couldn't find any trace related to that query in neo4j logs.
Any help with this?

how to get the output of a stored procedure as task output in snowflake

I created a stored procedure where it returns the string with a successful message and the number of rows inserted or error messages like file not found or data did not load when executed manually. when I called the same stored procedure with task it shows(task_history) as succeed. and cant find if the data has been loaded or not. it has to be checked manually.
when I referred the following question Snowflake a working procedure is not being successfully executed when calling it within a scheduled task
the procedure and the task has the same owner(owner has global execute task privilege).
but data is being updated both the times during manual and task call of procedure.
how to make the return value appear in task and make the task not executing the successor task if the stored procedure return a error.
You can use SYSTEM$SET_RETURN_VALUE to set a return value in you first task.
In a tree of tasks, a task can call this function to set a return value. Another task that identifies this task as the predecessor task (using the AFTER keyword in the task definition) can retrieve the return value set by the predecessor task.
You can then use SYSTEM$GET_PREDECESSOR_RETURN_VALUE in your next task to condition your actions (for example doing nothing if the return_value contains an error).
The return value will appear for monitoring in TASK_HISTORY.
There are 2 parts to get the return value from stored procedure to save to task history.
The stored procedure should "EXECUTE AS CALLER"
You need to call system$set_return_value with the return value
Example
CREATE OR REPLACE PROCEDURE MySchema.MyStoredProcedure()
RETURNS VARIANT
LANGUAGE JAVASCRIPT
EXECUTE AS CALLER
AS $$
let command1 = = `DELETE FROM MySchema.MyTable WHERE rownum > 10;`;
let stmt1 = snowflake.createStatement({sqlText: command1});
let rs = stmt1.execute();
rs.next();
let deleted = rs.getColumnValue(1);
//This requires the SP to be run as CALLER - This return value is logged in Task History
stmt1 = snowflake.createStatement({sqlText:`call system$set_return_value(' { "RowsDeleted": ${deleted} }');`});
rs = stmt1.execute();
return { "RowsDeleted": deleted };
$$;

Jedis/Redis SocketTimeout exception on Lua scripts

We are using lua scripts to perform batch deletes of data on updates to our DB. Jedis executes the lua script using a pipeline.
local result = redis.call('lrange',key,0,12470)
for i,k in ipairs(result) do
redis.call('del',k)
redis.call('ltrim',key,1,k)
end
try (Jedis jedis = jedisPool.getResource()) {
Pipeline pipeline = jedis.pipelined();
long len = jedis.llen(table);
String script = String.format(DELETE_LUA_SCRIPT, table, len);
LOGGER.info(script);
pipeline.eval(script);
pipeline.sync();
} catch (JedisConnectionException e) {
LOGGER.info(e.getMessage());
}
For large ranges we notice that the lua scripts slow down and we get SocketTimeOutExceptions.
running redis-cli slowlog displays only the lua scripts that have taken too long to execute.
Is there a better way to do this? is my lua script blocking?
When I use just pipeline to do the batch deletes, the slowlog also returns slow queries.
try (Jedis jedis = jedisPool.getResource()) {
Pipeline pipeline = jedis.pipelined();
long len = jedis.llen(table);
List<String> queriesContainingTable = jedis.lrange(table,0,len);
if(queriesContainingTable.size() > 0) {
for (String query: queriesContainingTable) {
pipeline.del(query);
pipeline.lrem(table,1,query);
}
pipeline.sync();
}
} catch (JedisConnectionException e) {
LOGGER.info("CACHE INVALIDATE FAIL:"+e.getMessage());
}
slowlog is capable of storing top 128 slowlogs alone (can be changed in redis.conf slowlog-max-len 128). So your 1st model of using LUA script is surely a blocking one.
If you delete such a number (12470) one by one it is surely a blocking one as it take more time to complete. Out of the 2 models 2nd one is fine for me (using pipeline), because you avoid the iteration all you do is hitting del query for n times.
You can use del of multiple keys for every 100 or 1000 (whichever you feel as optimal after a small testing). You can group them to a pipeline altogether.
Or if you can do the same without atomicity, you can delete every 100 or 1000 keys at once in a loop, so that it wouldn't be a blocking call.
Try out with different combinations take the metrics and go with the optimized one.

Making global environment access-only (Lua)

I embedded Lua and want scripts to be able to read the global table but not automatically write to it so two scripts can write variables with the same name without overwriting eachother but still being able to add stuff to the global table. I can't really explain it better then this:
Script 1
var1 = "foo"
_G.var2 = "bar"
Script 2
print(var1) -- Prints nil
print(var2) -- Prints 'bar'
How I tried to accomplish this is by doing something like this (The 'scripts' being a function)
newScript = function(content)
Script = loadstring(content)()
env = setmetatable({},{__index = _G})
setfenv(Script,env)
return Script
end
My Lua binding is LuaJ, for the sake of giving all information here is that code too:
private LuaValue newScript(String content){
LuaTable envMt = new LuaTable();
envMt.set(INDEX, _G);
LuaTable env = new LuaTable();
env.setmetatable(envMt);
LuaClosure func = (LuaClosure) _G.get("loadstring").call(valueOf(content));
thread = new LuaThread(func,env);
thread.resume(NIL);
return thread;
}
It's not __index that you want to change, it's __newindex. In addition, you can't use __index to catch access to keys that do exist in the table. The only way to make a table read-only in all situations is to defer all reads to a proxy table and throw an error on writes.
Here's a function I use to return a read-only table:
function ro_table (t)
local t = t
if t then
return setmetatable({},
{ __index=t,
__newindex= function(_,_,_) error ("Attempt to modify read-only table") end,
})
else
return nil
end
end
So for your code, you'd have the following:
newScript = function(content)
Script = loadstring(content)()
setfenv(Script,ro_table(_G))
return Script
end
Note that this does not work recursively, so if you have any table defined as a global (or even any of the built-in functions) the contents can be changed, but the table itself cannot be replaced.

Resources