Create tasks in dynamic in Airflow DAGs - task

I want to design an Airflow DAG with several tasks, and I want them execute as this image Example and the Gantt chart like this and some descriptions list below:
A-1, B-1, C-1 must executed in sequential
A-2 depends on A-1, B-2 depends on B-1 and C-2 depends on C-1
A-2, B-2, C-2 could be executed in parallel
I have create the desired DAG through the code below
main_task_list = ["T1", "T2", "T3"]
def decide_what_to_do(table_name, **context):
if random.randint(0, 100) > 80:
return tid_prefix_zip_file + table_name
else:
return tid_prefix_do_nothing + table_name
def create_tasks_list(table_name):
tid_call_api = tid_prefix_call_api + table_name
py_op_call_api = DummyOperator(
task_id= tid_call_api
)
tid_branch_operator = tid_prefix_branch + table_name
py_op_new_data_come_in = BranchPythonOperator(
task_id=tid_branch_operator,
python_callable=decide_what_to_do,
op_args=[table_name]
)
tid_zip_file = tid_prefix_zip_file + table_name
ssh_op_zip_file = DummyOperator(
task_id=tid_zip_file
)
tid_upload_blob = tid_prefix_upload + table_name
ssh_op_upload_file = DummyOperator(
task_id=tid_upload_blob
)
tid_update_table_setting = tid_prefix_update_table + table_name
py_update_tables_setting = DummyOperator(
task_id=tid_update_table_setting
)
tid_execute_databricks = tid_prefix_call_databricks + table_name
db_op_execute_notebook = DummyOperator(
task_id=tid_execute_databricks
)
dummy_op_do_nothing = DummyOperator(
task_id= tid_prefix_do_nothing + table_name
)
# branch 1
first_pipeline = [py_op_call_api, py_op_new_data_come_in, ssh_op_zip_file, ssh_op_upload_file, py_update_tables_setting, db_op_execute_notebook]
airflow.utils.helpers.chain(*first_pipeline)
# branch 1
second_pipeline = [py_op_new_data_come_in, dummy_op_do_nothing]
airflow.utils.helpers.chain(*second_pipeline)
tasks_list = [first_pipeline, second_pipeline]
return tasks_list
with DAG(dag_id, default_args = default_args) as dag:
tasks_chain_list = [create_tasks_list(each) for each in main_task_list]
start = DummyOperator(
task_id="start"
)
start >> tasks_chain_list[0][0][0]
for n in range(0, len(tasks_chain_list)-1):
tasks_chain_list[n][0][0] >> tasks_chain_list[n+1][0][0]
But these code are not flexible if I want to add more branch to each tasks chain.
Does anyone can help me to improve the code?
Thanks.

Related

Write a code to calculate Backward Propagation, Deep Learning course by Andrew NG

So I've taken the Deep Learning AI course by Andrew NG on coursera.
I am currently working the last assignment in week 2.
I reached the part where I have to write the forward and backward propagation function.
I managed to write the fwd_propagate function which is fairly easy.
This is the code below :
def fwd_propagate(w,b,X,y):
m = X.shape[1]
A = sigmoid(np.dot(w.T,X)+b)
J = (-1/m)*np.sum(y * np.log(A) + (1-y) * np.log(1-A))
return J
Now I have to write the bwd_propagation function but I don't know where and how to start.
Can someone help and explain to me what I should write.
This is everything I wrote so far with the tests.
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
%matplotlib inline
def load_dataset():
train_dataset = h5py.File('C:/Users/Univ/Desktop/ML Intern/Logistic-Regression-with-a-Neural-Network-mindset-master/train_catvnoncat.h5', "r")
train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
test_dataset = h5py.File('C:/Users/Univ/Desktop/ML Intern/Logistic-Regression-with-a-Neural-Network-mindset-master/test_catvnoncat.h5', "r")
test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
classes = np.array(test_dataset["list_classes"][:]) # the list of classes
train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
index = 25
plt.imshow(train_set_x_orig[index])
print ("y = " + str(train_set_y[:,index]) + ", it's a '" + classes[np.squeeze(train_set_y[:,index])].decode("utf-8") + "' picture.")
print(str(train_set_y.shape[1]) + " This is the amount of elements in the training set")
print(str(test_set_y.shape[1]) + " This is the amount of elements in the test set")
print(str(train_set_x_orig.shape[1]) + " This is the Num_px")
print(f"{train_set_x_orig.shape[1]} This is the Num_px")
print(train_set_x_orig.shape)
X_flatten1 = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
X_flatten2 = train_set_y.reshape(train_set_y.shape[0], -1).T
X_flatten3 = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T
X_flatten4 = test_set_y.reshape(test_set_y.shape[0], -1).T
print(X_flatten1)
print(X_flatten2)
print(X_flatten3)
print(X_flatten4)
print(X_flatten1.shape)
print(X_flatten2.shape)
print(X_flatten3.shape)
print(X_flatten4.shape)
print(" Let's standardize our date")
train_set_x = X_flatten1/256
test_set_x = X_flatten3/256
print(train_set_x)
print(test_set_x)
def sigmoid(x):
s = 1/(1+np.exp(-x))
return s
print ("sigmoid(0) = " + str(sigmoid(0)))
print ("sigmoid(9.2) = " + str(sigmoid(9.2)))
def initialize_with_zeros(dim):
shp=(dim,1)
w = np.zeros(shp)
b = 0
assert(w.shape == (dim, 1))
assert(isinstance(b, float) or isinstance(b, int))
return w,b
dim = 2
w, b = initialize_with_zeros(dim)
print ("w = " + str(w))
print ("b = " + str(b))
def fwd_propagate(w,b,X,y):
m = X.shape[1]
A = sigmoid(np.dot(w.T,X)+b)
J = (-1/m)*np.sum(y * np.log(A) + (1-y) * np.log(1-A))
return J
The next step is to calculate derivative for back propagation:
dw = 1/m*(np.dot(X, ((A-Y).T)))
db = 1/m*(np.sum(A-Y))

Convert a decision tree to a table

I'm looking for a way to convert a decision tree trained using scikit sklearn into a decision table.
I would like to know how to parse the decision tree structure to find the decisions made at each step.
Then I would like ideas on how to structure this table.
Do you know a way or have a idea to do it?
Building on the other answer here. The following traverses the tree in the same way but generates a pandas dataframe as an output.
import sklearn
import pandas as pd
def tree_to_df(reg_tree, feature_names):
tree_ = reg_tree.tree_
feature_name = [
feature_names[i] if i != sklearn.tree._tree.TREE_UNDEFINED else "undefined!"
for i in tree_.feature
]
def recurse(node, row, ret):
if tree_.feature[node] != sklearn.tree._tree.TREE_UNDEFINED:
name = feature_name[node]
threshold = tree_.threshold[node]
# Add rule to row and search left branch
row[-1].append(name + " <= " + str(threshold))
recurse(tree_.children_left[node], row, ret)
# Add rule to row and search right branch
row[-1].append(name + " > " + str(threshold))
recurse(tree_.children_right[node], row, ret)
else:
# Add output rules and start a new row
label = tree_.value[node]
ret.append("return " + str(label[0][0]))
row.append([])
# Initialize
rules = [[]]
vals = []
# Call recursive function with initial values
recurse(0, rules, vals)
# Convert to table and output
df = pd.DataFrame(rules).dropna(how='all')
df['Return'] = pd.Series(values)
return df
Here is a sample code to convert a decision tree into a "python" code. You can easily adapt it to make a table.
All you need to do is create a global variable that is a table that is the size of the number of leaves times the number of features (or feature categories) and fill it recursively
def tree_to_code(tree, feature_names, classes_names):
tree_ = tree.tree_
feature_name = [
feature_names[i] if i != _tree.TREE_UNDEFINED else "undefined!"
for i in tree_.feature
]
print( "def tree(" + ", ".join(feature_names) + "):" )
def recurse(node, depth):
indent = " " * depth
if tree_.feature[node] != _tree.TREE_UNDEFINED:
name = feature_name[node]
threshold = tree_.threshold[node]
print( indent + "if " + name + " <= " + str(threshold)+ ":" )
recurse(tree_.children_left[node], depth + 1)
print( indent + "else: # if " + name + "<=" + str(threshold) )
recurse(tree_.children_right[node], depth + 1)
else:
impurity = tree.tree_.impurity[node]
dico, label = cast_value_to_dico( tree_.value[node], classes_names )
print( indent + "# impurity=" + str(impurity) + " count_max=" + str(dico[label]) )
print( indent + "return " + str(label) )
recurse(0, 1)
code snippet
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_text
iris = load_iris()
X = iris['data']
y = iris['target']
decision_tree = DecisionTreeClassifier(random_state=0, max_depth=2)
decision_tree = decision_tree.fit(X, y)
r = export_text(decision_tree, feature_names=iris['feature_names'])
print(r)
listt= [r]
print(listt)
#########OUTPUT###########################
|--- petal width (cm) <= 0.80
| |--- class: 0
|--- petal width (cm) > 0.80
| |--- petal width (cm) <= 1.75
| | |--- class: 1
| |--- petal width (cm) > 1.75
| | |--- class: 2

How to get user name who updated the data from UI in MVC 5 and Entity Framework

We have a website in ASP.NET MVC 5 with Entity Framework.
When a logged in user made changes in UI (i.e. update the data) we save/update/delete the data in SQL Server as per operation performed by the user.
We also have a trigger for audit trailing.
With the trigger, table format for storing data is as below:
[AuditID]
[Type] -- Contains operation performed (Insert (I), Update (U), Delete (D))
[TableName]
[PK] -- Primary Key
[FieldName]
[OldValue]
[NewValue]
[UpdateDate]
[UserName]
We are storing SYSTEM_USER in [UserName] column.
If we have any solution to store the [UserName] who actually made changes from UI instead of system_user?
Do we have any approach to pass [UserName] from application (UI) to the trigger?
Please share your thoughts.
I have one solution - to add UpdatedBy column name in all the tables so that trigger can easily get the value of UpdatedBy column from magic tables or from main tables.
Please suggest best approach.
Below is the trigger used.
CREATE TRIGGER [ids].[tr_AuditEmploee]
ON Employee
FOR INSERT, UPDATE, DELETE
AS
DECLARE #bit INT,
#field INT,
#maxfield INT,
#char INT,
#fieldname VARCHAR(128),
#TableName VARCHAR(128),
#PKCols VARCHAR(1000),
#sql VARCHAR(2000),
#UpdateDate VARCHAR(21),
#UserName VARCHAR(128),
#Type CHAR(1),
#PKSelect VARCHAR(1000)
--You will need to change #TableName to match the table to be audited
SELECT #TableName = 'Employee'
-- date and user
SELECT
#UserName = SYSTEM_USER,
#UpdateDate = CONVERT(VARCHAR(8), GETDATE(), 112) + ' ' + CONVERT(VARCHAR(12), GETDATE(), 114)
-- Action
IF EXISTS (SELECT * FROM inserted)
IF EXISTS (SELECT * FROM deleted)
SELECT #Type = 'U'
ELSE
SELECT #Type = 'I'
ELSE
SELECT #Type = 'D'
-- get list of columns
SELECT * INTO #ins FROM inserted
SELECT * INTO #del FROM deleted
-- Get primary key columns for full outer join
SELECT
#PKCols = COALESCE(#PKCols + ' and', ' on')
+ ' i.' + c.COLUMN_NAME + ' = d.' + c.COLUMN_NAME
FROM
INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk,
INFORMATION_SCHEMA.KEY_COLUMN_USAGE c
WHERE
pk.TABLE_NAME = #TableName
AND CONSTRAINT_TYPE = 'PRIMARY KEY'
AND c.TABLE_NAME = pk.TABLE_NAME
AND c.CONSTRAINT_NAME = pk.CONSTRAINT_NAME
-- Get primary key select for insert
SELECT
#PKSelect = COALESCE(#PKSelect+'+','')
+ '''' + COLUMN_NAME
+ '=''+convert(varchar(100),
coalesce(i.' + COLUMN_NAME +', d.' + COLUMN_NAME + '))+'''''
FROM
INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk ,
INFORMATION_SCHEMA.KEY_COLUMN_USAGE c
WHERE
pk.TABLE_NAME = #TableName
AND CONSTRAINT_TYPE = 'PRIMARY KEY'
AND c.TABLE_NAME = pk.TABLE_NAME
AND c.CONSTRAINT_NAME = pk.CONSTRAINT_NAME
IF #PKCols IS NULL
BEGIN
RAISERROR('no PK on table %s', 16, -1, #TableName)
RETURN
END
SELECT
#field = 0,
#maxfield = MAX(ORDINAL_POSITION)
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME = #TableName
WHILE #field < #maxfield
BEGIN
SELECT #field = MIN(ORDINAL_POSITION)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #TableName
AND ORDINAL_POSITION > #field
SELECT #bit = (#field - 1 )% 8 + 1
SELECT #bit = POWER(2,#bit - 1)
SELECT #char = ((#field - 1) / 8) + 1
IF SUBSTRING(COLUMNS_UPDATED(),#char, 1) & #bit > 0 OR #Type IN ('I','D')
BEGIN
SELECT #fieldname = COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #TableName
AND ORDINAL_POSITION = #field
SELECT #sql = '
insert [audit].[AuditEmployee] ( Type,
TableName,
PK,
FieldName,
OldValue,
NewValue,
UpdateDate,
UserName)
select ''' + #Type + ''','''
+ #TableName + ''',' + #PKSelect
+ ',''' + #fieldname + ''''
+ ',convert(varchar(1000),d.' + #fieldname + ')'
+ ',convert(varchar(1000),i.' + #fieldname + ')'
+ ',''' + #UpdateDate + ''''
+ ',''' + #UserName + ''''
+ ' from #ins i full outer join #del d'
+ #PKCols
+ ' where i.' + #fieldname + ' <> d.' + #fieldname
+ ' or (i.' + #fieldname + ' is null and d.' + #fieldname + ' is not null)'
+ ' or (i.' + #fieldname + ' is not null and d.' + #fieldname + ' is null)'
EXEC (#sql)
END
END

problems with Lua match to find a pattern

I'm struggling with this problem:
Given 2 strings:
s1 = '/foo/:bar/oof/:rab'
s2 = '/foo/lua/oof/rocks'
I would like to produce the following information:
If they match (these two above should match, s2 follows a pattern described in s1).
A table holding the values of s2 in with the corresponding name in s1. In this case we would have: { bar = "lua", rab = "rocks" }
I think this algorithm solves it, but I can't figure how to implement it (probably with gmatch):
store the placeholders : indexes as KEYS of a table, and the respective VALUES being the name of these placeholders.
Example with s1:
local aux1 = { "6" = "bar", "15" = "rab" }
With the keys of aux1 fetched as indexes, extract the values of s2
into another table:
local aux2 = {"6" = "lua", "15" = "rocks"}
Finally merge them two into one table (this one is easy :P)
{ bar = "lua", rab = "rocks" }
Something like this maybe:
function comp(a,b)
local t = {}
local i, len_a = 0
for w in (a..'/'):gmatch('(.-)/') do
i = i + 1
if w:sub(1,1) == ':' then
t[ -i ] = w:sub(2)
else
t[ i ] = w
end
end
len_a = i
i = 0
local ans = {}
for w in (b..'/'):gmatch('(.-)/') do
i = i + 1
if t[ i ] and t[ i ] ~= w then
return {}
elseif t[ -i ] then
ans[ t[ -i ] ] = w
end
end
if len_a ~= i then return {} end
return ans
end
s1 = '/foo/:bar/oof/:rab'
s2 = '/foo/lua/oof/rocks'
for k,v in pairs(comp(s1,s2)) do print(k,v) end
Another solution could be:
s1 = '/foo/:bar/oof/:rab'
s2 = '/foo/lua/oof/rocks'
pattern = "/([^/]+)"
function getStrngTable(_strng,_pattern)
local t = {}
for val in string.gmatch(_strng,_pattern) do
table.insert(t,val)
end
return t
end
local r = {}
t1 = getStrngTable(s1,pattern)
t2 = getStrngTable(s2,pattern)
for k = 1,#t1 do
if (t1[k] == t2[k]) then
r[t1[k + 1]:match(":(.+)")] = t2[k + 1]
end
end
The Table r will have the required result
The solution below, which is some what cleaner, will also give the same result:
s1 = '/foo/:bar/oof/:rab'
s2 = '/foo/lua/oof/rocks'
pattern = "/:?([^/]+)"
function getStrng(_strng,_pattern)
local t = {}
for val in string.gmatch(_strng,_pattern) do
table.insert(t,val)
end
return t
end
local r = {}
t1 = getStrng(s1,pattern)
t2 = getStrng(s2,pattern)
for k = 1,#t1 do
if (t1[k] == t2[k]) then
r[t1[k + 1]] = t2[k + 1]
end
end

Get a certain value from a concatenated table

Trying to allow a concatenated table to be referenced as such:
local group = table.concat(arguments, ",", 1)
where arguments = {"1,1,1"}
Currently, doing group[2] gives me the comma. How do I avoid that while still allowing for two-digit numbers?
(snippet of what I'm trying to use it for)
for i = 1, #group do
target:SetGroup(i, tonumber(group[i]))
end
Maybe you want something like
local i = 1
for v in string.gmatch(s, "(%w+),*") do
group[i] = v
i = i + 1
end
Revised version in response to comment, avoiding the table altogether:
local i = 1
for v in string.gmatch(s, "(%w+),*") do
target:SetGroup(i, tonumber(v))
i = i + 1
end
split function (you have to add it to code)
split = function(str, delim)
if not delim then
delim = " "
end
-- Eliminate bad cases...
if string.find(str, delim) == nil then
return { str }
end
local result = {}
local pat = "(.-)" .. delim .. "()"
local nb = 0
local lastPos
for part, pos in string.gfind(str, pat) do
nb = nb + 1
result[nb] = part
lastPos = pos
end
-- Handle the last field
result[nb + 1] = string.sub(str, lastPos)
return result
end
so
local arguments = {"1,1,1"};
local group = split(arguments[1], ",");
for i = 1, #group do
target:SetGroup(i, tonumber(group[i]))
end
also note that
local arguments = {"1,1,1"};
local group = split(arguments[1], ",");
local group_count = #group;
for i = 1, group_count do
target:SetGroup(i, tonumber(group[i]))
end
is faster code ;)

Resources