Meaning of **kwargs attribute put in classes of Machine Learning Models - machine-learning

I am wondering about the meaning of the attribute **kwargs that I've found typically added in the constructor of some machine learning models classes. For example considering a neural network in PyTorch:
class Model(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim, **kwargs)
is **kwargs associated with extra parameters that are defined later?

This is not specific to machine learning models classes, it is rather a Python feature.
You are indeed right, it corresponds to additional keyword arguments. It will essentially collect the remaining passed named argument, that are not defined in the function header, and add them to the dictionary variable kwargs. This variable can actually be renamed to any name, it is customary to keep 'args' for iterable unnamed arguments (*args) and 'kwargs' for keyword arguments (**kwargs).
This adds the flexibility to allow for additional arguments to be defined and passed to the function without having to specifically state their names in the header. One common use case is when extending a class. Here we are implementing a dummy 3x3 2D convolution layer named Conv3x3, which will extends the base nn.Conv2d module:
class Conv3x3(nn.Conv2d):
def __init__(self, **kwargs):
super().__init__(kernel_size=3, **kwargs)
As you can see, we didn't need to name all arguments and we still keep the same interface as nn.Conv2d in our Conv3x3 class initializer:
>>> Conv3x3(in_channels=3, out_channels=16)
Conv3x3(3, 16, kernel_size=(3, 3), stride=(1, 1))
There are a lot of nice things you can do with these two constructs. Much of which you can find on here.

The keyword **kwargs refers to keyword arguments; so yes, they can be used for later additions.
See e.g. https://www.w3schools.com/python/gloss_python_function_arbitrary_keyword_arguments.asp

Related

wxMaxima: subindexed variables in functions work when written as "x_1" but not when written as "x[1]"

I'm having trouble defining a function in terms of variables with subindices. Using the makelist command I can create an unspecified function that depends upon the subindexed variables x[1] and x[2]. However, when I try to give an expression to that function, wxMaxima does not allow it:
On the other hand, if I write the subindexed variables as x_1 and x_2 instead of x[1] and x_[2], things do work.
What is the reason for this behavior? Aren't the two subindexing methods equivalent in terms of functions?
Only symbols can be declared function arguments. In particular, subscripted expressions are not symbols and therefore can't be function arguments.
WxMaxima displays symbols which end in a number, e.g., x_1, the same as subscripted expressions, e.g., x[1]. This is intended as a convenience, although it is confusing because it makes it difficult to distinguish the two.
You can see the internal form of an expression via ?print (note the question mark is part of the name). E.g., ?print(x_1); versus ?print(x[1]);.

Julia create array of functions with matching arguments to execute in a loop

It's easy to create an array of functions and execute them in a loop.
It's easy to provide arguments in either a corresponding array of the same length or the array could be of tuples (fn, arg).
For 2, the loop is just
for fn_ar in arr # arr is [(myfunc, [1,2,3]), (func2, [10,11,12]), ...]
fn_ar[1](fn_ar[2])
end
Here is the problem: the arguments I am using are arrays of very large arrays. In #2, the argument that will be called with the function will be the current value of the array when the arg entry of the tuple is initially created. What I need is to provide the array names as the argument and defer evaluation of the arguments until the corresponding function is run in the loop body.
I could provide the arrays used as input as an expression and eval the expression in the loop to supply the needed arguments. But, eval can't eval in local scope.
What I did that worked (sort of) was to create a closure for each function that captured the arrays (which are really just a reference to storage). This works because the only argument to each function that varies in the loop body turns out to be the loop counter. The functions in question update the arrays in place. The array argument is really just a reference to the storage location, so each function executed in the loop body sees the latest values of the arrays. It worked. It wasn't hard to do. It is very, very slow. This is a known challenge in Julia.
I tried the recommended hints in the performance section of the manual. Make sure the captured variables are typed before they are captured so the JIT knows what they are. No effect on perf. The other hint is to put the definition of the curried function with the data for the closure in let block. Tried this. No effect on perf. It's possible I implemented the hints incorrectly--I can provide a code fragment if it helps.
But, I'd rather just ask the question about what I am trying to do and not muddy the waters with my past effort, which might not be going down the right path.
Here is a small fragment that is more realistic than the above:
Just a couple of functions and arguments:
(affine!, "(dat.z[hl], dat.a[hl-1], nnw.theta[hl], nnw.bias[hl])")
(relu!, "(dat.a[hl], dat.z[hl])")
Of course, the arguments could be wrapped as an expression with Meta.parse. dat.z and dat.a are matrices used in machine learning. hl indexes the layer of the model for the linear result and non-linear activation.
A simplified version of the loop where I want to run through the stack of functions across the model layers:
function feedfwd!(dat::Union{Batch_view,Model_data}, nnw, hp, ff_execstack)
for lr in 1:hp.n_layers
for f in ff_execstack[lr]
f(lr)
end
end
end
So, closures of the arrays is too slow. Eval I can't get to work.
Any suggestions...?
Thanks,
Lewis
I solved this with the beauty of function composition.
Here is the loop that runs through the feed forward functions for all layers:
for lr in 1:hp.n_layers
for f in ff_execstack[lr]
f(argfilt(dat, nnw, hp, bn, lr, f)...)
end
end
The inner function parameter to f called argfilt filters down from a generic list of all the inputs to return a tuple of arguments needed for the specific function. This also takes advantage of the beauty of method dispatch. Note that the function, f, is an input to argfilt. The types of functions are singletons: each function has a unique type as in typeof(relu!), for example. So, without any crazy if branching, method dispatch enables argfilt to return just the arguments needed. The performance cost compared to passing the arguments directly to a function is about 1.2 ns. This happens in a very hot loop that typically runs 24,000 times so that is 29 microseconds for the entire training pass.
The other great thing is that this runs in less than 1/10 of the time of the version using closures. I am getting slightly better performance than my original version that used some function variables and a bunch of if statements in the hot loop for feedfwd.
Here is what a couple of the methods for argfilt look like:
function argfilt(dat::Union{Model_data, Batch_view}, nnw::Wgts, hp::Hyper_parameters,
bn::Batch_norm_params, hl::Int, fn::typeof(affine!))
(dat.z[hl], dat.a[hl-1], nnw.theta[hl], nnw.bias[hl])
end
function argfilt(dat::Union{Model_data, Batch_view}, nnw::Wgts, hp::Hyper_parameters,
bn::Batch_norm_params, hl::Int, fn::typeof(relu!))
(dat.a[hl], dat.z[hl])
end
Background: I got here by reasoning that I could pass the same list of arguments to all of the functions: the union of all possible arguments--not that bad as there are only 9 args. Ignored arguments waste some space on the stack but it's teeny because for structs and arrays an argument is a pointer reference, not all of the data. The downside is that every one of these functions (around 20 or so) all need to have big argument lists. OK, but goofy: it doesn't make much sense when you look at the code of any of the functions. But, if I could filter down the arguments just to those needed, the function signatures don't need to change.
It's sort of a cool pattern. No introspection or eval needed; just functions.

Some questions about bazel concept

I am learning Bazel and confused by many basic concepts.
load("//bazel/rules:build_tools.bzl", "build_tools_deps")
build_tools_deps() //build_tools_deps is macro or rules?
load("#bazel_gazelle//:deps.bzl", "gazelle_dependencies")
gazelle_dependencies() //what about the # mean exactly? where is the bazel_gazelle ?
native.new_git_repository(...) //what about the native mean?
What definition is called a function? what definition is a rule?
A macro is a regular Starlark function that wraps (and expands to) rules.
def my_macro(name = ..., ...):
native.cc_library(...)
android_library(...)
native.genrule(...)
Think of macros as a way to chain and group several rules together, which allows you to pipe the output of some rules into the input of others. At this level, you don't think about how a rule is implemented, but what kinds of inputs and outputs they are associated with.
On the other hand, a rule's declaration is done using the rule() function. cc_library, android_library and genrule are all rules. The rule implementation is abstracted in a regular function that accepts a single parameter for the rule context (ctx).
my_rule = rule(
attrs = { ... },
implementation = _my_rule_impl,
)
def _my_rule_impl(ctx):
outfile = ctx.actions.declare_file(...)
ctx.actions.run(...)
return [DefaultInfo(files = depset([outfile]))]
Think of actions as a way to chain and group several command lines together, which works at the level of individual files and running your executables to transform them (ctx.actions.run with exectuable, args, inputs and outputs arguments). Within a rule implementation, you can extract information from rule attributes (ctx.attr), or from dependencies through providers (e.g. ctx.attr.deps[0][DefaultInfo].files)
Note that rules can only be called in BUILD files, not WORKSPACE files.
# is the notation for a repository namespace. #bazel_gazelle is an external repository fetched in the WORKSPACE by a repository rule (not a regular rule), typically http_archive or git_repository. This repository rule can also be called from a macro, like my_macro above or build_tools_deps in your example.
native.<rule name> means that the rule is implemented in Java within Bazel and built into the binary, and not in Starlark.

Lua Terminology related to OOP

To be to the point; I've done Lua for awhile, but never quite got the terminology down to specifics, so I've been Googling for hours and haven't come up with a definitive answer.
Related to OOP in Lua, the terminology used include:
Object
Class
Function
Method
Table
The question is, when are these properly used? Such as in the example below:
addon = { }
function addon:test_func( )
return 'hi'
end
Q: From my understanding with Lua and OOP, addon is a table, however, I've read that it can be an object as well -- but when it is technically an object? After a function is created within that table?
Q: test_func is a function, however, I've read that it becomes a "Method" when it's placed within a table (class).
Q: The entire line addon:test_func( ), I know the colon is an operator, but what is the term for the entire line set of text? A class itself?
Finally, for this example code:
function addon:test_func( id, name )
end
Q: What is id and name, because I've seen some people identify them as arguments, but then other areas classify them as parameters, so I've stuck with parameters.
So in short, what is the proper terminology for each of these, and when do they become what they are?
Thanks
From my understanding with Lua and OOP, addon is a table, however, I've read that it can be an object as well -- but when it is technically an object? After a function is created within that table?
Object is not a well-defined term. I've seen it defined (in C) as any value whatsoever. In Lua, I would consider it synonymous with a table. You could also define it as an instance of a class.
test_func is a function, however, I've read that it becomes a "Method" when it's placed within a table (class).
You're basically right. A method is any function that is intended to be called with the colon notation. Metamethods are also methods, because, like regular methods, they define the behavior of tables.
The entire line addon:test_func( ), I know the colon is an operator, but what is the term for the entire line set of text? A class itself?
There's no name for that particular piece of code. It's just part of a method definition.
Also, I wouldn't call the colon an operator. An operator would be the plus in x + y where x and y both mean something by themselves. In addon:test_func(), test_func only has meaning inside the table addon, and it's only valid to use the colon when calling or defining methods. The colon is actually a form of syntactic sugar where the real operator is the indexing operator: []. Assuming that you're calling the method, the expansion would be: addon['test_func'](addon).
What is id and name, because I've seen some people identify them as arguments, but then other areas classify them as parameters, so I've stuck with parameters.
They're parameters. Parameters are the names that you declare in the function signature. Arguments are the values that you pass to a function.

Dependency injection OR configuration object?

I have the following constructor for my Class
public MyClass(File f1, File f2, File f3, Class1 c1, Class2 c2, Class3 c3)
{
..........
}
As can be seen, it has 6 parameters. On seeing this code, one of my seniors said that instead of passing 6 parameters I should rather pass a configuration object.
I wrote the code this way because recently I have read about "Dependency injection", which says "classes must ask for what they want". So I think that passing a configuration object will be against the principle.
Is my interpretation of "Dependency injection" correct? OR Should I take my senior's advice?
"Configuration object" is an obtuse term to apply in this situation; it frames your efforts in a purely mechanical sense. The goal is to communicate your intent to the class's consumer; let's refactor toward that.
Methods or constructors with numerous parameters indicate a loose relationship between them. The consumer generally has to make more inferences to understand the API. What is special about these 3 files together with these 3 classes? That is the information not being communicated.
This is an opportunity to create a more meaningful and intention-revealing interface by extracting an explicit concept from an implicit one. For example, if the 3 files are related because of a user, a UserFileSet parameter would clearly express that. Perhaps f1 is related to c1, f2 to c2, and f3 to c3. Declaring those associations as independent classes would halve the parameter count and increase the amount of information that can be derived from your API.
Ultimately, the refactoring will be highly dependent on your problem domain. Don't assume you should create a single object to fulfill a parameter list; try to refactor along the contours of the relationships between the parameters. This will always yield code which reflects the problem it solves more than the language used to solve it.
I don't think using a configuration object contradicts using dependency injection pattern. It is more about the form in which you inject your dependencies and a general question of whether it's better to have a function (in this case the constructor) that takes 20 parameters or combine those parameters into a class so that they are bundled together.
You are still free to use dependency injection, i.e. construct the configuration object by some factory or a container and inject it into the constructor when creating an instance of your class. Whether or not that's a good idea also depends on the particular case, there are no silver bullets ;)

Resources