How to pass multiple arguments to dask.distributed.Client().map? - dask

import dask.distributed
def f(x, y):
return x, y
client = dask.distributed.Client()
client.map(f, [(1, 2), (2, 3)])
Does not work.
[<Future: status: pending, key: f-137239e2f6eafbe900c0087f550bc0ca>,
<Future: status: pending, key: f-64f918a0c730c63955da91694fcf7acc>]
distributed.worker - WARNING - Compute Failed
Function: f
args: ((1, 2))
kwargs: {}
Exception: TypeError("f() missing 1 required positional argument: 'y'",)
distributed.worker - WARNING - Compute Failed
Function: f
args: ((2, 3))
kwargs: {}
Exception: TypeError("f() missing 1 required positional argument: 'y'",)

You do not quite have the signature right - perhaps the doc is not clear (suggestions welcome). Client.map() takes (variable number of) sets of arguments for each task submitted, not a single iterable thing. You should phrase this as
client.map(f, (1, 2), (2, 3))
or, if you wanted to stay closer to your original pattern
client.map(f, *[(1, 2), (2, 3)])

Ok, the documentation is definitely a bit confusing on this one. And I couldn't find an example that clearly demonstrated this problem. So let me break it down below:
def test_fn(a, b, c, d, **kwargs):
return a + b + c + d + kwargs["special"]
futures = client.map(test_fn, *[[1, 2, 3, 4], (1, 2, 3, 4), (1, 2, 3, 4), (1, 2, 3, 4)], special=100)
output = [f.result() for f in futures]
# output = [104, 108, 112, 116]
futures = client.map(test_fn, [1, 2, 3, 4], (1, 2, 3, 4), (1, 2, 3, 4), (1, 2, 3, 4), special=100)
output = [f.result() for f in futures]
# output = [104, 108, 112, 116]
Things to note:
Doesn't matter if you use lists or tuples. And like I did above, you can mix them.
You have to group arguments by their position. So if you're passing in 4 sets of arguments, the first list will contain the first argument from all 4 sets. (In this case, the "first" call to test_fn gets a=b=c=d=1.)
Extra **kwargs (like special) are passed through to the function. But it'll be the same value for all function calls.
Now that I think about it, this isn't that surprising. I think it's just following Python's concurrent.futures.ProcessPoolExecutor.map() signature.
PS. Note that even though the documentation says "Returns:
List, iterator, or Queue of futures, depending on the type of the
inputs.", you can actually get this error: Dask no longer supports mapping over Iterators or Queues. Consider using a normal for loop and Client.submit

Related

Why torch.nn.Conv2d has different result between '(n, n)' and 'n' arguments?

input = torch.randn(8, 3, 50, 100)
m = nn.Conv2d(3, 3, kernel_size=(3, 3), padding=(1, 1))
m2 = nn.Conv2d(3, 3, kernel_size=3, padding=1)
output = m(input)
output2 = m2(input)
torch.equal(output, output2) >> False
I suppose above m and m2 Conv2d should have exactly same output value but practically not, what is the reason?
You have initialized two nn.Conv2d with identical settings, that's true. Initialization of the weights however is done randomly! You have here two different layers m and m2. Namely m.weight and m2.weigth have different components, same goes for m.bias and m2.bias.
One way to have get the same results, is to copy the underlying parameters of the model:
>>> m.weight = m2.weight
>>> m.bias = m2.bias
Which, of course, results in torch.equal(m(input), m2(input)) being True.
The "problem" here isn't related to int vs tuple. In fact, if you print m and m2 you'll see
>>> m
Conv2d(3, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
>>> m2
Conv2d(3, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
that the integer got expanded as the documentation promises.
What actually differs is the initial weights, which I believe are just random. You can view them via m.weights, m2.weights. These will differ every time you create a new Conv2d, even if you use the same arguments.
You can initialize the weights if you want to play around with these objects in a predictable way, see
How to initialize weights in PyTorch?
e.g.
m.weight.data.fill_(0.01)
m2.weight.data.fill_(0.01)
m.bias.data.fill_(0.1)
m2.bias.data.fill_(0.1)
and they should now be identical.

arguments and function call of LSTM in pytorch

Could anyone please explain me the below code:
import torch
import torch.nn as nn
input = torch.randn(5, 3, 10)
h0 = torch.randn(2, 3, 20)
c0 = torch.randn(2, 3, 20)
rnn = nn.LSTM(10,20,2)
output, (hn, cn) = rnn(input, (h0, c0))
print(input)
While calling rnn rnn(input, (h0, c0)) we gave arguments h0 and c0 in parenthesis. What is it supposed to mean? if (h0, c0) represents a single value then what is that value and what is the third argument passed here?
However, in the line rnn = nn.LSTM(10,20,2) we are passing arguments in LSTM function without paranthesis.
Can anyone explain me how this function call is working?
The assignment rnn = nn.LSTM(10, 20, 2) instanciates a new nn.Module using the nn.LSTM class. It's first three arguments are input_size (here 10), hidden_size (here 20) and num_layers (here 2).
On the other hand rnn(input, (h0, c0)) corresponds to actually calling the class instance, i.e. running __call__ which is roughly equivalent to the forward function of that module. The __call__ method of nn.LSTM takes in two parameters: input (shaped (sequnce_length, batch_size, input_size), and a tuple of two tensors (h_0, c_0) (both shaped (num_layers, batch_size, hidden_size) in the basic use case of nn.LSTM)
Please refer to the PyTorch documentation whenever using builtins, you will find the exact definition of the parameters list (the arguments used to initialize the class instance) as well as the input/outputs specifications (whenever inferring with that said module).
You might be confused with the notation, here's a small example that could help:
tuple as input:
def fn1(x, p):
a, b = p # unpack input
return a*x + b
>>> fn1(2, (3, 1))
>>> 7
tuple as output
def fn2(x):
return x, (3*x, x**2) # actually output is a tuple of int and tuple
>>> x, (a, b) = fn2(2) # unpacking
(2, (6, 4))
>>> x, a, b
(2, 6, 4)

Grouping dask.bag items into distinct partitions

I was wondering if somebody could help me understand the way Bag objects handle partitions. Put simply, I am trying to group items currently in a Bag so that each group is in its own partition. What's confusing me is that the Bag.groupby() method asks for a number of partitions. Shouldn't this be implied by the grouping function? E.g., two partitions if the grouping function returns a boolean?
>>> a = dask.bag.from_sequence(range(20), npartitions = 1)
>>> a.npartitions
1
>>> b = a.groupby(lambda x: x % 2 == 0)
>>> b.npartitions
1
I'm obviously missing something here. Is there a way to group Bag items into separate partitions?
Dask bag may put several groups within one partition.
In [1]: import dask.bag as db
In [2]: b = db.range(10, npartitions=3).groupby(lambda x: x % 5)
In [3]: partitions = b.to_delayed()
In [4]: partitions
Out[4]:
[Delayed(('groupby-collect-f00b0aed94fd394a3c61602f5c3a4d42', 0)),
Delayed(('groupby-collect-f00b0aed94fd394a3c61602f5c3a4d42', 1)),
Delayed(('groupby-collect-f00b0aed94fd394a3c61602f5c3a4d42', 2))]
In [5]: for part in partitions:
...: print(part.compute())
...:
[(0, [0, 5]), (3, [3, 8])]
[(1, [1, 6]), (4, [4, 9])]
[(2, [2, 7])]

How to compute the mean over rows till a variable changes and repeat?

Given a very huge table of the following format (e.g. snippet):
Subject, Condition, VPH, Task, Round, Item, Decision, Self, Other, RT
1, 1, 1, SVO, 0, 0, 4, 2.5, 2.0, 8.598
1, 1, 1, SVO, 1, 5, 3, 4.1, 3.4, 7.785
1, 1, 1, SVO, 2, 4, 3, 3.2, 3.4, 15.713
2, 2, 1, SVO, 0, 0, 4, 2.5, 2.0, 15.439
2, 2, 1, SVO, 1, 2, 7, 4.9, 2.3, 30.777
2, 2, 1, SVO, 2, 3, 8, 4.3, 4.3, 13.549
3, 3, 1, SVO, 0, 0, 5, 2.8, 1.5, 9.066
... (And so on)
Needed: Compute the mean over all rounds for self and others for each subject.
What i have so far:
I sorted the about 100mb .txt file using bash sort so the subject and the related rounds appear after each other (like the example shows). After that i imported the .txt file into SPSS24. Right now i have no idea to write a function that computes for each subject the mean of variable self and others over the three rounds. E.g.: (some pseudo-code)
for n = 1 to last_subject do:
get row self where lines have line_subject as n
compute mean over these content
write result as new variable self_mean as new variable after variabel RT at line n
increase n by one
As i am totally new to SPSS i really appreciate detailed help. I am also satisfied with references that specifically attend to computation over rows (i found lots of stuff over columns).
Thank you very much!
Edit: example output
After computing the table should look like this:
Subject, Mean_Self, Mean_Others
1, 3.27, 2.9
2, ..., ...
3,
... (And so on)
So now we computed the Mean_Self from the top example like so:
mean(2.5 + 4.1 + 3.2)
where:
2.5 was used from line 1 of Variable Self
4.1 was used from line 2 of Variable Self
3.2 was used from line 3 of Variable Self
2.5 was not used from line 4 of Variable Self because Variable Subject changed, there for we want to repeat the process with the new Subject (here 2) until it changes again. The results should create a table like the one above. Same procedure for Variable Other.
If I understand right what you need is the aggregate command. aggregate can create a new dataset/file with your aggregated data, or add the aggregated data to your active dataset, like you described above:
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=Subject
/Self_mean=MEAN(Self)
/Other_mean=MEAN(Other).
In order to get the new variables in a new, separate tabe, look up other AGGREGATE options, e.g. /OUTFILE=* (removing MODE=ADDVARIABLES) will result in the new aggregated data replacing the original file in the window, while /OUTFILE="path/filename" will save the aggregated data to a file.

torch7: Setting Variable Learning Rates for Different Conv-net Layers

I am trying to fine-tune a conv-net. It has the following structure (adapted from OverFeat):
net:add(SpatialConvolution(3, 96, 7, 7, 2, 2))
net:add(nn.ReLU(true))
net:add(SpatialMaxPooling(3, 3, 3, 3))
net:add(SpatialConvolutionMM(96, 256, 7, 7, 1, 1))
net:add(nn.ReLU(true))
net:add(SpatialMaxPooling(2, 2, 2, 2))
net:add(SpatialConvolutionMM(256, 512, 3, 3, 1, 1, 1, 1))
net:add(nn.ReLU(true))
net:add(SpatialConvolutionMM(512, 512, 3, 3, 1, 1, 1, 1))
net:add(nn.ReLU(true))
net:add(SpatialConvolutionMM(512, 1024, 3, 3, 1, 1, 1, 1))
net:add(nn.ReLU(true))
net:add(SpatialConvolutionMM(1024, 1024, 3, 3, 1, 1, 1, 1))
net:add(nn.ReLU(true))
net:add(SpatialMaxPooling(3, 3, 3, 3))
net:add(SpatialConvolutionMM(1024, 4096, 5, 5, 1, 1))
net:add(nn.ReLU(true))
net:add(SpatialConvolutionMM(4096, 4096, 1, 1, 1, 1))
net:add(nn.ReLU(true))
net:add(SpatialConvolutionMM(4096, total_classes, 1, 1, 1, 1))
net:add(nn.View(total_classes))
net:add(nn.LogSoftMax())
And I'm using SGD as the optimization method with the following parameters:
optimState = {
learningRate = 1e-3,
weightDecay = 0,
momentum = 0,
learningRateDecay = 1e-7
}
optimMethod = optim.sgd
I am training it as follows:
optimMethod(feval, parameters, optimState)
where:
-- 'feval' is the function with the forward and backward passes on the current batch
parameters,gradParameters = net:getParameters()
From my references, I have learned that while fine-tuning a pre-trained network, it is recommended that the lower (convolutional) layers should have lower learning rates and the higher layers should have relatively higher learning rates.
I referred to torch7's documentation of optim/sgd to set different learning rates for each layer. From there, I get that setting config.learningRates i.e. a vector of individual learning rates, I can achieve what I want. I am new to Torch, so, please pardon me if this seems as a silly question, but it would be really helpful if someone can please explain me how and where to create/use this vector to serve my purpose?
Thanks in advance.
I don't know if you still need an answer, as you posted this question one year ago.
Anyway, just in case someone sees this, I've written a post here about how to set different learning rates for different layers in torch.
The solution is to use net:parameters() instead of net:getParameters(). Instead of returning two tensors, it returns two tables of tensors, containing the parameters (and the gradParameters) for each layer in separate tensors.
In this way, you can run an sgd() step (with a different learning rate) for each layer. You can find the full code by clicking the above link.

Resources