Accessing ghosted chunks with dask - dask

Using dask, I would like to break up an image array into overlapping tiles, perform a computation (on all the tiles simultaneously), and then stitch the results back into an image.
The following works, but feels clumsy:
from dask import array as da
from dask.array import ghost
import numpy as np
test_data = np.random.random((50, 50))
x = da.from_array(test_data, chunks=(10, 10))
depth = {0: 1, 1: 1}
g = ghost.ghost(x, depth=depth, boundary='reflect')
# Calculate the shape of the array in terms of chunks
chunk_shape = [len(c) for c in g.chunks]
chunk_nr =
# Allocate a list for results (as many entries as there are chunks)
blocks = [None,] * chunk_nr
def pack_block(block, block_id):
"""Store `block` at the correct position in `blocks`,
according to its `block_id`.
E.g., with ``block_id == (0, 3)``, the block will be stored at
idx = np.ravel_multi_index(block_id, chunk_shape)
blocks[idx] = block
# We don't really need to return anything, but this will do
return block
# Do some operation on the blocks; this is an over-simplified example.
# Typically, I want to do an operation that considers *all*
# blocks simultaneously, hence the need to first unpack into a list.
blocks = [b**2 for b in blocks]
def retrieve_block(_, block_id):
"""Fetch the correct block from the results set, `blocks`.
idx = np.ravel_multi_index(block_id, chunk_shape)
return blocks[idx]
result = g.map_blocks(retrieve_block)
# Slice off excess from each computed chunk
result = ghost.trim_internal(result, depth)
result = result.compute()
Is there a cleaner way to achieve the same end result?

The user-facing api for this is map_overlap method
>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1])
>>> x = da.from_array(x, chunks=5)
>>> def derivative(x):
... return x - np.roll(x, 1)
>>> y = x.map_overlap(derivative, depth=1, boundary=0)
>>> y.compute()
array([ 1, 0, 1, 1, 0, 0, -1, -1, 0])
Two additional notes for your use case
Avoid hashing costs by supplying name=False to from_array. This saves you about 400MB/s assuming you don't have any fancy hashing libraries around.
x = da.from_array(x, name=False)
Be careful of computing inplace. Dask doesn't guarantee correct behavior if user functions mutate data inplace. In this particular case it's probably fine, since we're copying for ghosting anyway, but it's something to be aware of.
Second answer
Given the comment by #stefan-van-der-walt we'll try another solution.
Consider using the .to_delayed() method to get an array of chunks as dask.delayed objects
depth = {0: 1, 1: 1}
g = ghost.ghost(x, depth=depth, boundary='reflect')
blocks = g.todelayed()
This gives you a numpy array of dask.delayed objects, each of which point to a block. You can now perform arbitrary parallel computations on these blocks. If I wanted them all to arrive at the same function then I might call the following:
result = dask.delayed(f)(blocks.tolist())
The function f will then get a list of lists of numpy arrays, each of which corresponds to one block in the dask.array g.


How to take a derivative of one of the outputs of a neural network (involving batched inputs) with respect to inputs?

I am solving a PDE using a neural network. My neural network is as follows:
def f(params, inputs):
for w, b in params:
outputs =, w) + b
inputs = jnn.swish(outputs)
return outputs
The layer architecture of the network is as follows - [1,5,2]. Hence, i have one input neuron and two output neurons. Therefore, if I pass 10 batches of input, I am supposed to get a (10,2) array as output. Now let the output neurons be termed as 'p' and 'q' respectively. How do I find dp/dx, dq/dx? I don't want to pick values from jacobians and hessians, and want to have a more explicit functionality. What I mean is, I want something like this below:
p = lambda inputs: f(params, inputs)[:,0].reshape(-1,1)
q = lambda inputs: f(params, inputs)[:,1].reshape(-1,1)
p_x = lambda inputs: vmap(jacfwd(p,argnums=0))(inputs)
q_x = lambda inputs: vmap(jacfwd(q,argnums=0))(inputs)
k_p_x = lambda inputs: kappa(inputs).reshape(-1,1) * p_x(inputs)
##And other calculations proceed..
When I execute p(inputs) it's working as expected (as it should), but as soon as I execute p_x(inputs) I am getting an error: IndexError: Too many indices for array: 2 non-None/Ellipsis indices for dim 1.
How do I get around this?
The reason you are seeing an index error is that your p function expects a two-dimensional input, and when you wrap it in vmap it means you are effectively passing the function a single one-dimensional row at a time.
You can fix this by changing your function so that it accepts a one-dimensional input, and then use vmap as appropriate to compute the batched result.
Here is a complete example with the modified versions of your functions:
import jax
import jax.numpy as jnp
from jax import nn as jnn
from jax import vmap, jacfwd
def f(params, inputs):
for w, b in params:
outputs =, w)
inputs = jnn.swish(outputs)
return outputs
# Some example inputs and parameters
inputs_x = jnp.ones((10, 1))
params = [
(jnp.ones((1, 5)), 1),
(jnp.ones((5, 2)), 1)
inputs = jnp.arange(10.0).reshape(10, 1)
# p and q map a length-1 input to a length-1 output
p = lambda inputs: f(params, inputs)[0].reshape(1)
q = lambda inputs: f(params, inputs)[1].reshape(1)
p_batched = vmap(p)
q_batched = vmap(q)
p_x = lambda inputs: vmap(jacfwd(p,argnums=0))(inputs)
q_x = lambda inputs: vmap(jacfwd(q,argnums=0))(inputs)
# (10, 1)
# (10, 1)
# Note: since p and q map a size 1 input to size-1 output,
# p_x and q_x compute a sequence of 10 1x1 jacobians.
# (10, 1, 1)
# (10, 1, 1)

Z3 - how to count matches?

I have a finite set of pairs of type (int a, int b). The exact values of the pairs are explicitly present in the knowledge base. For example it could be represented by a function (int a, int b) -> (bool exists) which is fully defined on a finite domain.
I would like to write a function f with signature (int b) -> (int count), representing the number of pairs containing the specified b value as its second member. I would like to do this in z3 python, though it would also be useful to know how to do this in the z3 language
For example, my pairs could be:
(0, 0)
(0, 1)
(1, 1)
(1, 2)
(2, 1)
then f(0) = 1, f(1) = 3, f(2) = 1
This is a bit of an odd thing to do in z3: If the exact values of the pairs are in your knowledge base, then why do you need an SMT solver? You can just search and count using your regular programming techniques, whichever language you are in.
But perhaps you have some other constraints that come into play, and want a generic answer. Here's how one would code this problem in z3py:
from z3 import *
pairs = [(0, 0), (0, 1), (1, 1), (1, 2), (2, 1)]
def count(snd):
return sum([If(snd == p[1], 1, 0) for p in pairs])
s = Solver()
searchFor = Int('searchFor')
result = Int('result')
s.add(Or(*[searchFor == d[0] for d in pairs]))
s.add(result == count(searchFor))
while s.check() == sat:
m = s.model()
print("f(" + str(m[searchFor]) + ") = " + str(m[result]))
s.add(searchFor != m[searchFor])
When run, this prints:
f(0) = 1
f(1) = 3
f(2) = 1
as you predicted.
Again; if your pairs are exactly known (i.e., they are concrete numbers), don't use z3 for this problem: Simply write a program to count as needed. If the database values, however, are not necessarily concrete but have other constraints, then above would be the way to go.
To find out how this is coded in SMTLib (the native language z3 speaks), you can insert print(s.sexpr()) in the program before the while loop starts. That's one way. Of course, if you were writing this by hand, you might want to code it differently in SMTLib; but I'd strongly recommend sticking to higher-level languages instead of SMTLib as it tends to be hard to read/write for anyone except machines.

Map Dask bincount over 2d array columns

I am trying to use bincount over a 2D array. Specifically I have this code:
import numpy as np
import dask.array as da
def dask_bincount(weights, x):
da.bincount(x, weights)
idx = da.random.random_integers(0, 1024, 1000)
weight = da.random.random((1000, 2))
bin_count = da.apply_along_axis(dask_bincount, 1, weight, idx)
The idea is that the bincount can be made with the same idx array on each one of the weight columns. That would return an array of size (np.amax(x) + 1, 2) if I am correct.
However when doing this I get this error message:
AttributeError Traceback (most recent call last)
<ipython-input-17-5b8eed89ad32> in <module>
----> 1 bin_count = da.apply_along_axis(dask_bincount, 1, weight, idx)
~/.local/lib/python3.9/site-packages/dask/array/ in apply_along_axis(func1d, axis, arr, dtype, shape, *args, **kwargs)
454 if shape is None or dtype is None:
455 test_data = np.ones((1,), dtype=arr.dtype)
--> 456 test_result = np.array(func1d(test_data, *args, **kwargs))
457 if shape is None:
458 shape = test_result.shape
<ipython-input-14-34fd0eb9b775> in dask_bincount(weights, x)
1 def dask_bincount(weights, x):
----> 2 da.bincount(x, weights)
~/.local/lib/python3.9/site-packages/dask/array/ in bincount(x, weights, minlength, split_every)
670 raise ValueError("Input array must be one dimensional. Try using x.ravel()")
671 if weights is not None:
--> 672 if weights.chunks != x.chunks:
673 raise ValueError("Chunks of input array x and weights must match.")
AttributeError: 'numpy.ndarray' object has no attribute 'chunks'
I thought that when dask array were created the library automatically assigns them chunks, so the error does not say much. How can I fix this?
I made an script that does it on numpy with map.
idx_np = np.random.randint(0, 1024, 1000)
weight_np = np.random.random((1000,2))
f = lambda y: np.bincount(idx_np, weight_np[:,y])
result = map(f, [i for i in range(2)])
array([[0.9885341 , 0.9977873 , 0.24937023, ..., 0.31024526, 1.40754883,
[1.77406303, 0.84787723, 0.14591474, ..., 0.54584068, 0.38357015,
I would like to the same but with dask
There are multiple problems at play.
Weights should be (2, 1000)
You discover this by trying to write the same function in numpy using apply_along_axis.
idx_np = np.random.random_integers(0, 1024, 1000)
weight_np = np.random.random((2, 1000)) # <- transposed
# This gives the same result as the code you provided
np.apply_along_axis(lambda weight, idx: np.bincount(idx, weight), 1, weight_np, idx_np)
da.apply_along_axis applies the function to numpy arrays
You're getting the error
AttributeError: 'numpy.ndarray' object has no attribute 'chunks'
This suggests that what makes it into the da.bincount method is actually a numpy array. The fact is that da.apply_along_axis actually takes each row of weight and sends it to the function as a numpy array.
Your function should therefore actually be a numpy function:
def bincount(weights, x):
return np.bincount(x, weights)
However, if you try this, you will still get the same error. I believe that happens for a whole another reason though:
Dask doesn't know what the output shape will be and tries to infer it
In the code and/or documentation for apply_along_axis, we can see that Dask tries to infer the output shape and dtype by passing in the array [1] (related question). This is a problem, since bincount cannot just accept such argument.
What we can do instead is provide shape and dtype to the method so that Dask doesn't have to infer it.
The problem here is that bincount's output shape depends on the maximum value of the input array. Unless you know it beforehand, you will sadly need to compute it. The whole operation therefore won't be fully lazy.
This is the full answer:
import numpy as np
import dask.array as da
idx = da.random.random_integers(0, 1024, 1000)
weight = da.random.random((2, 1000))
def bincount(weights, x):
return np.bincount(x, weights)
m = idx.max().compute()
da.apply_along_axis(bincount, 1, weight, idx, shape=(m,), dtype=weight.dtype)
Appendix: randint vs random_integers
Be careful, because these are subtly different
randint takes integers from low (inclusive) to high (exclusive)
random_integers takes integers from low (inclusive) to high (inclusive)
Thus you have to call randint with high + 1 to get the same value.

Why does RMSE increase with horizon when using the timeslice method in caret's trainControl function?

I'm using the timeslice method in caret's trainControl function to perform cross-validation on a time series model. I've noticed that RMSE increases with the horizon argument.
I realise this might happen for several reasons, e.g., if explanatory variables are being forecast and/or there's autocorrelation in the data such that the model can better predict nearer vs. farther ahead observations. However, I'm seeing the same behaviour even when neither is the case (see trivial reproducible example below).
Can anyone explain why RSMEs are increasing with horizon?
# Make data
X = data.frame(matrix(rnorm(1000 * 3), ncol = 3))
X$y = rowSums(X) + rnorm(nrow(X))
# Iterate over different different forecast horizons and record RMSES
forecast_horizons = c(1, 3, 10, 50, 100)
rmses = numeric(length(forecast_horizons))
for (i in 1:length(forecast_horizons)) {
ctrl = trainControl(method = 'timeslice', initialWindow = 500, horizon = forecast_horizons[i], fixedWindow = T)
rmses[i] = train(y ~ ., data = X, method = 'lm', trControl = ctrl)$results$RMSE
print(rmses) #0.7859786 0.9132649 0.9720110 0.9837384 0.9849005

Find frequency of elements in a dask array without losing information about the array shape?

I need to find the frequency of every elements in the array while keeping the information about the array shape. This is because I'll need to iterate over it later on.
I tried this solution as well as this one. It works well for numpy however it doesn't seem to work in dask due to the limitation of dask arrays needing to know their size for most operation.
import dask.array as da
arr = da.from_array([1, 1, 1, 2, 3, 4, 4])
unique, counts = da.unique(arr, return_counts=True)
# dask.array<getitem, shape=(nan,), dtype=int64, chunksize=(nan,)>
# dask.array<getitem, shape=(nan,), dtype=int64, chunksize=(nan,)>
I am looking for something similar to this:
import dask.array as da
arr = da.from_array([1, 1, 1, 2, 3, 4, 4])
# {1: 3, 2: 1, 3:1, 4:2}
I found that this solution was the fastest for a large amount (~37.5 Billion elements) of data with many unique values (>50k).
import dask
import dask.array as da
arr = da.from_array(some_large_array)
bincount = da.bincount(arr)
bincount = bincount[bincount != 0] # Remove elements not in the initial array
unique = da.unique(arr)
# Allows to have the shape of the arrays
unique, counts = dask.compute(unique, bincount)
unique = da.from_array(unique)
counts = da.from_array(counts)
frequency = da.transpose(
da.vstack([unique, counts])
Perhaps you can call dask.compute directly after creating the frequency counts. Presumably at this point your dataset is small and now would be a good time to transition away from Dask Array and back to NumPy
import dask
import dask.array as da
arr = da.from_array([1, 1, 1, 2, 3, 4, 4])
unique, counts = da.unique(arr, return_counts=True)
unique, counts = dask.compute(unique, counts)
result = dict(zip(unique, counts))
# {1: 3, 2: 1, 3: 1, 4: 2}
