I would like to use dask.array.map_overlap to deal with the scipy interpolation function. However, I keep meeting errors that I cannot understand and hoping someone can answer this to me.
Here is the error message I have received if I want to run .compute().
ValueError: could not broadcast input array from shape (1070,0) into shape (1045,0)
To resolve the issue, I started to use .to_delayed() to check each partition outputs, and this is what I found.
Following is my python code.
Step 1. Load netCDF file through Xarray, and then output to dask.array with chunk size (400,400)
df = xr.open_dataset('./Brazil Sentinal2 Tile/' + data_file +'.nc')
lon, lat = df['lon'].data, df['lat'].data
slon = da.from_array(df['lon'], chunks=(400,400))
slat = da.from_array(df['lat'], chunks=(400,400))
data = da.from_array(df.isel(band=0), chunks=(400,400))
Step 2. declare a function for da.map_overlap use
def sumsum2(lon,lat,data, hex_res=10):
hex_col = 'hex' + str(hex_res)
lon_max, lon_min = lon.max(), lon.min()
lat_max, lat_min = lat.max(), lat.min()
b = box(lon_min, lat_min, lon_max, lat_max, ccw=True)
b = transform(lambda x, y: (y, x), b)
b = mapping(b)
target_df = pd.DataFrame(h3.polyfill( b, hex_res), columns=[hex_col])
target_df['lat'] = target_df[hex_col].apply(lambda x: h3.h3_to_geo(x)[0])
target_df['lon'] = target_df[hex_col].apply(lambda x: h3.h3_to_geo(x)[1])
tlon, tlat = target_df[['lon','lat']].values.T
abc = lNDI(points=(lon.ravel(), lat.ravel()),
values= data.ravel())(tlon,tlat)
target_df['out'] = abc
print(np.stack([tlon, tlat, abc],axis=1).shape)
return np.stack([tlon, tlat, abc],axis=1)
Step 3. Apply the da.map_overlap
b = da.map_overlap(sumsum2, slon[:1200,:1200], slat[:1200,:1200], data[:1200,:1200], depth=10, trim=True, boundary=None, align_arrays=False, dtype='float64',
Step 4. Using to_delayed() to test output shape
print(b.to_delayed().flatten()[0].compute().shape, )
(1065, 3)
(1045, 0)
(1090, 3)
(1070, 0)
which is saying that the output from da.map_overlap is only outputting 1-D dimension ( which is (1045,0) and (1070,0) ), while in the da.map_overlap, the output I am preparing is 2-D dimension ( which is (1065,3) and (1090,3) ).
In addition, if I turn off the trim argument, which is
c = da.map_overlap(sumsum2,
print(c.to_delayed().flatten()[0].compute().shape, )
The output becomes
(1065, 3)
(1065, 3)
(1090, 3)
(1090, 3)
This is saying that when trim=True, I cut out everything?
#-- print out the values
(1065, 3)
array([], shape=(1045, 0), dtype=float64)
#-- print out the values
array([[ -47.83683837, -18.98359832, 1395.01848583],
[ -47.8482856 , -18.99038681, 2663.68391094],
[ -47.82800624, -18.99207069, 1465.56517187],
[ -47.81897323, -18.97919009, 2769.91556363],
[ -47.82066663, -19.00712956, 1607.85927095],
[ -47.82696896, -18.97167714, 2110.7516765 ],
[ -47.81562653, -18.98302933, 2662.72112163],
[ -47.82176881, -18.98594465, 2201.83205114],
[ -47.84567 , -18.97512514, 1283.20631652],
[ -47.84343568, -18.97270783, 1282.92117225]])
Any thoughts for this?
Thank You.
I guess I got the answer. Please let me if I am wrong.
I am not allowing to use trim=True is because I change the shape of output array (after surfing the internet, I notice that the shape of output array should be the same with the shape of input array). Since I change the shape, the dask has no idea how to deal with it so it returns the empty array to me (weird).
Instead of using trim=False, since I didn't ask cutting-out the buffer zone, it is now okay to output the return values. (although I still don't know why the dask cannot concat the chunked array, but believe is also related to shape)
The solution is using delayed function on da.concatenate, which is
delayed(da.concatenate)([e.to_delayed().flatten()[idx] for idx in range(len(e.to_delayed().flatten()))])
In this case, we are not relying on the concat function in map_overlap but use our own concat to combine the outputs we want.
I am following this link to train rnn classifier on small dataset to check if the code is working.
While running command
rnn.predict(data_test, 'answer.csv'), throws exception:
AttributeError: 'tuple' object has no attribute 'ndim'
Here is the predict function
def predict(self, data_test, answer_filename):
word_matrix, char_matrix, additional_features_matrix = data_test
print("Test example: ")
preds = self.model.predict([word_matrix, char_matrix, additional_features_matrix],
batch_size=self.batch_size, verbose=1)
index_to_author = { 0: "EAP", 1: "HPL", 2: "MWS" }
submission = pd.DataFrame({"id": test["id"], index_to_author[0]: preds[:, 0],
index_to_author[1]: preds[:, 1], index_to_author[2]: preds[:, 2]})
submission.to_csv(answer_filename, index=False)
The word_matrix, char_matrix, additional_features_matrix are of variable length. In my case, the dimensions are (80,), (80, 30) and (1153, 15) respectively. I google it and found that I should add padding to the input numpy array.
But, the code in the link worked fine. I am not able to understand what am I doing wrong. Can somebody help me with this?
I found out my own mistake. If you follow this link then you will find the following line of code:
_, additional_features_matrix_test = collect_additional_features(x.iloc[idx_train], x_test)
The function collect_additional_features returns a tuple of two ndarrays. My mistake was that I missed _ and hence the line of code became:
additional_features_matrix_test = collect_additional_features(x.iloc[idx_train], x_test)
Thus the additional_features_matrix_test became a tuple instead of an ndarray and while passing the additional_features_matrix_test to the LSTM it threw the error AttributeError: 'tuple' object has no attribute 'ndim'
In my experiment, the MxNet may forget saving some parameters of my network.
I am studying mxnet’s gluoncv package ( To learn the programming skills from the engineers, I manually generate an SSD with ‘gluoncv.model_zoo.ssd.SSD’. The parameters that I use to initialize this class are the same as the official ‘ssd_512_resnet50_v1_voc’ network except ‘classes=('car', 'pedestrian', 'truck', 'trafficLight', 'biker')’.
from gluoncv.model_zoo.ssd import SSD
import mxnet as mx
name = 'resnet50_v1'
base_size = 512
features=['stage3_activation5', 'stage4_activation2']
filters=[512, 512, 256, 256]
sizes=[51.2, 102.4, 189.4, 276.4, 363.52, 450.6, 492]
ratios=[[1, 2, 0.5]] + [[1, 2, 0.5, 3, 1.0/3]] * 3 + [[1, 2, 0.5]] * 2
steps=[16, 32, 64, 128, 256, 512]
classes=('car', 'pedestrian', 'truck', 'trafficLight', 'biker')
net = SSD(network = name, base_size = base_size, features = features,
num_filters = filters, sizes = sizes, ratios = ratios, steps = steps,
pretrained=pretrained, classes=classes)
I try to feed a manmade data x to this network, and it gives following errors.
x = mx.nd.zeros(shape=(batch_size,3,base_size,base_size))
cls_preds, box_preds, anchors = net(x)
RuntimeError: Parameter 'ssd0_expand_trans_conv0_weight' has not been initialized. Note that you should initialize parameters and create Trainer with Block.collect_params() instead of Block.params because the later does not include Parameters of nested child Blocks
This is reasonable. The SSD uses function ‘gluoncv.nn.feature.FeatureExpander’ to add new layers on the '_resnet50_v1_', and I forget to initialize them. So, I use following codes.
Oho, it gives me a lot of warnings.
v.initialize(None, ctx, init, force_reinit=force_reinit)
C:\Users\Bird\AppData\Local\conda\conda\envs\ssd\lib\site-packages\mxnet\gluon\ UserWarning: Parameter 'ssd0_resnetv10_stage4_batchnorm9_running_mean' is already initialized, ignoring. Set force_reinit=True to re-initialize.
v.initialize(None, ctx, init, force_reinit=force_reinit)
C:\Users\Bird\AppData\Local\conda\conda\envs\ssd\lib\site-packages\mxnet\gluon\ UserWarning: Parameter 'ssd0_resnetv10_stage4_batchnorm9_running_var' is already initialized, ignoring. Set force_reinit=True to re-initialize.
v.initialize(None, ctx, init, force_reinit=force_reinit)
The '_resnet50_v1_' which is the base of SSD are pre-trained, so these parameters cannot be installed. However, these warnings are annoying.
How can I turn them off?
Here, though, comes the first problem. I would like to save the parameters of the network.
net.save_params('F:/Temps/Models_tmp/' +'myssd.params')
The parameter file of _'resnet50_v1_' (‘resnet50_v1-c940b1a0.params’) is 97.7MB; however, my parameter file is only 9.96MB. Are there some magical technologies to compress these parameters?
To test this new technology, I open a new console and rebuild the same network. Then, I load the saved parameters and feed a data to it.
net.load_params('F:/Temps/Models_tmp/' +'myssd.params')
x = mx.nd.zeros(shape=(batch_size,3,base_size,base_size))
The initialization error comes again.
RuntimeError: Parameter 'ssd0_expand_trans_conv0_weight' has not been initialized. Note that you should initialize parameters and create Trainer with Block.collect_params() instead of Block.params because the later does not include Parameters of nested child Blocks
This cannot be right because the saved file 'myssd.params' should contain all the installed parameters of my network.
To find the block ‘_ssd0_expand_trans_conv0’, I do a deeper research in ‘gluoncv.nn.feature. FeatureExpander_’. I use ‘mxnet.gluon. nn.Conv2D’ to replace ‘mx.sym.Convolution’ in the ‘FeatureExpander’ function.
y = mx.sym.Convolution(
y, num_filter=num_trans, kernel=(1, 1), no_bias=use_bn,
name='expand_trans_conv{}'.format(i), attr={'__init__': weight_init})
Conv1 = nn.Conv2D(channels = num_trans,kernel_size = (1, 1),use_bias = use_bn,weight_initializer = weight_init)
y = Conv1(y)
Conv1.initialize(verbose = True)
y = mx.sym.Convolution(
y, num_filter=f, kernel=(3, 3), pad=(1, 1), stride=(2, 2),
no_bias=use_bn, name='expand_conv{}'.format(i), attr={'__init__': weight_init})
Conv2 = nn.Conv2D(channels = f,kernel_size = (3, 3),padding = (1, 1),strides = (2, 2),use_bias = use_bn, weight_initializer = weight_init)
y = Conv2(y)
Conv2.initialize(verbose = True)
These new blocks can be initialized manually. However, the MxNet still report the same errors.
It seems that the manual initialization is of no effect.
How can I save all the parameters of my network and restore them?
There is a tutorial on the subject of saving and loading that may be of help:
I came across an error during execute stereoCalibrate in Opencv 2.4.11, which is says :
OpenCV Error: Assertion failed (!fixedSize() || ((Mat*)obj)->size.operator()() == Size(cols, rows)) in cv::_OutputArray::create,
I think this must be some size error between these parameters, which go through them one by one. But there is still error. I hope someone awesome could find the error from the assembly code below. Here is the method call in my code.
double error = cv::stereoCalibrate(
objPoints, cali0.imgPoints, cali1.imgPoints,
camera0.intr.cameraMatrix, camera0.intr.distCoeffs,
camera1.intr.cameraMatrix, camera1.intr.distCoeffs,
cv::Size(1920,1080), m.rvec, m.tvec, m.evec, m.fvec,
cv::TermCriteria(CV_TERMCRIT_ITER + CV_TERMCRIT_EPS, 100, 1e-5)
In my code, m.rvec is (3,3,CV_64F), m.tvec is (3,1,CV_64F), m.evec and m.fvec are not preallocated which is same with the stereoCalibrate example. And intr.cameraMatrix is (3,3,CV_64F) and intr.distCoeffs is (8,1,CV_64F), objPoints is computed from the checkerboard which stores the 3d position of corners and all z value for point is zero.
After reading advice from #Josh, I modify the code as plain output mat object which are in CV_64F, but it still throws this assertion.
cv::Mat R, t, e, f;
double error = cv::stereoCalibrate(
objPoints, cali0.imgPoints, cali1.imgPoints,
camera0.intr.cameraMatrix, camera0.intr.distCoeffs,
camera1.intr.cameraMatrix, camera1.intr.distCoeffs,
cali0.imgSize, R, t, e, f,
cv::TermCriteria(CV_TERMCRIT_ITER + CV_TERMCRIT_EPS, 100, 1e-5));
Finally I solved this problem, as a reminder, make sure the camera parameters you passed in are not const type....
Why go for assembly? OpenCV is open source and you can check the code you're calling here:
If you get assertion fails in OpenCV it's usually because you've passed a matrix with an incorrect shape. OpenCV is extremely picky. The assertion fail is on an OutputArray, so checking the function signature there are four possible culprits:
OutputArray _Rmat, OutputArray _Tmat, OutputArray _Emat, OutputArray _Fmat
The sizing is done inside cv::stereoCalibrate here:
_Rmat.create(3, 3, rtype);
_Tmat.create(3, 1, rtype);
<-- snipped -->
if( _Emat.needed() )
_Emat.create(3, 3, rtype);
p_matE = &(c_matE = _Emat.getMat());
if( _Fmat.needed() )
_Fmat.create(3, 3, rtype);
p_matF = &(c_matF = _Fmat.getMat());
The assertion is being triggered in one of these calls, the code is here:
Try passing in plain Mat objects without preallocating their shape.
Im trying to make a neural network. I have followed the video from
I have loaded the training set.
I am now on my way of training and i have these lines where the code fails.
i = 0
for start, end in zip(range(0, len(trX), 128), range(128, len(trX), 128)):
tr = trX[start:end]
self.cost = train(tr.reshape(tr.shape[0],tr.shape[1]), trY[start:end])
I am strugling with an error message which is:
File "C:\Users\Bjornars\PycharmProjects\cogs-118a\Project\NN\", line 101, in training
self.cost = train(tr.reshape(128,106), trY[start:end])
File "C:\Anaconda3\lib\site-packages\theano\compile\", line 513, in call
File "C:\Anaconda3\lib\site-packages\theano\tensor\", line 169, in filter
TypeError: ('Bad input argument to theano function with name "C:\Users\Bjornars\PycharmProjects\cogs-118a\Project\NN\" at index 1(0-based)', 'Wrong number of dimensions: expected 2, got 1 with shape (128,).')
The shape of the array im sending in is (5000,106)
Used this, it expected array not number in trY
def preprocess(self,trDmatrix,labels):
for i in range(len(trDmatrix)):
numbers = [0.0]*2
numbers[int(labels[i])]= 1.0
labels[i] = numbers
return trDmatrix, labels