Expanding a list contained in a column, so that each element of the list corresponds to its own column and is represented as a binary variable

Expanding a list contained in a column, so that each element of the list corresponds to its own column and is represented as a binary variable - binary-data

I have a dataframe that looks like this:
skill_list name profile 561 904 468 875 737 402 882...
[561, 564, 632, 859] Aaron Weidele wordpress developer 0 0 0 0 0 0 0
[737, 399, 882, 1086, 5...]Abdelrady Tantawy full stack developer 0 0 0 0 0 0 0
[904, 468, 783, 1120, 8...]Abhijeet A Mulgund machine learning dev... 0 0 0 0 0 0 0 [468] Abhijeet Tiwari salesforce programmi... 0 0 0 0 0 0 0
[518, 466, 875, 445, 402..]Abhimanyu Veer A...machine learning devel...0 0 0 0 0 0 0
The skill_list column contains a list of encoded skills, which correspond to a developer. I would like to expand each list contained within the skill_list column, so that each encoded skill is represented within its own column as a binary variable (1 for on and 0 for off). Expected output would be:
skill_list name profile 561 904 468 875 737 402 882...
[561, 564, 632, 859] Aaron Weidele wordpress developer 1 0 0 0 0 0 0
[737, 399, 882, 1086, 5...]Abdelrady Tantawy full stack developer 0 0 0 0 1 0 1
[904, 468, 783, 1120, 8...]Abhijeet A Mulgund machine learning dev... 0 1 1 0 0 0 0 [468] Abhijeet Tiwari salesforce programmi... 0 0 1 0 0 0 0
[518, 466, 875, 445, 402..]Abhimanyu Veer A...machine learning devel...0 0 0 0 0 1 0
I've tried:
for index, row in df_vector_matrix["skill_list"].items():
for item in row:
for col in df_vector_matrix.columns:
if item == col:
df_vector_matrix.loc[item, col] = "1"
else:
0
I would really appreciate the help!

You can try MultiLabelBinarizer from sklearn.
Below example might help.
from sklearn.preprocessing import MultiLabelBinarizer
lb = MultiLabelBinarizer()
lb_res = lb.fit_transform(df_vector_matrix['skill_list'])
# convert result into dataframe
res = pd.DataFrame(lb_res,columns=lb.classes_)
# concatenate data result and original dataframe
df_vector_matrix = pd.concat([df_vector_matrix,res],axis=1)
Below is with the example dataframe, where the col column have list values.
>>> import pandas as pd
>>> from sklearn.preprocessing import MultiLabelBinarizer
>>> d ={'col':[[1,2,3],[2,3,4,5],[2]],'name':['abc','vdf','rt']}
>>> df = pd.DataFrame(d)
>>> df
col name
0 [1, 2, 3] abc
1 [2, 3, 4, 5] vdf
2 [2] rt
>>> lb = MultiLabelBinarizer()
>>> lb_res = lb.fit_transform(df['col'])
>>> res = pd.DataFrame(lb_res,columns=lb.classes_)
>>> pd.concat([df,res],axis=1)
col name 1 2 3 4 5
0 [1, 2, 3] abc 1 1 1 0 0
1 [2, 3, 4, 5] vdf 0 1 1 1 1
2 [2] rt 0 1 0 0 0
>>>

Related

Yolov5 model not able to train

I'm making a model to detect potholes in an image. I've done everything right or so it seems to me, but I can't train the model for some reason. What might be the problem here?
!python train.py --img 640 --cfg yolov5m.yaml --hyp data/hyps/hyp.scratch-med.yaml --batch 20 --epochs 300 --data data/potholeData.yaml --weights yolov5m.pt --workers 4 --name yolo_pothole_det_m
This is the final line of the code, which outputs the following.
train: weights=yolov5m.pt, cfg=yolov5m.yaml, data=data/potholeData.yaml, hyp=data/hyps/hyp.scratch-med.yaml, epochs=300, batch_size=20, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=4, project=runs/train, name=yolo_pothole_det_m, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
github: up to date with https://github.com/ultralytics/yolov5 ✅
YOLOv5 🚀 v7.0-23-g5dc1ce4 Python-3.9.13 torch-1.13.0 CPU
hyperparameters: lr0=0.01, lrf=0.1, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.3, cls_pw=1.0, obj=0.7, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.9, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.1, copy_paste=0.0
ClearML: run 'pip install clearml' to automatically track, visualize and remotely train YOLOv5 🚀 in ClearML
Comet: run 'pip install comet_ml' to automatically track and visualize YOLOv5 🚀 runs in Comet
TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
Overriding model.yaml nc=80 with nc=1
from n params module arguments
0 -1 1 5280 models.common.Conv [3, 48, 6, 2, 2]
1 -1 1 41664 models.common.Conv [48, 96, 3, 2]
2 -1 2 65280 models.common.C3 [96, 96, 2]
3 -1 1 166272 models.common.Conv [96, 192, 3, 2]
4 -1 4 444672 models.common.C3 [192, 192, 4]
5 -1 1 664320 models.common.Conv [192, 384, 3, 2]
6 -1 6 2512896 models.common.C3 [384, 384, 6]
7 -1 1 2655744 models.common.Conv [384, 768, 3, 2]
8 -1 2 4134912 models.common.C3 [768, 768, 2]
9 -1 1 1476864 models.common.SPPF [768, 768, 5]
10 -1 1 295680 models.common.Conv [768, 384, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 2 1182720 models.common.C3 [768, 384, 2, False]
14 -1 1 74112 models.common.Conv [384, 192, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 2 296448 models.common.C3 [384, 192, 2, False]
18 -1 1 332160 models.common.Conv [192, 192, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 2 1035264 models.common.C3 [384, 384, 2, False]
21 -1 1 1327872 models.common.Conv [384, 384, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 2 4134912 models.common.C3 [768, 768, 2, False]
24 [17, 20, 23] 1 24246 models.yolo.Detect [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [192, 384, 768]]
Isn't it supposed to train the model after that? What am I doing wrong for it to stop it right here?

in cmd you can see that it didn't read any images dataset. make sure that your potholedata.yaml file true located. in this file you have to write this code:
train: ../train/images #path to train images
val: ../valid/images #path to valid images
nc: 1 #number of classes
names: ['Weapon'] #name of classes
After this you can run and your train will continue

CNN for non-image data

I am trying to create a model from this https://machinelearningmastery.com/cnn-models-for-human-activity-recognition-time-series-classification/ example that takes as inputs 3 (to unbug, there will be 1000s) inputs which are arrays of dimension (17,40):
[[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 5 5 5]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]]
the output is a single integer between 0 and 8:
[[6]
[3]
[1]]
I use a CNN as follows:
X_train, X_test, y_train, y_test = train_test_split(Xo, Yo)
print("Xtrain", X_train)
print("Y_train", y_train)
print("Xtest", X_test)
print("Y_test", y_test)
print("X_train.shape[1]", X_train.shape[1])
print("X_train.shape[2]", X_train.shape[2])
#print("y_train.shape[1]", y_train.shape[1])
verbose, epochs, batch_size = 1, 10, 10
n_timesteps, n_features, n_outputs = X_train.shape[1], X_train.shape[2], 1
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_timesteps,n_features)))
model.add(Conv1D(filters=64, kernel_size=2, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
it gives me the following error:
ValueError: You are passing a target array of shape (6, 1)
but in fact it should take only 1 value as output.
Why do I have such error message when it is only supposed to take 1 value as output ?

The Softmax layer size should be equal to the number of classes. Your Softmax layer has only 1 output.
For this classification problem, first of all, you should turn your targets to a one-hot encoded format, then edit the size of the Softmax layer to the number of classes.

CIFilter/CIContext gives different results on Simulator and Device

I am using (relatively) simple filters on ios using coreimage.
However, on the Simulator I get my expected results but the device gives slightly different output.
The filter causing the issue is CIEdgeWork. But it is combined with some other filters.
The code just adds a set of CIFilters to a CIImage (created by loading PNG data).
The filters are; CIColorClamp -> CIColorInvert -> Composited over CIConstantColorGenerator (white) -> CIEdgeWork
The resulting CIImage is rendered out using CIContext.pngRepresentation(…) and displayed in a UIImageView.
Outputting the debug descriptions of the CIImage shows only one (minor?) difference.
Simulator
<CIImage: 0x600000ae4d20 extent [0 0 240 240]>
crop [0 0 240 240] extent=[0 0 240 240]
colorkernel _edgeWorkContrast(src,contrast=15.3272) extent=[infinite]
kernel _cubicUpsample10(src,scale=[0.25 0.25 0 0]) extent=[infinite]
kernel _gaussianBlur3(src,offset0=[0 0 0 0]) extent=[infinite]
kernel _gaussianBlur3(src,offset0=[0 0 0 0]) extent=[infinite]
kernel _gaussianReduce4(src,scale=[1 4 0 1]) extent=[infinite]
kernel _gaussianReduce4(src,scale=[4 1 1 0]) extent=[infinite]
colorkernel _edgeWork(src,blurred) extent=[infinite]
colorkernel _srcOver(src,dst) extent=[infinite] <0>
premultiply extent=[0 0 240 240]
colorkernel _colorClampAP(c,lo=[0 0 0 0],hi=[0 0 0 1]) extent=[0 0 240 240]
unpremultiply extent=[0 0 240 240]
affine [1 0 0 -1 0 240] extent=[0 0 240 240]
colormatch "sRGB IEC61966-2.1"_to_workingspace extent=[0 0 240 240]
IOSurface 0x6000033ec410(1) seed:0 RGBA8 alpha_unpremul extent=[0 0 240 240]
fill [1 1 1 1 devicergb] extent=[infinite][0 0 1 1] opaque
kernel _cubicUpsample10(src,scale=[0.25 0.25 0 0]) extent=[infinite]
kernel _gaussianBlur3(src,offset0=[0 0 0 0]) extent=[infinite]
kernel _gaussianBlur3(src,offset0=[0 0 0 0]) extent=[infinite]
kernel _gaussianReduce4(src,scale=[1 4 0 1]) extent=[infinite]
kernel _gaussianReduce4(src,scale=[4 1 1 0]) extent=[infinite]
<0>
Device
<CIImage: 0x280743ba0 extent [0 0 240 240]>
crop [0 0 240 240] extent=[0 0 240 240]
colorkernel _edgeWorkContrast(src,contrast=15.3272) extent=[infinite]
kernel _cubicUpsample10(src,scale=[0.25 0.25 0 0]) extent=[infinite]
kernel _gaussianBlur3(src,offset0=[0 0 0 0]) extent=[infinite]
kernel _gaussianBlur3(src,offset0=[0 0 0 0]) extent=[infinite]
kernel _gaussianReduce4(src,scale=[1 4 0 1]) extent=[infinite]
kernel _gaussianReduce4(src,scale=[4 1 1 0]) extent=[infinite]
colorkernel _edgeWork(src,blurred) extent=[infinite]
colorkernel _srcOver(src,dst) extent=[infinite] <0>
premultiply extent=[0 0 240 240]
colorkernel _colorClampAP(c,lo=[0 0 0 0],hi=[0 0 0 1]) extent=[0 0 240 240]
unpremultiply extent=[0 0 240 240]
affine [1 0 0 -1 0 240] extent=[0 0 240 240]
colormatch "sRGB IEC61966-2.1"_to_workingspace extent=[0 0 240 240]
IOSurface 0x28074d5c0(70) seed:1 RGBA8 extent=[0 0 240 240]
fill [1 1 1 1 devicergb] extent=[infinite][0 0 1 1] opaque
kernel _cubicUpsample10(src,scale=[0.25 0.25 0 0]) extent=[infinite]
kernel _gaussianBlur3(src,offset0=[0 0 0 0]) extent=[infinite]
kernel _gaussianBlur3(src,offset0=[0 0 0 0]) extent=[infinite]
kernel _gaussianReduce4(src,scale=[1 4 0 1]) extent=[infinite]
kernel _gaussianReduce4(src,scale=[4 1 1 0]) extent=[infinite]
<0>
The lines that differ are:
Simulator
IOSurface 0x6000033ec410(1) seed:0 RGBA8 alpha_unpremul extent=[0 0 240 240]
Device
IOSurface 0x28074d5c0(70) seed:1 RGBA8 extent=[0 0 240 240]
I don't know enough about graphics to say if this will have an effect.
What did I try?
My first (uneducated) guess was the simulator was using CPU and the Device was using GPU. So, using this code to try and force the device to use the CPU (it had no effect):
// ... Create CIImage with filters, set to `outputCIImage`
guard let colourSpace = CGColorSpace(name: CGColorSpace.sRGB) else { throw RenderError.failedToCreateColourSpace }
let context = CIContext(options: [CIContextOption.useSoftwareRenderer : true])
guard let png = context.pngRepresentation(of: outputCIImage, format: .RGBA8, colorSpace: colourSpace, options: [:]) else { throw RenderError.failedToCreatePNGData }
// ... return `png` as Data, then used in `UIImage(data: ...)`

Iterations vs. Kernel Size in Morphological Operations (OpenCV)

I've been using morph opening in OpenCV to reduce noise outside of my ROI in images via opencv, and until now, whenever I need a higher degree of noise reduction I just randomly increase kernel size or increase the number of iterations until I'm happy. Is there a significant difference in results depending on which you increase / how would you decide which to change in a given situation? I'm trying to come up with a better approach to which parameter I change (by how much) other than guess-and-check.

It depends on the kernel type. For dilating or eroding with an odd-square kernel, there is no difference whether you increase the size or increase the iterations (assuming values which would make them equal are used). For example:
>>> M = np.zeros((7,7), dtype=np.uint8)
>>> M[3,3] = 1
>>> k1 = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
>>> M1 = cv2.dilate(M, k1, iterations=2)
>>> k2 = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
>>> M2 = cv2.dilate(M, k2, iterations=1)
>>> M1
[[0 0 0 0 0 0 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 0 0 0 0 0 0]]
>>> M2
[[0 0 0 0 0 0 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 0 0 0 0 0 0]]
And this is fairly intuitive. A 3x3 rectangular kernel for dilating will find any white pixel, and turn the neighboring pixels white. So it's easy to see that doing this twice will make any single white pixel turn into a 5x5 block of white pixels. Here we're assuming the center pixel is the one that is compared---the anchor---but this could be changed, which could affect the result. For example, suppose you were comparing two iterations of a (2, 2) kernel with a single iteration of a (3, 3) kernel:
>>> M = np.zeros((5, 5), dtype=np.uint8)
>>> M[2,2] = 1
>>> k1 = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
>>> M1 = cv2.dilate(M, k1, iterations=2)
>>> k2 = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
>>> M2 = cv2.dilate(M, k2, iterations=1)
>>> M1
[[0 0 0 0 0]
[0 0 0 0 0]
[0 0 1 1 1]
[0 0 1 1 1]
[0 0 1 1 1]]
>>> M2
[[0 0 0 0 0]
[0 1 1 1 0]
[0 1 1 1 0]
[0 1 1 1 0]
[0 0 0 0 0]]
You can see that while it creates the shape (intuitive), they're not in the same place (non-intuitive). And that's because the anchor of a (2, 2) kernel cannot be in the center of the kernel---in this case we see that with a centered pixel, the neighbors that dilate are only to the bottom-right, since it has to choose a direction because it can only expand the single pixel to fill up a (2, 2) square.
Things become even more tricky with non-rectangular shaped kernels. For example:
>>> M = np.zeros((5, 5), dtype=np.uint8)
>>> M[2,2] = 1
>>> k1 = cv2.getStructuringElement(cv2.MORPH_CROSS, (3, 3))
>>> M1 = cv2.dilate(M, k1, iterations=2)
>>> k2 = cv2.getStructuringElement(cv2.MORPH_CROSS, (5, 5))
>>> M2 = cv2.dilate(M, k2, iterations=1)
>>> M1
[[0 0 1 0 0]
[0 1 1 1 0]
[1 1 1 1 1]
[0 1 1 1 0]
[0 0 1 0 0]]
>>> M2
[[0 0 1 0 0]
[0 0 1 0 0]
[1 1 1 1 1]
[0 0 1 0 0]
[0 0 1 0 0]]
The first pass of M1 creates a small cross 3 pixels high, 3 pixels wide. But then each one of those pixels creates a cross at their location, which actually creates a diamond pattern.
So to sum up for basic morphological operations, with rectangular kernels, at least odd-dimensioned ones, the result is the same---but for other kernels, the result is different. You can apply the other morphological operations to simple examples like this to get a hang of how they behave and which you should be using and how to increase their effects.

Using scipy.ndimage.uniform_filter to find stars in astro photo, but puzzled by results

I am searching for stars in an umage of night sky, after making a mask I use scipy.ndimage.uniform_filter at different sizes to find the stars. It looks to work reasonably well, but I expected once I used a small enough size, I would just get more hits as I reduced the size further, but it doesn't do this consistently, I am just a bit baffled by this.
There is an extract from around one of the hit areas at the bottom
The code below gives me:
size: 3, len: 621
size: 4, len: 340
size: 5, len: 200
size: 6, len: 0
size: 7, len: 0
size: 8, len: 24
size: 9, len: 8
size: 10, len: 0
size: 11, len: 0
size: 12, len: 0
Why do size 6 & 7 give zero hits? This seems totally bizarre to me!
def __init__(self, filename):
self.good=False
self.img = scipy.ndimage.imread(filename, flatten=True)
def checkcandidates(self, meanfact=3.0, maxwindow=25):
mask = self.img > self.img.mean()*meanfact
for wsize in range(3,maxwindow):
m2 = scipy.ndimage.uniform_filter(mask, size=wsize)
xc,yc = m2.nonzero()
print("size: %d, len: %d" %(wsize, len(xc)))
Here's part of the mask centred on one of the stars:
>>> sc1.showCoords(1360,493,10,usemask=True)
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

This looks like a bug, or at least a nasty implementation detail that will result in bugs in users' code.
First, read the note in the uniform_filter docstring:
The multi-dimensional filter is implemented as a sequence of
one-dimensional uniform filters. The intermediate arrays are stored
in the same data type as the output. Therefore, for output types
with a limited precision, the results may be imprecise because
intermediate results may be stored with insufficient precision.
So let's look at how one row of your input array is processed by uniform_filter1d for different size filters.
Here's a small one-dimensional input:
In [416]: x
Out[416]: array([0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0])
Apply uniform_filter1d with increasing sizes:
In [417]: from scipy.ndimage.filters import uniform_filter1d
In [418]: uniform_filter1d(x, 3)
Out[418]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0])
In [419]: uniform_filter1d(x, 4)
Out[419]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0])
In [420]: uniform_filter1d(x, 5)
Out[420]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0])
In [421]: uniform_filter1d(x, 6)
Out[421]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
In [422]: uniform_filter1d(x, 7)
Out[422]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
In [423]: uniform_filter1d(x, 8)
Out[423]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0])
In [424]: uniform_filter1d(x, 9)
Out[424]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Like your example, the output is all zeros when the size is 6 or 7.
I suspect this is a floating point precision problem. Note what happens when we make the input an array of floating point values:
In [439]: f = uniform_filter1d(x.astype(float), 6)
In [440]: f
Out[440]:
array([ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
1.66666667e-01, 3.33333333e-01, 5.00000000e-01,
6.66666667e-01, 8.33333333e-01, 1.00000000e+00,
1.00000000e+00, 1.00000000e+00, 1.00000000e+00,
8.33333333e-01, 6.66666667e-01, 5.00000000e-01,
3.33333333e-01, 1.66666667e-01, 5.55111512e-17,
5.55111512e-17, 5.55111512e-17])
In [441]: f.max()
Out[441]: 0.99999999999999989
So the intermediate values computed using floating point do not give the expected value of 1 in the "middle" of that output. When this array is converted back to the input data type (int), the result is all zeros:
In [442]: f.astype(int)
Out[442]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Given that behavior, I recommend converting your input array to floating point before calling uniform_filter, and adding a final step that converts the result back to integers in a way that you control, and that matches how you want to classify a "hit". Or even use a different function altogether.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Expanding a list contained in a column, so that each element of the list corresponds to its own column and is represented as a binary variable - binary-data

Related

Yolov5 model not able to train

CNN for non-image data

CIFilter/CIContext gives different results on Simulator and Device

Iterations vs. Kernel Size in Morphological Operations (OpenCV)

Using scipy.ndimage.uniform_filter to find stars in astro photo, but puzzled by results

Categories

Resources