I am trying to load an Images Dataset using the PyTorch dataloader, but the resulting transformations are tiled, and don't have the original images cropped to the center as I am expecting them.
transform = transforms.Compose([transforms.Resize(224),
transforms.CenterCrop(224),
transforms.ToTensor()])
dataset = datasets.ImageFolder('ml-models/downloads/', transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True)
images, labels = next(iter(dataloader))
import matplotlib.pyplot as plt
plt.imshow(images[6].reshape(224, 224, 3))
The resulting image is tiled, and not center cropped.[![as shown in the Jupyter snapshot here][1]][1]
Is there something wrong in the provided transformation? (Image shown below on link: )
[1]: https://i.stack.imgur.com/HtrIa.png
Pytorch stores tensors in channel-first format, so a 3 channel image is a tensor of shape (3, H, W). Matplotlib expects data to be in channel-last format i.e. (H, W, 3). Reshaping does not rearrange the dimensions, for that you need Tensor.permute.
plt.imshow(images[6].permute(1, 2, 0))
Related
How can I iterate all over pixels of this binary image.
I want to skeletonize or thinning this "white" line but still do not know how to iterate all over the pixels
I'm not actually know what kind of "skeletonize", "thinning" you mean.
This is a simple way to make it s
import cv2
import numpy as np
img = cv2.imread("B0xTI.png")
kernel = np.ones((3,3), np.uint8)
thinning = cv2.erode(img, kernel, iterations = 2)
cv2.imwrite("thinning.jpg",thinning)
output image:
I want to train MNIST on VGG16.
MNIST image size is 28*28 and I set the input size to 32*32 in keras VGG16. When I train I get good metrics, but I´m not sure what really happens. Is keras filling in with empty space or is the image being expanded linearly, like in a zoom function? Anyone understands how I can get a test accuracy of +95% after 60 epochs?
Here I define target size:
target_size = (32, 32)
This is where I define my flow_from_dataframe generator:
train_df = pd.read_csv("cv1_train.csv", quoting=3)
train_df_generator = train_image_datagen.flow_from_dataframe(
dataframe=train_df,
directory="../../../MNIST",
target_size=target_size,
class_mode='categorical',
batch_size=batch_size,
shuffle=False,
color_mode="rgb",
classes=["zero","one","two","three","four","five","six","seven","eight","nine"]
)
Here I define my input size:
model_base = VGG16(weights=None, include_top=False,
input_shape=(32, 32, 3), classes=10)
The images would be simply resized to the specified target_size. This has been clearly stated in the documentation:
target_size: tuple of integers (height, width), default: (256, 256). The dimensions to which all images found will be resized.
You can also inspect the source code and find the relevant part in the load_img function. Also the default interpolation method used to resize the images is nearest. You can find more information about various interpolation methods here (MATLAB) or here (PIL).
I'm doing a binary classification with CNNs and the data is imbalanced where the positive medical image : negative medical image = 0.4 : 0.6. So I want to use SMOTE to oversample the positive medical image data before training.
However, the dimension of the data is 4D (761,64,64,3) which cause the error
Found array with dim 4. Estimator expected <= 2
So, I reshape my train_data:
X_res, y_res = smote.fit_sample(X_train.reshape(X_train.shape[0], -1), y_train.ravel())
And it works fine. Before feed it to CNNs, I reshape it back by:
X_res = X_res.reshape(X_res.shape[0], 64, 64, 3)
Now, I'm not sure is it a correct way to oversample and will the reshape operator change the images' structer?
I had a similar issue. I had used the reshape function to reshape the image (basically flattened the image)
X_train.shape
(8000, 250, 250, 3)
ReX_train = X_train.reshape(8000, 250 * 250 * 3)
ReX_train.shape
(8000, 187500)
smt = SMOTE()
Xs_train, ys_train = smt.fit_sample(ReX_train, y_train)
Although, this approach is pathetically slow, but helped to improve the performance.
As soon as you flatten an image you are loosing localized information, this is one of the reasons why convolutions are used in image-based machine learning.
8000x250x250x3 has an inherent meaning - 8000 samples of images, each image of width 250, height 250 and all of them have 3 channels when you do 8000x250*250*3 reshape is just a bunch of numbers unless you use some kind of sequence network to teach its bad.
oversampling is bad for image data, you can do image augmentations (20crop, introducing noise like a gaussian blur, rotations, translations, etc..)
First Flatten the image
Apply SMOTE on this flattened image data and its labels
Reshape the flattened image to RGB image
from imblearn.over_sampling import SMOTE
sm = SMOTE(random_state=42)
train_rows=len(X_train)
X_train = X_train.reshape(train_rows,-1)
(80,30000)
X_train, y_train = sm.fit_resample(X_train, y_train)
X_train = X_train.reshape(-1,100,100,3)
(>80,100,100,3)
I am trying to detect the number of contours on this image. Ideally supposed to be 3 but due to noise I was not getting idle result. Hence i tried to blur the image before thresholding it as below:
import numpy as np
import cv2
img= cv2.imread('Inkedblueimagewithdot.jpg')
cv2.imshow('original',img)
blur= cv2.pyrMeanShiftFiltering(img,21,49)
gray_image= cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
ret,thresh= cv2.threshold(gray_image,70,255,cv2.THRESH_BINARY)
_, contours,hierarchy =cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
print(len(contours))
contourimage=cv2.drawContours(img,contours,-1,(255,255,255),20)
cv2.imshow('countors',contourimage)
cv2.waitKey(0)
cv2.destroyAllWindows()
output is:
2
This is the input image:
This is the input image
This is the output image:This is the output image
In order to obtain 3 contours , you could use cv2.RETR_LIST. It lists all the contours present in the binary image irrespective of any hierarchy as mentioned here
To answer the second question, you could try setting an area constraint such that contours below a certain area would be discarded. For the image provided I set an area of 4000:
for i, c in enumerate(contours):
if cv2.contourArea(c) > 4000:
x, y, w, h = cv2.boundingRect(c)
roi = image[y :y + h, x : x + w ]
cv2.imshow('cropped_region', roi)
cv2.waitKey(0)
Expected result:
im trying to fit the data with the following shape to the pretrained keras vgg19 model.
image input shape is (32383, 96, 96, 3)
label shape is (32383, 17)
and I got this error
expected block5_pool to have 4 dimensions, but got array with shape (32383, 17)
at this line
model.fit(x = X_train, y= Y_train, validation_data=(X_valid, Y_valid),
batch_size=64,verbose=2, epochs=epochs,callbacks=callbacks,shuffle=True)
Here's how I define my model
model = VGG16(include_top=False, weights='imagenet', input_tensor=None, input_shape=(96,96,3),classes=17)
How did maxpool give me a 2d tensor but not a 4D tensor ? I'm using the original model from keras.applications.vgg16. How can I fix this error?
Your problem comes from VGG16(include_top=False,...) as this makes your solution to load only a convolutional part of VGG. This is why Keras is complaining that it got 2-dimensional output insted of 4-dimensional one (4 dimensions come from the fact that convolutional output has shape (nb_of_examples, width, height, channels)). In order to overcome this issue you need to either set include_top=True or add additional layers which will squash the convolutional part - to a 2d one (by e.g. using Flatten, GlobalMaxPooling2D, GlobalAveragePooling2D and a set of Dense layers - including a final one which should be a Dense with size of 17 and softmax activation function).