to practice Wiener deconvolution, I'm trying to perform a simple deconvolution:
def div(img1 ,img2):
res = np.zeros(img2.shape, dtype = 'complex_')
for i in range (img2.shape[0]):
for j in range (img2.shape[0]):
if (np.abs(img2[i][j]) > 0.001):
res[i][j] = 1 / (img2[i][j])
res[i][j] = 0.001
return res
filtre = np.asarray([[1,1,1],
[1,1,1]]) * 1/9
filtre_freq = fft2(filtre)
v = signal.convolve(img, filtre)
F = div(1,(filtre_freq))
f = ifft2(F)
res = signal.convolve(v, f)
I am trying to compute the inverse filter in the frequency domain, pass it to the spatial domain and do the convolution with the inverse filter. On paper it's pretty simple, even if I have to manage the divisions by 0 without really knowing how to do it.
But my results seem really inconsistent:
If anyone can enlighten me on this ... Thanks in advance and have a great evening.


Modifying the loss in ppo in stable-baselines3

I'm trying to implement an addition to the loss function of the ppo algorithm in stable-baselines3. For this I collected additional observations for the states s(t-10) and s(t+1) which I can access in the train-function of the PPO class in as part of the rollout_buffer.
I'm using a 3-layer-mlp as my network architecture and need the outputs of the second layer for the triplet (s(t-α), s(t), s(t+1)) to use them to calculate L = max(d(s(t+1) , s(t)) − d(s(t+1) , s(t−α)) + γ, 0), where d is the L2-distance.
Finally I want to add this term to the old loss, so loss = loss + 0.3 * L
This is my implementation starting with the original loss in line 242:
loss = policy_loss + self.ent_coef * entropy_loss + self.vf_coef * value_loss
net1 = nn.Sequential(*list(self.policy.mlp_extractor.policy_net.children())[:-1])
L_losses = []
a = 0
obs = rollout_data.observations
obs_alpha = rollout_data.observations_alpha
obs_plusone = rollout_data.observations_plusone
inds = rollout_data.inds
for i in inds:
if i > alpha: # only use observations for which L can be calculated
fs_t = net1(obs[a])
fs_talpha = net1(obs_alpha[a])
fs_tone = net1(obs_plusone[a])
L = max(
th.norm(th.subtract(fs_tone, fs_t)) - th.norm(th.subtract(fs_tone, fs_talpha)) + 1.0, 0.0)
a += 1
L_loss = th.mean(th.FloatTensor(L_losses))
loss += 0.3 * L_loss
So with net1 I tried to get a clone of the original network with the outputs from the second layer. I am unsure if this is the right way to do this.
I do have some questions about my approach as the resulting performance is slightly worse compared to without the added term although it should be slightly better:
Is my way of getting the outputs of the second layer of the mlp network working?
When loss.backward() is called can the gradient be calculated correctly (with the new term included)?

Need a vectorized solution in pytorch

I'm doing an experiment using face images in PyTorch framework. The input x is the given face image of size 5 * 5 (height * width) and there are 192 channels.
Objective: To obtain patches of x of patch_size(given as argument).
I have obtained the required result with the help of two for loops. But I want a better-vectorized solution so that the computation cost will be very less than using two for loops.
Used: PyTorch 0.4.1, (12 GB) Nvidia TitanX GPU.
The following is my implementation using two for loops
def extractpatches( x, patch_size): # x is bsx192x5x5
patches = x.unfold( 2, patch_size , 1).unfold(3,patch_size,1)
bs,c,pi,pj, _, _ = patches.size() #bs,192,
cnt = 0
p = torch.empty((bs,pi*pj,c,patch_size,patch_size)).to(device)
s = torch.empty((bs,pi*pj, c*patch_size*patch_size)).to(device)
//Want a vectorized method instead of two for loops below
for i in range(pi):
for j in range(pj):
p[:,cnt,:,:,:] = patches[:,:,i,j,:,:]
s[:,cnt,:] = p[:,cnt,:,:,:].view(-1,c*patch_size*patch_size)
cnt = cnt+1
return s
Thanks for your help in advance.
I think you can try this as following. I used some parts of your code for my experiment and it worked for me. Here l and f are the lists of tensor patches
l = [patches[:,:,int(i/pi),i%pi,:,:] for i in range(pi * pi)]
f = [l[i].contiguous().view(-1,c*patch_size*patch_size) for i in range(pi * pi)]
You can verify the above code using toy input values.

Efficient way to extract and collect a random subsample of a generator in Julia

Consider a generator in Julia that if collected will take a lot of memory
g=(x^2 for x=1:9999999999999999)
I want to take a random small subsample (Say 1%) of it, but I do not want to collect() the object because will take a lot of memory
Until now the trick I was using was this
temp=collect((( rand()>0.01 ? nothing : x ) for x in g))
random_sample= temp[temp.!=nothing]
But this is not efficient for generators with a lot of elements, collecting something with so many nothing elements doesnt seem right
Any idea is highly appreciated. I guess the trick is to be able to get random elements from the generator without having to allocate memory for all of it.
Thank you very much
You can use a generator with if condition like this:
[v for v in g if rand() < 0.01]
or if you want a bit faster, but more verbose approach (I have hardcoded 0.01 and element type of g and I assume that your generator supports length - otherwise you can remove sizehint! line):
function collect_sample(g)
r = Int[]
sizehint!(r, round(Int, length(g) * 0.01))
for v in g
if rand() < 0.01
push!(r, v)
Here you have examples of self avoiding sampler and reservoir sampler giving you fixed output size. The smaller fraction of the input you want to get the better it is to use self avoiding sampler:
function self_avoiding_sampler(source_size, ith, target_size)
rng = 1:source_size
idx = rand(rng)
x1 = ith(idx)
r = Vector{typeof(x1)}(undef, target_size)
r[1] = x1
s = Set{Int}(idx)
sizehint!(s, target_size)
for i = 2:target_size
while idx in s
idx = rand(rng)
#inbounds r[i] = ith(idx)
push!(s, idx)
function reservoir_sampler(g, target_size)
r = Vector{Int}(undef, target_size)
for (i, v) in enumerate(g)
if i <= target_size
#inbounds r[i] = v
j = rand(1:i)
if j < target_size
#inbounds r[j] = v

shape of input to calculate information gain

I want to calculate the information gain on 20_newsgroup data set.
I am using the code here(also I put a copy of the code down of the question).
As you see the input to the algorithm is X,y
My confusion is that, X is going to be a matrix with documents in rows and features as column. (according to 20_newsgroup it is 11314,1000
in case i only considered 1000 features).
but according to the concept of information gain, it should calculate information gain for each feature.
(So I was expecting to see the code in a way loop through each feature, so the input to the function be a matrix where rows are features and columns are class)
But X is not feature here but X stands for documents, and I can not see the part in the code that take care of this part! ( I mean considering each document, and then going through each feature of that document; like looping through rows but at the same time looping through columns as the features are stored in columns).
I have read this and this and many similar questions but they are not clear in terms of input matrix shape.
this is the code for reading 20_newsgroup:
newsgroup_train = fetch_20newsgroups(subset='train')
X,y =,
cv = CountVectorizer(max_df=0.99,min_df=0.001, max_features=1000,stop_words='english',lowercase=True,analyzer='word')
X_vec = cv.fit_transform(X)
(X_vec.shape) is (11314,1000) which is not features in the 20_newsgroup data set. I am thinking am I calculating Information gain in a incorrect way?
This is the code for Information gain:
def information_gain(X, y):
def _calIg():
entropy_x_set = 0
entropy_x_not_set = 0
for c in classCnt:
probs = classCnt[c] / float(featureTot)
entropy_x_set = entropy_x_set - probs * np.log(probs)
probs = (classTotCnt[c] - classCnt[c]) / float(tot - featureTot)
entropy_x_not_set = entropy_x_not_set - probs * np.log(probs)
for c in classTotCnt:
if c not in classCnt:
probs = classTotCnt[c] / float(tot - featureTot)
entropy_x_not_set = entropy_x_not_set - probs * np.log(probs)
return entropy_before - ((featureTot / float(tot)) * entropy_x_set
+ ((tot - featureTot) / float(tot)) * entropy_x_not_set)
tot = X.shape[0]
classTotCnt = {}
entropy_before = 0
for i in y:
if i not in classTotCnt:
classTotCnt[i] = 1
classTotCnt[i] = classTotCnt[i] + 1
for c in classTotCnt:
probs = classTotCnt[c] / float(tot)
entropy_before = entropy_before - probs * np.log(probs)
nz = X.T.nonzero()
pre = 0
classCnt = {}
featureTot = 0
information_gain = []
for i in range(0, len(nz[0])):
if (i != 0 and nz[0][i] != pre):
for notappear in range(pre+1, nz[0][i]):
ig = _calIg()
pre = nz[0][i]
classCnt = {}
featureTot = 0
featureTot = featureTot + 1
yclass = y[nz[1][i]]
if yclass not in classCnt:
classCnt[yclass] = 1
classCnt[yclass] = classCnt[yclass] + 1
ig = _calIg()
return np.asarray(information_gain)
Well, after going through the code in detail, I learned more about X.T.nonzero().
Actually it is correct that information gain needs to loop through features.
Also it is correct that the matrix scikit-learn give us here is based on doc-features.
in code it uses X.T.nonzero() which technically transform all the nonzero values into array. and then in the next row loop through the length of that array range(0, len(X.T.nonzero()[0]).
Overall, this part X.T.nonzero()[0] is returning all the none zero features to us :)

Linear Regression not optimizing for non linear data

I am new to ML and am trying my hands on Linear regression. I am using this dataset. The data and my "optimized" model look like this:
I am modifying the data like this:
X = np.vstack((np.ones((X.size)),X,X**2))
Y = np.log10 (Y)
#have tried roots of Y and 3 degree feature as well
Intial cost: 0.8086672720475084
Optimized cost: 0.7282965408177141
I am unable to optimize further no matter the no. of runs.
Increasing learning rate causes increase in cost.
My rest algorithm is fine as I am able to optimize for a simpler dataset. Shown Below:
Sorry, If this is something basic but I can't seem to find a way to optimize my model for original data.
Pls have look at my code, I don't why its not working
def GradientDescent(X,Y,theta,alpha):
m = X.shape[1]
h = Predict(X,theta)
gradient =,(h - Y))
gradient.shape = (gradient.size,1)
gradient = gradient/m
theta = theta - alpha*gradient
cost = CostFunction(X,Y,theta)
return theta,cost
def CostFunction(X,Y,theta):
m = X.shape[1]
h = Predict(X,theta)
cost = h - Y
cost = np.sum(np.square(cost))/(2*m)
return cost
def Predict(X,theta):
h = np.transpose(X).dot(theta)
return h
x is 2,333
y is 333,1
I tried debugging it again but I can't find it. Pls help me.
