"warning: basinhopping: local minimization failure" when doing maximum entropy optimization - machine-learning

I'm minimizing a multiple-variable function using scipy.optimize.basinhopping as part of maximum entropy optimization. It works but occasionally I get local minimization failure warning messages. I'm trying to figure out whether they can be ignored:
basinhopping step 0: f 4.46112e+06
basinhopping step 1: f 4.4611e+06 trial_f 4.4611e+06 accepted 1 lowest_f 4.4611e+06
found new global minimum on step 1 with function value 4.4611e+06
basinhopping step 2: f 4.4611e+06 trial_f 4.4611e+06 accepted 1 lowest_f 4.4611e+06
found new global minimum on step 2 with function value 4.4611e+06
basinhopping step 3: f 4.4611e+06 trial_f 4.4611e+06 accepted 1 lowest_f 4.4611e+06
found new global minimum on step 3 with function value 4.4611e+06
basinhopping step 4: f 4.4611e+06 trial_f 4.4611e+06 accepted 1 lowest_f 4.4611e+06
found new global minimum on step 4 with function value 4.4611e+06
basinhopping step 5: f 4.4611e+06 trial_f 4.4611e+06 accepted 1 lowest_f 4.4611e+06
found new global minimum on step 5 with function value 4.4611e+06
basinhopping step 6: f 4.4611e+06 trial_f 4.4611e+06 accepted 1 lowest_f 4.4611e+06
found new global minimum on step 6 with function value 4.4611e+06
warning: basinhopping: local minimization failure
basinhopping step 7: f 4.4611e+06 trial_f 4.4611e+06 accepted 1 lowest_f 4.4611e+06
found new global minimum on step 7 with function value 4.4611e+06
<snip>
At the end of the optimization, I get this:
fun: 4461103.852803631
lowest_optimization_result: fun: 4461103.852803631
hess_inv: <8x8 LbfgsInvHessProduct with dtype=float64>
jac: array([ 0.021769, 0.028649, -0.000128, -0.056923, -0.002773, -0.045319, -0.013719, -0.02872 ])
message: 'ABNORMAL_TERMINATION_IN_LNSRCH'
nfev: 20
nit: 0
njev: 20
status: 2
success: False
x: array([ -31870.962047, -24718.132626, -29847.253508, -141243.035888, 172068.711537, 48290.798176, -7548.406846, -736.399023])
message: ['requested number of basinhopping iterations completed successfully']
minimization_failures: 5
nfev: 1230
nit: 100
njev: 1230
x: array([ -31870.962047, -24718.132626, -29847.253508, -141243.035888, 172068.711537, 48290.798176, -7548.406846, -736.399023])
It seems the global search worked but the local optimization failed 5 times.
Here's a link to the optimization result. The red curve is the maximum entropy optimization result using the green curve as the initial guess, note the artifact at 69 keV.
I'm thinking the artifact might be related to the 'local minimization failure' warning. If so, what could be done to address the location optimization failure? I would appreciate any thoughts on how to investigate this.

Related

Performing a one to many join in R dplyr

How to do a one to one-to-many join without any keysin r using dplyr?
I have two tables:
origin<-tribble(~"o",
1,2)
destination<-tribble(~"d",
5,
6,
7)
I want to merge both of them without any keys like the following:
od<- tribble(~"o",~"d",
1,5,
1,6,
1,7,
2,5,
2,6,
2,7)
Can anyone help me out with this?
You can use slice and rep to repeat the rows in origin based on the length of destination. Then, inside of bind_cols, we can create a list and repeat the values in destination based on the length of origin; then, bind them together.
library(tidyverse)
origin %>%
slice(rep(1:n(), each = nrow(destination[, 1]))) %>%
bind_cols(., d = unlist(rep(
c(destination[, 1]), times = nrow(origin)
)))
Output
# A tibble: 6 × 2
o d
<dbl> <dbl>
1 1 5
2 1 6
3 1 7
4 2 5
5 2 6
6 2 7
tidyr::crossing and expand_grid can give you a cross join of two dataframes.
tidyr::crossing(origin, destination)
#tidyr::expand_grid(origin, destination)
# o d
# <dbl> <dbl>
#1 1 5
#2 1 6
#3 1 7
#4 2 5
#5 2 6
#6 2 7

Interacting in agda-mode with agda?

It feels super awkward to interact with agda.
Consider the proof state:
_ = begin
5 ∸ 3
≡⟨⟩
4 ∸ 2 ≡⟨⟩
3 ∸ 1 ≡⟨⟩
2 ∸ 0 ≡⟨⟩ { 2 <cursor-goes-here> }0
When I type C-c C-l (type-check), it says
?0 : 2 ∸ 0 ≡ _y_131
_y_131 : ℕ [ at /home/bollu/work/plfa/src/plfa/part1/Naturals.lagda.md:586,5-10 ]
which doesn't seem like a great error? Nor does a refine (C-c C-r) give me a good error message: It only tells me:
cannot refine
How do I get adga to tell me:
You've finished the proof, except for a missing \qed
In general, what is the "preferred mode of interaction" when building proofs?
The overall issue
Your post starts by the following assumption:
It feels super awkward to interact with agda.
The reason that could explain your feeling is that you seem to assume that Agda can both infer a term and its type, in other words, both the property you wish to prove and a proof of it. Agda can often do one of these, but asking for both does not make much sense. As a comparison, imagine being on a bench in a park, when a complete strangers comes and sits next to you, saying nothing. You can see he would very much enjoy to ask you something, but, despite your efforts at making him speak, he remains silent. After a few minutes, the stranger yells at you that, despite him being thirsty, you did not bring the drink he was expected. In this metaphor, the stranger is you, and you are Agda. There is no way you could have known he was thirsty, and even less bring him his drink.
Concretely
You gave the following piece of code:
_ = begin
5 ∸ 3 ≡⟨⟩
4 ∸ 2 ≡⟨⟩
3 ∸ 1 ≡⟨⟩
2 ∸ 0 ≡⟨⟩ { 2 <cursor-goes-here> }0
This piece of code lacks a type signature which will allow Agda to help you more. Agda tells you so when you type check by providing you with the inferred type of the goal:
?0 : 2 ∸ 0 ≡ _y_131
_y_131 : ℕ [ at /home/bollu/work/plfa/src/plfa/part1/Naturals.lagda.md:586,5-10 ]
Here Agda says that your proof goal is that 2 ∸ 0 is equal to some unknown natural number y. This number being unknown there is very little chance Agda can help you go further in your proof effort because it does not even know what you wish to prove. As far as it knows, your goal could turn out to be 5 ∸ 3 ≡ 3 for wish there exists no proof term.
Getting back to our metaphor, you lacks the statement "I am thirsty". Should the stranger provide this piece of information, you could - possibly - react, which means Agda can try and help.
The solution
I'm assuming you wish to prove that the result of your subtraction is two, in which case the code is as follows:
test : 5 ∸ 3 ≡ 2
test = begin
5 ∸ 3 ≡⟨⟩
4 ∸ 2 ≡⟨⟩
3 ∸ 1 ≡⟨⟩
2 ∸ 0 ≡⟨⟩ {!!}
In this case, you can interact with Agda in various ways, which all lead to Agda providing you with a sound proof term:
You can call Agsy to solve the problem for you (CTRL-c CTRL-a), which leads to:
test : 5 ∸ 3 ≡ 2
test = begin
5 ∸ 3 ≡⟨⟩
4 ∸ 2 ≡⟨⟩
3 ∸ 1 ≡⟨⟩
2 ∸ 0 ≡⟨⟩ refl
You can try and refine the goal directly (CTRL-c CTRL-r), asking Agda if there exists any unique constructor which has the right type, which leads to the same:
test : 5 ∸ 3 ≡ 2
test = begin
5 ∸ 3 ≡⟨⟩
4 ∸ 2 ≡⟨⟩
3 ∸ 1 ≡⟨⟩
2 ∸ 0 ≡⟨⟩ refl
If you wish to wrap up your proof using \qed you can try and input _∎ into the hole after which refining (CTRL-c CTRL-r) gives:
test : 5 ∸ 3 ≡ 2
test = begin
5 ∸ 3 ≡⟨⟩
4 ∸ 2 ≡⟨⟩
3 ∸ 1 ≡⟨⟩
2 ∸ 0 ≡⟨⟩ {!!} ∎
Calling Agsy in the resulting goal naturally gives:
test : 5 ∸ 3 ≡ 2
test = begin
5 ∸ 3 ≡⟨⟩
4 ∸ 2 ≡⟨⟩
3 ∸ 1 ≡⟨⟩
2 ∸ 0 ≡⟨⟩ 2 ∎

Display summary statistics in date format using frmttable

I am trying to use the community-contributed command frmttable in Stata to generate a table summary statistics of date variables.
However, when I execute the command, the summary statistics are not in the date format, but rather are integers. I would like them to be displayed in a MDY format: %dtNN/DD/CCYY
The problem is shown below:
Step Dates
-------------------
Step Date
-------------------
Step 1 17,206
Step 2 17,241
Step 3 17,258
Step 4 17,619
Step 5 17,958
Step 6 18,401
Step 7 18,464
Step 8 18,976
Step 9 18,965
Step 10 19,243
Step 11 19,064
-------------------
I am not considering other table exporting commands since frmttable gives me the most flexibility. I am also trying to export the table into LaTeX.
Example data can be found below:
* Example generated by -dataex-. To install: ssc install dataex
clear
input double Step_n float Date
2 17206
2 17234
3 17241
3 17339
4 17258
4 17626
5 17619
5 17619
5 18155
6 17958
6 19339
7 18401
7 18662
8 18464
8 19001
8.5 18976
8.5 19267
9 18965
9.5 19243
10 19064
10 20227
end
format %tdNN/DD/CCYY Date
The code I used is the following:
matrix m1 = J(11,1,.)
local i = 1
foreach s of numlist 2/8 8.5 9 9.5 10 {
quietly summarize Date if Step_n==`s'
matrix m1[`i',1]=r(min)
local i = `i' + 1
}
matrix rownames m1 = "Step 1" "Step 2" "Step 3" "Step 4" ///
"Step 5" "Step 6" "Step 7" "Step 8" "Step 9" "Step 10" "Step 11"
matrix list m1, format(%tdNN/DD/CCYY)
frmttable using m1.tex, statmat(m1) title("Step Dates") ///
sdec(0) ctitle("Step","Date") replace tex
The community-contributed command frmttable is used to produce tables for summary statistics, the format of which can be specified by the sfmt() option.
However, as its help file suggests, in its current version this does not support date formats:
"...fmtgrid has the form fmt[,fmt...] [\ fmt[,fmt...] ...]], where fmt is either e, f, fc, g, or gc..."
An attempt to run frmttable with such a format specified confirms this:
. frmttable, statmat(m1) sfmt(%tdNN/DD/CCYY)
sfmt contains elements other than "e","f","g","fc", and "gc"
r(198);
The community-contributed command esttab offers an out-of-the-box solution:
esttab matrix(m1, fmt(%tdNN/DD/CCYY)), nomtitles ///
collabel("Date") ///
title("Step Dates") ///
tex
\begin{table}[htbp]\centering
\caption{Step Dates}
\begin{tabular}{l*{1}{c}}
\hline\hline
& Date \\
\hline
Step 1 & 02/09/2007\\
Step 2 & 03/16/2007\\
Step 3 & 04/02/2007\\
Step 4 & 03/28/2008\\
Step 5 & 03/02/2009\\
Step 6 & 05/19/2010\\
Step 7 & 07/21/2010\\
Step 8 & 12/15/2011\\
Step 9 & 12/04/2011\\
Step 10 & 09/07/2012\\
Step 11 & 03/12/2012\\
\hline\hline
\end{tabular}
\end{table}

GLMM glmer and glmmADMB - comparison error

I am trying to compare if there are differences in the number of obtained seeds in five different populations with different applied treatments, and having maternal plant and paternal plant as random effects. First I tried to fit a glmer model.
dat <-dat [,c(12,7,6,13,8,11)]
dat$parents<-factor(paste(dat$mother,dat$father,sep="_"))
compareTreat <- function(d)
{
d$treatment <-factor(d$treatment)
print (tapply(d$pop,list(d$pop,d$treatment),length))
print(summary(fit<-glmer(seed_no~treatment+(1|pop/mother)+
(1|pop/father),data=d,family="poisson")))
}
Then, I compared two treatments in two populations (pop 64 and pop 121, in that case). The other populations do not have this particular treatments, so I get NA values for those.
compareTreat(subset(dat,treatment%in%c("IE 5x","IE 7x")&pop%in%c(64,121)))
This is the output:
IE 5x IE 7x
10 NA NA
45 NA NA
64 31 27
121 33 28
144 NA NA
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: poisson ( log )
Formula: seed_no ~ treatment + (1 | pop/mother) + (1 | pop/father)
Data: d
AIC BIC logLik deviance df.resid
592.5 609.2 -290.2 580.5 113
Scaled residuals:
Min 1Q Median 3Q Max
-1.8950 -0.8038 -0.2178 0.4440 1.7991
Random effects:
Groups Name Variance Std.Dev.
father.pop (Intercept) 3.566e-01 5.971e-01
mother.pop (Intercept) 9.456e-01 9.724e-01
pop (Intercept) 1.083e-10 1.041e-05
pop.1 (Intercept) 1.017e-10 1.008e-05
Number of obs: 119, groups: father:pop, 81; mother:pop, 24; pop, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.74664 0.24916 2.997 0.00273 **
treatmentIE 7x -0.05789 0.17894 -0.324 0.74629
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
tretmntIE7x -0.364
It seems there are no differences between treatments. But as there are many zeros in the data, a zero-inflated model would be worthy to try. I tried with glmmabmd, and I wrote the script like this:
compareTreat<-function(d)
{
d$treatment<-factor(d$treatment)
print(tapply(d$pop,list(d$pop,d$treatment), length))
print(summary(fit_zip<-glmmadmb(seed_no~treatment + (1|pop/mother)+
(1|pop/father),data=d,family="poisson", zeroInflation=TRUE)))
}
Then I compared again the treatments. Here I have not changed the code.
compareTreat(subset(dat,treatment%in%c("IE 5x","IE 7x")&pop%in%c(64,121)))
But in that case, the output is
IE 5x IE 7x
10 NA NA
45 NA NA
64 31 27
121 33 28
144 NA NA
Error in pop:father : NA/NaN argument
In addition: Warning messages:
1: In pop:father :
numerical expression has 119 elements: only the first used
2: In pop:father :
numerical expression has 119 elements: only the first used
3: In eval(parse(text = x), data) : NAs introduced by coercion
Called from: eval(parse(text = x), data)
I tried to change everything I came up with, but I still don't know where the problem is.
If I remove the (1|pop/father) from the glmmadmb script, the model runs, but it feels not correct. I wonder if the mistake is in the loop prior to the glmmadmb but it worked OK in the glmer model, or if it is in the comparison itself after the model. I tried as well to remove NAs with na.omit in case that was an issue, but it did not make a difference. Why does the script stop and does not continue running?
I am a student beginner with RStudio, my version is 3.4.2, called Short Summer. If someone with experience could point me in the right direction I would be very grateful!
H.

Kdb/Q Group By Minimum gives infinity

Kdb calculates infinity for null column if group by is performed.
t:([]a: 1 1 2;b: 3 2 0n)
select min b by a from t
a
1 2.0
2 ow
ow is infinity.
Is there any way I can get null(0n) for 2
From Jeff Borror's q for mortals:
q)min 0N 5 0N 1 3 / nulls are ignored
1
q)min 0N 0N / infinity if all null
0W
http://code.kx.com/q/ref/stats-aggregates/#min-minimum
That's the expected result; you need to update afterwards:
update b:?[0w=b;0N;b] from select min b by a from t
You should be careful when operating with nulls. Note the following
as additional info:
q)max 0N 0N
-0W
q)min 0N 0N
0W
q)0N+2
0N
q)sum 0N 2
2
q)sum 0N 0N
0

Resources