Update model : 3-way interaction terms not dropped - nlme

My question is highly related to this one:
R update() interaction term not dropped
However, I don't have multiple categories in my predictor variables, so I don't understand how my issue relates to the answer. Maybe I'm just not understanding it...
I'd like to remove the insignificant 3-way interaction terms in a model reduction process one at a time.
However, the following happens:
model1 <- lme(sum.leafmass ~ stand.td.Sept.2017*stand.wtd.Sept.2017*I((stand.td.Sept.2017)^2)*I((stand.wtd.Sept.2017)^2), random = ~1|block/fence, method="ML", data=subset(Total.CiPEHR, species=="EV"), na.action=na.omit)
model2 <- update(model1,.~.-stand.td.Sept.2017:stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2))
summary(model2) ##works correctly to eliminate insignificant 4-way interactions
summary(model2)
DF t-value p-value
(Intercept) 4 3.849259 0.0183
stand.td.Sept.2017 4 -1.436666 0.2242
stand.wtd.Sept.2017 4 -2.921806 0.0432
I((stand.td.Sept.2017)^2) 4 4.594303 0.0101
I((stand.wtd.Sept.2017)^2) 4 -0.313197 0.7698
stand.td.Sept.2017:stand.wtd.Sept.2017 4 -1.301935 0.2629
stand.td.Sept.2017:I((stand.td.Sept.2017)^2) 4 1.853451 0.1374
stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2) 4 4.354757 0.0121
stand.td.Sept.2017:I((stand.wtd.Sept.2017)^2) 4 -0.028199 0.9789
stand.wtd.Sept.2017:I((stand.wtd.Sept.2017)^2) 4 1.598564 0.1852
I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2) 4 -1.683214 0.1676
stand.td.Sept.2017:stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2) 4 1.972616 0.1198
stand.td.Sept.2017:stand.wtd.Sept.2017:I((stand.wtd.Sept.2017)^2) 4 -1.635314 0.1773
stand.td.Sept.2017:I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2) 4 2.190518 0.0936
stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2) 4 -0.968249 0.3877
##attempt to remove insignificant 3-way interaction
model3 <- update(model2,.~.,-stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2))
summary(model3)
DF t-value p-value
(Intercept) 4 3.849259 0.0183
stand.td.Sept.2017 4 -1.436666 0.2242
stand.wtd.Sept.2017 4 -2.921806 0.0432
I((stand.td.Sept.2017)^2) 4 4.594303 0.0101
I((stand.wtd.Sept.2017)^2) 4 -0.313197 0.7698
stand.td.Sept.2017:stand.wtd.Sept.2017 4 -1.301935 0.2629
stand.td.Sept.2017:I((stand.td.Sept.2017)^2) 4 1.853451 0.1374
stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2) 4 4.354757 0.0121
stand.td.Sept.2017:I((stand.wtd.Sept.2017)^2) 4 -0.028199 0.9789
stand.wtd.Sept.2017:I((stand.wtd.Sept.2017)^2) 4 1.598564 0.1852
I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2) 4 -1.683214 0.1676
stand.td.Sept.2017:stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2) 4 1.972616 0.1198
stand.td.Sept.2017:stand.wtd.Sept.2017:I((stand.wtd.Sept.2017)^2) 4 -1.635314 0.1773
stand.td.Sept.2017:I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2) 4 2.190518 0.0936
stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2) 4 -0.968249 0.3877
##3-way interaction term still there.
Why won't the interaction term drop? The predictor variables are continuous and so should be independent from each other, right..?
Someone please explain if I'm not understanding something basic here...

Solved my own question.
Dummy syntax error. (had an incorrect comma in the .~. portion)
###Incorrect syntax.
model3 <- update(model2,.~.,-stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2))
###Correct syntax.
model3 <- update(model2,.~.-stand.wtd.Sept.2017:I((stand.td.Sept.2017)^2):I((stand.wtd.Sept.2017)^2))

Related

Thoughts: time series modeling with fable and cross validation

I am building a time series model using fable and cross validation to determine the best model definition to use. Is there a risk of modeling
model(ETS(GDP))
vs
model(ETS(GDP ~ error('A') + trend('A') + season('A')) and other ETS methods
I am asking this because when I perused the mable from **model(ETS(GDP))**, the chosen model was different among some .id. For example, ETS(A, A, A) for id = 1, ETS(A, Ad, A) for id = 2, etc. If this is the case, is it correct to define all the variants of ETS in order to ensure consistency?
Here is a mable I am referring to:
# A mable: 7 x 5
# Key: .id, LOB [7]
.id LOB ETS ETS_Exponential ARIMA_Exponential
<int> <chr> <model> <model> <model>
1 1 LG <ETS(A,N,N)> <ETS(A,N,N)> <ARIMA(0,0,1) w/ mean>
2 2 LG <ETS(M,N,N)> <ETS(A,N,N)> <ARIMA(0,0,1) w/ mean>
3 3 LG <ETS(A,N,N)> <ETS(A,N,N)> <ARIMA(0,0,1) w/ mean>
4 4 LG <ETS(A,N,N)> <ETS(A,N,N)> <ARIMA(0,0,1) w/ mean>
5 5 LG <ETS(A,N,N)> <ETS(M,N,N)> <ARIMA(0,0,1) w/ mean>
6 6 LG <ETS(A,N,N)> <ETS(M,N,N)> <ARIMA(0,0,0) w/ mean>
7 7 LG <ETS(A,N,N)> <ETS(M,N,N)> <ARIMA(0,0,0) w/ mean>
Thanks.
Why would you want the models to be the same? For example, if you wanted to compare model parameters for some reason, then you might want to fit the same model to all series. But if you just want good forecasts, you are probably better off having different models for different series -- some will be trended, some will be seasonal, etc., and you probably need to allow for that.
If in doubt, you could try both approaches and see which one gives the best forecasts (assuming that is what your ultimate purpose is here).

Prime factorization of integers with Maxima

I want to use Maxima to get the prime factorization of a random positive integer, e.g. 12=2^2*3^1.
What I have tried so far:
a:random(20);
aa:abs(a);
fa:ifactors(aa);
ka:length(fa);
ta:1;
pfza: for i:1 while i<=ka do ta:ta*(fa[i][1])^(fa[i][2]);
ta;
This will be implemented in STACK for Moodle as part of a online exercise for students, so the exact implementation will be a little bit different from this, but I broke it down to these 7 lines.
I generate a random number a, make sure that it is a positive integer by using aa=|a|+1 and want to use the ifactors command to get the prime factors of aa. ka tells me the number of pairwise distinct prime factors which I then use for the while loop in pfza. If I let this piece of code run, it returns everything fine, execpt for simplifying ta, that is I don't get ta as a product of primes with some exponents but rather just ta=aa.
I then tried to turn off the simplifier, manually simplifying everything else that I need:
simp:false$
a:random(20);
aa:ev(abs(a),simp);
fa:ifactors(aa);
ka:ev(length(fa),simp);
ta:1;
pfza: for i:1 while i<=ka do ta:ta*(fa[i][1])^(fa[i][2]);
ta;
This however does not compile; I assume the problem is somewhere in the line for pfza, but I don't know why.
Any input on how to fix this? Or another method of getting the factorizing in a non-simplified form?
(1) The for-loop fails because adding 1 to i requires 1 + 1 to be simplified to 2, but simplification is disabled. Here's a way to make the loop work without requiring arithmetic.
(%i10) for f in fa do ta:ta*(f[1]^f[2]);
(%o10) done
(%i11) ta;
2 2 1
(%o11) ((1 2 ) 2 ) 3
Hmm, that's strange, again because of the lack of simplification. How about this:
(%i12) apply ("*", map (lambda ([f], f[1]^f[2]), fa));
2 1
(%o12) 2 3
In general I think it's better to avoid explicit indexing anyway.
(2) But maybe you don't need that at all. factor returns an unsimplified expression of the kind you are trying to construct.
(%i13) simp:true;
(%o13) true
(%i14) factor(12);
2
(%o14) 2 3
I think it's conceptually inconsistent for factor to return an unsimplified, but anyway it seems to work here.

Postgres vs MongoDB on modeling recursive multi-child relations

I am considering using either of the following stack for a personal project:
Nodes.js/MongoDB (learning)
Rails/Postgres (more familiar)
I would like to give MongoDB a try for learning purposes, but I am unsure if it is suitable for this problem. I would like to hear the trade-off and examples based on the following problem description, some specific questions are at the bottom:
There are a list of Products, let's say p1, p2, p3, and each product has the fields for some environmental impact, let's say A, B, C.
p1 p2
+ +
| |
| |
+------------------+ +----+----+
| | | | |
+ + + + +
p3 p4 p5 p3 p6
+ + |
| | |
+-----+-+ +---+--+ +---+--+
+ + + + + +
p7 p8 p2 p9 p10 p11
p1.A = p3.A + p4.A + p5.A
p1.B = p3.B + p4.B + p5.B
p3.A = p7.A + p8.A
Product Table would look something like this
id A B C parents children
1 4 5 6 [] [3, 4, 5]
2 10 11 12 [4] [3, 6]
3 6 7 8 [1,2] [7, 8]
4 3 9 6 [1] [2, 9]
5 3 3 10 [1] [10, 11]
6 3 1 2 [2] []
7 4 5 0 [3] []
...
Updates Process would look like this:
p1 is made of p2 and p3.
p2 is also made of p3
If p3 A, B, or C updates, it would trigger a p1 update to recalculate its A, B, C, although maybe still with old p2's value. Then when p3 updates p2, p2 updates will trigger the p1 updates again. There could be some redundant operations in the updates depending on the ordering. I am guessing that is ok.
Since the environmental impact is not a critical data, I am just looking that the data becomes eventually consistent.
In terms of scale, maybe tens of thousands of products at some point.
Questions:
1) I need to way to prevent infinite update cycle in a circular graph.
2) Can you handle this type of two-way associations in MongoDB easily, product has parents that are products, and children that are products.
3) What are the different approaches I can structure my data instead of parents and child arrays, and design this update process efficiently. If I design it such that when one product update, trigger another update, which trigger another update and the chain goes on, that could potentially make a long web request cycle?
Thanks.
Your model is best described as a directed graph
G = (V,E) ; V->P ; E = VxV
Therefore neither PostgreSQL or MongoDB is really good for your use case.
The biggest advantage of MongoDB compared to traditional RDBMS like PostgreSQL is the dynamic schema of MongoDB. That means you can add new records with various structure without re-defining the database schema.
But the model you have described is pretty static to me. So the argument doesn't count for your problem.
As far as I am concerned the best technology decision in your case is to use a graph database like Neo4j.
As alternative inspiration you could take a look on graph data structures. Therefore one efficient way to model a graph is the use of an adjacency matrix.

Generating means of a variable using dummy variables & foreach in Stata

My dataset includes TWO main variables X and Y.
Variable X represents distinct codes (e.g. 001X01, 001X02, etc) for multiple computer items with different brands.
Variable Y represents the tax charged for each code of variable X (e.g. 15 = 15% for 001X01) at a store.
I've created categories for these computer items using dummy variables (e.g. HD dummy variable for Hard-Drives, takes value of 1 when variable X represents a HD, etc). I have a list of over 40 variables (two of them representing X and Y, and the rest is a bunch of dummy variables for the different categories I've created for computer items).
I would like to display the averages of all these categories using a loop in Stata, but I'm not sure how to do this.
For example the code:
mean Y if HD == 1
Mean estimation Number of obs = 5
--------------------------------------------------------------
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
Tax | 7.1 2.537716 1.154172 15.24583
gives me the mean Tax for the category representing Hard Drives. How can I use a loop in Stata to automatically display all the mean Taxes charged for each category? I would do it by hand without a problem, but I want to repeat this process for multiple years, so I would like to use a loop for each year in order to come up with this output.
My goal is to create a separate Excel file with each of the computer categories I've created (38 total) and the average tax for each category by year.
Why bother with the loop and creating the indicator variables? If I understand correctly, your initial dataset allows the use of a simple collapse:
clear all
set more off
input ///
code tax str10 categ
1 0.15 "hd"
2 0.25 "pend"
3 0.23 "mouse"
4 0.29 "pend"
5 0.16 "pend"
6 0.50 "hd"
7 0.54 "monitor"
8 0.22 "monitor"
9 0.21 "mouse"
10 0.76 "mouse"
end
list
collapse (mean) tax, by(categ)
list
To take to Excel you can try export excel or put excel.
Run help collapse and help export for details.
Edit
Because you insist, below is an example that gives the same result using loops.
I assume the same data input as before. Some testing using this example database
with expand 1000000, shows that speed is virtually the same. But almost surely,
you (including your future you) and your readers will prefer collapse.
It is much clearer, cleaner and concise. It is even prettier.
levelsof categ, local(parts)
gen mtax = .
quietly {
foreach part of local parts {
summarize tax if categ == "`part'", meanonly
replace mtax = r(mean) if categ == "`part'"
}
}
bysort categ: keep if _n == 1
keep categ mtax
Stata has features that make it quite different from other languages. Once you
start getting a hold of it, you will find that many things done with loops elsewhere,
can be made loop-less in Stata. In many cases, the latter style will be preferred.
See corresponding help files using help <command> and if you are not familiarized with saved results (e.g. r(mean)), type help return.
A supplement to Roberto's excellent answer: After collapse, you will need a loop to export the results to excel.
levelsof categ, local(levels)
foreach x of local levels {
export excel `x', replace
}
I prefer to use numerical codes for variables such as your category variable. I then assign them value labels. Here's a version of Roberto's code which does this and which, for closer correspondence to your problem, adds a "year" variable
input code tax categ year
1 0.15 1 1999
2 0.25 2 2000
3 0.23 3 2013
4 0.29 1 2010
5 0.16 2 2000
6 0.50 1 2011
7 0.54 4 2000
8 0.22 4 2003
9 0.21 3 2004
10 0.76 3 2005
end
#delim ;
label define catl
1 hd
2 pend
3 mouse
4 monitor
;
#delim cr
label values categ catl
collapse (mean) tax, by(categ year)
levelsof categ, local(levels)
foreach x of local levels {
export excel `:label (categ) `x'', replace
}
The #delim ; command makes it possible to easily list each code on a separate line. The"label" function in the export statement is an extended macro function to insert a value label into the file name.

minimal number of d flip-flops required for first seven Fibonacci numbers

I encountered a problem while preparing for a test.
What is the minimal number of d flip-flops required (along) with combinational logic to design a counter circuit that outputs the first seven Fibonacci numbers and then wraps around?
A) 3
B) 4
C) 5
D) 6
E) 7
My answer B
Seven Fibonacci numbers => 1 1 2 3 5 8 13.
To count to 13, we will need to 4 flip-flops hence 4 was my choice.
But the correct answer given is solutions was A.
Could someone please explain?

Resources