reduce_max function in tensorflow - machine-learning

Screenshot
>>> boxes = tf.random_normal([ 5])
>>> with s.as_default():
... s.run(boxes)
... s.run(keras.backend.argmax(boxes,axis=0))
... s.run(tf.reduce_max(boxes,axis=0))
...
array([ 0.37312034, -0.97431135, 0.44504794, 0.35789603, 1.2461706 ],
dtype=float32)
3
0.856236
.
Why am I getting 0.8564. I expect the value to be 1.2461. since 1.2461 is big.right?
I am getting correct answer if i use tf.constant.
But I am not getting correct answer while using radom_normal

Each time a new boxes is regenerated when you run s.run() with radom_normal. So your three results are different. If you want to get consistent results, you should only run s.run() once.
result = s.run([boxes,keras.backend.argmax(boxes,axis=0),tf.reduce_sum(boxes,axis=0)])
print(result[0])
print(result[1])
print(result[2])
#print
[ 0.69957364 1.3192859 -0.6662426 -0.5895929 0.22300807]
1
0.9860319
In addition, the code should be given in text format rather than picture format.

TensorFlow is different from numpy because TF only uses symbolic operations. That means when you instantiate the random_normal, you don't get numeric values, but a symbolic normal distribution, so each time you evaluate it, you get different numbers.
Each time you operate with this distribution, with any other operation, you are getting different numbers, and that explains the results you see.

Related

Splitting a string variable delimited list into individual binary variables in SPSS

I have a string variable created from a checkbox questions (Which of the following assets do you own?)
I am trying to create individual binary variables for each type of asset based on whether that number is present in the string list.
The syntax I am using cannot differentiate between 1 and 11.
do repeat wrd="1," ",2," ",3," ",4," ",5," ",6," ",7,"/NewVar= W3_CG_asset_TV_1 W3_CG_asset_radio_2 W3_CG_asset_payTV_3 W3_CG_asset_tel_4 W3_CG_asset_cellphone_5
W3_CG_asset_fridge_6 W3_CG_asset_freezer_7.
compute NewVar=char.index(W3_CG_HouseExpen1, wrd)>0.
end repeat.
do repeat wrd= ",8," ",9," ",10," ",11," ",12," ",13," ",14," ",15," ",16," ",17," ",18," ",19," /NewVar= W3_CG_asset_electricstove_8 W3_CG_asset_primusstove_9
W3_CG_asset_gasstove_10 W3_CG_asset_electrickettle_11 W3_CG_asset_microwave_12 W3_CG_asset_computer_13 W3_CG_asset_electricity_14 W3_CG_asset_geyser_15
W3_CG_asset_washingmachine_16 W3_CG_asset_workingvehicle_17 W3_CG_asset_bicycle_18 W3_CG_asset_donkeyhorse_19.
compute NewVar=char.index(W3_CG_HouseExpen1, wrd)>0.
end repeat.
I have tested this on SPSS 28.
make sure the column W3_CG_HouseExpen1 is string and length of it long enough to hold the data.
Then I added execute
data list list/W3_CG_HouseExpen1 (a50).
begin data
"1,2,11,12,"
"2,12,"
"1,2,"
"1,11,12,"
end data.
do repeat
wrd="1," ",2," ",3," ",4," ",5," ",6," ",7," ",8," ",9," ",10," ",11," ",12," ",13," ",14," ",15," ",16," ",17," ",18," ",19,"
/NewVar = W3_CG_asset_TV_1 W3_CG_asset_radio_2 W3_CG_asset_payTV_3 W3_CG_asset_tel_4 W3_CG_asset_cellphone_5 W3_CG_asset_fridge_6 W3_CG_asset_freezer_7
W3_CG_asset_electricstove_8 W3_CG_asset_primusstove_9 W3_CG_asset_gasstove_10 W3_CG_asset_electrickettle_11 W3_CG_asset_microwave_12 W3_CG_asset_computer_13 W3_CG_asset_electricity_14 W3_CG_asset_geyser_15
W3_CG_asset_washingmachine_16 W3_CG_asset_workingvehicle_17 W3_CG_asset_bicycle_18 W3_CG_asset_donkeyhorse_19.
compute NewVar=char.index(W3_CG_HouseExpen1, wrd)>0.
end repeat.
EXECUTE.
My suggestion is to run through this in reverse, erasing the values you've already recognized. So if you've got "11" and erased it, when you later search for "1" you won't find it in an "11".
I recreated a tiny exaple dataset to demonstrate on (EDIT-improved example):
data list list/W3_CG_HouseExpen1 (a50) .
begin data
"1,2,11,12,"
"11,12,"
"2,11,"
end data.
Now I do the whole process on a copy of the original W3_CG_HouseExpen1 variable so I can eat it away without damage to the original data:
string #temp(a50).
compute #temp=W3_CG_HouseExpen1.
do repeat wrd="12," "11," "2," "1," /NewVar= W3_12 W3_11 W3_2 W3_1.
compute NewVar=char.index(#temp, wrd)>0.
compute #temp=replace(#temp, wrd, ""). /*deleting the search string from the full string.
end repeat.
exe.

Error when trying to get tidy summaries with broomstick for multiple randomForest models

I'm grouping my data frame and fitting each group's data with a random forest model, and then using broomstick to get tidy outputs for each group's model. I'm running into trouble when I get to tidy and augment.
I can group the data and fit the models.
library(tidyverse)
library(broomstick)
library(randomForest)
data<-data.frame(y=rep(rep(c(1,0),each=100),5),
group=rep(c("A","B","C","D","E"), each=200),
x1=rnorm(2000),
x2=rnorm(2000),
x3=rnorm(2000),
x4=rnorm(2000),
x5=rnorm(2000))
GroupModels<-data%>%
nest(data= -group)%>%
mutate(fit = map(data, ~ randomForest(y ~ ., ntree=101, mtry=2, data = .x, importance=TRUE)))
I then map glance to the fitted models and that works. I get mse and rsq for each group.
GroupModels%>%
mutate(glanced = map(fit, glance))%>%
unnest(glanced)%>%
select(-data, -fit)%>%
as.data.frame()
If I map tidy to the fitted models I get an output and a deprecation warning and I don't understand where tibble::as_tibble() should come into play.
GroupModels%>%
mutate(tidied = map(fit, tidy))%>%
unnest(tidied)%>%
select(-data, -fit)%>%
as.data.frame()
1: Problem with mutate() column tidied. i tidied = map(fit, tidy). i This function is deprecated as of broom 0.7.0 and will be
removed from a future release. Please see tibble::as_tibble().
If I map augment to the models I get an error and I'm not sure what to do with that.
GroupModels%>%
mutate(augmented = map(fit, augment))%>%
unnest(augmented)%>%
select(-data, -fit)%>%
as.data.frame()
Error: Problem with mutate() column augmented. i augmented = map(fit, augment). x argument must be coercible to non-negative
integer
I can now get augment to work using "map2", didn't know about this, but it's handy when you need both the fit and the data for a function. I guess I'll worry about the deprecation warning when it happens.
GroupModels%>%
mutate(augmented = map2(fit, data, augment))%>%
unnest(augmented)%>%
select(-data, -fit)%>%
as.data.frame()

Can't interpret SPSS Error Message in MATRIX code

A program to run a Schmid-Leiman transformation using SPSS's Matrix language was published in 2005 by Woolf & Preising in Behavior Research Methods volume 37, pages 48 to 58). It is probably not important for you to know what a Schmid-Leiman transformation is, but I'll explain in comments if you feel it is necessary.
In modifying the program for my own data, I'm getting an error I can't figure out:
Error # 12302 in column 12. Text: ,
Syntax error.
Execution of this command stops.
Error in RIGHT HAND SIDE of COMPUTE command.
The MATRIX statement skipped.
Here is the beginning of the code. The error is showing as coming in Line 6:
* Encoding: UTF-8.
* Schmid-Leiman Solution for 2 level higher-order Factor analysis.
Matrix.
* ENTER YOUR SPECIFICATIONS HERE.
* Enter first-order pattern matrix.
Compute F1={.461, .253, -.058, -.069;
.241, .600, .143, .033;
.582, .047, -.077, -.125;
.327, .297, -.120, -.166;
.176, .448, -.240, -.099;
.680, .069, -.036, -.138;
.415, .228, -.091, -.153;
.
.
.
.390, .205, .002, -.098;
.164, .369, -.170, -.047
}.
As shown above, the text generating the error is shown as a comma (,), but the actual text (following the COMPUTE statement) in column 12 is an open bracket ({). So I have no idea what is going on. Can someone help?
For reference, the original code as proposed by Woolf & Preising (2005) is found here;
The Woolf & Preising article is found here
PS: The sample program given in the link above does run on my copy of SPSS. Here's the beginning of that code:
* Schmid-Leiman Solution for 2 level higher-order Factor analysis.
Matrix.
* ENTER YOUR SPECIFICATIONS HERE.
* Enter first-order pattern matrix.
Compute F1={0.099, 0.5647, -0.1521;
0.0124, 0.9419, -0.1535;
-0.1501, 0.6177, 0.4218;
0.7441, -0.0882, 0.1425;
0.6241, 0.2793, -0.1137;
0.8693, -0.0331, 0.0289;
-0.0154, -0.2706, 0.6262;
-0.0914, 0.0995, 0.7216;
0.1502, 0.0835, 0.398}.

dask equivalent of df.loc[df.index.intesection(mylabels)]

When I run df.loc[mylabels] in dask I get a warning with the link to
Warning Starting in 0.21.0, using .loc or [] with a list with one or more missing labels, is deprecated, in favor of .reindex *
This page also says:
Alternatively, if you want to select only valid keys, the following is idiomatic and efficient; it is guaranteed to preserve the dtype of the selection.
In [106]: labels = [1, 2, 3]
In [107]: s.loc[s.index.intersection(labels)]
Out[107]:
1 2
2 3
dtype: int64
Dask indexes do not have an intersection method.
So hat is the recommended way to achieve the above effect in dask?
The problem with df.loc[mylabels] is that mylabels contains items not in df.index.
For now it looks like you should continue calling df.loc[labels].
It looks like things have changed upstream and probably dask.dataframe needs to follow a bit. I recommend submitting a bug report to https://github.com/dask/dask/issues/new

Syntax for counting cases

I work with SPSS and have difficulty finding/generating a syntax for counting cases.
I have about 120 cases and five variables. I need to know the count /proportion of cases where just one, more than one, or all of the cases have a value of 1 (dichotomous variable). Then I need to compute a new variable that shows the number / proportion of cases which include all of the aforementioned cases (also dichotomous).
For example case number one: var1=1, var2=1, var3=1, var4=0, var5=0 --> newvariable=1.
Case number two: var1=0, var2=0, var3=0, var4=0, var5=0 --> newvariable=1.
And so on...
Can anybody help me with a syntax?
Help would much appreciated!
Here we can use the sum of the variables to determine your conditions. So using a scratch variable that is the sum, we can see if it is equal to 1, more than 1 or 5 in your example.
compute #sum = SUM(var1 to var5).
compute just_one = (#sum = 1).
compute more_one = (#sum > 1).
compute all_one = (#sum = 5).
Similarly, all_one could be computed using the ANY command to evaluate if any zeroes exist, i.e. compute all_one = ANY(0,var1 to var5).. These code snippets assume that var1 to var5 are contiguous in the data frame, if not they just need to be replaced with var1,var2,var3,var4,var5 in all given instances.
You could read up on the logical function ANY in the Command Syntax Reference manual, if you negated a test for ANY with "0", then that is effectively a test for all "1"s. Use of the COUNT command would be another approach.

Resources