I am new to tableau. I need your help to create a calculation field in Tableau.
I have a big data an excel file format.
A sample data is shown below.There are two model runs for same units and both models derive different values.
Unit Latitude Longitude Model_Name Value
1 43.12 9.12 Model_1 53
1 43.12 9.12 Model_2 42
2 52.09 10.11 Model_1 105
2 52.09 10.11 Model_2 206
I want to create a calculation field in tableau which gives me the difference between values of same unit and different model (here between Model_1 and Model_2).
It should be look like:
Unit Latitude Longitude Difference
1 43.12 9.12 11
2 52.09 10.11 -101
Could anyone please help me to create this calculation to plot a graph/visualization?
Thank you.
Related
<For example:
a variable has values 1 2 3 5 10 11 12 13 14 20 21 ....
I want to replace it with 1 2 3 4 5 6 7 8 9 10 11.....
I was using this command but is not giving, the desired results:
old variable=district
I want to replace value with the correct sequential values>
levelsof district, local(district_new)
foreach i in `district_new'{
replace district= mod(_n-1,707)+1
}
Not fully sure what you trying to do, but is this a solution to what you are trying to do:
sort district
replace district = _n
This will replace the values in district with 1 for the lowest current value, 2 for the second lowest value etc. This might not be a good solution if your variable may have duplicates.
I agree with #TheIceBear but more can be said that won't fit easily into comments.
The particular code posted boils down to a single statement repeated
replace district = mod(_n-1,707) + 1
as that action is repeated regardless of the values of district. In a dataset with 707 or fewer observations, that in turn would be equivalent to
replace district = _n
as #TheIceBear points out. If there were duplicate observations on any district, this would definitely be a bad idea, and something like
egen newid = group(district), label
would be a better idea. For more, see https://www.stata.com/support/faqs/data-management/creating-group-identifiers/
Since Tableau does not have a function for P-values(correct me if I'm wrong here) I created a spreadsheet with all possible sample sizes under two different alphas/significance levels and need to connect the appropriate p-value to a calculated field from the main database source (aggregate count of people). I assumed I could easily match numbers with a condition to bring back the p-value in a calculated field yet I'm hitting a brick wall. Biggest issue seems to be that the field I want to join the P-value reference table to is an aggregated integer. Also, I do not have any extensions and my end result needs to be an integer, not a graph.
Any secret tricks here?
Seems I cannot blend the reference table in nor join it to an aggregate?
Thanks!
I found a work around in calculating the critical value for a two tailed t-test in tableau. However, I didn't figure out how to join based on an aggregated calculated field. Work around: I used a conditional statement just copying and pasting about 100 critical values based on (sample size - 2) aka degrees of freedom, into a calculated field. To save time, use excel to pull down the conditions to 120. Worked like a charm!
Here is the conditional logic for alpha = .2 (80%) in two tailed t-test (replace the ## line with about 117 rows):
IF [degrees of freedom] = 1 THEN 3.08
ELSEIF [degrees of freedom] = 2 THEN 1.89
ELSEIF [degrees of freedom] = 3 THEN 1.64
##ELSEIF [...calculate down to 120] = ... then ...
ELSEIF [degrees of freedom] > 121 THEN 1.28
END
I am using weights when running the data with SPSS custom tables.
Thus it is expected that the column or row values may not add up to row total, column total or Table Total due to rounding of decimals
sample table result:
variable 2
category 1 category 2 Total
variable 1 category 1 45 52 97
category 2 60 56 115
Total 105 107 211
Is there a way to force SPSS to output the correct row, column, or table totals?
expected table output:
variable 2
category 1 category 2 Total
variable 1 category 1 45 52 97
category 2 60 56 116
Total 105 108 213
If you are using the CROSSTABS procedure to produce these figures then you should do using the option ASIS.
To be clear: the total displayed by CTABLES is mathematically correct. However, if you want to display as the total the sum of the displayed values in the rows, instead, the only way to do this is by using the STATS TABLE CALC extension command to recompute the totals using the rounded values.
Here is how to do that.
First, you need to create a Python module named customcalc.py with the following contents
def custom(datacells, ncells, roworcol):
'''Calculate sum of formatted values'''
total = sum(float(datacells.GetValueAt(roworcol,i)) for i in range(ncells))
return(total)
This file should be saved in the python\lib\site-packages directory under your Statistics installation or anywhere else that Python can find it.
Then, after your CTABLES command, run this syntax
STATS TABLE CALC SUBTYPE="customtable" PROCESS=PRECEDING
/TARGET custommodule="customcalc"
FORMULA="customcalc.custom(datacells, ncells, roworcol)" DIMENSION=COLUMNS LEVEL = -2 LOCATION="Total"
LABEL="Rounded Count".
That custom function adds up the formatted values in each row instead of the full precision values. If you have suppressed the default statistic name, Count, so that "Total" is the innermost label, use LEVEL=-1 instead of LEVEL=-2 ABOVE.
I have a lot of columns in SPSS and for a calculation, I need to get the sum of each and every one of them. Is there a way to do this in SPSS?
An example of what I mean is shown below:
age gender question 1 question 2
-------------------------------------------------
25 m 2 3
19 f 4 2
20 f 3 4
------- -------
need sum need sum
If you just need an ouput table with the results then see the DESCRIPTIVES command.
Alternatively, if you need the results in an output dataset for further processing then see the AGGREGATE command.
use: Analyse > Reports > Summaries in Columns and add your columns
Posted as a new question
The report is working ok in regards to selecting one country and seeing the different data within the 12 or 36 months date range.
The problem comes when I 'Select All' countries. What I want is the totals of all the countries to be represented on the graphs.
this is what the output is
country yyyy-mm Population Employed 12months 36months
uk 2016-06 56 43 y y
france 2016-06 40 22 y y
Germany 2016-06 73 32 y y
uk 2015-06 45 10 n y
france 2015-06 30 11 n y
Germany 2015-06 76 56 n y
AND SO ON......
All help appreciated, thank you.
Based on what you've described I think something like this will work for you
You are correct that you need two parameters: one for the country, the other is the period. For this second parameter specify two entries in the report designer. Give them the labels '12 months' and '36 months' and values or 12 and 36 respectively. Now change your dataset query as shown in the example below (obviously my table/column names won't be the same as yours)
select country, [yyyy-mm], Total
from #datatable
where country = #country
and ((#period = 12 and [12months] = 'Y') or (#period = 36 and [36months] = 'Y'));
The last line is where the magic happens. By testing the value of the #period parameter in the where clause we can make parts of the clause conditional.