In Tableau how do I use RANK to calculate an "OTHER" field? - tableau-desktop

I'm new to Tableau so this may be an easy question about computations using RANK. I can't find any tableau HELP or other stack-overflow answer to this. Maybe this is a GROUP question. Maybe it's about OTHER.
I have a data set of 160 countries ( rows ) with a field for jetfuel consumption for each country.
I just want to make a bar chart like the attached image showing the 20 highest fuel-consumption countries by name ranked by jetfuel_consumption ( I can do that much) AND an 21st row computed country name titled "Rest of world" summing the remaining 140 countries together as if it were just another country like the bottom of this model .
I have a working valid computed field labelled "myrank" = RANK(AVG([Jetfuel Consumption]),'desc')
My thought was to simply calculate a new text field that would equal the country name for rank < 21 and then be the string "Rest of World" otherwise.
Such as:
IF ( [therank] < 11 ) [Country] ELSE "Rest of World" END
But that is not valid for an unspecified reason. I know I'm confused already about how to just specify the value of a field without something like SUM or AVG or AGG wrapping it, but this is a larger question.
What's the right way to make this view?

I've created simple dataset:
And I want to group TOP 3 countries by Consumption.
To do it I should create a set (click on Country in Dimension) and select TOP 3 By SUM(Cosumption):
Then create a calculated field to show Countries IN Set and "Others".
IF [Country Set] is a boolean expression "The country IN a set".
Drag and Drop corresponding fields and configure sort, for example:
Sets are convenient to dynamically change, expand and customize any visualization. More detailed: https://help.tableau.com/current/pro/desktop/en-us/sortgroup_sets_topn.htm

Related

Tableau incorrect average field value on drill up level

I am very much new for tableau and working on a report in tableau with two level drill-down. I have three fields named 'Sales' and 'Quantity' and 'Avg-Price'. I need the correct value of 'Avg-Price' (?) when I drill-up on company level. See below picture
For field 'Avg-Price', I use 'CalculatedField' with formula '[Sales]/[Quanity]'.
Any suggestion?
Try
SUM([Sales]) / SUM([Quantity])

Query + Transpose based on value in Column B if Column A contains certain text

I am currently working with Google Forms and want to rearrange the way the responses are being displayed on the "Response Sheet". The only way I can think of doing this is by importing or moving the data to another sheet that would select and transpose certain columns if Column A contains key value.
This is what I'm seeing as part of the input and would like to see as the output if Column A Contains certain text:
Input & Output
Thank you in advance for your help!
O.K.
I rewrite headings a2:e2,
I take whole first five columns without headings e3:e6
I display content of columns A,B,F,G,H for all the rows that have 'A1' in column 1
I take tables built in point 1 and 2 together and sort them by first column
My solution is here:
https://docs.google.com/spreadsheets/d/1n7Ppd8v75mb3qrnJz_Jh_b4HNaj4i56X9wRGnz0l6i8/copy
={A2:E2;
sort({A3:E6;
query(A3:H6,"select A,B,F,G,H where A ='A1'",0)})
}

How to automatically get hyperlinks based on a condition?

OS: Windows 10
Here's the basic idea:
I have a list of foods that are on the left side, and when I decide which foods I want, I then decide what order I want them in. After they are in the right order, G7-J7 (Linked Foods) should automatically place a hyperlink based on what food I selected in the Order of Food. How it pulls these links should be from the List of Foods. Then, at the bottom, there are 4 images that will automatically be shown based on what link is in the Linked Food boxes.
Basically, what I did, was I made the Linked Food 1 formula =IF(G3 = "Ramen", D2, "No Link Found").
And then for the Food 1 image, I did =iferror(arrayformula(image(G7)),"")
The image should automatically be there for whatever link I put in the G7 box now, but the G7 box is the main issue I'm having.
Everything works, but this is only an example. In my REAL project I'm doing, I have tons and tons of "foods" and I can't just put =IF(G3 = "Ramen", "Lemonade", "Tofu", "Fruit Punch", ...and the hundreds of others.
SO...What I'm wondering, is if there's an easier way to make these links automatically change, without having to manually put every single item from the List of Foods into the formula.
Any help is appreciated!
It is somewhat unclear if the question is asking this for google docs or Excel, but the solution is basically the same either way.
What you want here is called "VLOOKUP". You will be replacing this line:
=IF(G3 = "Ramen", D2, "No Link Found")
with
=VLOOKUP(G3, C2:D8, 2, FALSE)
This will take the value of cell G3 (Ramen in the example case) and search the range of C2:D8 (all the cells on the left of your table) until it finds it. Once it does it will go to the second column of the range (C = 1, D = 2) and take the matching cell value. The FALSE at the end has to do with sorting but shouldn't play a role here.
Note that your range needs to be the entire size of your foods list so change that D8 to DX where X is the size of your list in rows.
You can read the full syntax for VLOOKUP at: https://support.google.com/docs/answer/3093318?hl=en

Reference Specific Row in Named Range within another Named Range

I'm writing a spreadsheet to keep track of a small business' financials. They operate a few Rooms for rent, and the structure of the document is made so that each sheet holds a year's worth of booking for all the rooms.
Essentially, each row is defines a specific date, while each rooms spans a few columns (reason is that they don't just want to track whether or not a room is booked, but also record names of clients & other remarks), among which the daily calculated income (some factors alter the daily rate each room will generate).
So this is all fine and dandy, and I've created named ranges for each month of the year, and for each room.
For example, rows 6:36 will represent the month of January, while columns C:I will represent Room 1. Room 2 will span J:P and so forth.
Now, in another sheet, I wanted to make a dashboard which lists the earning for each room, per month. It's a very simple table with 12 rows (one for each month) and 10 columns (1 for each room) where I planned to sum up all the earnings.
So my issue is that I can't find a way to retrieve a specific column of a named range for a room ('vertical named range'), which is also limited in a named range for a month ('horizontal named range'). I had read about using ARRAYFORMULA(INDEX(named_range, ,wished_column)) but that only works for a single named range. My knowledge of these two functions being non-existent, I didn't manage to extend it to a 2-named-range version...
(I mean I did try something along the lines of ARRAYFORMULA(INDEX(January, , INDEX(Room1, , 3))) but that didn't work)
So because there isn't a one-to-one relation from the Dashboard cells to the Rooms cells, my current only solution is to manually reference everything, which you'll understand is inefficient and time-consuming...
My question, in fine, is: How can I retrieve a range that results of the intersection of 2 (or more) named ranges ? Once I have that resulting range, I know it will be very easy to use INDEX().
Define a named range Base as
A:Z
Define a range named Horizontal as
6:36
Define a range named Vertical as
C:I
Then the intersection of the vertical and horizontal ranges is given by:
index(Base,row(Horizontal),COLUMN(Vertical)):index(Base,row(Horizontal)+rows(Horizontal)-1,COLUMN(Vertical)+columns(Vertical)-1)
This can be verified by using it in a function e.g.
=countblank(index(Base,row(Horizontal),COLUMN(Vertical)):index(Base,row(Horizontal)+rows(Horizontal)-1,COLUMN(Vertical)+columns(Vertical)-1))
gives the result 7 * 31 = 217 in my sheet because I haven't filled in any of the cells.
The Offset version of this would be:
=countblank(offset(A1,row(Horizontal)-1,COLUMN(Vertical)-1):offset(A1,row(Horizontal)+rows(Horizontal)-2,COLUMN(Vertical)+columns(Vertical)-2))
or more simply:
=countblank(offset(A1,row(Horizontal)-1,COLUMN(Vertical)-1,rows(Horizontal),COLUMNS(Vertical)))
So this works well in OP's case where you have two fully overlapping ranges like this:
Partial Overlap
Suppose you have two partially overlapping ranges like this:
You can use a variation on the standard overlap formula (This is one of the early references to it as used with a date range)
max(start1,start2) to min(end1,end2)
So the previous formula becomes
=countblank(index(Base,max(row(index(Partial1,1,1)),row(index(Partial2,1,1))),max(COLUMN(index(Partial1,1,1)),column(index(Partial2,1,1)))):
index(Base,min(row(index(Partial1,1,1))+rows(Partial1)-1,row(index(Partial2,1,1))+rows(Partial2)-1),min(COLUMN(index(Partial1,1,1))+columns(Partial1)-1,column(index(Partial2,1,1))+columns(Partial2)-1)))
and the offset version is
=countblank(offset(A1,max(row(offset(Partial1,0,0)),row(offset(Partial2,0,0)))-1,max(COLUMN(offset(Partial1,0,0)),column(offset(Partial2,0,0)))-1):
offset(A1,min(row(offset(Partial1,0,0))+rows(Partial1)-2,row(offset(Partial2,0,0))+rows(Partial2)-2),min(COLUMN(offset(Partial1,0,0))+columns(Partial1)-2,column(offset(Partial2,0,0))+columns(Partial2)-2)))
I have tested this on ranges C2:F10 and D3:G11 which gives the result 24 as expected.
However, if there is no overlap, this can still give a non-zero result, so a suitable test needs adding to the formula:
=if(and(max(row(index(Partial1,1,1)),row(index(Partial2,1,1)))<=min(row(index(Partial1,1,1))+rows(Partial1)-1,row(index(Partial2,1,1))+rows(Partial2)-1),
max(column(index(Partial1,1,1)),column(index(Partial2,1,1)))<=min(column(index(Partial1,1,1))+columns(Partial1)-1,column(index(Partial2,1,1))+columns(Partial2)-1)),"Overlap","No overlap")
Perhaps the best approach in Google Sheets is to go back to the full version of the Offset call OFFSET(cell_reference, offset_rows, offset_columns, [height], [width]) . Although this is rather long, it will return a #Value! error if there is no overlap:
=Countblank(offset(A1,
max(row(offset(Partial1,0,0)),row(offset(Partial2,0,0)))-1,
max(COLUMN(offset(Partial1,0,0)),column(offset(Partial2,0,0)))-1,
min(row(offset(Partial1,0,0))+rows(Partial1),row(offset(Partial2,0,0))+rows(Partial2))-max(row(offset(Partial1,0,0)),row(offset(Partial2,0,0))),
min(COLUMN(offset(Partial1,0,0))+columns(Partial1),column(offset(Partial2,0,0))+columns(Partial2))-max(COLUMN(offset(Partial1,0,0)),column(offset(Partial2,0,0)))
))
Notes
Why did I have to introduce some more indexes (indices?) in the second formula to make it work? Because if you use the row function with a range in an array context, you get an array of row numbers which isn't what I want. As it happens, in the first formula you are not using it in an array context, so you just get the first row and column of the given range which is fine. In the second formula, Max and Min try to evaluate all the rows in the array, which gives the wrong answer, so I have used Index(range,1,1) to force it to look only at the top left hand corner of each range. The other thing is that both index and offset return a reference, so it is valid to use the construct Index(...):Index(...) or Offset(...):Offset(...) to define a new range.
I have also tested the above in Excel (where as mentioned the Index version would be preferable). In this case Base would be set to $1:$1048576.
Although in Excel you have the Intersect Operator (single space) so it's not necessary to use an Index or Offset formula at all e.g. the first example above would simply be:
=COUNTBLANK(Vertical Horizontal)
and if there is no overlap the formula returns a #NULL! error.
"I've created named ranges for each month of the year, and for each
room. For example, rows 6:36 will represent the month of January,
while columns C:I will represent Room 1. Room 2 will span J:P and so
forth."
What I suggest is that if "January" is defined for columns C to whatever (the last column of the last room), then that's all you need.
You haven't shown us the layout of the dashboard. But let's assume that at the very least you're interested in the income generated by each room.
=query({January},"select sum(Col3) label sum(Col3)'' ")
In this image, the range called "January" is highlighted. Note that it does NOT include the header. Note also that it can be many columns wide; in this example, I've just made up a few columns, but your range should cover all the columns for rooms 1 to n.
Syntax: QUERY(data, query, [headers])
Data: This formula queries the range called "January". That range can be on the same sheet, on on another sheet (such as your Dashboard). Reminder: in this screenshot, "my version of "January" is highlighted.
Query to count Number of People: "select sum(Col3) label sum(Col3)'' "
Query to sum the income earned: "select count(Col2) label count(Col2)'' "
Col2 & Col4 = Number of People for Room#1 and Room#2 respectively.
Col3 & Col5 = Income for Room#1 and Room#2 respectively.
[headers]: You can ignore them.
This formula delivers just the value of the query; even though it includes a "label", the label will not print.
Modify and adapt these formulae to create the other information required for your Dashboard.

How to assign a unique ID to a google form input?

Google Forms - I have set up a google form and I want to assign a unique id each of the completed incoming form inputs. My intention is to use the unique ID as an input for another google form I have created which I will use to link the two completed forms. Is there another easier way to do this?
I'm not a programmer but I have programming resources available to me if needed.
I was also banging my head at this and finally found a solution.
I compose a 6-digit number that gets generated automatically for every row and is composed of:
3 digits of the row number - that gives the uniqueness (you can use more if you expect more than 998 responses), concatenated with
3 digits of the timestamp converted to a number - that prevents guessing the number
Follow these instructions:
Create an additional column in the spreadsheet linked to your form, let's call it: "unique ID"
Row number 1 should be populated with column titles automatically
In row number 2, under column "Unique ID", add the following formula:
=arrayformula( if( len(A2:A), "" & text(row(A2:A) - row(A2) + 2, "000") & RIGHT(VALUE(A2:A), 3), iferror(1/0) ) )
Note: An array formula applies automatically to the entire column.
Make sure you never delete that row, even if you clear up all the results from the form
Once a new submission is populated, its "Unique ID" will appear automatically
Formula explanation:
Column A should normally hold the timestamp. If the timestamp is not empty, then this gives the row number: row(A2:A) - row(A2) + 2
Using text I trim it to a 3-digit number.
Then I concatenate it with the timestamp converted to a number using VALUE and trim it to the three right-most digits using RIGHT
Voila! A number that is both unique and hard-to-guess (as the submitter has no access to the timestamp).
If you would like more confidence, obviously you could use more digits for each of the parts.
You can apply unique ID numbers using an arrayformula next to the form data. In row 1 of the first rightmost empty column you can use something like
=arrayformula(if(row(A1:A)=1,"UNIQUE ID",if(len(A1:A)>0,98+row(A1:A),iferror(1/0))).
A few comments regarding the explanation provided by #Ying, which I will try to expand, as it is very good.
> Column A should normally hold the timestamp.
In my case, it is date+time stamp.
> 4. Make sure you never delete that row,
even if you clear up all the results from the form
That issue can easily be avoided by placing the formula in the header like this
={"calculated_id";arrayformula( if( len(C2:C); "" & text(row(C2:C) - row(C2) + 2; "000") & RIGHT(VALUE(C2:C); 3); iferror(1/0) ) )}
This formula provides an string for one cell, and a formula for the next one, which happens to be an array formula which will cover all the cells below.
Note: Depending on your language settings you may need to use ";" or "," as separator among parameters.
> 5. Once a new submission is populated,
its "Unique ID" will appear automatically
Issue
And here is the issue I see with this solution.
If the Google Form allows responders to Edit their responses, the date+time stamp will change and so the calculated_id.
A workaround is to have 2 columns, one is the calculated_id and the other will be static_id.
static_id will take whatever is on calculated_id only if itself has no data, otherwise it will stay as it is.
Doing that we will have an ID that will not change no matter how many updates the response experience.
The sort formula for static_id is
=IF(AND(IFERROR(K2)<>0;K2<>"");K2;L2)
The large one is
={"static_id";ArrayFormula(IF(AND(IFERROR(M2:M)<>0;M2:M<>"");M2:M;L2:L))
}
M or K -> static_id
L -> calculated_id
Remember to put this last one on the header of the column. I tend to change the color to purple when it has a formula behind, so I don't mess with it by mistake.
Extra info.
The numeric value from the date/time stamp differs when it comes from both or just one. Here are some examples.
Note that the number of digits on the fractional part differ quite a lot depending on the case.

Resources