How to create two histograms on one plot with shared axis? - histogram

I tried to plot arrival delay and departure delay columns separately, it's clear the distributions are different.
I would like to show them on the same plot, but whenever I try to do so, both plots became one identical shaped distribution although I'm plotting two different columns! What am I doing wrong?
Thank you for your help in advance.

You need a Departure Delay (bin) field. You can create one by selecting Departure Delay in the data pane on the left sidebar and selecting Create bin.
Once you have that new field, you can place it on the Columns shelf next to the other bin field and just put SUM([Number of Records]) on the Rows shelf — getting rid of both CNT() fields.
That should let you see both histograms.
To answer your question about why your previous approach yielded the same chart, you were binning data in both cases by the Arrival delay.
The CNT([xxx]) fields are misleading. That just counts the number of records that have a non-null value in the [xxx] field. If [xxx] always has a value, its equivalent to SUM[Number of Records]). The bin field is what matters.

Related

How do I create a list of non-repeating cells/numbers in Google Sheets?

I’m trying to emulate Minesweeper in Google Sheets, and for this I want to create a second map adjacent to the first with all of the correct values already in it. To randomize bomb position, I need a list of random numbers or cells(cells would be preferable). However, I cannot figure out how to do this without ending up repeating numbers. The result would ideally be a vertical array of cell coordinates. Thank you!
Answer
The following formula should produce the result you desire:
=SORTN(FLATTEN(MAKEARRAY(10,10,LAMBDA(row,col,ADDRESS(row,col)))),20,,RANDARRAY(100),)
In =MAKEARRAY, change the first 10 to adjust how many rows to randomly choose from, or the second 10 to adjust how many columns to choose from. The value in =RANDARRAY must be equal to the product of the number of rows and the number of columns. (e.g. in the above example, 10*10=100).
Change the 20 to adjust how many randomly chosen values to return.
Explanation
=MAKEARRAY is used to generate an array of every possible row and column combination. It accepts a =LAMBDA, which in this case is just the =ADDRESS function. The first two arguments of =MAKEARRAY determine how large the array should be, which is why changing them adjusts how many rows/columns to randomly pick from.
Then, the result of =MAKEARRAY is squashed into a single column using the =FLATTEN formula.
Finally, the entire thing is sorted randomly using =SORTN combined with =RANDARRAY. =SORTN also limits the number of results that are returned dependent on its second argument, which is why changing it adjusts how many results are returned.
If you want information on how to "freeze" the value of =RANDARRAY so it doesn't recalculate each time you change something, check out this question by player0.
Functions used:
=MAKEARRAY
=LAMBDA
=ADDRESS
=FLATTEN
=SORTN
=RANDARRAY

Tableau: Subset multiple time dependent histograms into multiple rows and columns to fit the screen

I am trying to replicate the plot below (done with ggplot in R) using Tableau:
However, I can't see how I can subset the plot so it fits the screen using Tableau. Using Tableau, this is what I get:
I've attempted adding the following but it stops plotting the histograms and ends up messier:
Row Divider (Discrete):
INT((INDEX()-1)/(ROUND(SQRT(SIZE()))))
Columns Divider (Discrete):
(INDEX()-1)%(ROUND(SQRT(SIZE())))
How can I achieve the plot in R using Tableau?
P.S.: The datasets are different in case you were wondering why Monday doesn't look the same.
You're on the right path using Row-Column divider, but you need to go some step further using the small multiple technique.
For instance, you need to move WEEKDAY in the detail mark and then, use column and row divider in column and row shelf.
Doing so, you'll also need to right-click on CNT/Ride Id Hash) and compute it with WEEKDAY.
Here's a cool guide by a Tableau Zen master showing how to work with this tecnique: https://www.vizwiz.com/2016/03/tableau-tip-tuesday-how-to-create-small.html

Google Sheets: How to make a stacked/aggregate chart

I have made a bar chart which aggregates my data, but is there any way I can split each bar based on the data it is aggregating - similar to how a stacked bar chart would look?
Here is a bad artists impression (thick blue lines mine). The idea is that it's important to know from looking at the graph if I sold 5 at £1, or 1 at £5.
Ideally this would work even if the price for each item is variable, but that is not essential (eg: if there is a 'hack' with hardcoding Apple = 3, I can live with that.)
I'm also fine inputting helper columns etc, within reason, but I would want to be able to easily continue to add things to the list on the left without having to add new helper columns each time (calculated ones are fine, of course.)
Thanks in advance.
UPDATE: With thanks to Kin Siang below, I ended up implementing a slightly modified version of their solution, which I am posting here for completeness.
I added a very large (but finite) number of helper columns to the right, with a formula in each cell which would look for the nth occurrence of the item in the main list (wrapped in an iferror to make the unused cells blank).
=iferror(index(FILTER($A:$B,$A:$A=$D2),E$1,2))
Theoretically it could run out of space one day, but I have made it suitably large that this should not be an issue. It has the advantage over the other solution that I do not need to sort or otherwise manipulate the input range and can continue trickling in data to the main list and have the chart automatically update.
Yes, it is possible to display the chart in your case, however need some data transpose in order to do so, let me show you the example with dataset
Assuming this is your original data:
First sort the data by alphabet, and enter this formula in new column
=if(G39="",1,if(G40=G39,I39+1,if(G40<>G39,1)))
Next add new column for categorical purpose, by using concatenate function
="Price"&I40
In the transform data for chart purpose, enter this formula to split all price into different row, different column for different product
=sumifs($H$40:$H$47,$G$40:$G$47,$A41,$J$40:$J$47,B$40)
After that i select stack bar chart and ensure the price in under series, in case in 23 will have some problem to set price at series correctly, you can use 33 data create stack bar chart and update the data range again, it will work also
Here is the cute chart you expected, accept if help :)
*When certain fruit has less price record, it is advised to fill in 0, as the data table need in same column (see the orange price 3), although I didnot test if blank

Tableau map tooltip not displaying average reference line

I created a chart visualising the cost of living in different cities and entered a line indicating the average. When integrating this sheet into the tooltip of my map, the line is not representing the average anymore but the actual cost of living for each city. I have been trying a lot but can't seem to figure it out. Thankful for any tip!
That's because the tooltip, triggered by the click/hover, is taking into consideration just a city at once, and so the average value is equal to the sum of that specific city: you're running average on just one city.
In order to compute the correct reference value you should create a calculated field like this using LOD:
{ FIXED : SUM([Cost Of Living])} / { FIXED : COUNTD([City])}
Then you could use that calculated field in a dual axis chart.
Doing so, since EXCLUDE acts before dimension filters, you will be able to preserve your average across City even though tooltip will trigger a filter.
Take a look at this simple example made with superstore and keep and eye on the red line (LOD v2) which relates to the calulated field above.
As you can see there's also a blue line which relates to the previous calculated field I wrote (LOD v1):
{ EXCLUDE [State] : AVG( { FIXED [State] : SUM([Sales])})}
Once we move to our main worksheet triggering the viz in tooltip, you'll see that the red value still keep the correct value calculated on all data, while the blue value is taking into consideration just data according to filter.
In fact FIXED is the only LOD calculus which act before the dimension filters and it's able to bypass the filtering triggered by the tooltip.

Highcharts compare different dates ranges

I'd like to use highstock to compare two different time ranges together.
For example, for two data sets, one that shows the max temp for each day in Jan and the other one for Feb (for example), I'd like them to be shown one above the other, with the x-axis being the "same" one for both.
I can't do it with categories, because the data is being fed automatically, so each data point has its own time, so the x-axis is datetime.
I wanted to know if it was possible to simply have two graphs overlapping, with one graph having the normal x-axis at the bottom, and the other one having on top of the graph, so even when the data is for different times, it's shown overlapping. I can't find this problem anywhere.
Found the answer on this thread. Hope it helps!
Overlay 2 series of data of different length with highcharts
Essential Chart can be used with different date ranges with multiple axes. example source
The community license provides the whole suite of products for free if you qualify.
Note: I work for Syncfusion.

Resources