Alteryx: Creating multiple Histograms from one dataset - histogram

I have a data set that contains the following information - Date, item # and the unit price for that item on that date.What I would like to create is one histogram per item (my dataset has 17 unique items), charting the frequency of the unit prices? Is this possible in Alteryx?

What you really want is the ability to group by items within your data set. I think the closest thing to this for your specific use case is the summarize tool. You can group by item and then use the percentile operation to generate several points within the data range to add to a histogram.

Related

Tableau has two data tables, how can i make a filter only apply to single data table

I currently have two data tables each linked to a date table. The data tables are from salesforce. I can calculate the number of a certain case type per quarter without issue. I can also calculate the running sum over quarters to show instrument install base increasing. I want to divide the number of cases per qtr by the install base. This calculation works, but when I apply a filter to see different types of cases per instrument, the filter impacts the install base as well. I would like to keep the install base consistent. I tried different LOD, but no luck. Any suggestions on filters and LOD and where to place in tableau would be beneficial.
One option is to use a parameter for filtering and then having a calculated field that changes based on parameter values in one table but not in the other table. However, this type of filter would affect all worksheets that use the same data.

How to generate a distribution based bar chart on row_numbers

I have a SQL query that acts as a data source in my tableau desktop:
SELECT
row_number() over (order by sales) as rn,
article_number,
country,
SUM(sold_items) as si,
SUM(sales) as sales
FROM data.sales
WHERE sales.order_date between '2021-01-01' and '2021-12-31'
GROUP BY 2, 3
On tableau I dragged rn to column and sales to row to generate a bar chart. The following is the output:
I want to convert this into a 0-100% distribution chart so that I can get the following result:
How can I achieve this? Also, I want the user to filter by country level so even if the # of records increase or decrease, the distribution should always be consistent with the filtered data.
You can do this with nested table calcs.
For example, the following uses the Superstore sample data set, and then first computes a running total of SUM(Sales) per day, then converts that to a percent of total. Notice the edit table calc dialog box - applying two back to back calculations in this case.
The x-axis in this example is Order-Date, and in your question, the the x-axis is a percentage somehow - so its not exactly what you requested but still shows that table calcs are an easy way to do these types of operations.
Also, realize you can just connect to the sales table directly, the custom sql isn’t adding any value, and in fact can defeat query optimizations that Tableau normally makes.
The tableau help docs explains table calculations. Pay attention to the discussion on partitioning and addressing.

Tableau - Dynamic Datasets

I'm not sure what what best way would be to describe the problem I'm trying to solve.
Basically, my datasets are a model output which are generated in the same format on the daily basis.
I have build a dashboard around one dataset but want to create a dynamic filter which check for the output files in a folder and update visuals for the dataset I select.
I can create data connections for the existing datasets and that will make it work but since the datasets get updated on daily basis, is there a way to create such a filter?
I don't know how to let the user select any arbitrary data source without knowing the choices in advance. Maybe there is a way. But maybe you don't need to do that.
Suppose you generate new datasets each day and they are always called by the same names, such as data-monday, data-tuesday, data-wednesday, etc. So you always have exactly the same seven datasets to pick from.
Then each dataset could have a field in it corresponding to the name, such as "WHAT-SET" with values, say, "monday", "tuesday', "wednesday", etc.
Then your data step could import all seven data sets and UNION them together, and your user could use a parameter to filter on "WHAT-SET" to pick the desired one.

Extract Last Value as Metric from Table Calculation in Tableau?

I have raw data in Tableau that looks like:
Month,Total
2021-08,17
2021-09,34
2021-10,41
2021-11,26
2021-12,6
And by using the following calculation
RUNNING_SUM(
COUNTD(IF [Inserted At]>=[Parameters].[Start Date]
AND [Inserted At]<=[End Date]
THEN [Id] ELSE NULL END
))
/
LOOKUP(RUNNING_SUM(
COUNTD(IF [Inserted At]>=[Parameters].[Start Date]
AND [Inserted At]<=[End Date]
THEN [Id] ELSE NULL END
)),-1)*100-100
I get
Month,My_Calc
2021-08,NULL
2021-09,200
2021-10,80.4
2021-11,28.3
2021-12,5.1
And all I really want is 5.1 (last monthly value) as one big metric (% Month-Over-Month Growth).
How can I accomplish this?
I'm relatively new to Tableau and don't know how to use calculated fields in conjunction with the date groupings aspect to express I want to calculate month-over-month growth. I've tried the native year-over-year growth running total table calculation but that didn't end with the same result since I think my calculation method is different.
First a brief table calc intro, and then the answer at the end.
Most calculations in Tableau are actually performed by the data source (e.g. database server), and the results are then returned to Tableau (i.e. the client) for presentation. This separation of responsibilities allows high performance, even when facing very large data sets.
By contrast, table calculations operate on the table of query results that were returned from the server. They are executed late in the order of operations pipeline. That is why table calcs operate on aggregated data -- i.e. you have to ask for WINDOW_SUM(SUM([Sales)) and not WINDOW_SUM([Sales])
Table calcs give you an opportunity to make final passes of calculations over the query results returned from the data source before presentation to the user. You can for instance calculate a running total or make the visualization layout dynamically depend in part on the contents of the query results. This flexibility comes at a cost, the calculation is only one part of defining a table calc. You also have to specify how to apply the calculation to the table of summary results, known as partitioning and addressing. The Tableau on-line help has a useful definition of partitioning and addressing.
Essentially, table calcs are applied to blocks of summary data at a time, aka vectors or windows. Partitioning is how you tell Tableau how you wish to break up the summary query results into windows for purposes of applying your table calc. Addressing is how you specify the order in which you wish to traverse those partitions. Addressing is important for some table calcs, such as RUNNING_SUM, and unimportant for others, such as WINDOW_SUM.
Besides understanding partitioning and addressing very well, it is also helpful to learn about the functions INDEX(), SIZE(), FIRST(), LAST(), WINDOW_SUM(), LOOKUP() and (eventually) PREVIOUS_VALUE() to really understand table calcs. If you really understand them, you'll be able to implement all of these functions using just two of them as the fundamental ones.
Finally, to partially address your question:
You can use the boolean formula LAST() = 0 to tell if you are at the last value of your partition. If you use that formula as a filter, you can hide all the other values. You'll have to get partitioning and addressing specified correctly. You would essentially be fetching a batch of data from your server, using it in calculations on the client side, but only displaying part of it. This can be a bit brittle depending on which fields are on which shelves, but it can work.
Normally, it is more efficient to use a calculation that can be performed server-side, such as LOD calc, if that allows you to avoid fetching data only for client side calculations. But if the data is already fetched for another purpose, or if the calculation requires table calc features, such as the ability to depend on the order of the values, then table calcs are a good tool.
However you do it, the % month-to-month change from 2021.11 (a value of 26) to the value for 2021.12 (a value of 6) is not 5.1%.
It's (( 6 - 26 ) / 26) * 100 = -76.9 %
OK, starting from scratch, this works for me: ( I don't know how to get exactly the table format I want without using ShowMe and Flip, but it works. Anyone else? )
drag Date to rows, change it to combined Month(Date)
drag sales to column shelf
in showme select TEXT-TABLES
flip rows for columns using tool bar
that gets a table like the one you show above
Drag Sales to color (This is a trick to simply hold it for a minute ),
click the down-arrow on the new SALES pill in the mark card,
select "Add a table calculation",
select Running Total, of SUM, compute using Table(down), but don't close this popup window yet.
click Add Secondary Calculation checkbox at the bottom
select Percent Different From
compute using table down
relative to Previous
Accept your work by closing the popup (x).
NOW, change the new pill in the mark card from color to text
you can see the 5.1% at the bottom. Almost done.
Reformat again by clicking table in ShowMe
and flipping axes.
click the sales column header and hide it
create a new calculated field
label 'rows-from-bottom'
formula = last()
close the popup
drag the new pill rows-from-bottom to the filters shelf
select range 0 to 0
close the popup.
Done.
For the next two weeks you can see the finished workbook here
https://public.tableau.com/app/profile/wade.schuette/viz/month-to-month/hiderows?publish=yes

Conditional Median combining different values in Google Sheet Formula

Is there a way to calculate the Median of a set of values in one column depending on whether the adjacent column contains a value that is within a set of values?
Below is a table sample:
I would like to get the median of all the Revenues from the US (combine Team US East and West).
First you have to filter this table according to your criteria and then extract median from new range.
Filtering may be obtained using QUERY function, and then you use built in MEDIAN formula.
I've prepared my example which uses two conditions - like yours.
=median(query(B2:C11,"select B where C ='a' or C='b'"))
I think the easiest way is with Filter and Regexmatch:
=median(filter(B2:B,regexmatch(C2:C,"^Team US")))
or in case there are more teams like Team US North and you don't want to include them:
=median(filter(B2:B,regexmatch(C2:C,"^Team US East|^Team US West")))

Resources