histogram not showing properly in jupyter - histogram

hope you can help me with a minor problem in jupyter.
I have a big dataset about people that have bought newspapers and i want to show it in a histogram. the dataset is kinda big with 68000 row and 1 columns. the histogram is giving me all the row values in the x-axis, which i just want to be between 100 and 1200. Because the lowest paper that got sold in a month was 1 and highest 1188. i want to group that so my x axis shows 100-200-300-400-500 and so on till 1200. but its showing all wrong, instead its showing my rows witch goes from 1-68000.
plt.figure(figsize=(20, 10))
plt.xlabel('day')
plt.ylabel('Frequency')
plt.title('subscription_days')
x = df[['subscription_days']]
plt.hist(x)
plt.grid(True)
plt.show()
enter image description here

Related

Tableau - Find median of top 10 in a category

Based on my data, I want to find the median PRICE of top 10 IDs basis the ENR value in every county. In the attached file
I have already found the top 10 IDs basis the ENR value but cannot figure out how to find the median of these top 10 PRICE values in a county.
Any help is appreciated.
Thanks.

Tableau - Data disappear when changing 'date'

Working some data and needing to essentially do an 'average' of an 'average'. Have a daily snapshot that I am averaging over the month, then want to sum the averages and then find the average of that. I know I know, boss says it's a requirement.
So the data looks fine when I have 'Year/Month' on the rows shelf. The issue I am running into is that when I remove the 'month' pill, the data disappears. I've narrowed it down to my calc and the part where I'm trying to define how to average at the lower level.
Any ideas if there is a way to calc it at that lower level to use in the second part where I need to sum those averages then average that?
Tableau Calc
Table

Random select in with a bias towards certain outcomes (ie 60/40)

Lets say I have 2 lists and I would like to randomly select a winner between the lists but I would like to select the winner from list A 60% of the time and from list B 40% of the time, how can that be done in Google Sheets?
You can randomly select names from a list using this formula
INDEX(A2:A, RANDBETWEEN(1, COUNTA(A2:A)))
Without knowing some more information on your setup here is a general formula that does what you're describing:
=IF(RAND()<=0.6,INDEX(A2:A, RANDBETWEEN(1, COUNTA(A2:A))),INDEX(B2:B, RANDBETWEEN(1, COUNTA(B2:B))))
Essentially it is rolling a random number between 0 and 1. If it is equal to or less than .6 (simulating 60%, since there is a 60% chance it will be less than or equal to .6) it then selects a random name from Column A, otherwise (bottom 40%) it selects from column B.
You can also replace the "0.6" with A1 in my example to have the weight be a dynamic number. Changing A1 to 75% for example will then compare the random value against less than or equal to .75.
EDIT: Image shows the wrong condition, I was corrected in the sense you want less than or equal to .6 and not greater than, I had the weights flipped.

Tableau: Subset multiple time dependent histograms into multiple rows and columns to fit the screen

I am trying to replicate the plot below (done with ggplot in R) using Tableau:
However, I can't see how I can subset the plot so it fits the screen using Tableau. Using Tableau, this is what I get:
I've attempted adding the following but it stops plotting the histograms and ends up messier:
Row Divider (Discrete):
INT((INDEX()-1)/(ROUND(SQRT(SIZE()))))
Columns Divider (Discrete):
(INDEX()-1)%(ROUND(SQRT(SIZE())))
How can I achieve the plot in R using Tableau?
P.S.: The datasets are different in case you were wondering why Monday doesn't look the same.
You're on the right path using Row-Column divider, but you need to go some step further using the small multiple technique.
For instance, you need to move WEEKDAY in the detail mark and then, use column and row divider in column and row shelf.
Doing so, you'll also need to right-click on CNT/Ride Id Hash) and compute it with WEEKDAY.
Here's a cool guide by a Tableau Zen master showing how to work with this tecnique: https://www.vizwiz.com/2016/03/tableau-tip-tuesday-how-to-create-small.html

How to create a histogram in Google Sheets with a log scale on x-axis

I need to create a histogram on Google Sheets, and I need it to have a log scale on the x-axis. This is because there are some random high numbers on my column, most numbers are clustered at the beginning.
The option shows up for the y-axis, but not for the x-axis. I think that when I was trying different options it showed up for a time...???? But now it just disappeared.
Please help!
Try normal chart (bar or line) and building a histogram table manually
Use FREQUENCY() formula for this. This way you can make your own classes the way you like and you can then make whatever chart you like.
Take a look at my solution - line chart with logarythmic y-scale.
X-scale as I see is unavailable for manipulation, but you can use own values and treat them as text.
Example dataset: 100 random values from 0 to 35.
Classes are powers of 2 (increase by 1/2 with each step)
Here is my example file. See if it helps
https://docs.google.com/spreadsheets/d/13xVVwhUrMcDj-ec7xpTJv-8cDjlh8zXT46zrqVVLnk0/copy

Resources