Summation Function in Google Sheets - google-sheets

I have data which can be used to find the amount of snowfall in a particular month.
MONTH SNOWFALL INDEX
Jan 0.25
Feb 0.1
Mar 0.6
Apr 0.99
May 0.2
Jun 0.2
Jul 0.01
Aug 0.09
Sep 1.0
Oct 0.5
Nov 0.8
Dec 0.39
To calculate how much snow falls in each month, I have the following formula:
snowfall_amount = (130 - snowfall_index) / 90
I want to write a formula which adds up the amount of snowfall between the months of march and april. Normally, I would create a third column and make the formula:
=130 - $B2 / 90
and then drag that formula down. Then my solution would be:
=SUM($C5:$C6)
However here I am looking for a one-cell solution. Intuitively it seems like this is the job for a Summation but I don't see any way to do that through formulas.

Try
=ArrayFormula(sum((130-index(B2:B,match(C2,A2:A,0)):index(B2:B,match(D2,A2:A,0)))/90))

Related

Does XGBoost Regressor handles missing timesteps?

I've a dataframe with daily items selling: the goal is forecasting on future selling for a good warehouse supply. I'm using XGBoost as Regressor.
date
qta
prezzo
year
day
dayofyear
month
week
dayofweek
festivo
2014-01-02 00:00:00
6484.8
1
2014
2
2
1
1
3
1
2014-01-03 00:00:00
5300
1
2014
3
3
1
1
4
1
2014-01-04 00:00:00
2614.9
1.1
2014
4
4
1
1
5
1
2014-01-07 00:00:00
114.3
1.1
2014
7
7
1
2
1
0
2014-01-09 00:00:00
11490
1
2014
9
9
1
2
3
0
The date is also the index of my dataframe. Qta is the label (the dependent variable) and all the others are the features.
As you can see it's a daily sampling but some days are missing (i.e. 5,6,8).
Could it be a problem during fitting and prediction of future days?
Am i supposed to fill the missing days with qta = 0?

Averaging a Data Series in a Google Sheet to a single entry per period regardless of the number of samples in the larger period?

I have a small data set of ~200 samples taken over twenty years with two columns of data that sometimes have multiple entries for the period (i.e. age or date). When I go to plot it, even though the data is over 20 years the graph heavily reflects the number of samples in the period and not the period itself. For example during age 23 there may be 2 or 3 samples, 1 for age 24, 20 for age 25, and 10 for age 35.. the number of samples entirely on needs for additional data at the time.. so simply there is no consistency to the sample rate.
How do I get an Max or an Average / Max for a period (age) and ensure there is only one entry per period in the sheet (about one entry per year) without having to create a separate sheet full of separate queries and charting off of that?
What I have tried in Google Sheets (where my data is) is on the x-series chart choosing "aggregate" (which is on the age period) which helps flatten the graph a bit, but doesn't reduce the series.
A read only link to the the spreadsheet is HERE for reference.
Data Looking something like this:
3/27/2013 36.4247 2.5 29.3
4/10/2013 36.4630 1.8 42.8
4/15/2013 36.4767 2.2 33.9
5/2/2013 36.5233 2.2 33.9
5/21/2013 36.5753 1.91 39.9
5/29/2013 36.5973 1.94 39.2
7/29/2013 36.7644 1.98 38.3
10/25/2013 37.0055 1.7 45.6
2/28/2014 37.3507 1.85 50 41.3
6/1/2014 37.6055 1.98 38 38.1
12/1/2014 38.1068 37
6/1/2015 38.6055 2.18 34 33.9
12/11/2015 39.1342 3.03 23 23.1
12/14/2015 39.1425 3.18 22 21.9
12/15/2015 39.1452 3.44 20 20.0
12/17/2015 39.1507 3.61 19 18.9
12/21/2015 39.1616 3.62 19 18.8
12/23/2015 39.1671 3.32 21 20.8
12/25/2015 39.1726 3.08 23 22.7
12/28/2015 39.1808 3.12 22 22.4
12/29/2015 39.1836 2.97 24 23.7
12/30/2015 39.1863 3.57 19 19.1
12/31/2015 39.1890 3.37 20 20.5
1/1/2016 39.1918 3.37 20 20.5
1/3/2016 39.1973 2.65 27 27.0
1/4/2016 39.2000 2.76 26 25.8
try:
=QUERY(SORTN(SORT({YEAR($A$6:$A), B6:B}, 1, 0, 2, 0), 9^9, 2, 1, 1),
"where Col1 <> 1899")
demo spreadsheet
and build a chart from there

Google sheets importHTML removes zero and treats commas as decimal

I'm trying to import a table where the commas are the 1000 separator,
example: 32,100 is 32100 but it is treating it as 32.1 instead.
This is a similar table (first one / top left):
https://en.wikipedia.org/wiki/Demographics_of_the_world
imgur for screenshots:
https://imgur.com/a/hJR9tox
I want it to say:
Year million
1500 458
1600 580
1700 682
1750 791
1800 978
1850 1262
1900 1650
1950 2521
1999 5978
2008 6707
2011 7000
2015 7350
2018 7600
2020 7750
But it comes out as:
Year million
1500 458
1600 580
1700 682
1750 791
1800 978
1850 1,262
1900 1,65
1950 2,521
1999 5,978
2008 6,707
2011 7
2015 7,35
2018 7,6
2020 7,75
This is the function I'm using:
=IMPORTHTML("https://en.wikipedia.org/wiki/Demographics_of_the_world"; "table"; 1)
I have also tried using this function:
=IMPORTXML("https://en.wikipedia.org/wiki/Demographics_of_the_world"; "//*[#id='mw-content-text']/div/table[1]/tbody")
But that shows as this witch is extremely hard to understand since it looks like this and still removes the zeros:
World Population[1][2] Yearmillion 1500458 1600580 1700682 1750791 1800978 18501,262 19001,65 19502,521 19995,978 20086,707 20117 20157,35 20187,6 20207,75
Other things i have tried is:
forsing it to always print out three decimals, that wont work since it adds more numbers to the end of all numbers.
The main & easiest possible solution that you have is to change your Spreadsheet's locale setting to one that uses the , as mile separator.
As an alternative, if changing this setting is really not a possibility, you could create a script that uses URLFetchApp to retrieve the page's contents and parses the values, taking into considerations the usage of , as mile separator.

Highcharts Plotline feature

I was wondering whether it'd be possible to draw multiple y axis plotlines on specific dates using highcharts.
I have 3 lines with different colors, one from 16 Apr to 30 Apr and the second one from 30 Apr to 28 May, the third from 28 May to 25 Jun
something similar to this :
Thanks for your help.
H

Rails: Why, and how, does adding apparently equal values to equal dates give different results in this example?

I see that typing 100.days gives me [edit: seems to give me] a Fixnum 8640000:
> 100.days.equal?(8640000)
=> true
I would have thought those two values were interchangable, until I tried this:
x = Time.now.to_date
=> Wed, 31 Oct 2012
> [x + 100.days, x + 8640000]
=> [Fri, 08 Feb 2013, Mon, 07 May 25668]
Why, and how, does adding apparently equal values to equal dates give different results?
The above results are from the Rails console, using Rails version 3.1.3 and Ruby version 1.9.2p320. (I know, I should upgrade to the latest version...)
100.days doesn't return a Fixnum, it returns an ActiveSupport::Duration, which tries pretty hard to look like a integer under most operations.
Date#+ and Time#+ are overridden to detect whether a Duration is being added, and if so does the calculation properly rather than just adding the integer value (While Time.+ expects a number of seconds, i.e. + 86400 advances by 1 day, Date.+ expects a number of days, so +86400 advances by 86400 days).
In addition some special cases like adding a day on the day daylight savings comes into effect are covered. This also allow Time.now + 1.month to advance by 1 calendar month irrespective of the number of days in the current month.
Besides what Frederick's answer supplies, adding 8640000 to a Date isn't the same as adding 8640000 to a Time, nor is 100.days the correct designation for 100 days.
Think of 100.days meaning "give me the number of seconds in 100 days", not "This value represents days". Rails used to return the number of seconds, but got fancy/smarter and changed it to a duration so the date math could do the right thing - mostly. That fancier/smarter thing causes problems like you encountered by masking what's really going on, and makes it harder to debug until you do know.
Date math assumes day values, not seconds, whereas Time wants seconds. So, working with 100 * 24 * 60 * 60 = 8640000:
100 * 24 * 60 * 60 => 8640000
date = Date.parse('31 Oct 2012') => Wed, 31 Oct 2012
time = Time.new(2012, 10, 31) => 2012-10-31 00:00:00 -0700
date + 8640000 => Mon, 07 May 25668
time + 8640000 => 2013-02-08 00:00:00 -0700
date + 100 => Fri, 08 Feb 2013
It's a pain sometimes dealing with Times and Dates, and you're sure to encounter bugs in code you've written where you forget. That's where the ActiveSupport::Duration part helps, by handling some of the date/time offsets for you. The best tactic is to use either Date/DateTime or Time, and not mix them unless absolutely necessary. If you do have to mix them, then bottleneck the code into methods so you have a single place to look if a problem crops up.
I use Date and DateTime if I need to handle larger ranges than Time can handle, plus DateTime has some other useful features, otherwise I use Time because it's more closely coupled to the OS and C. (And I revealed some of my roots there.)

Resources