First: I am aware of the distinct() function, but that's not what I want.
My problem: Imagine a series with sensor readings that barely change like e.g:
[2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 4, 3, 3, 5, 5, 2]
In my application this series is very long (thousands of entries) and I would like to visualize it in a Diagram (on Android, but that doesn't matter).
What I'd like to achieve:
I would like to get the values, where the series changes e.g:
[2, 3, 4, 3, 5, 2]
of course with their respective timestamps and tags.
With the distinct() function the result would look like this:
[2, 3, 4, 5, ]
Thanks!
Related
I've got a spreadsheet tracking my partner and I's wordle scores, tracking an average.
Win Turns, Me, Them
1, 0, 0
2, 1, 2
3, 4, 9
4, 12, 8
5, 5, 9
6, 2, 6
And the formula to calculate the average looks like this:
=((C2*A2)+(C3*A3)+(C4*A4)+(C5*A5)+(C6*A6)+(C7*A7))/sum(C2:C7)
I'm using Google sheets. Is there a better way to write this??
if you want something short you can use
=INDEX(SUM(C2:C*A2:A)/SUM(C2:C))
or:
=AVERAGE.WEIGHTED(A2:A, C2:C)
Suppose we have an array that's like [1, 2, 3, 4], if I created a segment tree for that array we'd get something like: [null, 10, 3, 7, 1, 2, 3, 4], so all of the subarray sums would exactly what we have on the segment tree.
However, if our input array is like [1, 2, 3], our segment tree would be something like: [null, 6, 3, 3, 1, 2, 3, 0], with the trailing 0 since we don't have a complete binary tree due to 3 (the array's length) not being a power of 2.
Unlike in the first example, since our binary tree isn't complete, we run into duplicate ranges. In our tree: [null, 6, 3, 3, 1, 2, 3, 0], the 2nd last 3 and the last 3 represent the same range, since the right tree has a 0.
Is there any way to distinguish between this duplicate range? Or should I be using another data structure for a problem that's susceptible to this kind of segment tree duplicate range issue that I'm having with my second example?
I have a timestamp for every record in the data set.
I heard about time based spiting but don't know anything about it.
Normal cross-validation
You have a set of data points:
data_points = [2, 4, 5, 8, 6, 9]
Then, if you do a 2-fold split, your data points will get randomly assigned to 2 different groups.
For example:
split_1 = [2, 5, 9]
split_2 = [3, 8, 6]
However, this assumes that there is no need to keep the order of your data points.
You can train your model with split_1 and test it with split_2.
Time based splitting
However, this assumption isn't always correct for time series prediction.
For example, given the same data points:
data_points = [2, 4, 5, 8, 6, 9]
It can be that they are arranged by time.
You could then have a model that to predict the next number, it looks back 3 time steps. (e.g. to predict the number after 9, it will have [8, 6, 9] as input. Meaning that the order of which the data points appear is important. Because of that, in order to test your model, you cannot randomly split your data points. The order in which they appear needs to be kept.
So if you do a 2-fold split, you could get the following splits:
split_1 = [2, 4, 5, 8]
split_2 = [5, 8, 6, 9]
Implementation
There is an implementation of time-based cross-validation from Sklearn: the TimeSeriesSplit.
I'm arranging the data based on a priority(ascending order), where '0' ignored in prioritising.
Below is the Rails Query:
Profile.where(active: true).order(:priority).pluck(:priority)
This query returns an ordered list of records with priorities that starts from '0'
[0, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 7]
Could you help me figure out how to order the data where the record with "0" is added to last in the query as per the example below.
Example: [1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 7, 0]
You can pass a string to #order to use raw SQL so you could say:
Profile.where(active: true)
.order('case priority when 0 then 1 else -1 end, priority')
.pluck(:priority)
to force the priority zero entries to the end. You don't have to use 1 and -1 as the numbers of course, you could use anything that is readable to you and sorts in the right order, you could even use strings (assuming they sort properly of course):
.order("case priority when 0 then 'last' else 'first' end, priority")
I am new to Highcharts. I need to display a line chart.
Here is the categories:
["9/7/14", "9/8/14", "9/9/14", "9/10/14", "9/11/14", "9/12/14", "9/13/14", "9/14/14", "9/15/14", "9/16/14", "9/17/14", "9/18/14", "9/19/14", "9/20/14", ...]
Here is the data series:
[1, 4, 0, 2, 1, 1, 1, 5, 3, 1, 0, 0, 6, 8, ... ]
What I hope to achieve is to group every three dates and their total and display it accordingly. Something like this:
["9/7/14", "9/10/14", ...]
[5, 4, ... ]
Is this something Highcharts can do out-of-box and how if yes?
Thanks and regards.
It's not possible when using categories. When using categories, then you need to calculate this on your own.
In Highstock, that feature is called dataGrouping - however, doesn't work with categories.