I have a number of fruit baskets, all of them have a random amount of apples and they have different properties.
arrayOfBaskets = [
["basketId": 1, "typeOfPesticidesUsed": 1, "fromCountry":1, "numberOfApples": 5],
["basketId": 2, "typeOfPesticidesUsed": 1, "fromCountry":1, "numberOfApples": 6],
["basketId": 3, "typeOfPesticidesUsed": 2, "fromCountry":1, "numberOfApples": 3],
["basketId": 4, "typeOfPesticidesUsed": 2, "fromCountry":1, "numberOfApples": 7],
["basketId": 5, "typeOfPesticidesUsed": 1, "fromCountry":2, "numberOfApples": 8],
["basketId": 6, "typeOfPesticidesUsed": 1, "fromCountry":2, "numberOfApples": 4],
["basketId": 7, "typeOfPesticidesUsed": 2, "fromCountry":2, "numberOfApples": 9],
["basketId": 8, "typeOfPesticidesUsed": 2, "fromCountry":2, "numberOfApples": 5]
]
in this case, how do I formulate an algorithm of sorts to output into an array like so:
uniquePairingOfBasketProperties = [
["typeOfPesticidesUsed": 1, "fromCountry":1],
["typeOfPesticidesUsed": 2, "fromCountry":1],
["typeOfPesticidesUsed": 1, "fromCountry":2],
["typeOfPesticidesUsed": 2, "fromCountry":2]
]
my main point is so that I can get my UITableView to know how many rows it should have. Which in this case is 4 instead of total number of baskets.
Huh? You have an array of dictionaries. You want to divide those dictionaries into "buckets" where each bucket has a unique combination of pesticide type and country of origin?
Assuming that's the case, how about this:
let kNumberOfCountries = 2
uniqueValue = basket["typeOfPesticidesUsed"] * kNumberOfCountries +
basket["fromCountry"]
uniqueValue will jump in large steps based on the type of pesticide, and then change by 1s based on the country of origin. (think of a rectangular grid where the country number starts at 1 on the left and increases to the right, and the pesticide number starts at 1 at the top and increases as you go down. The unique value number is 1 at the top left square, counts up to the right, then wraps around to the next row and keeps counting up by 1s.
You can then group your table view based on uniqueVaue.
If you want to know how many unique parings you have, create an empty set of integers. Loop through your array of baskets. Calculate the uniqueValue for that basket, and add it to the set of uniqueValues (sets only have one entry for each value.) Once you are done looping, the number of entries in the set is the number of unique pairings you have. If you use an NSCountedSet, you can even get the count of the number of baskets with each pairing. (I don't know if Swift has a native counted set collection. It didn't last time I checked.)
EDIT:
It looks like Swift does NOT have a native counted set collection (at least not yet.) There is, however, at least one open source Swift counted set (aka a bag) on Github
Related
Suppose we have an array that's like [1, 2, 3, 4], if I created a segment tree for that array we'd get something like: [null, 10, 3, 7, 1, 2, 3, 4], so all of the subarray sums would exactly what we have on the segment tree.
However, if our input array is like [1, 2, 3], our segment tree would be something like: [null, 6, 3, 3, 1, 2, 3, 0], with the trailing 0 since we don't have a complete binary tree due to 3 (the array's length) not being a power of 2.
Unlike in the first example, since our binary tree isn't complete, we run into duplicate ranges. In our tree: [null, 6, 3, 3, 1, 2, 3, 0], the 2nd last 3 and the last 3 represent the same range, since the right tree has a 0.
Is there any way to distinguish between this duplicate range? Or should I be using another data structure for a problem that's susceptible to this kind of segment tree duplicate range issue that I'm having with my second example?
I have a timestamp for every record in the data set.
I heard about time based spiting but don't know anything about it.
Normal cross-validation
You have a set of data points:
data_points = [2, 4, 5, 8, 6, 9]
Then, if you do a 2-fold split, your data points will get randomly assigned to 2 different groups.
For example:
split_1 = [2, 5, 9]
split_2 = [3, 8, 6]
However, this assumes that there is no need to keep the order of your data points.
You can train your model with split_1 and test it with split_2.
Time based splitting
However, this assumption isn't always correct for time series prediction.
For example, given the same data points:
data_points = [2, 4, 5, 8, 6, 9]
It can be that they are arranged by time.
You could then have a model that to predict the next number, it looks back 3 time steps. (e.g. to predict the number after 9, it will have [8, 6, 9] as input. Meaning that the order of which the data points appear is important. Because of that, in order to test your model, you cannot randomly split your data points. The order in which they appear needs to be kept.
So if you do a 2-fold split, you could get the following splits:
split_1 = [2, 4, 5, 8]
split_2 = [5, 8, 6, 9]
Implementation
There is an implementation of time-based cross-validation from Sklearn: the TimeSeriesSplit.
I'm arranging the data based on a priority(ascending order), where '0' ignored in prioritising.
Below is the Rails Query:
Profile.where(active: true).order(:priority).pluck(:priority)
This query returns an ordered list of records with priorities that starts from '0'
[0, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 7]
Could you help me figure out how to order the data where the record with "0" is added to last in the query as per the example below.
Example: [1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 7, 0]
You can pass a string to #order to use raw SQL so you could say:
Profile.where(active: true)
.order('case priority when 0 then 1 else -1 end, priority')
.pluck(:priority)
to force the priority zero entries to the end. You don't have to use 1 and -1 as the numbers of course, you could use anything that is readable to you and sorts in the right order, you could even use strings (assuming they sort properly of course):
.order("case priority when 0 then 'last' else 'first' end, priority")
I am trying to generate random numbers but only certain numbers. I know to generate a random number between 0 and 10 you'd use:
arc4random_uniform(11)
But what if I wanted to generate a random number between a selection of, say 3, 5, 8, and 10?
Vacawama is right and should be given credit.
a little more thought.
Chose what number you want and put them into an array. then use the index of the array to get the
[3, 5, 8, 10]
array index starts at zero so; [0: 3, 1: 5, 2: 8, 3: 10].
using "4" within the arc4random will let you choose between 0-3.
I am attempting to create a stacked column chart with an unequal number of "sub-groups".
For example, given the following data:
Category#1 : [SubCategory1: 2, SubCategory2: 4, SubCategory3: 3],
Category#2 : [SubCategory4: 5, SubCategory5: 3],
Category#3 : [SubCategory6: 4, SubCategory7: 3, SubCategory8: 3, SubCategory9: 5]
...
I want to create a column chart where the first column is comprised of three stacked segments and has a total height of 9,
the second column has a stack of two segments with total height of 8,
and the third column has four segments with a total height of 15.
After having worked for a little while with the HighCharts API and generally getting good results, I believe what I want to accomplish is probably doable and I am likely just missing some combination of options or structuring my data incorrectly. Does anyone know what I need to do in order to create such a chart?
Two of the ways you can solve this are:
Giving each point in your series a specific x index that relates to the category index.
Example of a series (JSFiddle):
series: [{
name: 'John',
data: [{x:0,y:5},{x:3,y:7},{x:4,y:2}]
}
Here we skip the 2nd and 3rd category (index 1 and 2), so they will not have a value.
Using null values in your series to skip having it appear in a category.
Example of a series (JSFiddle):
series: [{
name: 'John',
data: [5, null, null, 7, 2]
}
This series also skips the 2nd and 3rd category, like the one above.
Your solution choice may rely on how many null values you would end up with. If it is only a few, then that might be the most lightweight solution. If it is a lot, then using Point objects with x values may be more suitable and cleaner.