I have a 2d Mat which has x,y as indexes and data is a z value.
I would like to create a mat with x,z index
and give y values as the data.
Yes, I understand I would have to limit the range and datatype of z.
I can do it element by element. Is there something faster?
sample
array in:
Y | (X,Y) = Z
3 | 3 1 3
2 | 1 3 4
1 | 2 2 1
---------- X
1 2 3
array out: Z |
4 | 0 0 2 (X,Z) = Y
3 | 3 2 3
2 | 1 1 0
1 | 2 3 1
--------------
1 2 3 X
The resulting matrix will actually be very sparse.
The original will be roughly 100x20 (x100,y20).
The result will be probably 100x1000 with the 20 y values
in each x column. So that is pretty sparse!
I don't know if that matters in the selection of available tools.
Related
I'm working with a golf data set and I'm looking for a way to filter holes based on the result of a previous hole. In the end, I want this range to be able to get the average score of the golfer following a bogey or worse.
I've made a few attempts with FILTER(), OFFSET(), and even INDIRECT(), but I can't figure out how to properly use values from a different row as the condition for my filter.
=FILTER(A2:D10, OFFSET(D2:D10, -1, 0) >= 1, ROW(D2:D10) <> 2) (errors with "FILTER has mismatched range sizes.")
=INDIRECT("D"&FILTER(ROW(A2:D10)+1, D2:D10 >= 1, ROW(D2:D10) <> 2)) (only returns the first value)
Sample Data:
A B C D
-----------------------------
1 | Hole Par Score ScoreDiff
2 | 1 4 5 1
3 | 2 4 4 0
4 | 3 4 3 -1
5 | 4 5 6 1
6 | 5 3 3 0
7 | 6 5 6 1
8 | 7 3 4 1
9 | 8 4 5 1
10 | 9 4 4 0
Desired outcome: only the holes directly following a bogey or worse (where ScoreDiff >= 1)
A B C D
-----------------------------
1 | 2 4 4 0
2 | 5 3 3 0
3 | 7 3 4 1
4 | 8 4 5 1
5 | 9 4 4 0
Simpler option:
=FILTER(A3:D11,D2:D10>=1)
try:
=FILTER(A2:D10, {""; D2:D9} >= 1, ROW(D2:D10) <> 2)
I have been given matrices filled with alphanumerical values excluding lower case letters like so:
XX11X1X
XX88X8X
Y000YYY
ZZZZ789
ABABABC
and have been tasked with counting the repetitions in each row and then tallying up a score depending on the ranking of the character being repeated. I used {⍺ (≢⍵)}⌸¨ ↓ m to help me. For the example above I would get something like this:
X 4 X 4 Y 4 Z 4 A 3
1 3 8 3 0 3 7 1 B 3
8 1 C 1
9 1
This is great but now I need to do a function that would be able to multiply the numbers with each letter. I can access the first matrix with ⊃ but then I am completely lost on how to access the other ones. I can simply write ⊃w[2] and ⊃w[3] and so forth but I need a way to change every matrix at the same time in one function. For this example, the array of the ranking is as follow: ZYXWVUTSRQPONMLKJIHGFEDCBA9876543210 so for the first array XX11X1X
which corresponds to:
X 4
1 3
So the X is 3rd in the array so it corresponds to a 3 and 1 is 35th so it's a 35. The final scoring would be something like (3×104)+(35×103). My biggest problem is not necessarily the scoring part but being able to access each matrix individually in one function. So for this nested array:
X 4 X 4 Y 4 Z 4 A 3
1 3 8 3 0 3 7 1 B 3
8 1 C 1
9 1
if I do arr[1] it gives me the scalar
X 4
1 3
and ⍴ arr[1] gives me nothing confirming it so I can do ⊃arr[1] to get the matrix itself and have access to each column individually. This is where I'm stuck. I'm trying to write a function to be able to do the math for each matrix and then saving those results to an array. I can easily do the math for the first matrix but I can't do it for all of them. I might have made a mistake by making using {⍺ (≢⍵)}⌸¨ ↓ m to get those matrices. Thanks.
Using your example arrangement:
⎕ ← arranged ← ⌽ ⎕D , ⎕A
ZYXWVUTSRQPONMLKJIHGFEDCBA9876543210
So now, we can get the index values:
1 ⌷ m
XX11X1X
∪ 1 ⌷ m
X1
arranged ⍳ ∪ 1 ⌷ m
3 35
While you could compute the intermediary step first, it is much simpler to include most of the final formula in in Key's operand:
{ ( arranged ⍳ ⍺ ) × 10 * ≢⍵ }⌸¨ ↓m
┌───────────┬───────────┬───────────┬─────────────────┬───────────────┐
│30000 35000│30000 28000│20000 36000│10000 290 280 270│26000 25000 240│
└───────────┴───────────┴───────────┴─────────────────┴───────────────┘
Now we just need to sum each:
+/¨ { ( arranged ⍳ ⍺ ) × 10 * ≢⍵ }⌸¨ ↓m
65000 58000 56000 10840 51240
In fact, we can combine the summation with the application of Key to avoid a double loop:
{ +/ { ( arranged ⍳ ⍺ ) × 10 * ≢⍵ }⌸ ⍵}¨ ↓m
65000 58000 56000 10840 51240
For completeness, here is a way to use the intermediary result. Let's start by working on just the first matrix (you can get the second one with 2⊃ instead of ⊃ ― for details, see Problems when trying to use arrays in APL. What have I missed?):
⊃{⍺ (≢⍵)}⌸¨ ↓m
X 4
1 3
We can insert a function between the left column elements and the right column elements with reduction:
{⍺ 'foo' ⍵}/ ⊃{⍺ (≢⍵)}⌸¨ ↓m
┌─────────┬─────────┐
│┌─┬───┬─┐│┌─┬───┬─┐│
││X│foo│4│││1│foo│3││
│└─┴───┴─┘│└─┴───┴─┘│
└─────────┴─────────┘
So now we simply have to modify the placeholder function with one that looks up the left argument in the arranged items, and multiplies by ten to the power of the right argument:
{ ( arranged ⍳ ⍺ ) × 10 * ⍵ }/ ⊃{⍺ (≢⍵)}⌸¨ ↓m
30000 35000
Instead of applying this to only the first matrix, we apply it to each matrix:
{ ( arranged ⍳ ⍺ ) × 10 * ⍵ }/¨ {⍺ (≢⍵)}⌸¨ ↓m
┌───────────┬───────────┬───────────┬─────────────────┬───────────────┐
│30000 35000│30000 28000│20000 36000│10000 290 280 270│26000 25000 240│
└───────────┴───────────┴───────────┴─────────────────┴───────────────┘
Now we just need to sum each:
+/¨ { ( arranged ⍳ ⍺ ) × 10 * ⍵ }/¨ {⍺ (≢⍵)}⌸¨ ↓m
65000 58000 56000 10840 51240
However, this is a much more circuitous approach, and is only provided here for reference.
I have a filter/kernel like
| 1 1 1|
H = 1/m | 1 n 1|
| 1 1 1|
I want to know what is the relationship between m and n in this filter and how this relationship
effect the image using convolution.
There doesn't have to be any relationship between n and m, but if you want the convolution to be normalized, you need the sum of the kernel to be 1. In that case
m = 8 + n
The wiki page on kernels also explains that
Normalization ensures that the pixel values in the output image are of
the same relative magnitude as those in the input image.
Otherwise if m < 8 + n they will be brighter, or if m > 8 + n they will be dimmer.
NOTE
As pointed out by BЈовић, changing n changes the action of the filter significantly (see comments on this question).
Given a complete dense graph (over 250.000 nodes) , what is the quickest way to determine the number of k-length paths from node A to B ?
I understand this is an old post, but I had the exact same question and could not find the answer.
I like to think of this problem as a "permutation without repetition", as the order of the nodes visited matters (permutation) and we aren't backtracking (no repetitions). The number of permutations without repetition is: n!/(n-r)!
For a complete graph with N nodes, there are N - 2 remaining nodes to choose from when creating a path between a given A and B. To create a path of length K, K-1 nodes must be chosen from the remaining nodes after A and B are excluded. Therefore, in this context, n = N - 2, and r = k - 1.
Plugging into the above formula yields:
(N-2)!/(N-K-1)!
Example: for N = 5, with nodes 0,1,2,3,4 the following paths are possible from 0 to 1:
0 1
0 2 1
0 2 3 1
0 2 3 4 1
0 2 4 1
0 2 4 3 1
0 3 1
0 3 2 1
0 3 2 4 1
0 3 4 1
0 3 4 2 1
0 4 1
0 4 2 1
0 4 2 3 1
0 4 3 1
0 4 3 2 1
This yields 1 path of length 1, 3 paths of length 2, 6 paths of length 3, and 6 paths of length 4.
This appears to work for any N>=2 and K<=N-1.
You can use basically dynamic programming: For each node Y and path length k, you can compute the number of paths from A to Y of length k if you know the number of paths from A to X of path length k-1 for all nodes X. Total complexity is O(KV), where K is the total path length you are trying to compute for and V is the number of vertices.
I have two classes:
x={-3,-2,1} //represented by *
y={0,5,6,7} //represented by x
If k=3, how do you determine the decision boundary?
* * x * x x x
| | | | | | | | | | | | |
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
Supposedly the correct answer is 1.5, between 1 and 2. How does that work?
The KNN algorithm classifies new observations by looking at the K nearest neighbors, looking at their labels, and assigning the majority (most popular) label to the new observation.
For KNN with K=3, anything < 1.5 will be classified as * and anything > 1.5 will be classified as x.
You can see this by trying out a few examples. Suppose you need to classify a value of 1. The three nearest neighbors are the * at 1, the x at 0, and the * at -2. Since there are two *'s and one x, 1 will be classified as *.
Now suppose you want to classify 2. Here, the three nearest neighbors are the x at 0, the * at 1, and the x at 5. So 2 would get classified as x.
The KNN process implicitly defines a decision boundary. The best way to determine it that I'm aware of is to try a bunch of examples and look for the transition boundary where observation classifications change from one class to another class. In your example this would look like this:
-5 -> *
-4 -> *
-3 -> *
-2 -> *
-1 -> *
0 -> *
1 -> *
2 -> x
3 -> x
4 -> x
You can see this in your example - the decision boundary is somewhere between 1 and 2. Hence the 1.5 answer.