Expanding arrays of intervals in Arrayfire - vectorization

I have three Arrayfire arrays that look like this:
Array 1 Array 2 Array 3
20 5 9
3 0 0
9 4 8
0 20 22
... ... ...
Using Arrayfire, I would like to generate 2 new arrays. The first should contain values from Array 1. Each value should be repeated a number of times dictated by the interval between the corresponding values in Array 2 (inclusive) and Array 3 (exclusive). The second array should contain an expansion of the values within each interval for each value from Array 1. Sorry if that's not clear. Here's the desired output to hopefully clarify:
Array 1 Array 2
20 5
20 6
20 7
20 8
9 4
9 5
9 6
9 7
0 20
0 21
... ...
The order of the output doesn't matter.
Thanks, in advance, from an Arrayfire novice.

Related

Calculate Positional Difference based on row for string values for two tables

Table 1:
Position
Team
1
MCI
2
LIV
3
MAN
4
CHE
5
LEI
6
AST
7
BOU
8
BRI
9
NEW
10
TOT
Table 2
Position
Team
1
LIV
2
MAN
3
MCI
4
CHE
5
AST
6
LEI
7
BOU
8
TOT
9
BRI
10
NEW
Output I'm looking for is
Position difference = 10 as that is the total of the positional difference. How can I do this in excel/google sheets? So the positional difference is always a positive even if it goes up or down. Think of it as a league table.
Table 2 New (using formula to find positional difference):
Position
Team
Positional Difference
1
LIV
1
2
MAN
1
3
MCI
2
4
CHE
0
5
AST
1
6
LEI
1
7
BOU
0
8
TOT
2
9
BRI
1
10
NEW
1
Try this:
=IFNA(ABS(INDEX(A:B,MATCH(E2,B:B,0),1)-D2),"-")
Assuming that table 1 is at columns A:B:

Cycling through a sequence 1-12 at different offsets

In Sheets I would like fixed sequence of 1-12. I have set it as =sequence(3,4) and I would like it to roll and wrap when I change the first number
Apologies in advance for formatting. I would like the array to roll and wrap when I change the first number in the sequence. So, the starting array is 1-12, but when I change the first number to 4 I would like the sequence to run from there and wrap around back to 1.
1 2 3 4
5 6 7 8
9 10 11 12
But if I start at 4 I would like it to read
4 5 6 7
8 9 10 11
12 1 2 3
Say your start number is in A1:
=ArrayFormula(MOD(SEQUENCE(3,4,A1-1,1),12)+1)
This uses MOD to cycle through the sequence.

Debugging APL code: how to use `#`(index) and `⊢` (right tack) together?

I am attempting to read Aaron Hsu's thesis on A data parallel compiler hosted on the GPU, where I have landed at some APL code I am unable to fix. I've attached both a screenshot of the offending page (page number 74 as per the thesis numbering on the bottom):
The transcribed code is as follows:
d ← 0 1 2 3 1 2 3 3 4 1 2 3 4 5 6 5 5 6 3 4 5 6 5 5 6 3 4
This makes sense: create an array named d.
⍳≢d
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
This too makes sense. Count the number of elements in d and create a sequence of
that length.
⍉↑d,¨⍳≢d
0 1 2 3 1 2 3 3 4 1 2 3 4 5 6 5 5 6 3 4 5 6 5 5 6 3 4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
This is slightly challenging, but let me break it down:
zip the sequence ⍳≢d = 1..27 with the d array using the ,¨ idiom, which zips the two arrays using a catenation.
Then, split into two rows using ↑ and transpose to get columns using ⍉
Now the biggie:
(⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' '
INDEX ERROR
(⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' '
Attempting to break it down:
⍳≢d counts number of elements in d
(d,¨⍳≢d) creates an array of pairs (d, index of d)
7 27⍴' ' creates a 7 x 27 grid: presumably 7 because that's the max value of d + 1, for indexing reasons.
Now I'm flummoxed about how the use of ⊢ works: as far as I know, it just ignores everything to the left! So I'm missing something about the parsing of this expression.
I presume it is parsed as:
(⍳≢d)#((d,¨⍳≢d)⊢(7 27⍴' '))
which according to me should be evaluated as:
(⍳≢d)#((d,¨⍳≢d)⊢(7 27⍴' '))
= (⍳≢d)#((7 27⍴' ')) [using a⊢b = b]
= not the right thing
As I was writing this down, I managed to fix the bug by sheer luck: if we increment d to be d + 1 so we are 1-indexed, the bug no longer manifests:
d ← d + 1
d
1 2 3 4 2 3 4 4 5 2 3 4 5 6 7 6 6 7 4 5 6 7 6 6 7 4 5
then:
(⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' '
1
2 5 10
3 6 11
4 7 8 12 19 26
9 13 20 27
14 16 17 21 23 24
15 18 22 25
However, I still don't understand how this works! I presume the context will be useful
for others attempting to leave the thesis, so I'm going to leave the rest of it up.
Please explain what (⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' ' does!
I've attached the raw screenshot to make sure I didn't miss something:
I'm happy to see that you found the the off-by-one error. It stems from Aaron Hsu working with index origin 0. If you set ⎕IO←0 then his code will work.
Some dyadic operators can take an array operand, giving the sequence OPERATOR operand argument, e.g. in -#(1 2 3)(4 5 6 7). This poses a problem because both the operand and the argument are arrays, and juxtaposition of arrays forms a new array with those arrays as elements by a process known as stranding. Compare:
(1 2 3)(4 5 6 7)
┌─────┬───┐
│1 2 3│4 5│
└─────┴───┘
However, in the case of the operator with its array operand, we want to "break" this strand so the left part can act as operand while the right part acts as argument. One way to break the stranding up is by applying a function to the argument, giving the sequence OPERATOR operand Function argument. Now, we don't actually need any transformation of the argument, so an identity function will do: -#(1 2 3)⊢(4 5 6 7).
As for what (⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' ' actually does:
7 27⍴' ' creates a blank matrix.
(⍳≢d) are indices to insert into specified slots in the matrix.
#(d,¨⍳≢d) indicates at which locations in the matrix the above should replace the existing values
⊢ serves solely to separate (d,¨⍳≢d) from 7 27⍴' '. The code could also have been written as ((⍳≢d)#(d,¨⍳≢d))7 27⍴' ' with parentheses serving to "bind" the operand to the operator.

why the result of method mostSimilarItems in mahout is not order by the weight?

I have the following codes:
ItemSimilarity itemSimilarity = new UncenteredCosineSimilarity(dataModel);
recommender = new GenericItemBasedRecommender(dataModel,itemSimilarity);
List<RecommendedItem> items = recommender.mostSimilarItems(10, 5);
my datamodel is like this:
uid itemid socre
userid itemid score
1 6 5
1 10 3
1 11 5
1 12 4
1 13 5
2 2 3
2 6 5
2 10 3
2 12 5
when I run the code above,the result is just like this:
13
6
11
2
12
I debug the code,and find that the List items = recommender.mostSimilarItems(10, 5); return the items has the same score,that is one!
so,I have a problem.in my opinion,I think the mostsimilaritem should consider the item co-occurrence matrix:
2 6 10 11 12 13
2 0 1 1 0 1 0
6 1 0 2 1 2 1
10 1 2 0 1 2 1
11 0 1 1 0 1 1
12 1 2 2 1 0 1
13 0 1 1 1 1 0
in the matrix above ,the item 12's most similar should be [6,12,11,13,2],because the item 1 and item 12 is more similar than the other items,isn't it?
now,anyone who can explain this for me?thanks!
In your matrix you have much more data than in your input. In particular you seem to be imputing 0 values that are not in the data. That is why you are likely getting answers different from what you expect.
Mahout expects your IDs to be contiguous Integers starting from 0. This is true of your row and column ids. Your matrix looks like it has missing ids. Just having Integers is not enough.
Could this be the problem? Not sure what Mahout would do with the input above.
I always keep a dictionary to map Mahout IDs to/from my own.

Using COUNTIFS on 3 different columns and then need to SUM a 4th column?

I have written this formula below. I do not know the correct part of this formula that will add the numbers I have in Column AB2:AB552. As it is, this formula is counting the number of cells in that range that has numbers in it, but I need it to total those numbers as my final result. Any help would be great.
=COUNTIFS(Cases!B2:B552,"1",Cases!G2:G552,"c*",Cases!X2:X552,"No",**Cases!AB2:AB552,">0"**)
Assuming you don't actually need the intermediate counts, the sumifs function should give you the final result:
=SUMIFS(Cases!AB2:AB552,Cases!B2:B552,1,Cases!G2:G552,"c",Cases!X2:X552,"No",Cases!AB2:AB552,">0")
Testing this with some limited data:
Row B G X AB
2 2 a No 10
3 1 c No 24
4 2 c No 4
5 1 c No 0
6 1 a Yes 9
7 2 c No 12
8 2 c No 6
9 2 b No 0
10 1 b No 0
11 1 a No 10
12 2 c No 6
13 1 c No 20
14 1 c No 4
15 1 b Yes 22
16 1 b Yes 22
the formula above returned 48, the sum of AB3, AB13, and AB14, which were the only rows matching all 4 criteria

Resources