Related
I am trying to implement a weighted sampler for a very imbalanced data set. There are 182 different classes. Here is an array of the bin counts per class:
array([69487, 5770, 5753, 138, 4308, 10, 1161, 29, 5611,
350, 7, 183, 218, 4, 3, 3872, 5, 950,
33, 3, 443, 16, 20, 330, 4353, 186, 19,
122, 546, 6, 44, 6, 3561, 2186, 3, 48,
8440, 338, 9, 610, 74, 236, 160, 449, 72,
6, 37, 1729, 2255, 1392, 12, 1, 3426, 513,
44, 3, 28, 12, 9, 27, 5, 75, 15,
3, 21, 549, 7, 25, 871, 240, 128, 28,
253, 62, 55, 12, 8, 57, 16, 99, 6,
5, 150, 7, 110, 8, 2, 1296, 70, 1927,
470, 1, 1, 511, 2, 620, 946, 36, 19,
21, 39, 6, 101, 15, 7, 1, 90, 29,
40, 14, 1, 4, 330, 1099, 1248, 1146, 7414,
934, 156, 80, 755, 3, 6, 6, 9, 21,
70, 219, 3, 3, 15, 15, 12, 69, 21,
15, 3, 101, 9, 9, 11, 6, 32, 6,
32, 4422, 16282, 12408, 2959, 3352, 146, 1329, 1300,
3795, 90, 1109, 120, 48, 23, 9, 1, 6,
2, 1, 11, 5, 27, 3, 7, 1, 3,
70, 1598, 254, 90, 20, 120, 380, 230, 180,
10, 10])
In some classes, instances are as low as 1. I am trying to implement a Weighted random sampler from torch for this dataset. However, as the class imbalance is so large, when I calculate weights using
count_occr = np.bincount(dataset.y)
lbl_weights = 1. / count_occr
weights = np.array(lbl_weights)
weights = torch.from_numpy(weights)
sampler = WeightedRandomSampler(weights.type('torch.DoubleTensor'), len(weights*2))
I get two error messages:
RuntimeWarning: divide by zero encountered in true_divide
and
RuntimeError: invalid multinomial distribution (encountering probability entry = infinity or NaN)
Does anyone have a work around for this ? I was considering multiplying the lbl_weights by some scalar however I am not sure if this is a viable option.
I'm migrating from MySQL to PostgreSQL, but I'm getting the following error:
PG::TooManyArguments: ERROR: cannot pass more than 100 arguments to a function
when running queries like this:
Project.where(id: ids)
Which is translated to
"SELECT \"projects\".* FROM \"projects\" WHERE \"projects\".\"id\" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100) ORDER BY FIELD(projects.id, '1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','37','38','39','40','41','42','43','44','45','46','47','48','49','50','51','52','53','54','55','56','57','58','59','60','61','62','63','64','65','66','67','68','69','70','71','72','73','74','75','76','77','78','79','80','81','82','83','84','85','86','87','88','89','90','91','92','93','94','95','96','97','98','99','100')"
For me it's a common use case to query by specific IDs and it worked pretty well with MySQL. Is there any way to make this work with PostgreSQL?
I'm using PostgreSQL 13.2 on a docker container.
According to the error you have, cause is the function not the query itself. you can pass 32K arguments to the query and it will work (2byte int limit). As for functions, postgres by default has 100 arg limit (set during compilation). you can try to compile from source and set that number to higher value (I dont recommend doing that, unless you really understand the consequences).
Best approach would be probably to look into how to replace FIELD() function that is executed and modify so that you don't run into the problem. Can you change your system so that you can use column in DB to sort by? That way you dont need to pass those IDs for sorting. Or, if you have to use IDs, what about using CASE for sorting, like in this SO question: Simulating MySQL's ORDER BY FIELD() in Postgresql
The only "fix" I could find was downgrading PostgreSQL docker image to 11.11 where this error does not happen.
This question already has answers here:
Create array of n items based on integer value
(6 answers)
Closed 4 years ago.
Need to create an array of 1 to n numbers with a single line of code in ruby.
I have tried it using while loop. But I'm sure there are other simpler way of doing this in ruby.
a = []
b = 1
while b < 100 do
a << b
b += 1
end
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
Convert a range into an array.
(1..n).to_a
another way
You can just splat a range:
[*1..n]
example
[*1..10]
=>[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Or
a= Array(0..10)
puts a # => =>[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
I am trying to write a function to match ORB features. I am not using default matchers (bfmatcher, flann matcher) because i just want match speific features in image with features in other image.
I saw ORS descriptor its a binary array.
My query is how to match 2 features i.e how to find hamming distance between 2 descriptors ?
ORB descriptors:
descriptor1 =[34, 200, 96, 158, 75, 208, 158, 230, 151, 85, 192, 131, 40, 142, 54, 64, 75, 251, 147, 195, 78, 11, 62, 245, 49, 32, 154, 59, 21, 28, 52, 222]
descriptor2 =[128, 129, 2, 129, 196, 2, 168, 101, 60, 35, 83, 18, 12, 10, 104, 73, 122, 13, 2, 176, 114, 188, 1, 198, 12, 0, 154, 68, 5, 8, 177, 128]
Thanks.
ORB descriptors are just 32 byte uchar Mat's.
the bruteforce and flann matchers do some more work, than just comparing descriptors, but if that's all you want for now, it would be a straight norm:
Mat descriptor1, descriptor2;
double dist = norm( descriptor1, descriptor2, NORM_HAMMING);
// NORM_HAMMING2 or even NORM_L1 would make sense, too.
// dist is a double, but ofc. you'd only get integer values in this case.
Posting the code is losing the formatting that is causing the issue, copying what I in the post will actually do what it should. To bad that isn't an option in using the script, so I uploaded the script file here with an example of the text built in that is causing the issue. I will try to convey what the issue is still.
I am pulling text from mail.app. The emails I am parsing have within them a list of dates (amongst other things):
5/27/2012
5/28/2012
5/29/2012
5/30/2012
5/31/2012
6/1/2012
6/3/2012
6/4/2012
6/5/2012
6/6/2012
Now I'm trying to get the dates into a list. No problem I thought...
The following did NOT work:
Using paragraphs did NOT work, returned the entire thing as a paragraph
set AppleScript's text item delimiters to (ASCII character 13) -- (Carriage Return)
set AppleScript's text item delimiters to (ASCII character 10) -- (LF)
Neither of the delimiters worked. I wondered what exactly the ASCII code of the 'return' was so I made the following:
set rundates to "5/27/2012
5/28/2012
5/29/2012
5/30/2012
5/31/2012
6/1/2012
6/3/2012
6/4/2012
6/5/2012
6/6/2012
6/7/2012
6/8/2012
6/10/2012
6/11/2012"
set mylist to {}
repeat with z from 1 to count of characters of rundates
copy (ASCII number (character z of rundates)) to end of mylist
end repeat
--return mylist ---{53, 47, 50, 55, 47, 50, 48, 49, 50, 13, 53, 47, 50, 56, 47, 50, 48, 49, 50, 13, 53, 47, 50, 57, 47, 50, 48, 49, 50, 13, 53, 47, 51, 48, 47, 50, 48, 49, 50, 13, 53, 47, 51, 49, 47, 50, 48, 49, 50, 13, 54, 47, 49, 47, 50, 48, 49, 50, 13, 54, 47, 51, 47, 50, 48, 49, 50, 13, 54, 47, 52, 47, 50, 48, 49, 50, 13, 54, 47, 53, 47, 50, 48, 49, 50, 13, 54, 47, 54, 47, 50, 48, 49, 50, 13, 54, 47, 55, 47, 50, 48, 49, 50, 13, 54, 47, 56, 47, 50, 48, 49, 50, 13, 54, 47, 49, 48, 47, 50, 48, 49, 50, 13, 54, 47, 49, 49, 47, 50, 48, 49, 50}
---===== Notice the 13s? So this should work right? ====---
So my delimiter using 13 should have worked, but it doesn't.
Anyone have any ideas?
I get different results from your post of the ascii numbers. Actually now that applescript is unicode we use "id" now instead of ascii number. It seems your character is "8232". So use this in your code before you get the text items...
set AppleScript's text item delimiters to character id 8232