Q: Creating a combined plot based on a ts object - time-series

R beginner here.
I have created a dataframe (called combined_ts2) based on a dataset I was given.See this link for the dataframe.
Based on the dataframe a made this ts object (find code at bottom of post).
I am supposed to make a plot based on the TS object. I'm thinking this can be achieved with a ts.plot function, but I can't figure out how to use this function.
The result I want is to get a bar graph for capacity and a line graph for fixtures. Does anyone know how to achieve this, and if I can actually achieve this from a ts object?
dput for dataframe
structure(list(week = c(26, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48), capacity = c(45000L,
39000L, 495500L, 855300L, 1318300L, 1301885L, 8211550L, 18515400L,
32282950L, 31568400L, 35410200L, 29867500L, 34809050L, 36420050L,
33960520L, 33987550L, 33465500L, 24599000L, 11597000L, 4553000L,
1375000L, 545000L), fixtures = c(2L, 4L, 12L, 13L, 18L, 29L,
161L, 338L, 393L, 405L, 439L, 386L, 442L, 406L, 413L, 421L, 326L,
180L, 84L, 23L, 6L, 3L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -22L))
Time series code
weekly_timeseries <- ts(combined_ts2, start=c(2019,26), frequency = 52)

Related

darknet mask and anchor values for yolov4

In the README.md of darknet repo https://github.com/AlexeyAB/darknet we have this sentence about anchor boxes:
But you should change indexes of anchors masks= for each [yolo]-layer, so for YOLOv4 the 1st-[yolo]-layer has anchors smaller than 30x30, 2nd smaller than 60x60, 3rd remaining.
It looks like the default anchor boxes for yolov4-sam-mish.cfg are
12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
and the first yolo layer has config:
mask = 0,1,2
Do I understand correctly that this will use those anchors?
(12, 16), (19, 36), (40, 28)
If yes it seems to contradict with the statement or do I understand it incorrectly.
I'm asking because for my dataset and my image sizes (256, 96) I got those anchors from calc_anchors in darknet
15, 56, 22, 52, 28, 48, 23, 62, 26, 59, 39, 43, 31, 57, 29, 66, 37, 64
and trying to figure out how should I set the masks.
Looks good to me.
12, 16,
19, 36,
40, 28,
36, 75,
76, 55,
72, 146,
142, 110,
192, 243,
459, 401
You may leave the masks as are. She current config you show will yield higher MaP; supporting documentation here:
https://github.com/WongKinYiu/PartialResidualNetworks/issues/2

Can't query for more than 100 IDs with PostgreSQL

I'm migrating from MySQL to PostgreSQL, but I'm getting the following error:
PG::TooManyArguments: ERROR: cannot pass more than 100 arguments to a function
when running queries like this:
Project.where(id: ids)
Which is translated to
"SELECT \"projects\".* FROM \"projects\" WHERE \"projects\".\"id\" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100) ORDER BY FIELD(projects.id, '1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','37','38','39','40','41','42','43','44','45','46','47','48','49','50','51','52','53','54','55','56','57','58','59','60','61','62','63','64','65','66','67','68','69','70','71','72','73','74','75','76','77','78','79','80','81','82','83','84','85','86','87','88','89','90','91','92','93','94','95','96','97','98','99','100')"
For me it's a common use case to query by specific IDs and it worked pretty well with MySQL. Is there any way to make this work with PostgreSQL?
I'm using PostgreSQL 13.2 on a docker container.
According to the error you have, cause is the function not the query itself. you can pass 32K arguments to the query and it will work (2byte int limit). As for functions, postgres by default has 100 arg limit (set during compilation). you can try to compile from source and set that number to higher value (I dont recommend doing that, unless you really understand the consequences).
Best approach would be probably to look into how to replace FIELD() function that is executed and modify so that you don't run into the problem. Can you change your system so that you can use column in DB to sort by? That way you dont need to pass those IDs for sorting. Or, if you have to use IDs, what about using CASE for sorting, like in this SO question: Simulating MySQL's ORDER BY FIELD() in Postgresql
The only "fix" I could find was downgrading PostgreSQL docker image to 11.11 where this error does not happen.

Can't use .len of a bidimensional array

I have this simple code that doesn't compile.
const s = [_][_]int {
[_]int{08, 02, 22, 97, 38, 15, 00},
[_]int{49, 49, 99, 40, 17, 81, 18},
[_]int{81, 49, 31, 73, 55, 79, 14},
[_]int{52, 70, 95, 23, 04, 60, 11},
[_]int{22, 31, 16, 71, 51, 67, 63},
[_]int{24, 47, 32, 60, 99, 03, 45},
[_]int{32, 98, 81, 28, 64, 23, 67},
[_]int{67, 26, 20, 68, 02, 62, 12},
[_]int{24, 55, 58, 05, 66, 73, 99},
[_]int{21, 36, 23, 09, 75, 00, 76}
};
pub fn main() void
{
const w = s[0].len;
const h = s.len;
}
The compiler says:
./a.zig:1:14: error: inferred array size invalid here
const s = [_][_]int {
^
./a.zig:16:15: note: referenced here
const w = s[0].len;
What is the problem?
I'd be interested to know there's a deeper reason, but my simple understanding is that the current syntax [N]T allows for the array size to be elided using _, but not for more than one dimension.
So you can fix your problem using the following (N.B. I've used u8 because I'm unsure what your int is):
const s = [_][7]u8{
// Your items
}
I suspect this is because of the way the parsing rules are applied, so [7]u8 would be the type your nested array would hold, and will be used by the compiler to check contents are all of type [7]u8; you can confirm this by modifying one of your rows to have 6 elements and examining the resulting error.
If you want a variable number of items, you could start to look into an array of slices: [_][]u8, but I don't think that's what you're currently after.

Fitting a Support Vector Classifier in scikit-learn with image data produces error

I'm trying to train an SVC classifier for image data. Yet, when I run this code:
classifier = svm.SVC(gamma=0.001)
classifier.fit(train_set, train_set_labels)
I get this error:
ValueError: setting an array element with a sequence.
I produced the images into an array with Matplotlib: plt.imread(image).
The error seems like it's not in an array, yet when I check the types of the data and the labels they're both lists (I manually add to a list for the labels data):
print(type(train_set))
print(type(train_set_labels))
<class 'list'>
<class 'list'>
If I do a plt.imshow(items[0]) then the image shows correctly in the output.
I also called train_test_split from scikit-learn:
train_set, test_set = train_test_split(items, test_size=0.2, random_state=42)
Example input:
train_set[0]
array([[[212, 134, 34],
[221, 140, 48],
[240, 154, 71],
...,
[245, 182, 51],
[235, 175, 43],
[242, 182, 50]],
[[230, 152, 51],
[222, 139, 47],
[236, 147, 65],
...,
[246, 184, 49],
[238, 179, 43],
[245, 186, 50]],
[[229, 150, 47],
[205, 122, 28],
[220, 129, 46],
...,
[232, 171, 28],
[237, 179, 35],
[244, 188, 43]],
...,
[[115, 112, 103],
[112, 109, 102],
[ 80, 77, 72],
...,
[ 34, 25, 28],
[ 55, 46, 49],
[ 80, 71, 74]],
[[ 59, 56, 47],
[ 66, 63, 56],
[ 48, 45, 40],
...,
[ 32, 23, 26],
[ 56, 47, 50],
[ 82, 73, 76]],
[[ 29, 26, 17],
[ 41, 38, 31],
[ 32, 29, 24],
...,
[ 56, 47, 50],
[ 59, 50, 53],
[ 84, 75, 78]]], dtype=uint8)
Example label:
train_set_labels[0]
'Picasso'
I'm not sure what step I'm missing to get the data in the form that the classifier needs in order to train it. Can anyone see what may be needed?
The error message you are receiving:
ValueError: setting an array element with a sequence,
normally results when you are trying to put a list somewhere that a single value is required. This would suggest to me that your train_set is made up of a list of multidimensional elements, although you do state that your inputs are lists. Would you be able to post an example of your inputs and labels?
UPDATE
Yes, it's as I thought. The first element of your training data, train_set[0], corresponds to a long list (I can't tell how long), each element of which consists of a list of 3 elements. You are therefore calling the classifier on a list of lists of lists, when the classifier requires a list of lists (m rows corresponding to the number of training examples with each row made up of a list of n features). What else is in your train_set array? Is the full data set in train_set[0]? If so, you would need to create a new array with each element corresponding to each of the subelements of train_set[0], and then I believe your code should run, although I am not too familiar with that classifier. Alternatively you could try running the classifier with train_set[0].
UPDATE 2
I don't have experience with scikit-learn.svc so I wouldn't be able to tell you what the best way of preprocessing the data in order for it to be acceptable to the algorithm, but one method would be to do as I said previously and for each element of train_set, which is composed of lists of lists, would be to recurse through and place all the elements of sublist into the list above. For example
new_train_set = []
for i in range(len(train_set)):
for j in range(len(train_set[i]):
new_train_set.append([train_set[i,j])
I would then train with new_train_set and the training labels.

Making a function to turn quality strings into a list of Phred scores

I'm new to Python coding, and I am having trouble making a function that turns a quality string into a list of PHRED-scaled quality scores. Hoping to get some assistance.
Here is a FASTQ read:
#SEQ_ID
AAGCGTCTGATCGGCAGAGGATACACATGCCGCACGTCGAGTATCTCGGC
+
=3:AAF>FGD1FCGGGGGFBGGGGCGGG1FE>>>E<:>/<9:CDGFG#GG
This is the function definition:
def quality_to_list(quality_string):
BioPython has a couple of good examples and documentation on Phred scores.
from Bio import SeqIO
with open('tmp.fastq', 'w') as f:
f.write("""#SEQ_ID
AAGCGTCTGATCGGCAGAGGATACACATGCCGCACGTCGAGTATCTCGGC
+
=3:AAF>FGD1FCGGGGGFBGGGGCGGG1FE>>>E<:>/<9:CDGFG#GG""")
for record in SeqIO.parse("tmp.fastq", "fastq"):
print("ID: {0}\nPhred scores: {1}".format(record.id, record.letter_annotations['phred_quality']))
Output:
ID: SEQ_ID
Phred scores: [28, 18, 25, 32, ..., 34, 35, 38, 37, 38, 31, 38, 38]

Resources