Ruby script to extract numbers [closed] - ruby-on-rails

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a .txt file with characters that look like this:
7 3 5 7 3 3 3 3 3 3 3 6 7 5 5 22 1 4 23 16 18 5 13 34 24 17 50 30 42 35 29 27 52 35 44 52 36 39 25 40 50 52 40 2 52 52 31 35 30 19 32 46 50 43 36 15 21 16 36 25 7 3 5 17 3 3 3 3 23 3 3 46 1 2
I want to extract numbers >10 only if 7 or more of the next 15 numbers are greater than 10 too.
In this case, I would have the output:
22 1 4 23 16 18 5 13 34 24 17 50 30 42 35 29 27 52 35 44 52 36 39 25 40 50 52 40 2 52 52 31 35 30 19 32 46 50 43 36 15 21 16 36 25
Note that in this output there's numbers <10, but they pass the condition of having 7 or more of the next 15 numbers >10.

Sounds like a homework question, but I'll give you an attempted answer just for fun anyway.
numbers = "7 3 5 7 3 3 3 3 3 3 3 6 7 5 5 22 1 4 23 16 18 5 13 34 24 17 50 30 42 35 29 27 52 35 44 52 36 39 25 40 50 52 40 2 52 52 31 35 30 19 32 46 50 43 36 15 21 16 36 25 7 3 5 17 3 3 3 3 23 3 3 46 1 2"
numbers.split.each_cons(16).map{|x| x[0] if x[1..15].count{|y| y.to_i > 10} >= 7}.compact

num_string = "7 3 5 7 3 3 3 3 3 3 3 6 7 5 5 22 1 4 23 16 18 5 13 34 24 17 50 30 42 35 29 27 52 35 44 52 36 39 25 40 50 52 40 2 52 52 31 35 30 19 32 46 50 43 36 15 21 16 36 25 7 3 5 17 3 3 3 3 23 3 3 46 1 2"
num_arr = num_string.split(" ")
def next_ones(arr)
counter = 0
arr.each do |num|
if num.to_i > 10
counter += 1
end
end
if counter >= 7
arr[0]
end
end
def processor(arr)
answer = []
arr.each_with_index do |num, index|
if num.to_i > 10
answer << next_ones(arr[index...(index + 15)])
end
end
answer.compact.join(" ")
end
processor(num_arr)
A bit verbose, and with bad naming, but it should give you some ideas.

Related

How T Transpose Multiple Columns Values by Groups between groups delimiters in adjacent Column Google Sheets?

I have the following minimal example data (in reality 100's of groups) in range A1:P9 (same data in range A14:A22):
With Sample A1:AR9:
2
61
219
2
4
2
:
61
219
26
26
26
94
21
33
4
26
26
26
94
2
2
:
154
26
40
19
3
2
21
33
14
1
2
3
:
87
39
54
38
26
32
38
26
32
87
39
54
38
26
23
23
4
6
28
2
154
26
2
2
40
19
14
87
39
54
38
26
32
38
26
32
87
39
54
38
26
1
23
2
23
4
4
3
6
20
28
Or Sample A14:AQ22:
2
61
219
2
:
61
219
4
:
26
26
26
94
2
:
21
33
4
26
26
26
94
2
:
154
26
2
:
40
19
3
2
21
33
14
:
87
39
54
38
26
32
38
26
32
87
39
54
38
26
1
:
23
2
:
23
4
:
3
6
20
2
154
26
2
2
40
19
14
87
39
54
38
26
32
38
26
32
87
39
54
38
26
1
23
2
23
4
4
3
6
20
28
I need the output as shown in range Q1:AR3 or as in range Q14:AQ16.
Basically, at each group delimited/inbetween values in Column A, I would need:
The intemediary adjacent values in Column B to be transposed horizontally
And the adjacent content of Columns C to P (14 Columns, at least) to be "joined" together horizontaly an sequencialy "per group", including the content of the delimiter's row (in Column A).
As a bonus it would be really nice to have the Transposed data followed by a :, and each sub Content of Columns C to P to be also separated by a | (as shown in screenshot Q1:AR3 or Q14:AR16).
(Or if it's more feasible, alternatively, the simpler to read 2nd model as in A14:AQ22).
I have a really hard time putting together a formula to come to the expected result.
All I could think of was:
Transposing Column B's content by getting the rows of the adjacent Cells with values in column A,
Concatenating with the Column letter,
Duplicating it in a new column, and Filtering out the blank intermediary cells,
Then shifting the duplicated column 1 cell up,
Then concatenating within a TRANSPOSE formula to get the range of the groups,
Then finally transposing all the groups from Columns B in a new Colum
(very convoluted but I couldn't find better way).
To get to that input:
=TRANSPOSE(B1:B3)
=TRANSPOSE(B4:B5)
=TRANSPOSE(B7:B9)
That was already a very manual and error prone process, and still I could not successfully think of how to do the remaining content joining of Column C to P in a formula.
I tested the following approach but it's not working and would be very tedious process to fix to go and to implement on large datasets:
=TRANSPOSE(B1:B3)&": "&JOIN( " | " , FILTER(C1:P1, NOT(C2:P2 = "") ))&JOIN( " | " , FILTER(C2:P2, NOT(C2:P2 = "") ))&JOIN( " | " , FILTER(C43:P3, NOT(C3:P3 = "") ))
=TRANSPOSE(B4:B5)&": "&JOIN( " | " , FILTER(C4:P4, NOT(C4:P4 = "") ))&JOIN( " | " , FILTER(C5:P5, NOT(C5:P5 = "") ))
=TRANSPOSE(B6:B9)&": "&JOIN( " | " , FILTER(C6:P6, NOT(C6:P6 = "") ))&JOIN( " | " , FILTER(C7:P7, NOT(C7:P7 = "") ))&JOIN( " | " , FILTER(C8:P8, NOT(C8:P8 = "") ))&JOIN( " | " , FILTER(C8:P8, NOT(C9:P9 = "") ))
What better approach to favor toward the expected result? Preferably with a Formula, or if not possible with a script.
Any help is greatly appreciated.
For Sample 1 try this out:
=LAMBDA(norm,MAP(UNIQUE(norm),LAMBDA(ζ,{TRANSPOSE(FILTER(B1:B9,norm=ζ)),":",SPLIT(BYROW(TRANSPOSE(FILTER(BYROW(C1:P9,LAMBDA(r,TEXTJOIN("ζ",1,r))),norm=ζ)),LAMBDA(rr,TEXTJOIN("γ|γ",1,rr))),"ζγ")})))(SORT(SCAN(,SORT(A1:A9,ROW(A1:A9),),LAMBDA(a,c,IF(c="",a,c))),ROW(A1:A9),))

Proper way to set corARMA() correlation structure for lme {nlme}

I have a time series with temperature measures every 5-minutes over ca. 5-7 days. I'm looking to set the correlation structure for my model as I have considerable temporal autocorrelation. I've decided that moving averages would be the best form, but I am unsure what to specify within the correlation = corARMA(q=?) part of the model. Here is the following output for ACF(m1):
lag ACF
1 0 1.000000000
2 1 0.906757430
3 2 0.782992821
4 3 0.648405513
5 4 0.506600300
6 5 0.369248402
7 6 0.247234208
8 7 0.139716028
9 8 0.059351579
10 9 -0.009968973
11 10 -0.055269347
12 11 -0.086383590
13 12 -0.108512009
14 13 -0.114441343
15 14 -0.104985321
16 15 -0.089398656
17 16 -0.070320370
18 17 -0.051427604
19 18 -0.028491302
20 19 0.005331508
21 20 0.044325557
22 21 0.083718759
23 22 0.121348020
24 23 0.143549745
25 24 0.151540265
26 25 0.146369313
It appears that there is highly significant autocorrelation in the first ca. 7 lags. See also the attached images: 1[Residuals] & 2[Model]
Would this mean I set the correlation = corARMA(q=7)?

torch / lua: retrieving n-best subset from Tensor

I have following code now, which stores the indices with the maximum score for each question in pred, and convert it to string.
I want to do the same for n-best indices for each question, not just single index with the maximum score, and convert them to string. I also want to display the score for each index (or each converted string).
So scores will have to be sorted, and pred will have to be multiple rows/columns instead of 1 x nqs. And corresponding score value for each entry in pred must be retrievable.
I am clueless as to lua/torch syntax, and any help would be greatly appreciated.
nqs=dataset['question']:size(1);
scores=torch.Tensor(nqs,noutput);
qids=torch.LongTensor(nqs);
for i=1,nqs,batch_size do
xlua.progress(i, nqs)
r=math.min(i+batch_size-1,nqs);
scores[{{i,r},{}}],qids[{{i,r}}]=forward(i,r);
end
tmp,pred=torch.max(scores,2);
answer=json_file['ix_to_ans'][tostring(pred[{i,1}])]
print(answer)
Here is my attempt, I demonstrate its behavior using a simple random scores tensor:
> scores=torch.floor(torch.rand(4,10)*100)
> =scores
9 1 90 12 62 1 62 86 46 27
7 4 7 4 71 99 33 48 98 63
82 5 73 84 61 92 81 99 65 9
33 93 64 77 36 68 89 44 19 25
[torch.DoubleTensor of size 4x10]
Now, since you want the N best indexes for each question (row), let's sort each row of the tensor:
> values,indexes=scores:sort(2)
Now, let's look at what the return tensors contain:
> =values
1 1 9 12 27 46 62 62 86 90
4 4 7 7 33 48 63 71 98 99
5 9 61 65 73 81 82 84 92 99
19 25 33 36 44 64 68 77 89 93
[torch.DoubleTensor of size 4x10]
> =indexes
2 6 1 4 10 9 5 7 8 3
2 4 1 3 7 8 10 5 9 6
2 10 5 9 3 7 1 4 6 8
9 10 1 5 8 3 6 4 7 2
[torch.LongTensor of size 4x10]
As you see, the i-th row of values is the sorted version (in increasing order) of the i-th row of scores, and each row in indexes gives you the corresponding indexes.
You can get the N best values/indexes for each question (i.e. row) with
> N_best_indexes=indexes[{{},{indexes:size(2)-N+1,indexes:size(2)}}]
> N_best_values=values[{{},{values:size(2)-N+1,values:size(2)}}]
Let's see their values for the given example, with N=3:
> return N_best_indexes
7 8 3
5 9 6
4 6 8
4 7 2
[torch.LongTensor of size 4x3]
> return N_best_values
62 86 90
71 98 99
84 92 99
77 89 93
[torch.DoubleTensor of size 4x3]
So, the k-th best value for question j is N_best_values[{{j},{values:size(2)-k+1}]], and its corresponding index in the scores matrix is given by this row, column values:
row=j
column=N_best_indexes[{{j},indexes:size(2)-k+1}}].
For example, the first best value (k=1) for the second question is 99, which lies at the 2nd row and 6th column in scores. And you can see that values[{{2},values:size(2)}}] is 99, and that indexes[{{2},{indexes:size(2)}}] gives you 6, which is the column index in the scores matrix.
Hope that I explained my solution well.

how to join two pandas dataframe on specific column

I have 1st pandas dataframe which looks like this
order_id buyer_id caterer_id item_id qty_purchased
387 139 1 7 3
388 140 1 6 3
389 140 1 7 3
390 36 1 9 3
391 79 1 8 3
391 79 1 12 3
391 79 1 7 3
392 72 1 9 3
392 72 1 9 3
393 65 1 9 3
394 65 1 10 3
395 141 1 11 3
396 132 1 12 3
396 132 1 15 3
397 31 1 13 3
404 64 1 14 3
405 146 1 15 3
And the 2nd dataframe looks like this
item_id meal_type
6 Veg
7 Veg
8 Veg
9 NonVeg
10 Veg
11 Veg
12 Veg
13 NonVeg
14 Veg
15 NonVeg
16 NonVeg
17 Veg
18 Veg
19 NonVeg
20 Veg
21 Veg
I want to join this two data frames on item_id column. So that the final data frame should contain item_type where it has a match with item_id.
I am doing following in python
pd.merge(segments_data,meal_type,how='left',on='item_id')
But it gives me all nan values
You have to check types by dtypes of both columns (names) to join on.
If there are different, you can cast them, because you need same dtypes. Sometimes numeric columns are string columns, but looks like numbers.
If there are both same string types, maybe help cast both of them to int. Problem can be some whitespaces:
segments_data['item_id'] = segments_data['item_id'].astype(int)
meal_type['item_id'] = meal_type['item_id'].astype(int)
pd.merge(segments_data,meal_type,how='left',on='item_id')

How do I capture images in OpenCV and saving in pgm format?

I am brand new to programming in general, and am working on a project for which I need to capture images from my webcam (possibly using OpenCV), and save the images as pgm files.
What's the simplest way to do this? Willow Garage provides this code for image capturing:
http://opencv.willowgarage.com/wiki/CameraCapture
Using this code as a base, how might I modify it to:
capture an image from the live cam every 2 seconds
save the images to a folder in pgm format
Thanks so much for any help you can provide!
First of all, please use newer site - opencv.org. Using outdated references leads to chain effect, when new users see old references, read old docs and post old links again.
There's actually no reason to use old C API. Instead, you can use newer C++ interface, which, among other things, handles capturing video gracefully. Here's shortened version of example from docs on VideoCapture:
#include "opencv2/opencv.hpp"
using namespace cv;
int main(int, char**)
{
VideoCapture cap(0); // open the default camera
if(!cap.isOpened()) // check if we succeeded
return -1;
for(;;)
{
Mat frame;
cap >> frame; // get a new frame from camera
// do any processing
imwrite("path/to/image.png", frame);
if(waitKey(30) >= 0) break; // you can increase delay to 2 seconds here
}
// the camera will be deinitialized automatically in VideoCapture destructor
return 0;
}
Also, if you are new to programming, consider using Python interface to OpenCV - cv2 module. Python is often considered simpler than C++, and using it you can play around with OpenCV functions right in an interactive console. Capturing video with cv2 looks something like this (adopted code from here):
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
# do what you want with frame
# and then save to file
cv2.imwrite('path/to/image.png', frame)
if cv2.waitKey(30) & 0xFF == ord('q'): # you can increase delay to 2 seconds here
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
Since ffriend's answer is only partially true, I'll add some more to it (in C++). The author of this question asks explicitly for exporting to PGM (PXM file format that stores each pixel in 8 bits) and not PNG (as ffriend describes in his/her reply). The main issue here is that the official documentation for imwrite is omho not clear about this matter at all:
For PPM, PGM, or PBM, it can be a binary format flag ( CV_IMWRITE_PXM_BINARY ), 0 or 1. Default value is 1.
If we read the sentence in normal English, we have a list of options: CV_IMWRITE_PXM_BINARY, 0 or 1. There is no mention that those can and actually are supposed to be combined! I had to experiment a little bit (I also needed to store 8-bit images for my project) and finally got to the desired solution:
std::vector<int> compression_params; // Stores the compression parameters
compression_params.push_back(CV_IMWRITE_PXM_BINARY); // Set to PXM compression
compression_params.push_back(0); // Set type of PXM in our case PGM
const std::string imageFilename = "myPGM.pgm"; // Some file name - C++ requires an std::string
cv::imwrite(imageFilename, myImageMatrix, compression_params); // Write matrix to file
My investigation was also fueled by this question where the author was (maybe still is) struggling with the very same issue and also by some basic information on the PXM format, which you can find here.
The result (only part of the image) is displayed below:
P2
32 100
255
121 237 26 102 88 143 67 224 160 164 238 8 119 195 138 16 176 244 72 106 72 211 168 45 250 161 37 1 96 130 74 8
126 122 227 86 106 120 102 150 185 218 164 232 111 230 207 191 39 222 236 78 137 71 174 96 146 122 117 175 34 245 6 125
124 121 241 67 225 203 118 209 227 168 175 40 90 19 197 190 40 254 68 90 88 242 136 32 123 201 37 35 99 179 198 163
97 161 129 35 48 140 234 237 98 73 105 77 211 234 88 176 152 12 68 93 159 184 238 5 172 252 38 68 200 130 194 216
160 188 21 53 16 79 71 54 124 234 34 92 246 49 0 17 145 102 72 42 105 252 81 63 161 146 81 16 72 104 66 41
189 100 252 37 13 91 71 40 123 246 33 157 67 96 71 59 17 196 96 110 109 116 253 30 42 203 69 53 97 188 90 68
101 36 84 5 41 59 80 8 107 160 168 9 194 8 71 55 152 132 232 102 12 96 213 24 134 208 1 55 64 43 74 22
92 77 30 44 139 96 70 152 160 146 142 8 87 243 11 91 49 196 104 250 72 67 159 44 240 225 69 29 34 115 42 2
109 176 145 90 137 172 65 25 162 57 169 92 214 211 72 94 149 20 104 56 27 67 218 17 203 182 5 124 138 2 130 48
121 225 25 106 89 76 69 189 34 25 173 8 114 83 72 52 145 154 64 40 91 2 251 53 251 237 20 124 82 2 194 42 ...
Which is exactly what is required in this case. You can see the "P2" marking at the top and also the values are clearly from 0 to 255, which is exactly 8 bits per pixel.
Read most of the answers but none of them could satisfy my requirement. Here's how I implemented it.
This program will use webcam as a camera and clicks picture when you press 'c' - we can change the condition, then make it to click pictures automatically after certain interval
# Created on Sun Aug 12 12:29:05 2018
# #author: CodersMine
import cv2
video_path = 0 # 0 internal cam, 1 external webcam
cap = cv2.VideoCapture(video_path)
img_ctr = 0 # To Generate File Names
while(True):
ret, frame = cap.read()
cv2.imshow("imshow",frame)
key = cv2.waitKey(1)
if key==ord('q'): # Quit
break
if key==ord('c'): # Capture
cv2.imshow("Captured",frame)
flag = cv2.imwrite(f"image{img_ctr}.png", frame)
print(f"Image Written {flag}")
img_ctr += 1
# Release the Camera
cap.release()
cv2.destroyAllWindows()
If you don't need superaccurate 2seconds then simply put a sleep(2) or Sleep(2000) in the while(1) loop to wait fro 2seconds before each grab,
Write images with cvSaveImage() just put the extention .pgm on the filename and it will use pgm.
I believe that the format is chosen from the extension of the filename - so assuming your opencv lib's are linked against the appropriate libs you can do something like: (this is from memory, might not be correct.)
CvCapture* capture = cvCaptureFromCam(0);
IplImage* frame = cvQueryFrame(capture);
while (1) {
frame = cvQueryFrame(capture);
cvSaveImage("foo.pgm", &frame);
sleep(2);
}
cvReleaseImage(&frame);
cvReleaseCapture(&capture);

Resources