gsub('$0\n','') isn't working
I would prefer something similar. I want:
(note the 10 and 20 have to work with 0 not being replaced in them).
If I have:
23
12
0
15
9
0
10
20
0
I want:
23
12
15
9
10
20
You may want to convert this to an array to re-process it, but the same thing can be done with a regular expression:
string.gsub(/^\s+0+$/m, '')
The /m part is key and it makes the expression operate in multi-line mode, that is ^ and $ refer to the beginning and ending of a line, not the beginning and ending of the string as is usually the case.
Related
In Sheets I would like fixed sequence of 1-12. I have set it as =sequence(3,4) and I would like it to roll and wrap when I change the first number
Apologies in advance for formatting. I would like the array to roll and wrap when I change the first number in the sequence. So, the starting array is 1-12, but when I change the first number to 4 I would like the sequence to run from there and wrap around back to 1.
1 2 3 4
5 6 7 8
9 10 11 12
But if I start at 4 I would like it to read
4 5 6 7
8 9 10 11
12 1 2 3
Say your start number is in A1:
=ArrayFormula(MOD(SEQUENCE(3,4,A1-1,1),12)+1)
This uses MOD to cycle through the sequence.
I am attempting to read Aaron Hsu's thesis on A data parallel compiler hosted on the GPU, where I have landed at some APL code I am unable to fix. I've attached both a screenshot of the offending page (page number 74 as per the thesis numbering on the bottom):
The transcribed code is as follows:
d ← 0 1 2 3 1 2 3 3 4 1 2 3 4 5 6 5 5 6 3 4 5 6 5 5 6 3 4
This makes sense: create an array named d.
⍳≢d
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
This too makes sense. Count the number of elements in d and create a sequence of
that length.
⍉↑d,¨⍳≢d
0 1 2 3 1 2 3 3 4 1 2 3 4 5 6 5 5 6 3 4 5 6 5 5 6 3 4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
This is slightly challenging, but let me break it down:
zip the sequence ⍳≢d = 1..27 with the d array using the ,¨ idiom, which zips the two arrays using a catenation.
Then, split into two rows using ↑ and transpose to get columns using ⍉
Now the biggie:
(⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' '
INDEX ERROR
(⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' '
Attempting to break it down:
⍳≢d counts number of elements in d
(d,¨⍳≢d) creates an array of pairs (d, index of d)
7 27⍴' ' creates a 7 x 27 grid: presumably 7 because that's the max value of d + 1, for indexing reasons.
Now I'm flummoxed about how the use of ⊢ works: as far as I know, it just ignores everything to the left! So I'm missing something about the parsing of this expression.
I presume it is parsed as:
(⍳≢d)#((d,¨⍳≢d)⊢(7 27⍴' '))
which according to me should be evaluated as:
(⍳≢d)#((d,¨⍳≢d)⊢(7 27⍴' '))
= (⍳≢d)#((7 27⍴' ')) [using a⊢b = b]
= not the right thing
As I was writing this down, I managed to fix the bug by sheer luck: if we increment d to be d + 1 so we are 1-indexed, the bug no longer manifests:
d ← d + 1
d
1 2 3 4 2 3 4 4 5 2 3 4 5 6 7 6 6 7 4 5 6 7 6 6 7 4 5
then:
(⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' '
1
2 5 10
3 6 11
4 7 8 12 19 26
9 13 20 27
14 16 17 21 23 24
15 18 22 25
However, I still don't understand how this works! I presume the context will be useful
for others attempting to leave the thesis, so I'm going to leave the rest of it up.
Please explain what (⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' ' does!
I've attached the raw screenshot to make sure I didn't miss something:
I'm happy to see that you found the the off-by-one error. It stems from Aaron Hsu working with index origin 0. If you set ⎕IO←0 then his code will work.
Some dyadic operators can take an array operand, giving the sequence OPERATOR operand argument, e.g. in -#(1 2 3)(4 5 6 7). This poses a problem because both the operand and the argument are arrays, and juxtaposition of arrays forms a new array with those arrays as elements by a process known as stranding. Compare:
(1 2 3)(4 5 6 7)
┌─────┬───┐
│1 2 3│4 5│
└─────┴───┘
However, in the case of the operator with its array operand, we want to "break" this strand so the left part can act as operand while the right part acts as argument. One way to break the stranding up is by applying a function to the argument, giving the sequence OPERATOR operand Function argument. Now, we don't actually need any transformation of the argument, so an identity function will do: -#(1 2 3)⊢(4 5 6 7).
As for what (⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' ' actually does:
7 27⍴' ' creates a blank matrix.
(⍳≢d) are indices to insert into specified slots in the matrix.
#(d,¨⍳≢d) indicates at which locations in the matrix the above should replace the existing values
⊢ serves solely to separate (d,¨⍳≢d) from 7 27⍴' '. The code could also have been written as ((⍳≢d)#(d,¨⍳≢d))7 27⍴' ' with parentheses serving to "bind" the operand to the operator.
I have two files with thousands of lines:
file1:
COL22A1 LCT 1 12 0.149667616334 2.16226378401
GPRIN2 TP53 12 170 0.0455368539793 44.2359753827
MUC3A TP53 12 170 0.0455368539793 44.2359753827
file2:
COL22A1 LCT 12 41 23 0.0296296296296 0.101234567901 0.0567901234568 2.36563
MEGF10 SORCS1 10 21 39 0.0246913580247 0.0518518518519 0.0962962962963 2.30599
I want to compare first two columns of these files and if they match I want to print whole line of second file and last column of first file:
output:
COL22A1 LCT 12 41 23 0.0296296296296 0.101234567901 0.0567901234568 2.36563 2.16226378401
I tried awk, grep, join but it always gives me output of just one file
Could you please try following and let us know then.
awk 'FNR==NR{a[$1,$2]=$NF;next} a[$1,$2]{print $0,a[$1,$2]}' Input_file1 Input_file2
I have a large text file as below imported in MATLAB:
Run Lat Long Time
1 32 32 34
1 23 22 21
2 23 12 11
2 11 11 11
2 33 11 12
up to 10 runs etc.
So I'm trying to break up each section in the file: section 1, section 2, etc and write it to 10 different text files. File 1 will have data from Run 1. File 2 will have data from Run 2.
What you're looking for is Matlab's textread function. I'll give you the pieces you need and frame out the logic, but you'll need to connect the pieces yourself :)
Your read would look something like this
[head1, head2, head3, head4] = textread(file_name,'%s %s %s %s',1);
[run, lat, long, time] = textread(file_name,'%u %u %u %u');
and your write method would use a loop to iterate over the values in
unique(run)
creating a file with
fout = fopen([base_file_name_out num2str(run_number)]);
and writing to it the values contained in
lat_this_run=Lat(run==run_number);
using the method
fprintf(fout,'%u %u %u\n', lat_this_run, long_this_run, time_this_run)
If your data is already loaded into matlab and named A, you could do:
>> a = max(A(:,1));
>> AA={};
>> for i = 1:a
AA{i}=A(find(A(:,1)==i),:)
name=sprintf('%d.txt',i);
dlmwrite(name,AA{i},'\t');
end
The output will be .txt files containing tab-delimited data.
below is my string
local Amount =[[
Customer Details Net Amount
# Seq Name
Amount NTR
1 CDABCDEFGHIJ00564
0,1234
2 CDABCDEFGHIJ00565
0,0361
3 CDABCDEFGHIJ00566
0,0361
4 CDABCDEFGHIJ00567
0,0722
5 CDABCDEFGHIJ00568
0,0000
6 CDABCDEFGHIJ00569
0,0000
7 CDABCDEFGHIJ00570
0,0000
8 CDABCDEFGHIJ00571
0,7091
9 CDABCDEFGHIJ00572
1,4240
10 CDABCDEFGHIJ00573
0,0361
11 CDABCDEFGHIJ00574
0,5790
12 CDABCDEFGHIJ00575
0,4060
13 CDABCDEFGHIJ00576
0,3610
14 CDABCDEFGHIJ00577
0,6859
15 CDABCDEFGHIJ00578
0,2888
16 CDABCDEFGHIJ00579
0,0000
17 CDABCDEFGHIJ00580
0,0000
18 CDABCDEFGHIJ00581
0,0000
19 CDABCDEFGHIJ00582
0,0000
20 CDABCDEFGHIJ00583
0,0000
21 CDABCDEFGHIJ00584
0,0000
22 CDABCDEFGHIJ00585
0,8978
23 CDABCDEFGHIJ00586
0,0000
24 CDABCDEFGHIJ00587
2,3882
25 CDABCDEFGHIJ00588
0,0000
26 CDABCDEFGHIJ00589
2,0216
27 CDABCDEFGHIJ00590
1,7540
28 CDABCDEFGHIJ00591
0,0000
29 CDABCDEFGHIJ00592
0,0722
30 CDABCDEFGHIJ00593
0,0361
31 CDABCDEFGHIJ00594
0,0000
32 CDABCDEFGHIJ00595
0,0000
Total NAT files
11,9269
Direct inquiries to:
]]
by executing the code below
local ptrn = '\n([%d%p]+)\n'
for val1, val2 in string.gmatch(Amount, ptrn) do
print ("val1:=\t" .. (val1 or '').."\tval2:=\t"..(val2 or ''))
end
basically from the above string I want to fetch the last 5 digits of the string which is 00564 in val1 and the amount which is 0,1234 in val2 variable, but all this should in one pattern. This is a record, every record is starting with a number like this is 1 record or row
1 CDABCDEFGHIJ00564
0,1234
and this is 2nd record or row and so on
2 CDABCDEFGHIJ00565
0,0361
plese help....
It seems to me that %d+%s+%a+(%d+)\n%s*([%d,]+) should do the trick: the first %d+ will catch the row number, %s+ to match the white space after. %a+(%d+) will match CDABCDEFGHIJ00592 and capture the digits in the end (no way to specify that you want exactly five digits though). \n%s* will match the newline and any white space on the next line and ([%d,]+) will capture the last number with the comma.