Google Spreadsheet: Use column + n in formular - google-sheets

In my spreadsheet in Column X I have the following formular:
=ImportRange('_keys'!$B$2;"2015!A200:A203")
Now I'd like to copy this formular to column X+n (in this case X+2) so that it should look like:
=ImportRange('_keys'!$B$2;"2015!C200:C203")
But it doesn't change the column and I have to change it by hand.
Is it possible to change this formular that it always uses the column where the formular is in?

You can use the COLUMN() function to get the column of the current cell as a number. Using ADDRESS() you can turn it into a cell reference string. See the docs for COLUMN and ADDRESS.
Your code becomes
=ImportRange('_keys'!$B$2;
CONCATENATE("2015!", ADDRESS(200, COLUMN()-Y, 4),
":", ADDRESS(203, COLUMN()-Y, 4))
)
where Y is the offset between column A and column X (where this formula is located). The third argument of ADDRESS makes both the row and column relative (without the $). Note that the order of arguments to ADDRESS is row then column, annoyingly.

My solution:
I wrote a simple custom function that converts numbers into letters.
/**
* Converts number of column into column letter
*
* #param {Number} aNumer Number of column
* #return {String} Letter of column
* #customfunction
*/
function COL_NR2LETTER(aNumber) {
var letterArray = ['-', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'AA', 'AB', 'AC', 'AD', 'AE', 'AF', 'AG', 'AH', 'AI', 'AJ', 'AK', 'AL', 'AM', 'AN', 'AO', 'AP', 'AQ', 'AR', 'AS', 'AT', 'AU', 'AV', 'AW', 'AX', 'AY', 'AZ'];
if (aNumber < 1 || aNumber > letterArray.length)
throw "column index out of bound error";
return letterArray [aNumber];
}
Now its possible to copy
=ImportRange('_keys'!$B$2;
"2015!" & COL_NR2LETTER(Column(A1)) &"200:"& COL_NR2LETTER(Column(A1)) &"203")
from Column X into a column X+n.

Related

CountVectorizer skips letters but returns count of words

I have a list of words like below.
words = ['john', 'i', 'romeo', 'i', 'john', 'steve', 'k']
I apply CountVectorizer to get the count of words as below.
vec = CountVectorizer().fit(words)
word_library =
vec.transform(words)
sum_words = [(word, sum_words[0,
idx]) for word, idx in
vec.vocabulary.items()]
It returns
[('john', 2), ('romeo', 1),
('steve', 1)]
I would like to return the count of single letters too, they should not vanish in the process.
[('john', 2), ('i' 2), ('romeo', 1),
('steve', 1), ('k', 1)]

multiple select take ages with snowflake

i have a table with 6M rows and it seems my query take ages.
I try to calculate values for 2 rolling months.
Input:
Period
ID
Tag
Name
Program
Total Cost
2017-06-01
ID1
X
User1
Program 1
438
2020-12-01
ID2
A
User2
Program 2
118
2020-12-01
ID3
X
User3
Program 3
380
Wanted output:
Period
ID
Tag
Name
Program
Total Cost
Period M-1
Total Cost M-1
Period M-2
Total Cost M-2
2017-06-01
ID1
X
User1
Program 1
438
2017-05-01
372
2017-04-01
340
2020-12-01
ID2
A
User2
Program 2
118
2020-11-01
103
2020-10-01
98
2020-12-01
ID3
X
User3
Program 3
380
2020-11-01
362
2020-10-01
334
Where am i wrong ? The below query is very slow.
WITH month_M AS (
SELECT "Period","ID","Tag","Name","Program","Cost USD",
DATEADD(MONTH, -1, "Period" ) AS "Period M-1 ",
DATEADD(MONTH, -2, "Period" ) AS "Period M-2"
FROM "ARROWSPHERE_PROD_DB"."PBI_SCH"."Revenue_Dashboard"
), month_M1 AS (
SELECT "Period","ID","Tag","Name","Program","Cost USD"
FROM "ARROWSPHERE_PROD_DB"."PBI_SCH"."Revenue_Dashboard"
), month_M2 AS (
SELECT "Period","ID","Tag","Name","Program","Cost USD"
FROM "ARROWSPHERE_PROD_DB"."PBI_SCH"."Revenue_Dashboard"
)
SELECT M."Period",M."ID",M."Tag",M."Name",M."Program",M."Cost USD",
M."Period M-1 ",M1."Cost USD" AS "Total Cost M-1",M."Period M-2",M2."Cost USD" AS "Total Cost M-2"
FROM month_M AS M,month_M1 AS M1, month_M2 AS M2
WHERE M."Period M-1 "=M1."Period" AND M."Period M-2"=M2."Period"
AND M."ID"=M1."ID" AND M."ID"=M2."ID"
AND M."Tag"=M1."Tag" AND M."Tag"=M2."Tag"
AND M."Name"=M1."Name" AND M."Name"=M2."Name"
AND M."Program"=M1."Program" AND M."Program"=M2."Program"
You can achieve your goal by using a Window Function like LAG, and reducing drastically your SQL code complexity and the execution plan that will perform the operation, which I guess will require one single table scan only (https://docs.snowflake.com/en/sql-reference/functions/lag.html)
CREATE OR REPLACE TEMPORARY TABLE TMP_TEST (
Period TIMESTAMP,
ID VARCHAR,
Tag VARCHAR,
Name VARCHAR,
Program VARCHAR,
TotalCost NUMERIC
);
INSERT INTO TMP_TEST
VALUES
('2020-10-01', 'ID2', 'A', 'User2', 'Program 2', 98),
('2020-11-01', 'ID2', 'A', 'User2', 'Program 2', 103),
('2020-12-01', 'ID2', 'A', 'User2', 'Program 2', 118),
('2020-10-01', 'ID3', 'X', 'User3', 'Program 3', 334),
('2020-11-01', 'ID3', 'X', 'User3', 'Program 3', 362),
('2020-12-01', 'ID3', 'X', 'User3', 'Program 3', 380);
SELECT * ,
DATEADD(MONTH, -1, Period) AS "Period M-1",
LAG(TotalCost, 1, 0) over (PARTITION BY Id, Tag, Name ORDER BY Period) AS "TotalCost M-2",
DATEADD(MONTH, -2, Period) AS "Period M-2",
LAG(TotalCost, 2, 0) OVER (PARTITION BY Id, Tag, Name ORDER BY Period) AS "TotalCost M-1"
FROM TMP_TEST
ORDER BY Id, Tag, Name, Period;
This is valid SQL so it's not "wrong" but since there are no predicates Snowflake must do a full table scan of 6e8 records, do processing and return about as many rows ...which is a lot of work to do.
If you can't just temporarily use a bigger warehouse, then you will have to dig into the Query Profile to find the bottleneck by clicking the query_id and then the "Profile" tab from the Worksheet UI.
First look at the Profile Overview and look at the breakdown of Remote IO to Processing.
You can reduce Remote IO by selecting fewer columns (if possible) or by using a predicate (like 1 year at a time, or users that start with X, or something... you may have to experiment.) You can click on a step to see how much was able to be pruned.
You can reduce processing by doing less :) which won't be easy but you could try a left join (example below) or a window query.
WITH rev_dash as (select $1 "Period", $2 "ID", $3 "Tag", $4 "Name", $5 "Program", $6 "Cost USD" from values
('2017-06-01', 'ID1', 'X', 'User1', 'Program 1', '438'),
('2020-12-01', 'ID2', 'A', 'User2', 'Program 2', '118'),
('2020-12-01', 'ID3', 'X', 'User3', 'Program 3', '380'),
('2017-05-01', 'ID1', 'X', 'User1', 'Program 1', '438'),
('2020-11-01', 'ID2', 'A', 'User2', 'Program 2', '118'),
('2020-11-01', 'ID3', 'X', 'User3', 'Program 3', '380'),
('2017-04-01', 'ID1', 'X', 'User1', 'Program 1', '438'),
('2020-10-01', 'ID2', 'A', 'User2', 'Program 2', '118'),
('2020-10-01', 'ID3', 'X', 'User3', 'Program 3', '380')
)
, month_M AS (
SELECT "Period","ID","Tag","Name","Program","Cost USD",
DATEADD(MONTH, -1, "Period" ) AS "Period M-1 ",
DATEADD(MONTH, -2, "Period" ) AS "Period M-2"
FROM rev_dash
), month_M1 AS (
SELECT "Period","ID","Tag","Name","Program","Cost USD"
FROM rev_dash
), month_M2 AS (
SELECT "Period","ID","Tag","Name","Program","Cost USD"
FROM rev_dash
)
SELECT M."Period",M."ID",M."Tag",M."Name",M."Program",M."Cost USD", M."Period M-1 ",M1."Cost USD" AS "Total Cost M-1",M."Period M-2",M2."Cost USD" AS "Total Cost M-2"
FROM month_M AS M left join month_M1 AS M1 left join month_M2 AS M2
on M."Period M-1 "=M1."Period" AND M."Period M-2"=M2."Period"
AND M."ID"=M1."ID" AND M."ID"=M2."ID"
AND M."Tag"=M1."Tag" AND M."Tag"=M2."Tag"
AND M."Name"=M1."Name" AND M."Name"=M2."Name"
AND M."Program"=M1."Program" AND M."Program"=M2."Program"
where "Total Cost M-2" is not null;

how to align lists of words using biopython pairwise2

When I run the script below, output is getting split into single chars. Any idea why? It looks like the second argument gets split into single chars.
I am trying to align the word sequences.
I will have many words hence cannot map them to letters only.
from Bio.Seq import Seq
from Bio.pairwise2 import format_alignment
fruits = ["orange","pear", "apple","pear","orange"]
fruits1 = ["pear","apple"]
from Bio import pairwise2
alignments = pairwise2.align.localms(fruits,fruits1,2,-1,-0.5,-0.1, gap_char=["-"])
for a in alignments:
print(format_alignment(*a))
Output:
['orange', 'r', 'a', 'e', 'p', 'e', 'l', 'p', 'p', 'a', 'pear', 'orange']
|||||||||
['-', 'r', 'a', 'e', 'p', 'e', 'l', 'p', 'p', 'a', '-', '-']
Score=4
You are passing a list to localms which expects a string or a Seq object, also gap_char should be a string not a list.
Try the following snippet:
import Bio.pairwise2 as pairwise2
fruits = ["orange", "pear", "apple", "pear", "orange"]
fruits1 = ["pear", "apple"]
for f0 in fruits:
for f1 in fruits1:
print('Aligning {0} and {1}'.format(f0, f1))
alignments = pairwise2.align.localms(f0, f1, 2, -1, -0.5, -0.1, gap_char="-")
for a in alignments:
print(pairwise2.format_alignment(*a))
Output
Aligning orange and pear
orange
|
pear--
Score=2
Aligning orange and apple
orange
|
-apple
Score=2
orange-
|
--apple
Score=2
Aligning pear and pear
pear
||||
pear
Score=8
[...]

Why does char array insert trailing characters when converting to string in Objective C?

I'm trying to write a quick category on NSString to base64 encode the string's contents. Everything seems okay, except for extra characters showing up on the trailing end of the generated string. Can anybody explain why the following code produces the output below?
Source:
const char base64CharSet[64] = {
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
'w', 'x', 'y', 'z', '0', '1', '2', '3',
'4', '5', '6', '7', '8', '9', '+', '/'
};
const char *input = "Hello, World!";
int length = strlen(input);
int outlen = (length / 3) * 4;
int modlen = length % 3;
int rawlen = length - modlen;
if (modlen != 0)
outlen += 4;
char output[outlen];
char inbuf[3], outbuf[4];
int inpos = 0, outpos = 0;
for (outpos = 0, inpos = 0; inpos < rawlen; inpos += 3) {
for (int i = 0; i < 3; i++) {
int j = inpos + i;
inbuf[i] = j < length ? input[j] : 0;
}
outbuf[0] = (inbuf[0] & 0xFC) >> 2;
outbuf[1] = ((inbuf[0] & 0x03) << 4) | ((inbuf[1] & 0xF0) >> 4);
outbuf[2] = ((inbuf[1] & 0x0F) << 2) | ((inbuf[2] & 0xC0) >> 6);
outbuf[3] = (inbuf[2] & 0x3F);
output[outpos++] = base64CharSet[outbuf[0]];
output[outpos++] = base64CharSet[outbuf[1]];
output[outpos++] = base64CharSet[outbuf[2]];
output[outpos++] = base64CharSet[outbuf[3]];
}
if (modlen > 0) {
char modbuf[3] = {0, 0, 0};
for (int i = 0; i < modlen; i++) {
int j = rawlen + i;
modbuf[i] = input[j];
}
outbuf[0] = (modbuf[0] & 0xFC) >> 2;
outbuf[1] = ((modbuf[0] & 0x03) << 4) | ((modbuf[1] & 0xF0) >> 4);
outbuf[2] = ((modbuf[1] & 0x0F) << 2) | ((modbuf[2] & 0xC0) >> 6);
outbuf[3] = (modbuf[2] & 0x3F);
output[outpos++] = base64CharSet[outbuf[0]];
output[outpos++] = base64CharSet[outbuf[1]];
output[outpos++] = modlen == 2 ? base64CharSet[outbuf[2]] : '=';
output[outpos++] = '=';
}
NSLog(#"Input: '%s', Length: %zd", input, strlen(input));
NSLog(#"Output: '%s', Length: %zd, Expected Length: %d", output, strlen(output), outlen);
Output:
2013-03-19 14:46:51.568 Sandbox[19195:c07] Input: 'Hello, World!', Length: 13
2013-03-19 14:46:51.569 Sandbox[19195:c07] Output: 'SGVsbG8sIFdvcmxkIQ==wä]', Length: 23, Expected Length: 20
2013-03-19 14:46:51.569 Sandbox[19195:c07] Output: 'SGVsbG8sIFdvcmxkIQ==wä]', Length: 23, Expected Length: 20
The goober on the end is because you didn't NULL terminate the output buffer. C strings require the character after the last character in the string to be 0 (all 0 bits, not ASCII "0" :).
... appending to a full array would raise an exception ...
Welcome to C! The language is akin to running with scissors. Even when you fall down, you might not get hurt. Might not.
In this case, you aren't actually writing the NULL byte and, thus, the printing of the C string is just reading whatever happens to be on the stack after your string array. I didn't audit the code to determine if the buffer is even of the right size.
Assuming all your math is correct, you could allocate the buffer to be one byte longer than needed for your encoding and drop the terminator there.
char output[outlen + 1];
output[outlen + 1] = 0;

Intersection of two strings/ sets

As coming from python I'm looking for something equivalent to this python code (sets) in delphi5:
>>> x = set("Hello")
>>> x
set(['H', 'e', 'l', 'o'])
>>> y = set("Hallo")
>>> y
set(['a', 'H', 'l', 'o'])
>>> x.intersection(y)
set(['H', 'l', 'o'])
var
a, b, c: set of byte;
begin
a := [1, 2, 3, 4];
b := [3, 4, 5, 6];
c := a*b; // c is the intersection of a and b, i.e., c = [3, 4]
But beware:
var
a, b, c: set of integer;
will not even compile; instead, you get the 'Sets may have at most 256 elements' error. Please see the documentation for more information on Delphi sets.
Update
Sorry, forgot to mention the 'obvious' (from the point of view of a Delphi programmer):
var
a, b, c: set of char;
begin
a := ['A', 'B', 'C', 'D'];
b := ['C', 'D', 'E', 'F'];
c := a*b; // c is the intersection of a and b, i.e., c = ['C', 'D']
But your chars will all be byte chars -- that is, forget about Unicode (Delphi 5 doesn't support Unicode, so in this case this isn't really a restriction)!

Resources