Find Unique in google sheets - first n characters - google-sheets

I have a google sheet with a column (A) of urls.
Xttps://tXco/008wnbebbw
Xttps://tXco/00lR1FNKBo
Xttps://tXco/00lR1Fw9cO
Xttps://tXco/00UwZwgh2h
Xttps://tXco/00UwZwxSqR
Xttps://tXco/00UwZwxSqR
Xttps://tXco/044TcIFl72
In column B I need to find all unique urls up to the 18th character. For instance column B should show:
Xttps://tXco/008wnbebbw
Xttps://tXco/00lR1FNKBo
Xttps://tXco/00UwZwgh2h
Xttps://tXco/044TcIFl72
I have this formula which I was trying to adapt for it (not sure if it helps at all). I was trying to adapt this to use with =UNIQUE( ?
=SUMPRODUCT(--(LEFT($A$1:$A$15,18)=LEFT(A1,18)))>1
If it helps to take a look at the sheet, here it is: https://docs.google.com/spreadsheets/d/1tG7TpHNvNY86PRiePsKyKfxnuEZah6T7ZDL7dXOIcEA/edit?usp=sharing
Thanks in advance!

You may try this formula:
=ArrayFormula(vlookup(
UNIQUE(FILTER(LEFT(A2:A,17),A2:A<>"")),
FILTER({LEFT(A2:A,17),A2:A},A2:A<>""),
2,0))
How it works
it will first find unique left N chars:
UNIQUE(FILTER(LEFT(A2:A,17),A2:A<>"")
then get left N chars and original strings:
FILTER({LEFT(A2:A,17),A2:A},A2:A<>"")
and then use vlookup to get top first entry for uniques.

Try this instead without the extra column. Put it in B1:
=unique(arrayformula(if(left(A1:A,18)=left(A1:A,18),A1:A,"")))

Try this: =unique(arrayformula(left(A1:A,18)))

Related

Array formula is not working with the combination of Vlookup and Indirect-Match?

I am trying to generate an arrayformula for the whole column but it does not work, when I use the same formula cell by cell, it generates correct value. Here is the formula for F1 cell:
=if(Iferror(vlookup($D1, INDIRECT("$"&"A"&MATCH(E1,$B:$B,0)+1&":$B"),2,false),"")=E1,"",Iferror(vlookup($D1, INDIRECT("$"&"A"&MATCH(E1,$B:$B,0)+1&":$B"),2,false),""))
I am trying to convert it into ArrayFormula like this and put it in F1:
=ARRAYFORMULA(if(Iferror(vlookup($D:$D, INDIRECT("$"&"A"&MATCH($E:$E,$B:$B,0)+1&":$B"),2,false),"")=$E:$E,"",Iferror(vlookup($D:$D, INDIRECT("$"&"A"&MATCH($E:$E,$B:$B,0)+1&":$B"),2,false),"")))
But this does not work and returns empty column, here is the sheet if you want to test it, you can find formula in column F:
https://docs.google.com/spreadsheets/d/13XLZvvdzK_mqr4Ous50cIEfernw2XrPJWvVgt1hFxtk/edit?usp=sharing
Please share your knowledge how to go about it as I am trying to learn formulas starting from basic level? Thank you.
when working with more than one condition, you can usually use FILTER for it. In this case, I used MAP to refer to both columns D and E, and used FILTER to narrow to the matches in A column, and then to those values that were further than the row of the MATCH of E value. Then, with INDEX I chose the first value of that resulting range.
=MAP(D:D,E:E,LAMBDA(d,e,IFERROR(IF(d="","",INDEX(FILTER(B:B,A:A=d,ROW(B:B)>MATCH(e,B:B,0)),1)))))
Unclear of the full scope but from your expected result scenario(in Column F) you seem to be pulling off second match(if any). well in that case try:
=BYROW(D:D,LAMBDA(dx,IF(dx="",,IFNA(FILTER(ARRAY_CONSTRAIN(FILTER(B:B,A:A=dx),2,1),{0;1})))))

Splitting Text and Formatting it - Google Sheets

I have a log file with three fields, the first two are userIDs and the third is an item description. Can I use a single formula to Split the text, format the first 2 items, and leave the 3rd alone?
Sample Data:
<#!9812391203019>; <#!98120319283> ; Short Description
Formula Currently used:
=ARRAYFORMULA(REGEXEXTRACT(TRIM(SPLIT(A3,";")),"[0-9]+"))
This works to extract the userID from the unneeded in the first two columns but will only give me numbers for the third.
Previous Formula was within the REGEXEXTRACT but it left the unneeded "<#!" and ">" in the columns which I wanted to remove.
Any help and suggestions would be appreciated! Thanks in advance!
Does this work?
=Split(RegexReplace(A1,"[<#!>]",),"; ",0)
try:
=INDEX(IFERROR(REGEXREPLACE(TRIM(SPLIT(A1:A5, ";")), "[#!<>]", )))

i give up what the formula for B2:B7 any one knows?

enter image description here
Hi im stuck more than 6 hours finding formula for this simple table, what the best formula for B2:B7 ? already use the filter-search, using vlookup also not work :
=filter(E:E,SEARCH(E2,A:A))
=VLOOKUP(A2;$D$2:$E$7;2;FALSE)
Please use the data on column D to show on column B
Try
=index(RegexExtract(A2:A7,".*\s(.*)"))
Or this
=index(RegexExtract(A2:A7,"(?i)\b"&join("\b|\b",D2:D4)&"\b"))
The first formula extracts the last word, the second one extracts the first word in A2:A7 that matches any of the words in D2:D4.

Q: Transpose -> Merge(?) on google sheets

Trying to transpose data such that rows transpose into a single column stacking on top of each other.
=ARRAYFORMULA({TRANSPOSE(A1:C1);TRANSPOSE(A2:C2);TRANSPOSE(A3:C3)})
This formula essentially does what I want but what if I have many more rows? Would I need to enter; TRANSPOSE(Col(x):Col(y)) for every single row?
Any help is appreciated.
Please try:
=TRANSPOSE(SPLIT(TEXTJOIN(",",1,A:C),","))
Notes:
textjoin will join text and skip blanks. Add spaces in column C to have an empty row.
limit of join function is 50000 characters
Max Makhrov's answer is good, but indeed subject to the 50k limit. To get around that, I have recently found another method which is explained in my
interlacing answer to another question
In your case this would look something like this (up to arbitrary 9 rows):
=query(
sort(
{arrayformula({row(A1:A9)*3, A1:A9});
arrayformula({row(B1:B9)*3+1, B1:B9});
arrayformula({row(C1:C9)*3+2, C1:C9})}
),
"select Col2")
Am I missing something, or why does nobody suggest Flatten?
FLATTEN(A1:C3)
And you can use Filter as usual to filter out blank cells, e.g.
=FILTER(FLATTEN(A1:C3);FLATTEN(A1:C3)<>"")

Countif with len in Google Spreadsheet

I have a column XXX like this :
XXX
A
Aruin
Avolyn
B
Batracia
Buna
...
I would like to count a cell only if the string in the cell has a length > 1.
How to do that?
I'm trying :
COUNTIF(XXX1:XXX30, LEN(...) > 1)
But what should I write instead of ... ?
Thank you in advance.
For ranges that contain strings, I have used a formula like below, which counts any value that starts with one character (the ?) followed by 0 or more characters (the *). I haven't tested on ranges that contain numbers.
=COUNTIF(range,"=?*")
To do this in one cell, without needing to create a separate column or use arrayformula{}, you can use sumproduct.
=SUMPRODUCT(LEN(XXX1:XXX30)>1)
If you have an array of True/False values then you can use -- to force them to be converted to numeric values like this:
=SUMPRODUCT(--(LEN(XXX1:XXX30)>1))
Credit to #greg who posted this in the comments - I think it is arguably the best answer and should be displayed as such. Sumproduct is a powerful function that can often to be used to get around shortcomings in countif type formulae.
Create another list using an =ARRAYFORMULA(len(XXX1:XXX30)>1) and then do a COUNTIF based on that new list: =countif(XXY1:XXY30,true()).
A simple formula that works for my needs is =ROWS(FILTER(range,LEN(range)>X))
The Google Sheets criteria syntax seems inconsistent, because the expression that works fine with FILTER() gives an erroneous zero result with COUNTIF().
Here's a demo worksheet
Another approach is to use the QUERY function.
This way you can write a simple SQL like statement to achieve this.
For example:
=QUERY(XXX1:XXX30,"SELECT COUNT(X) WHERE X MATCHES '.{1,}'")
To explain the MATCHES criteria:
It is a regex that matches every cell that contains 1 or more characters.
The . operator matches any character.
The {1,} qualifies that you only want to match cells that have at 1 or more characters in them.
Here is a link to another SO question that describes this method.

Resources