Merge two lines in a fixed length file - fixed-length-record

I have a fixed length file but few records randomly split into two lines(see below 3rd adn 4th lines). How can I join those two lines which are not equal standard size length?
Input:
10030114U65701698421090000
10030115U65715565405110000
10030116U658036
77717050000
10030117U65810298016060000
10030118U66105431422030000
Output needed:
10030114U65701698421090000
10030115U65715565405110000
10030116U65803677717050000
10030117U65810298016060000
10030118U66105431422030000

Related

How to specify a range in a Google Sheets formula, where the end row number is a reference to another cell?

I have several google spreadsheets with different number of records (rows) - let's say
file 1: 200.000 records (rows)
file 2: 350.000 records (rows)
file 3: 246.000 records (rows)
etc.
I use a lot of formulas (20-30) that reference entire columns from file 1:
sumif(a$2:a$200000,">3")
countif(b$2:b$200000, "=n")
etc.
I want to reuse the already created formulas for the other files, but since the number of records there is different, I would have to replace the 200.000 with 350.000 for file 2 in 20-30 cells, with 246.000 for file 3 in 20-30 cells etc.
That would be too much work.
Is there a way to specify the end point of the range not with a constant but by pointing to a cell that contains the number of rows?
e.g.
I would add in cell z1 the number of rows: 200000
The other formulas would contain something like
sumif(a$2:a$ (something that tells sheets to use as row number the number from z1) )
This way I would need to only replace the number in z1, and all formulas would be updated correctly. Any ideas?
I tried using indirect:
="a"&indirect("z1")
where z1 contains 200000
This pastes
a200000
But if I try using it in a range, it's not recognized as a range
=sum(a1:"a"&indirect("z1"))
Any ideas how to do that correctly?
why not just skip it... instead of:
=sumif(a$2:a$200000,">3")
use:
=sumif(a$2:a,">3")
to answer your indirecting, the correct syntax would be:
=sum(INDIRECT("a1:a"&z1))
You don't need to use the line numbers limit on this case.
Just use sumif(A$2:A,">3") and it will read the whole column A starting from line 2

Google sheet: joint text from multiple columns

On a new tab, for each row, I want to enclose the text of all the columns from my dataset tab that contains the word "WORD" in its 2nd row.
I cannot directly target the column letter, and the number and place of columns containing "WORD" will change over time.
I've tried with HLOOKUP and QUERY, I can't get there.
Example
dataset
#
Another header
Another header
xxxx
WORD
WORD
1
contentA
contentC
2
contentB
contentD
new tab
#
ALL WORD
1
contentA ContentC
2
contentB ContentD
use:
=FLATTEN(QUERY(TRANSPOSE(A1:B);;9^9))
or:
=INDEX(TRIM(FLATTEN(QUERY(TRANSPOSE(A1:B);;9^9))))
update:
=INDEX(TRIM(FLATTEN(QUERY(QUERY(TRANSPOSE(FILTER(
dataset!A2:99999; REGEXMATCH(dataset!1:1; "(?:)WORD")));;9^9)))))
use:
=ARRAYFORMULA(TRANSPOSE(TRIM(QUERY(TRANSPOSE(FILTER(dataset!3:100000,dataset!2:2="WORD")),,9^9)))
the use of the number 100000 is intentional, it should be more rows than you'd ever have.

Using ArrayFormula with a Dynamic Number of Column Header Names

My goal is to use ArrayFormula with the SPLIT() function, and name the headers of each column.
My problem is that the formula below only works when the number of headers declared exactly matches the first row's number of elements to split ie. if there are 3 elements being split on the first row, the formula needs 3 headers named (g1, g2, g3), but if any rows have more than 3 elements to split, it gives an error.
Is there a way to make the column header names dynamic in number, so that the number of elements to split can be, say, from 0-10? The elements to be split will always be separated by a comma and no spaces.
=ArrayFormula({"g1", "g2", "g3";if(A2:A="","",split(A2:A,","))})
link to example: https://docs.google.com/spreadsheets/d/1c2pskSYsGs12Yjbn-5gORQ22mDSaC9cSnp1nWeULlf4/edit?usp=sharing
You can try:
=index(iferror({"g"&sequence(1,max(len(substitute(
transpose(query(transpose(if(iferror(split(A2:A,","))="",,"z")),,9^9)),
" ",))));split(A2:A,",")}))
If we can use the Orders column, it's as simple as:
=index(iferror({"g"&sequence(1,max(B:B));split(A2:A,",")}))
You can achieve it by combining the index function, the sequence function and the max function. Here is the thought process behind it:
The max function (you can read more about it here) will retrieve the maximum value of the orders column.
The sequence function (you can read more about it here) will generate a series starting at 1 and ending at the previous maximum value.
The index function (you can read more about it here) will distribute the elements of the sequence (with a "g" in front) across as many cells as elements are in the sequence.
If you combine those, you get:
=INDEX("g"&SEQUENCE(1,MAX(B:B)))

How can I generate a three column list of unique "combos"?

I have three columns of information. For example: color, model, year.
Can I use the "unique" instruction to generate in three new columns each unique combination for color, model, year, each in one column?
ex.
color model year
red sedan 2016
red sedan 2020
black truck 2018
Thanks!
Suppose your three headers are in A1, B1 and C1 with your data running A2:C. And suppose you want the unique combinations in E:G. First, be sure that the entire range E:G is empty. Then place the following formula in E1:
=ArrayFormula({A1:C1;SPLIT(FLATTEN(UNIQUE(FILTER(A2:A,A2:A<>""))&"|"&TRANSPOSE(FLATTEN(UNIQUE(FILTER(B2:B,B2:B<>""))&"|"&TRANSPOSE(UNIQUE(FILTER(C2:C,C2:C<>"")))))),"|")})
The formula first reproduces the headers from A1:C1.
The combinations are formed by first concatenating each UNIQUE model (from a list that is FILTERed to remove blanks) with each UNIQUE year (from a list that is also FILTERed to remove blanks), with a pipe symbol between each as a separator that SPLIT will later use.
That grid of combinations is FLATTENed into a single column and then concatenated once more with a UNIQUE and FILTERed list of the colors leading off, and again with a pipe symbol as a separator. Once more, the entire grid of results is FLATTENed into a single column.
Finally, SPLIT acts on the pipe symbols to separate the three pieces into their own columns under the headers.
try:
=INDEX({A1:C1; UNIQUE(QUERY(SPLIT(FLATTEN(FLATTEN(A2:A&"×"&
TRANSPOSE(B2:B))&"×"&TRANSPOSE(C2:C)), "×"),
"where Col3 is not null"))})
the task is simple: take column A and combine it with transposed column B. flatten the output in one single column and combine it with transposed column C and again flatten it into one single column. then split it and query out all combinations that have less than 3 columns. next, run it through unique to remove duplicates.

TextPad Replace Character and Line Feed with Nothing

How do I replace a line in TextPad ' with nothing (ie: delete lines with just that one character)?
I have an Excel Spreadsheet containing three columns:
Column A - single quote
Column B - some number
Column C - single quote plus a comma
There are over 90,000 rows on this spreadsheet with data in column B. There are over one million rows with just a single quote in column A because I did a "Ctrl+D" on that column to copy the value in that column (a single quote) down to all rows.
When I copy and paste these three columns into TextPad, I end up with over one million lines. I replaced the tabs with nothing using the F8/Replace dialog.
(Replace: tab with: empty string)
The majority of what is left are lines that contain only a single quote. I want to delete these 900,000 extra lines.
How do I specify a Replace (delete) of single quote + line feed. I do not want to delete any of the single quotes from the lines that include a number that came from column B.
I just figured it out. The backslash n is the line feed.
If I check Regular Expression and enter this Find what:
'\n
(Keeping empty string for Replace with) and Replace All, I have deleted those extra lines.
I also experienced the same...it did not work for me until I did this:
uncheck the regular expression first before entering \n in the find box and replacing with whatever you chose to (in my case, it was ',').
Your result might be an entire list becoming transposed (that's what happened to my data).

Resources