How can I concatenate two char columns within a perform screen?
example:
table
sample
col1
char(1)
col2
char(1)
after edit/add of sample
let label_3 = sample.col1 + sample.col2
.. this didn't work, I even tried using subscripts for the 2 cols but no dice!
There isn't a simple way to do it. Your closest approach would be a custom C function to do the concatenation:
LET label_3 = CONCATENATE(sample.col1, sample.col2)
That, of course, relies on you having a custom Perform runner with a concatenate function added to it.
Perform pre-dates the addition of the '||' string concatenation operator into SQL and does not support it.
The alternative is to use an Informix 4GL (I4GL) program instead. You can do a lot of things in I4GL that you cannot do in ISQL - at the cost of writing the code.
Related
When I perform this function, it works:
=Query(CRM!1:1085,"Select B where D contains 'FirstName LastName' ",4)
When I perform this one, where B31 is where the text "FirstName LastName" is located in the sheet, the output is only ONE of the many results:
=Query(CRM!1:1085,"Select B where D contains '&B31&' ",4)
I want to be able to use the cell rather than write the quoted text in the formula. This is because I need to quickly replicate the formula for many different values. Also because it needs to be automated in case of changes in the source.
B31 is interpreted as literal string, when you use just 'B31'. You need to use double quotes along with single ones and concatenate:
'"&B31&"'
I need to use query in Google Sheet Spreadsheet to return text in Column B of a given length, and various other conditions. To make it easy, below is a simplified version concentrate solely on Len() function. Seems simple enough but unfortunately does not work.
=QUERY(Valuations!B1:B,"select B where LEN(B)>3 ")
I'm aware that SQL uses LEN(), where as LENGTH() for MySQL.
Are my syntax incorrect or it is not possible to check string length in Google Sheet Query?
You can do it using a filter
=filter(B:B,len(B:B)>=3)
And then if you want to combine that with other conditions, you can put it in a query e.g.
=query(filter(A:B,len(B:B)>=3),"select Col1,Col2 where Col1>1")
See this question
A regular expression can be used:
=QUERY(Valuations!B1:B, "select B where B matches '.{3,}'")
The regular expression explained:
. match any character
{3,} match the preceding symbol (the .) 3 or more times
You could also search for a specific length by modifying the expression to ^.{3}$
OR a range ^.{3,10}$
OR a maximum ^.{,10}$
^ the start of the string
$ the end of the sting
regex101.com is a valuable resource for regular expressions.
I am not associated with the site in any way but I use it all the time.
I have a column XXX like this :
XXX
A
Aruin
Avolyn
B
Batracia
Buna
...
I would like to count a cell only if the string in the cell has a length > 1.
How to do that?
I'm trying :
COUNTIF(XXX1:XXX30, LEN(...) > 1)
But what should I write instead of ... ?
Thank you in advance.
For ranges that contain strings, I have used a formula like below, which counts any value that starts with one character (the ?) followed by 0 or more characters (the *). I haven't tested on ranges that contain numbers.
=COUNTIF(range,"=?*")
To do this in one cell, without needing to create a separate column or use arrayformula{}, you can use sumproduct.
=SUMPRODUCT(LEN(XXX1:XXX30)>1)
If you have an array of True/False values then you can use -- to force them to be converted to numeric values like this:
=SUMPRODUCT(--(LEN(XXX1:XXX30)>1))
Credit to #greg who posted this in the comments - I think it is arguably the best answer and should be displayed as such. Sumproduct is a powerful function that can often to be used to get around shortcomings in countif type formulae.
Create another list using an =ARRAYFORMULA(len(XXX1:XXX30)>1) and then do a COUNTIF based on that new list: =countif(XXY1:XXY30,true()).
A simple formula that works for my needs is =ROWS(FILTER(range,LEN(range)>X))
The Google Sheets criteria syntax seems inconsistent, because the expression that works fine with FILTER() gives an erroneous zero result with COUNTIF().
Here's a demo worksheet
Another approach is to use the QUERY function.
This way you can write a simple SQL like statement to achieve this.
For example:
=QUERY(XXX1:XXX30,"SELECT COUNT(X) WHERE X MATCHES '.{1,}'")
To explain the MATCHES criteria:
It is a regex that matches every cell that contains 1 or more characters.
The . operator matches any character.
The {1,} qualifies that you only want to match cells that have at 1 or more characters in them.
Here is a link to another SO question that describes this method.
I am analyzing an electronic survey I made using Google Forms and I have the following problem.
One of the questions can take multiple answers in the form of Checkboxes as shown in the picture below. The question is in Greek so I have added some Choice1, Choice2, Choice3 etc next to each answer in order to facilitate my question.
In my data when someone chose lets say Choice1 and Choice2,
I will have an answer which is the concatenation of the strings he checked seperated with commas.
In this case it would be:
Choice1, Choice2
If someone else checked Choice1, Choice2 and Choice4
his answer in my data would be:
Choice1, Choice2, Choice4
The problem is SPSS has no way of seperating the substrings (seperated by commas) and understanding which Choices each case has in common. Or maybe there is a way but I don't know it :)
When I, for example, do a simple frequency analysis for this question it produces a table that perceives
Choice1, Choice2
as a completely different case from
Choice1, Choice2, Choice4
Ideally I would like to somehow tell SPSS to count the frequency of each unique Choice (Choice1, Choice2, Choice3 etc etc) rather than each unique combination of those Choices.
Is that possible? And if it is can you point me to the documentation I need to study to make it happen?
Thx a lot!
Imagine you are working with the following data, which is a CSV file you have downloaded from your online form. Copy and paste the text below and save it to a text file named "CourseInterestSurvey.CSV".
Timestamp,Which courses are you interested in?,What software do you use?
12/28/2012 11:57:56,"Research Methods, Data Visualization","Gnumeric, SPSS, R"
12/28/2012 11:58:09,Data Visualization,"SPSS, Stata, R"
12/28/2012 11:59:09,"Research Dissemination, Graphic Design",Adobe InDesign
12/28/2012 11:59:27,"Data Analysis, Data Visualization, Graphic Design","Excel, OpenOffice.org/Libre Office, Stata"
12/28/2012 11:59:44,Data Visualization,"R, Adobe Illustrator"
Read it into SPSS using the following syntax:
GET DATA
/TYPE=TXT
/FILE="path\to\CourseInterestSurvey.CSV"
/DELCASE=LINE
/DELIMITERS=","
/QUALIFIER='"'
/ARRANGEMENT=DELIMITED
/FIRSTCASE=2
/IMPORTCASE=ALL
/VARIABLES=
Timestamp A19
CourseInterest A49
Software A41.
CACHE.
EXECUTE.
DATASET NAME DataSet2 WINDOW=FRONT.
LIST.
It currently looks like the image below--three columns (one timestamp, and two with the data we want):
Working with some syntax from here, we can split the cells up as follows:
* We know the string does not excede 50 characters.
* We got that information while we were reading our data in.
STRING #temp(a50).
* We're going to work on the "CourseInterest" variable.
COMPUTE #temp=CourseInterest.
* We're going to create 3 new variables with the prefix "CourseInterest".
* You should modify this according to the actual number of options your data has
* and the maximum length of one of the strings in your data.
VECTOR CourseInterest(3, a25).
* Here's where the actual variable creation takes place.
LOOP #i = 1 TO 3.
. COMPUTE #index=index(#temp,",").
. DO IF #index GT 0.
. COMPUTE CourseInterest(#i)=LTRIM(substr(#temp,1, #index-1)).
. COMPUTE #temp=substr(#temp, #index+1).
. ELSE.
. COMPUTE CourseInterest(#i)=LTRIM(#temp).
. COMPUTE #temp=''.
. END IF.
END LOOP IF #index EQ 0.
LIST.
The result:
This only addresses one column at a time, and I'm not familiar enough to modify it to work over multiple columns. However, if you were to switch over to R, I already have some readymade functions to help deal with exactly these kinds of situations.
Unfortunately there is no easy "built-in" way to achieve this, but it is certainly achievable with spreadsheet formulae, or Google Apps Script.
Using formulae, assuming your check box question lands in column D, this will produce a "normalised" list:
=ArrayFormula(TRANSPOSE(SPLIT(CONCAENATE(D2:D&",");",")))
and you can turn that into a two-column list and QUERY it to return a table of frequencies:
=ArrayFormula(QUERY(TRANSPOSE(SPLIT(CONCATENATE(D2:D&",");","))&{"",""};"select Col1, count(Col2) group by Col1 label Col1 'Item', count(Col2) 'Frequency'";0))
If your locale uses a comma as a decimal separator, replace {"",""} with {""\""}.
It is easy to split the fields into separate variables as described above. Now define these variables as a multiple response set (Analyze > Tables > Multiple Response Sets), and you can analyze these with the CTABLES or MULT REPONSE procedures and graph them using the Chart Builder
I have a column in open office like this:
abc-23
abc-32
abc-1
Now, I need to get only the sum of the numbers 23, 32 and 1 using a formula and regular expressions in calc.
How do I do that?
I tried
=SUMIF(F7:F16,"([:digit:].)$")
But somehow this does not work.
Starting with LibreOffice 6.4, you can use the newly added REGEX function to generically extract all numbers from a cell / text using a regular expression:
=REGEX(A1;"[^[:digit:]]";"";"g")
Replace A1 with the cell-reference you want to extract numbers from.
Explanation of REGEX function arguments:
Arguments are separated by a semicolon ;
A1: Value to extract numbers from. Can be a cell-reference (like A1) or a quoted text value (like "123abc"). The following regular expression will be applied to this cell / text.
"[^[:digit:]]": Match every character which is not a decimal digit. See also list of regular expressions in LibreOffice
The outer square brackets [] encapsulate the list of characters to search for
^ adds a NOT, meaning that every character not included in the search list is matched
[:digit:] represents any decimal digit
"": replace matching characters (every non-digit) with nothing = remove them
"g": replace all matches (don't stop after the first non-digit character)
Unfortunately Libre-Office only supports regex in find/replace and in search.
If this is a once-only deal, I would copy column A to column to B, then use [data] [text to columns] in B and use the - as a separator, leaving you with all the text in column B and the numbers in column C.
Alternatively, you could use =Right(A1,find("-",A1,1)+1) in column B, then sum Column C.
I think that this is not exactly what do you want, but maybe it can help you or others.
It is all about substring (in Calc called [MID][1] function):
First: Choose your cell (for example with "abc-23" content).
Secondly: Enter the start length ("british" --> start length 4 = tish).
After that: To print all remaining text, you can use the [LEN][2] function (known as length) with your cell ("abc-23") in parameter.
Code now looks like this:
D15="abc-23"
=MID(D15; 5; LEN(D15))
And the output is: 23
When you edit numbers (in this example 23), no problem. However, if you change anything before (text "abc-"), the algorithm collapses because the start length is defined to "5".
Paste the string in a cell, open search and replace dialog (ctrl + f) extended search option mark regular expression search for ([\s,0-9])([^0-9\s])+ and replace it with $1
adjust regex to your needs
I didn't figure out how to do this in OpenOffice/LibreOffice directly. After frustrations in searching online and trying various formulas, I realised my sheet was a simple CSV format, so I opened it up in vim and used vim's built-in sed-like feature to find/replace the text in vim command mode:
:%s/abc-//g
This only worked for me because there were no other columns with this matching text. If there are other columns with the same text, then the solution would be a bit more complex.
If your sheet is not a CSV, you could copy the column out to a text file and use vim to find/replace, and then paste the data back into the spreadsheet. For me, this was a lot less frustrating than trying to figure this out in LibreOffice...
I won't bother with a solution without knowing if there really is interest, but, you could write a macro to do this. Extract all the numbers and then implement the sum by checking for contained numbers in the text.