I am working on a CRM and need to find a way to identify people who fit a variety of different words.
For example.
Column A is Dessert and options may be Chocolate, Vanilla, Strawberry or None.
=REGEXMATCH(F2, "chocolate")
Is what I have, but I cannot figure out how to get it to show true for two options
I thought it would be
=REGEXMATCH(F2, "chocolate"|"strawberry")
but that doesn't appear to be it. I am sure I am just not seeing the forest through the trees. Any guidance?
try:
=REGEXMATCH(F2, "chocolate|strawberry")
Related
I have a question that I don't really know how to ask. For context, I am making a sheet to organize a large collection of cards in various different locations. Then I have other lists that I want to "gather all the required items" for from those various locations. However, it's possible to have multiple instances of items in the various locations. (Eg, location four could have the same card as location one, but I only need to know about location one). So I have conditional formatting set up to color code each thing I need based on which location it's in (with a preferred order since certain locations are "better" to pull from). Sheet for reference. Check the "Deck Compare" tab
Image of the "part in question"
On to what I want to achieve. This conditional formatting is nice, but I'd really like for the lists to be grouped by which collection the item is in! Ideally, in the same order as the key. I know that "sort by color" is a thing, but since the lists are dynamic and based on a QUERY, sorting doesn't really work.
Does anyone have any ideas? I'm still fairly new to spreadsheeting and still learning how to go about solving these types of problems. If my question is unclear, please let me know!
I admit this is a strange request. Essentially myself and another person who speaks Mandarin need to work on scheduling asynchronously through a spreadsheet. If either of us enters something in, in our respective sections, it should update the other person's section to match. So If I changed Order 1 on Day 1 from Apple to Butter, it should look at the translated text for Butter in Chinese and update the dropdown list entry for Order 1 on Day 1 from Apple to Butter
Unfortunately it doesn't seem like there's anyway to add formulas to dropdown lists. Any advice here?
I created a super simplified spreadsheet of what I'm looking for Spreadsheet
there is a GOOGLETRANSLATE formula:
also, you have DETECTLANGUAGE that outputs the language code:
both of them (DETECTLANGUAGE is able to work with vertical arrays only) are not supported under ARRAYFORMULA so you will need to drag them around. also, it's worth mentioning that formulae are always 1-directional so you can have a dropdown to be translated but that translated output can't be used directly as the input for back-translation creating a paradox. with a scripted solution, you may have more flexibility tho.
I have a table in the link below:
https://docs.google.com/spreadsheets/d/1EOALaBVzHijUP_8dM1Sr7KTutdTah8b9Q0xDRoNHBLo/edit#gid=0
if the text is split first, then check what do you do? for example "Kebumen District Office" Vs "District Head Office of Kebumen District" Then we need 7x7 columns = 49 columns because we will match for each word words 1-1, 1-2, 1-3, 1-4, 2-1, 2-2.2-3.2-4, etc.
The text in column B is split and then checked for each word with the text in column A. If in column B there are many different words found the text is not similar.
Only I am still confused to make the formula. Please give me the solution sir. Thanks.
The matching patterns are very different in your case and I see no solution based on formulas (regular expressions).
You may need to find articles about fuzzy vlookup.
Here's what I found for google sheets (not tested):
Addon, find fuzzy matches
This problem is common for Excel, there're solutions based on vba.
As I said, the one formula won't solve your task because you have many cases. First example Mc Donald vs McDonald is checked easily with a formula:
= substitute(A, " ", "") = substitute(B, " ", "")
Your next samples are different. You may use some code, but even this won't give the expected results. My suggestion: split the task into small cases and try to solve them separately. Make an investigation or ack a new question for each case.
Your second and 3-d lines are case2. In this case, you need to check all the words in A are also in B. You'll need to try solving it and ask another question if needed. And so on.
Fuzzy matching is definitely the way to go. Different algorithms have different strengths and weaknesses. My suggestion is that you visit the G Suite marketplace and look for Flookup or simply follow this link:
Flookup for Google Sheets
It'll allow you to look for matches ranging from 0% to 100% similarity. The basic formula is:
FLOOKUP(lookupValue, tableArray, lookupCol, indexNum, [threshold], [rank])
Find out more from the official website.
Edit: I'm the creator of Flookup.
I am having a hard time generating precisely the frequency table I am looking for using SPSS.
The data in question: cases (n = ~800) with categorical variables DX_n (n = 1-15), each containing ICD9 codes, many of which are the same code. I would like to create a frequency table that groups the DX_n variables such that I can view frequency of every diagnosis in this sample of cases.
The next step is to test the hypothesis that the clustering of diagnoses in this sample is different than that of another. If you have any advice as to how to test this, that would be really appreciated as well!
Thanks!
Edit: My attempts:
1) Analyze -> Descriptive Statistics -> Frequencies; then add variables DX_n (1-15) and display frequency charts. The output is frequencies of each ICD9 code per DX_n variable (so 15 tables are generated - I'm hoping to just have one grouped table).
2) I tried adjusting the output format to organize by variable and also to compare variables but neither option gives the output I'm looking for.
I think what you are looking for CTABLES. It can do parallel columns of frequencies, and it includes a column proportions test that can see whether the distributions differ
Thank you, JKP! You set me on exactly the right track. I'm not sure how I overlooked that menu. Just to clarify in case anyone else comes along needing to figure this out:
Group diagnosis variables into a multiple response set using Analyze > Custom Tables > Multiple Response Sets. Code the variables as categories.
http:// i.imgur.com/ipE9suf.png
Create a custom table with your new multiple response set as a row and the subsets to compare as columns. I set summary statistics to compute from rows and added the column n% column (sorted descending).
http:// i.imgur.com/hptIkfh.png
Under test statistics, include a column proportions z-test as JKP suggested.
http:// i.imgur.com/LYI6ZRl.png
Behold, your results:
http:// i.imgur.com/LgkBA8X.png
Thanks again, and best of luck to anyone else who runs across this.
-GCH
p.s. Sorry everyone, I was going to post images but don't have enough reputation points yet. Images detailing the steps in the GUI can be found at the obfuscated links above.
I am using Google SpreadSheet, and I'm trying to have multiple sheets containg a list of words. On the final sheet, I would like to create a summative list, which is a combination of all the values in the column. I got it sort working using =CONCATENATE() , but it turned it into a string. Any way to keep it as a column list?
Here is an example as columns:
Sheet1
apple
orange
banana
Sheet2
pineapple
strawberry
peach
FinalSheet
apple
orange
banana
pineapple
strawberry
peach
Updated Answer
I was right there is a much better solution. It's been posted below but I'm copying it here so it's in the top answer:
=unique({A:A;B:B})
Caveat: This will include one blank cell in certain scenarios (such as if there's one at the end of the first list).
If you're not concerned with ordering and a tailing blank cell a simple sort() will clean things up:
=sort(unique({A:A;B:B}))
Otherwise a filter() can remove the blanks like so:
=filter(unique({A:A;B:B}),NOT(ISBLANK(unique({A:A;B:B}))))
The following is the old deprecated answer
I'm confident that this is "The Wrong Way To Do It", as this seems such an absurdly simple and common task that I feel I must be missing something as it should not require such an overwrought solution.
But this works:
=UNIQUE(TRANSPOSE(SPLIT(JOIN(";",A:A,B:B),";")))
If your data contains any ';' characters you'll naturally need to change the delimiter.
The basic way, is just to do it as arrays like so
={A1:A10;B1:B10...etc}
The problem with this method, as I found out is that its very time consuming if you have lots of columns.
I've done some searching around and have come across this article:
Joining Multiple Columns Into One Sorted Column in Google Spreadsheets
The core formula is
=transpose(split(arrayformula(concatenate(if(len(A:Z)>0,A:Z&";",""))),";"))
Obviously you'd replace the A:Z to whatever range you want to use.
And if you want to do some sorting or removing duplicates, you'd simply wrap the the above formula in a SORT() and/or UNIQUE() method, like so..
=sort(unique(transpose(split(arrayformula(concatenate(if(len(A:Z)>0,A:Z&";",""))),";"))))
Hope this helps.
Happy coding everyone :)
You can use this:
=unique({A1:A;B1:B})
Works perfect here!
The unique() function gets rid of blank spaces, but wasn't helpful for me because some of my rows repeat. Instead I first filter the columns by len() to remove blank cells. Then I combine the columns together in the same way.
={filter(A:A, len(A:A)); filter(B:B, len(B:B))}
Much more simple:
={sheetone!A2:A;sheettwo!A2:A}
Use flatten, e.g. flatten(A1:B2). More details in this article.
If the 2d range is not in one piece, one can be created first with the ampersand or similar techniques. Afterwards flatten can be called on the resulting 2d range. The below example is a bit overkill but it is nice when working with dynamic 2d ranges, where the basic solution can't be easily used.
flatten(ARRAYFORMULA(SPLIT(ARRAYFORMULA(A1:A2&";"&C3:C4), ";")))
The article shows also how to easily unflatten a range using the, as well undocumented, skipping clause in a query.
=TRANSPOSE(SPLIT(TEXTJOIN("#",TRUE,TRANSPOSE(A:C),TRANSPOSE(D1:D5)),"#",FALSE,FALSE))
use a preferred delimiter absent in the data (instead of #) if needed
the first 1 (TRUE) parameter means IGNORE EMPTY, which is very important in this case..
the A:C and D1:D5 are the ranges to combine
all values remain there - not using UNIQUE
Try using your CONCATENATE argument with
=ArrayFormula(EXPAND(...))