exclude certain text from keyword analysis in google sheets - google-sheets

I'm trying to do a little bit of analysis on the topics of emails I receive. I have the emails in a Google-sheet in the format below. I'm trying to count how often 'privacy' or 'confidentiality' are mentioned. My challenge is that pretty much every email signature mentions one of those words, so when i use SEARCH every cell returns TRUE.
Most email signatures start with similar phrases, so I tried deleting anything after those phrases with this formula:
=ArrayFormula(TRIM(LEFT(B1:B,MIN(IFERROR(FIND({" This email and any","IMPORTANT NOTICE", " Important notice","The information in this email"," The contents of this message"," Information in this email including"," This electronic mail message"," this message and any attachments"," This message is intended for the addressee only"," This email is CONFIDENTIAL"},B1:B),LEN(L2))))))
Column B is the column with the email body text in.
However that seems to be deleting text that follows words that aren't in my search (deleting everything after 'not' instead of 'IMPORTANT NOTICE' for instance).
Could anyone advise on either:
what's wrong with my above search
an alternate way of searching for 'privacy' and 'confidentiality' without including text from email signatures.
Example table:
|email title|email body|
|-----------|----------|
|Do you want to buy my stuff| Hi there, I'd like to know if you'd like to buy this thing I want to sell you. IMPORTANT: this email is private|
|two-for-the-price-of-one| I've a great offer for you! This email and attachments are private & confidential|
|Last chance to buy stuff!| Can we have a private call about whether you want to buy my stuff yet?|
In the example above I want to count row 3, but not rows 1 & 2, as the 'private' and 'confidential' mentions in 1 & 2 are in the signature.
Thanks!

I think I understand the error that you've described is occuring with your formula. Once the formula finds one of the values you are using to try to identify an email signature, such as " Important notice", and returns the location of that text, let's say position 96, it then uses 96 for all of the cells, like this: LEFT(B1:B,96). So you might not be able to do the compound arrayformula of an arrayformula that you are trying.
Using the formula like this, in B2, and dragging it down, should work though:
=ArrayFormula(TRIM(LEFT(B2,MIN(IFERROR(
FIND({" This email and any","IMPORTANT NOTICE", " Important notice","The information in this email"," The contents of this message"," Information in this email including"," This electronic mail message"," this message and any attachments"," This message is intended for the addressee only"," This email is CONFIDENTIAL"},B2),
LEN(L2))))))
Note: I'm not sure what value is in your L2.
But for the overall approach, it really depends on how well your terms to identify email signatures work, so as to exclude them from your final full text searches.

Related

Auto email when a cell is populated

I will explain since coding is many years behind me but I feel this can be done.
We have a form that our employees fill out when they need to return to a job. This works fine, sends them an email that it was received and creates the sheet in google.
We then have the CSR fill out their name once they set up the callback.
Simple right?
Yes and no.
The employees know it was received but not that column c2 (or whatever) is now populated with the CSR name and was set.
What I am looking to do is have column c2 trigger an email back to the employee who filled out the form with certain data from the google sheet so they know it was addressed, by who and when.
I have seen similar codes but nothing that does exactly this. I can play with the column names etc. but I am cannot get it started.
Yes, brand new with google sheets and the last programming I did was DOS and Advanced Revelatinos about 20 years ago.
For argument sake here are the headers:
Date employee_email Customer Address Reason date_needed set_up_date CSR (CSR would be the trigger and it would send the customer, address, set_up_date and CSR name to it.
It sounds so simple yet after 6 hours at my EMT shift (less two calls) I have not gotten far.
Thank you again.
Any help, direction or solution would be great!

Google Sheets - merge duplicates "join" alternative

CONTEXT:
I have a script that fetches some infos from gmail - "Emails Received" sheet - like from, cc, subject, to.
Sometimes the email subject isn't exactly the same but what's important it's the reference always indicated on the subject - e.g. 9052/18.6T8TER in subject: "Re: payment proof 9052/18.6T8TER".
I've created another sheet called "Emails Joined" that extracts that important reference from the subjects on Column A; merges the remaining data that belongs to all emails that have that same reference within email subjects
Using "Join" as is now, Array doesn't work and it makes the script a lot slower when it's fetching the emails as, I assume, it's always trying to join the results.
WHAT I'M LOOKING FOR
A better alternative to Join, keeping in mind that I don't want to match the full subject but instead the reference that's contained within the email subjects
An alternative that doesn't make the script so slow but if that's not possible, I also saw some posts about using If to "stop" a formula and maybe that's the way to go so the merge doesn't interfere with the email fetching
Can anyone point me in a better direction?
Thanks in advance.
Test Spreadsheet
Take a look at the new tab called MK.help. 3 formulas in B2,D2 and E2. This is the one for B2, it's based on a concept I learned here on Stack from #Player0.
=ARRAYFORMULA(SPLIT(TRANSPOSE(TRIM(QUERY(QUERY({REGEXEXTRACT('Emails Received'!C2:C&"|";"[A-Z]\D*([ \d|/\(.|_)\d A-Z+]+)\[?")\CHAR(9679)&TO_TEXT('Emails Received'!D2:D)&CHAR(10)};"select MAX(Col2) group by Col2 pivot Col1");;9^9)));"|"))

Formula (Array, etc.) for automatic Google Sheets Indexing using inputs from Google Forms

I'm hoping that someone can help me tweak (or even substitute) a formula that I'm using in Google Sheets to automatically populate columns with information based on inputs from a Google form.
Simply put, I am using the Index function to match the name that is selected from a drop-down menu in the Google Form and arrives in Column E of the Google Sheet receiving the responses with an identical list of names in Column A of 'Sheet 2'. The index formula takes information from 'Sheet 2' relating to that name (e.g. Registration Number, Email Address) and places it in the 'Formresponses 1'sheet alongside the inputs from the Google form (including, of course, the name that appears in Column E'.
I have been using (variations on) the following formula without any issues, but I have to manually drag it down the relevant column in 'Formresponses 1' each time a new entry/name arrives from the Google Form: =index(Sheet2!$B$2:$B,match(E2, Sheet2!$A$2:$A,0),1)
I have successfully used Array Formulas to automatically carry out other functions on data arriving from a Google Form (i.e. adding up individual numbers to arrive at an overall total), but in this case I cannot figure out how to create a formula that will automatically take each new name that arrives in column E and insert it into the relevant indexing formula at that end of that new row.
Any suggestions - or solutions! - would be greatly welcome!
Thanks,
A.
Cheers I'-'I,
I've used I'-'I's response to my original question here as a starting point and, with a bit of research, I've come up with the following working formula:
= ArrayFormula(vlookup(E2:E, Sheet2!A:E, {1,2,3,5},FALSE))
[The curly brackets simply indicate the columns in Sheet 2 from which I want to pull pieces of data relating to each name that is matched up in the 'front end' sheet receiving the responses from the Google Form.] As with my previous problems with array formulas, I found the following website really useful, so full credit has to go to it: benlcollins.com

Text replacement

I have a small business that uses Google Sheets to track our employee's cases. When they are typing in that they are handling a case they just put their initials. I would like to make use of the Reminders addon to make sure that cases are not forgotten. The Reminders addon requires that an email be given with who to contact so that it can send them an email reminder. However it is faster to simply type the initials of the case worker.
How to manipulate the functions or write a custom one so that all initials in a certain column can be replaced with the corresponding email?
For example:
=IF(B2="ABC", SUBSTITUTE(B2,"ABC","ABC#123.com"))
Will place ABC's email if the initials ABC are found in a new cell. However I can't expand the function to replace all the employees initials in one script as IFELSE is not recognized as a valid function.
You could do this with a simple vlookup
Make a new sheet with two columns. The first containing the initals the second containing the email addresses.
In a new column in sheet1 you would then write somethign like this:
=vlookup(B2,Sheet2!A1:B25,2)
Where B2 is the cell with the initials and Sheet2!A1:B25 is the range where you store the email addresses.

User input parsing - city / state / zipcode / country

I'm looking for advice on parsing input from a user in multiple combinations of City / State / Zip Code / Country.
A common example would be what Google maps does.
Some examples of input would be:
"City, State, Country"
"City, Country"
"City, Zip Code, Country"
"City, State, Zip Code"
"Zip Code"
What would be an efficient and correct way to parse this input from a user?
If you are aware of any example implementations please share :)
The first step would be to break up the text into individual tokens using spaces or commas as the delimiting characters. For scalability, you can then hand each token to a thread or server (if using a Map-Reducer like architecture) to figure out what each token is. For instance,
If we have numbers in the pattern, then it's probably a zip code.
Is the item in the list of known states?
Countries are also fairly easy to handle like states, there's a limited number.
What order are the tokens in compared to the common ways of writing an address? Most input will probably follow the local post office custom for address formats.
Once you have the individual token results, you can glue the parts back together to get a full address. In the cases where there are questions, you can prompt the user what they really meant (like Google maps) and add that information to a learned list.
The easiest method to add that support to an applications, assuming you're not trying to build a map system, is to query Google or Yahoo and ask them to parse the date for you.
I am myself very fascinated with how Google handles that. I do not remember seeing anything similar anywhere else.
I believe, you try to separate an input string in words trying various delimeters - space, comma, semicolon etc. Then you have several combinations. For each combination, you take each words and match it against country, city, town, postal code database. Then you define some metric on how to evaluate the group match result for each combination. Here should also be cross rules, like if the postal code does not match well, but country, city, town match well and in combination refer to a valid address then the metric yields a high mark.
It is sure difficult and not an evening code exercise. It also requires strong computational resources - a shared hosting would probably crack under just 10 requests, but a data center could serve it well.
Not sure if there is an example implementation. Many geographical services are offered on paid basis. Something that sophisticated as GoogleMaps would likely cost a fortune.
Correct me if I'm wrong.
I found a simple PHP implementation
http://www.eotz.com/2008/07/parsing-location-string-php/
Yahoo seems to have a webservice that offers the functionality (sort of)
http://developer.yahoo.com/geo/placemaker/
Openstreetmap seems to offer the same search functionality on its homepage
http://www.openstreetmap.org/
Assuming you're only dealing with those four fields (City Zip State Country), there are finite values for all fields except for City, and even that I guess if you have a big city list is also finite. So just split each field by comma then check against each field list.
Assuming we're talking US addresses-
Zip is most obvious, so check for
that first.
State has 50x2 options
(California or CA), check that next
Country has ~190x2 options, depending
on how encompassing you want to be
(US, United States, USA).
Whatever is left over is probably your City.
As far as efficiency goes, it might make sense to check a handful of 'standard' formats first, like Dan suggests.

Resources