How can I properly parse an email address with name? - parsing

I'm reading email headers (in Node.js, for those keeping score) and they are VARY varied. E-mail addresses in the to field look like:
"Jake Smart" <jake#smart.com>, jack#smart.com, "Development, Business" <bizdev#smart.com>
and a variety of other formats. Is there any way to parse all of this out?
Here's my first stab:
Run a split() on - to break up the different people into an array
For each item, see if there's a < or ".
If there's a <, then parse out the email
If there's a ", then parse out the name
For the name, if there's a ,, then split to get Last, First names.
If I first do a split on the ,, then the Development, Business will cause a split error. Spaces are also inconsistent. Plus, there may be more e-mail address formats that come through in headers that I haven't seen before. Is there any way (or maybe an awesome Node.js library) that will do all of this for me?

There's a npm module for this - mimelib (or mimelib-noiconv if you are on windows or don't want to compile node-iconv)
npm install mimelib-noiconv
And the usage would be:
var mimelib = require("mimelib-noiconv");
var addressStr = 'jack#smart.com, "Development, Business" <bizdev#smart.com>';
var addresses = mimelib.parseAddresses(addressStr);
console.log(addresses);
// [{ address: 'jack#smart.com', name: '' },
// { address: 'bizdev#smart.com', name: 'Development, Business' }]

The actual formatting for that is pretty complicated, but here is a regex that works. I can't promise it always will work though. https://www.rfc-editor.org/rfc/rfc2822#page-15
const str = "...";
const pat = /(?:"([^"]+)")? ?<?(.*?#[^>,]+)>?,? ?/g;
let m;
while (m = pat.exec(str)) {
const name = m[1];
const mail = m[2];
// Do whatever you need.
}

I'd try and do it all in one iteration (performance). Just threw it together (limited testing):
var header = "\"Jake Smart\" <jake#smart.com>, jack#smart.com, \"Development, Business\" <bizdev#smart.com>";
alert (header);
var info = [];
var current = [];
var state = -1;
var temp = "";
for (var i = 0; i < header.length + 1; i++) {
var c = header[i];
if (state == 0) {
if (c == "\"") {
current.push(temp);
temp = "";
state = -1;
} else {
temp += c;
}
} else if (state == 1) {
if (c == ">") {
current.push(temp);
info.push (current);
current = [];
temp = "";
state = -1;
} else {
temp += c;
}
} else {
if (c == "<"){
state = 1;
} else if (c == "\"") {
state = 0;
}
}
}
alert ("INFO: \n" + info);

For something complete, you should port this to JS: http://cpansearch.perl.org/src/RJBS/Email-Address-1.895/lib/Email/Address.pm
It gives you all the parts you need. The tricky bit is just the set of regexps at the start.

Related

Why is my binary search implementation returning -1?

This is my main.dart:
import 'edgecases.dart';
main () {
var card = edgecases(0)['input']['cards'];
var query = edgecases(0)['input']['query'];
var result = locate_card(edgecases(0)['input']['cards'], edgecases(0)['input']['query']);
var output = edgecases(0)['output'];
print("Cards:- $card");
print("Query:- $query");
print("Output:- $result");
print("Actual answer:- $output");
}
And this is my edgecases.dart:
edgecases ([edgecasenumber = null]) { //You may make it required, I provided a null as default to check if my syntax is going right.
List tests = [];
var edge1 = {'input': {
'cards': [13, 11, 10, 7, 4, 3, 1, 0],
'query': 1
}, 'output': 6};
tests.addAll([edge1]);
if (edgecasenumber == null){ // This if is useless here so you may
return 'Null type object coud not be found.';
} else {
return tests.elementAt(edgecasenumber); // Indexing in dart also starts with 0.
}
}
locate_card (List cards, int query){
int lo = 0;
int hi = cards.length - 1;
print('$lo $hi');
while (lo <= hi) {
//print('hello'); Uncomment to see if it is entering the loop
var mid = (lo + hi) ~/ 2;
var mid_number = cards[mid];
print("lo:$lo ,hi:$hi, mid:$mid, mid_number:$mid_number");
if (mid_number == query){
return mid;
} else if (mid_number < query) {
hi = mid - 1;
} else if (mid_number > query) {
lo = mid + 1;
};
return -1; //taking about this line
};
}
[I have cut short the code here so you may find some things as unnecessary so just ignore it XD]
Actually I am trying to implement binary search here(I have previously successfully implemented it in python, I am implementing in dart to learn the language.)
On testing it with first edge case(that is on running the command dart main.dart), I found that it is returning the value -1 which was wrong, so I tried commenting the return -1; line in edgecases.dart file to see what happens as it was made to handle another edge case(edgecase if the list is empty, here I have removed that for simplicity). I am not able to understand why it is returning -1 if it gives the right value on commenting that line. Any possible explainations and solutions?
Thanks in advance!
You almost did it right. Just place the return -1; after the while loop's closing brace at the very end of locate_card.

Extract URL from copied text in Google Sheets [duplicate]

I have a sheet where hyperlink is set in cell, but not through formula. When clicked on the cell, in "fx" bar it only shows the value.
I searched on web but everywhere, the info is to extract hyperlink by using getFormula().
But in my case there is no formula set at all.
I can see hyperlink as you can see in image, but it's not there in "formula/fx" bar.
How to get hyperlink of that cell using Apps Script or any formula?
When a cell has only one URL, you can retrieve the URL from the cell using the following simple script.
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1");
var url = sheet.getRange("A2").getRichTextValue().getLinkUrl(); //removed empty parentheses after getRange in line 2
Source: https://gist.github.com/tanaikech/d39b4b5ccc5a1d50f5b8b75febd807a6
When Excel file including the cells with the hyperlinks is converted to Google Spreadsheet, such situation can be also seen. In my case, I retrieve the URLs using Sheets API. A sample script is as follows. I think that there might be several solutions. So please think of this as one of them.
When you use this script, please enable Sheets API at Advanced Google Services and API console. You can see about how to enable Sheets API at here.
Sample script:
var spreadsheetId = "### spreadsheetId ###";
var res = Sheets.Spreadsheets.get(spreadsheetId, {ranges: "Sheet1!A1:A10", fields: "sheets/data/rowData/values/hyperlink"});
var sheets = res.sheets;
for (var i = 0; i < sheets.length; i++) {
var data = sheets[i].data;
for (var j = 0; j < data.length; j++) {
var rowData = data[j].rowData;
for (var k = 0; k < rowData.length; k++) {
var values = rowData[k].values;
for (var l = 0; l < values.length; l++) {
Logger.log(values[l].hyperlink) // You can see the URL here.
}
}
}
}
Note:
Please set spreadsheetId.
Sheet1!A1:A10 is a sample. Please set the range for your situation.
In this case, each element of rowData is corresponding to the index of row. Each element of values is corresponding to the index of column.
References:
Method: spreadsheets.get
If this was not what you want, please tell me. I would like to modify it.
Hey all,
I hope this helps you save some dev time, as it was a rather slippery one to pin down...
This custom function will take all hyperlinks in a Google Sheets cell, and return them as text formatted based on the second parameter as either [JSON|HTML|NAMES_ONLY|URLS_ONLY].
Parameters:
cellRef : You must provide an A1 style cell reference to a cell.
Hint: To do this within a cell without hard-coding
a string reference, you can use the CELL function.
eg: "=linksToTEXT(CELL("address",C3))"
style : Defines the formatting of the output string.
Valid arguments are : [JSON|HTML|NAMES_ONLY|URLS_ONLY].
Sample Script
/**
* Custom Google Sheet Function to convert rich-text
* links into Readable links.
* Author: Isaac Dart ; 2022-01-25
*
* Params
* cellRef : You must provide an A1 style cell reference to a cell.
* Hint: To do this within a cell without hard-coding
* a string reference, you can use the CELL function.
* eg: "=linksToTEXT(CELL("address",C3))"
*
* style : Defines the formatting of the output string.
* Valid arguments are : [JSON|HTML|NAMES_ONLY|URLS_ONLY].
*
*/
function convertCellLinks(cellRef = "H2", style = "JSON") {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = SpreadsheetApp.getActiveSheet();
var cell = sheet.getRange(cellRef).getCell(1,1);
var runs = cell.getRichTextValue().getRuns();
var ret = "";
var lf = String.fromCharCode(10);
runs.map(r => {
var _url = r.getLinkUrl();
var _text = r.getText();
if (_url !== null && _text !== null) {
_url = _url.trim(); _text = _text.trim();
if (_url.length > 0 && _text.length > 0) {
switch(style.toUpperCase()) {
case "HTML": ret += '' + _text + '}' + lf; break;
case "TEXT": ret += _text + ' : "' + _url + '"' + lf; break;
case "NAMES_ONLY" : ret += _text + lf; break;
case "URLS_ONLY" : ret += _url + lf; break;
//JSON default : ...
default: ret += (ret.length>0?(','+ lf): '') +'{name : "' + _text + '", url : "' + _url + '"}' ; break;
}
ret += lf;
}
}
});
if (style.toUpperCase() == "JSON") ret = '[' + ret + ']';
//Logger.log(ret);
return ret;
}
Cheers,
Isaac
I tried solution 2:
var urls = sheet.getRange('A1:A10').getRichTextValues().map( r => r[0].getLinkUrl() ) ;
I got some links, but most of them yielded null.
I made a shorter version of solution 1, which yielded all the links.
const id = SpreadsheetApp.getActive().getId() ;
let res = Sheets.Spreadsheets.get(id,
{ranges: "Sheet1!A1:A10", fields: "sheets/data/rowData/values/hyperlink"});
var urls = res.sheets[0].data[0].rowData.map(r => r.values[0].hyperlink) ;

Get count of characters for translation in Kentico Cloud

Is there a way to tell the count of characters of all text fields in some of our content items? We need to estimate a translation price for our content items.
You can use Delivery API to retrieve your items and run a quick javascript to count the characters for you. First, get all your items (or a subset, depending on what you need) with the call excluding all the modular content (linked items) like this:
https://deliver.kenticocloud.com/<projectid>/items?depth=0​​​​​​​
Then you can use browser console to run this piece of code:
var response = JSON.parse(document.getElementsByTagName("BODY")[0].textContent);
var noOfChars = 0;
for (var x = 0; x < response.items.length; x++) {
var p = response.items[x].elements;
for (var key in p) {
if (p[key].type=='rich_text' || p[key].type=='text') {
noOfChars += strip(p[key].value).length;
}
}
}
noOfChars;
function strip(html)
{
var tmp = document.createElement("DIV");
tmp.innerHTML = html;
return tmp.textContent || tmp.innerText || "";
}
And hit enter. This is what the result will look like:

Different Breaking in Textarea vs. Inline?

I am working on an extended Textarea like http://podio.github.com/jquery-mentions-input/
There you can see a transparent Textarea with an element in background simulating the highlighting.
You can see the problem there also: type some long text like "iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii " (attention to space at the end)
and then type "#ke" and choose first contact.
You will see that the background breaks different than the text in the textarea.
I figured out that this is not because different sizes!
Any ideas how to avoid that?
P.S.: I dont want to you contentediable.
For testing i used chrome (test with points!) and firefox.
I think this technic is also used often for auto-calculating a textarea-hight and they must have the same problems?!
I found a different solution myself: count line-breaks manually.
I modified and improved the line-break-adder from this thread: finding "line-breaks" in textarea that is word-wrapping ARABIC text
The big difference: this function only retrieves the breaked value without applying the breaks cause it used a temporary element copy.
I think it could help someone else!
function getApplyLineBreaks(strTextAreaId)
{
var strRawValue = $('#' + strTextAreaId).val();
var measureClone = $('#' + strTextAreaId).clone();
measureClone.attr('id', 'value_break_mess_clone');
measureClone.val('');
measureClone.css('overflow', 'hidden');
measureClone.removeAttr('onchange').removeAttr('onclick').removeAttr('onkeydown').removeAttr('onkeyup').removeAttr('onblur').removeAttr('onfocus');
measureClone.height(10);
measureClone.insertAfter('#' + strTextAreaId);
var lastScrollWidth = measureClone[0].scrollWidth;
var lastScrollHeight = measureClone[0].scrollHeight;
var lastWrappingIndex = -1;
var tolerancePixels = 5; //sollte kleiner als font-size
var addedSpace = false;
var debug_c = 0;
for (var i = 0; i < strRawValue.length; i++)
{
var curChar = strRawValue.charAt(i);
if (curChar == ' ' || curChar == '-' || curChar == '+')
lastWrappingIndex = i;
measureClone.val(measureClone.val() + curChar);
addedSpace = false;
if (i != strRawValue.length - 1 && strRawValue.charAt(i + 1) != "\n")
{
measureClone.val(measureClone.val() + ' '); //this is only 90% zero-width breaker unnoticed
addedSpace = true;
}
if (((measureClone[0].scrollWidth - tolerancePixels) > lastScrollWidth) || ((measureClone[0].scrollHeight - tolerancePixels) > lastScrollHeight))
{
if (addedSpace)
measureClone.val(measureClone.val().substr(0, measureClone.val().length - 1));
var buffer = "";
if (lastWrappingIndex >= 0)
{
for (var j = lastWrappingIndex + 1; j < i; j++)
buffer += strRawValue.charAt(j);
lastWrappingIndex = -1;
}
buffer += curChar;
measureClone.val(measureClone.val().substr(0, measureClone.val().length - buffer.length));
if (curChar == "\n")
{
if (i == strRawValue.length - 1)
measureClone.val(measureClone.val() + buffer + "\n");
else
measureClone.val(measureClone.val() + buffer);
}
else
{
measureClone.val(measureClone.val() + "\n" + buffer);
}
lastScrollHeight = measureClone[0].scrollHeight;
}
else if (addedSpace)
{
measureClone.val(measureClone.val().substr(0, measureClone.val().length - 1));
}
}
var returnText = measureClone.val();
measureClone.remove();
return returnText;
}
Only thing: its slow on long texts. Ideas for optimization are welcome.

Hash of a cell text in Google Spreadsheet

How can I compute a MD5 or SHA1 hash of text in a specific cell and set it to another cell in Google Spreadsheet?
Is there a formula like =ComputeMD5(A1) or =ComputeSHA1(A1)?
Or is it possible to write custom formula for this? How?
Open Tools > Script Editor then paste the following code:
function MD5 (input) {
var rawHash = Utilities.computeDigest(Utilities.DigestAlgorithm.MD5, input);
var txtHash = '';
for (i = 0; i < rawHash.length; i++) {
var hashVal = rawHash[i];
if (hashVal < 0) {
hashVal += 256;
}
if (hashVal.toString(16).length == 1) {
txtHash += '0';
}
txtHash += hashVal.toString(16);
}
return txtHash;
}
Save the script after that and then use the MD5() function in your spreadsheet while referencing a cell.
This script is based on Utilities.computeDigest() function.
Thanks to gabhubert for the code.
This is the SHA1 version of that code (very simple change)
function GetSHA1(input) {
var rawHash = Utilities.computeDigest(Utilities.DigestAlgorithm.SHA_1, input);
var txtHash = '';
for (j = 0; j <rawHash.length; j++) {
var hashVal = rawHash[j];
if (hashVal < 0)
hashVal += 256;
if (hashVal.toString(16).length == 1)
txtHash += "0";
txtHash += hashVal.toString(16);
}
return txtHash;
}
Ok, got it,
Need to create custom function as explained in
http://code.google.com/googleapps/appsscript/articles/custom_function.html
And then use the apis as explained in
http://code.google.com/googleapps/appsscript/service_utilities.html
I need to handtype the complete function name so that I can see the result in the cell.
Following is the sample of the code that gave base 64 encoded hash of the text
function getBase64EncodedMD5(text)
{
return Utilities.base64Encode( Utilities.computeDigest(Utilities.DigestAlgorithm.MD5, text));
}
The difference between this solution and the others is:
It fixes an issue some of the above solution have with offsetting the output of Utilities.computeDigest (it offsets by 128 instead of 256)
It fixes an issue that causes some other solutions to produce the same hash for different inputs by calling JSON.stringify() on input before passing it to Utilities.computeDigest()
function MD5(input) {
var result = "";
var byteArray = Utilities.computeDigest(Utilities.DigestAlgorithm.MD5, JSON.stringify(input));
for (i=0; i < byteArray.length; i++) {
result += (byteArray[i] + 128).toString(16) + "-";
}
result = result.substring(result, result.length - 1); // remove trailing dash
return result;
}
to get hashes for a range of cells, add this next to gabhubert's function:
function RangeGetMD5Hash(input) {
if (input.map) { // Test whether input is an array.
return input.map(GetMD5Hash); // Recurse over array if so.
} else {
return GetMD5Hash(input)
}
}
and use it in cell this way:
=RangeGetMD5Hash(A5:X25)
It returns range of same dimensions as source one, values will spread down and right from cell with formulae.
It's universal single-value-function to range-func conversion method (ref), and it's way faster than separate formuleas for each cell; in this form, it also works for single cell, so maybe it's worth to rewrite source function this way.
Based on #gabhubert but using array operations to get the hexadecimal representation
function sha(str){
return Utilities
.computeDigest(Utilities.DigestAlgorithm.SHA_1, str) // string to digested array of integers
.map(function(val) {return val<0? val+256 : val}) // correct the offset
.map(function(val) {return ("00" + val.toString(16)).slice(-2)}) // add padding and enconde
.join(''); // join in a single string
}
Using #gabhubert answer, you could do this, if you want to get the results from a whole row. From the script editor.
function GetMD5Hash(value) {
var rawHash = Utilities.computeDigest(Utilities.DigestAlgorithm.MD5, value);
var txtHash = '';
for (j = 0; j <rawHash.length; j++) {
var hashVal = rawHash[j];
if (hashVal < 0)
hashVal += 256;
if (hashVal.toString(16).length == 1)
txtHash += "0";
txtHash += hashVal.toString(16);
}
return txtHash;
}
function straightToText() {
var ss = SpreadsheetApp.getActiveSpreadsheet().getSheets();
var r = 1;
var n_rows = 9999;
var n_cols = 1;
var column = 1;
var sheet = ss[0].getRange(r, column, n_rows, ncols).getValues(); // get first sheet, a1:a9999
var results = [];
for (var i = 0; i < sheet.length; i++) {
var hashmd5= GetMD5Hash(sheet[i][0]);
results.push(hashmd5);
}
var dest_col = 3;
for (var j = 0; j < results.length; j++) {
var row = j+1;
ss[0].getRange(row, dest_col).setValue(results[j]); // write output to c1:c9999 as text
}
}
And then, from the Run menu, just run the function straightToText() so you can get your result, and elude the too many calls to a function error.
I was looking for an option that would provide a shorter result. What do you think about this? It only returns 4 characters. The unfortunate part is that it uses i's and o's which can be confused for L's and 0's respectively; with the right font and in caps it wouldn't matter much.
function getShortMD5Hash(input) {
var rawHash = Utilities.computeDigest(Utilities.DigestAlgorithm.MD5, input);
var txtHash = '';
for (j = 0; j < 16; j += 8) {
hashVal = (rawHash[j] + rawHash[j+1] + rawHash[j+2] + rawHash[j+3]) ^ (rawHash[j+4] + rawHash[j+5] + rawHash[j+6] + rawHash[j+7])
if (hashVal < 0)
hashVal += 1024;
if (hashVal.toString(36).length == 1)
txtHash += "0";
txtHash += hashVal.toString(36);
}
return txtHash.toUpperCase();
}
I needed to get a hash across a range of cells, so I run it like this:
function RangeSHA256(input)
{
return Array.isArray(input) ?
input.map(row => row.map(cell => SHA256(cell))) :
SHA256(input);
}

Resources