I need to create a customized formula generated from Numbers (1-100) & Operators (+,-,*,/). The precedence of operation has to be dynamic. User can choose the numbers/operators one-by-one. But how to arrange the precedence for where to put the Brackets and all to make it a valid expression. For Example :
(21*430)-17+((45/5)+135)+243-16
((97+74)+242-(169+13)*4)/76-21+(36*4012)
We need to take numbers/operators as inputs but the order/type can be dynamic & random. The only limitation is in an expression the total numbers used is fixed (here in 1. 8 numbers & 2. 10 numbers).
Please suggest a feasible solution to this.
Related
I have a sheet I use as a database of scientific papers. I copy journal article titles from different sources (some could be from an email, others are links on a web page, or just the title from the article page). I have conditional formatting set to let me know if I'm adding a title that is already in the list. I've noticed that there are some titles that are "ignoring" the conditional formatting, and it looks like there are hyphens in all of the offenders. If I remove the hyphens, the conditional formatting works. So there is some 'difference' in the hyphens originating from the same title that is preventing the conditional formatting from viewing them as identical.
Shared sheet
Examples of offending titles:
End-to-end continuous bioprocessing: impact on facility design, cost of goods and cost of development for monoclonal antibodies
End‐to‐end continuous bioprocessing: impact on facility design, cost of goods and cost of development for monoclonal antibodies
End‐to‐end continuous bioprocessing: Impact on facility design, cost of goods, and cost of development for monoclonal antibodies
What is this difference, and is there a way to fix it? Do I need to write a script to find/replace the hyphens to get this to work?
TIA
Just because characters appear identical, it does not mean that they are identical. You have fallen foul of the similarity between the hyphen and dashes. Visually, they are almost identical - dashes are slightly widest than the hyphen.
Dashes are regarded as "special characters" (i.e. they aren't keys on the keyboard) but they are used widely in html. So if, for instance, you copied an item from a website then you might unwittingly have copied dashes rather than hyphens.
You can identify the exact nature of a character by using the CODE function.
You ask "What is this difference, and is there a way to fix it? Do I need to write a script to find/replace the hyphens to get this to work?"
WHAT IS THIS DIFFERENCE?
It's important to recognise that though these examples appear identical, there are other differences that are more than just hyphens vs dashes.
Example#1 - Hyphen - CODE returns "45"
Example#2 - Dash - CODE returns "8208"
Example#3 - Dash - CODE returns "8208".
But there are other factors that contribute to fail to trigger the conditional formatting rule:
Length = 128 (vs 127 for the other examples). There is an additional comma (after "cost of sales")
the word "Impact" is spelled with an upper case "I" (lower case for the other examples)
MOVING FORWARD
Do you need a script? No (IMHO)
Is there a way to fix it? As outlined above, there are more differences that just hyphens and dashes. And, as time goes by, the number & type of difference might increase. However, there is a solution to the "Hyphen Vs Dash" problem which is the focus of this question.
FORMULA AND FORMATTING
Your data is currently in Column A and Column A is also subject to conditional formatting.
Remove the conditional formatting rules from Column A
Insert this formula in cell B2
=arrayformula(if(LEN($A2:A)-LEN(SUBSTITUTE($A2:A, char(8208), ""))=0,A2:A,arrayformula(substitute(A2:A,char(8208),char(45)))))
Conditional Formatting for Column B
select the range in Column B
select, Format, Conditional Formatting.
Select "Custom Formula is" and enter this formula: =countif($B$2:$B2,B2)>1
Select a preferred Formatting Style and then click Done.
FORMULA LOGIC
arrayformula enables the formula to automatically populate all the relevant cell in the column.
LEN($A2:A)-LEN(SUBSTITUTE($A2:A, char(8208), ""))=0
a test for dashes in a string. It substitutes a nil value for any/all instances of a dash (char(8208)), then compares the length to the adjusted length. If the value is zero, then there are no dashes in the string.
IF: Test for any dashes,
if the string doesn't contain any dashes then use that value
else, the string must contains dashes so substitute any dashes for hyphens, and use the substituted value
arrayformula(substitute(A2:A,char(8208),char(45)))
The conditional formatting rule then looks for duplicate values in the column, and formats any/all duplicate values.
You'll note that Example#3 is not flagged as a duplicate despite containing dashes. This is because of the spelling of "Impact" and the extra comma after "cost of sales".
Sample
I have these data in Google Sheets
$71,675_x000d_
$80,356_x000d_
$107,361_x000d_
$123,393_x000d_
$116,878
I want them to be split into different columns.
However, when I do so using Data > Split Data into Different Columns, it separates $71 and 675_x000d_ but I need the $71,275 and remove the xoood
Please note that the last number doesn't have those extra characters.
Please help.
Your post says you want to "remove the x000d (that is, extract only the dollar amounts). That said, let's say your raw data starts in A2 (i.e., the data is in A2:A). Place the following formula into the first cell of another otherwise empty column (e.g., B1):
=ArrayFormula({"Extracted";IF(A2:A="",,REGEXEXTRACT(SUBSTITUTE(A2:A&"_",",",""),"\d+"))})
How It Works:
ArrayFormula(...) signifies that we'll be processing an entire range and not just one cell.
The outer curly brackets {...} signify that a virtual array will be formed from non-like or non-contiguous pieces.
The first piece of the virtual array is the header. Here, that is "Extracted"; but you can change it as you like.
The semicolon means "place the next information below the previous part."
IF(A2:A="",, ...) is a standard check that basically says "Don't try to process any blank cells in Column A"; or alternatively worded, "If any cell in A2:A is blank/null, do nothing."
Skipping the REGEXEXTRACT for now, A2:A&"_" appends an underscore to every entry in A2:A. This allows entries in A2:A that are just a dollar amount (e.g., from the post, $116,878) to have a consistent symbol following them if not already there. (And adding the underscore to anything that already has an underscore won't matter, because we won't be extracting that far out.)
Now that we've got the new strings, we SUBSTITUTE every comma for a null (i.e., delete all commas).
Finally, REGEXEXTRACT will take all of the virtually modified strings and extract \d+, which means only digits (\d) in an unbroken sequence of any length greater than 0 (+). Note that REGEXEXTRACT will only return the first such match it encounters as written, so 000 will not be extracted.
An IFERROR wrap is placed around the REGEXEXTRACT, just in case you have any situations in real life that don't have any sequence of numbers at all. In these cases, nothing will be returned (whereas, without the IFERROR, an error would have been returned).
Once the extraction is done, you can apply Format > Number > Currency (rounded) to the entire column.
Addendum:
After an additional comment (below), it appears that the raw data is in Column T, that all five entries are in one cell and that the OP would like all five amounts extracted across each row. That being the case, assuming that Columns U:Y are empty to start, place the following in cell U1 (not U2):
=ArrayFormula({"Va11","Val2","Val3","Val4","Val5";IF(T2:T="",,IFERROR(REGEXEXTRACT(SUBSTITUTE(T2:T&"_",",",""),REPT("\$(\d+)[^\$]*",5))))})
This works much the same way as the previous formula. The differences:
There are five headers now.
You'll see REPT(...,5) here. This is an easy way to repeat the same extraction five times.
That repeated extraction is now the following:
\$(\d+)[^\$]*
The backslash in front of the dollar signs means to treat those symbols as literals instead of as their usual meaning (i.e., end-of-string). So the extraction reads as follows:
\$ anything that starts with a dollar sign
(\d+) extract what is between the ( ), which is any group of digits [^$]*` followed by any number (including 0) characters that are not dollar signs
As I said, the REPT will repeat this five times; so five groups matching this pattern will be extracted.
Understand that if you have any groups that don't follow the pattern exactly, resulting in five matching extractions, nothing will be returned.
Be sure to format U:Y as currency rounded, or you will wind up with some of those numbers translating as raw dates and therefore being completely off.
Please use the following formula and format cells to your needs.
=ArrayFormula(IFERROR(SPLIT(REGEXREPLACE(A2:A,"\n|_x000d_","√"),"√")))
The big advantage of the above formula compared to others is that it works for any number of lines included within a single cell (as shown in the image below).
Functions used:
ArrayFormula
IFERROR
SPLIT
REGEXREPLACE
You can use SPLIT function:
=ArrayFormula(IF(LEN(A:A),SPLIT(A:A,"_x000d_",FALSE),""))
I want to check if the number pin user entered is too simple. 3 cases would fail, repeating numbers, like "1111"; increasing ones like "1234"; decreasing ones like "4321". Is there a regex which could check these restrictions?
Regex can match specific text pattern but it can't understand its context..Yes you want to check for increasing numbers but there's no such thing as increasing pattern in regex.
You can check for repeated numbers using ^(\d)\1+$
But to check for increasing,decreasing numbers you would have to parse the string to int and check if they are in increasing or decreasing order manually using the %,/ operations
I was wondering how to implement the following problem: Say I have a 'set' of Strings and I wish to know which one is the most related to a given value.
Example:
String value= "ABBCCE";
Set contains: {"JJKKLL", "ABBCC", "AAPPFFEE", "AABBCCDD", "ABBCEE", "AABBCCEE"}
By 'most related' I assume there could be many options (valid one can be the last 2), but at least we can ignore some items (JJKKLLL).
What should be the approach to solve this kind of a problem (that at minmum, a result like AABBCCEE would be acceptable)
Any java code would be appreciated :-)
You could try using the Levenshtein Distance between your "target" string (e.g. "ABBCCE") and each element in your set. Pick a maximum threshold above which you will consider items to be unrelated (in your example here, a threshold of one or two perhaps), and reject everything in the set that has a Levenshtein Distance greater than that from the target string.
An example implementation of the Levenshtein Distance computation in Java can be found here.
You may be interested in the Levenstein distance metric, which measures similarities between two strings, including insertions and removals.
List Comprehension is a very useful code mechanism that is found in several languages, such as Haskell, Python, and Ruby (just to name a few off the top of my head). I'm familiar with the construct.
I find myself working on an Open Office Spreadsheet and I need to do something fairly common: I want to count all of the values in a range of cells that fall between a high and low bounds. I instantly thought that list comprehension would do the trick, but I can't find anything analogous in Open Office. There is a function called "COUNTIF", and it something similar, but not quite what I need.
Is there a construct in Open Office that could be used for list comprehension?
CountIf can count values equal to one chosen. Unfortunately it seems that there is no good candidate for such function. Alternatively you can use additional column with If to display 1 or 0 if the value fits in range or not accordingly:
=If(AND({list_cell}>=MinVal; {list_cell}<=MaxVal); 1; 0)
Then only thing left is to sum up this additional column.
Assuming:
your range is A1:A10
your lower bound is at B1
your upper bound is at B2
then what you want can be achieved by:
=COUNTIFS(A1:A10, ">" & B1, A1:A10, "<" & B2)
(you might need to change commas into semicolons, depending on your language preference for decimal point)
Quoting from the installed OpenOffice documentation:
The logical relation between criteria can be defined as logical AND (conjunction). In other words, if and only if all given criteria are met, a value from the corresponding cell of the given Func_Range is taken into calculation.
This function is part of the Open Document Format for Office Applications (OpenDocument) standard Version 1.2. (ISO/IEC 26300:2-2015)