How to escape ( parenthesis in stata - invalid '(' error r(196) - path

I have to replicate a do file of a colleague who used a macro for his file names. The problem is that the pathname contains a parenthesis, which causes problems:
*setting directory
cd "D:/Dropbox (Center for Child Well-being and Development)/2020/Playground"
*setup
sysuse auto
save "/Dropbox (Center for Child Well-being and Development)/example", replace
*problem
global path "/Dropbox (Center for Child Well-being and Development)"
local file "/example.dta"
global data "$path`file'""
disp "$data"
use $data
I get the following output
. disp "$data"
/Dropbox (Center for Child Well-being and Development)/example.dta
. use $data
invalid '('
r(198);
I know that calling the macro within quotations as use "$data" would do the job, but as it is not my do file I would like to try to avoid changing every occurrence where the macro is used.
I tried to escape the parenthesis with \( and add various numbers of quotations at any position I could imagine while constructing the global. Also I tried to add escaped quotations \" which did work neither.

Related

End of line lex

I am writing an interpreter for assembly using lex and yacc. The problem is that I need to parse a word that will strictly be at the end of the file. I've read that there is an anchor $, which can help. However it doesn't work as I expected. I've wrote this in my lex file:
ABC$ {printf("QWERTY\n");}
The input file is:
ABC
without spaces or any other invisible symbols. So I expect the outputput to be QWERTY, however what I get is:
ABC
which I guess means that the program couldn't parse it. Then I thought, that $ might be a regular symbol in lex, so I changed the input file into this:
ABC$
So, if $ isn't a special symbol, then it will be parsed as a normal symbol, and the output will be QWERTY. This doesn't happen, the output is:
ABC$
The question is whether $ in lex is a normal symbol or special one.
In (f)lex, $ matches zero characters followed by a newline character.
That's different from many regex libraries where $ will match at the end of input. So if your file does not have a newline at the end, as your question indicates (assuming you consider newline to be an invisible character), it won't be matched.
As #sepp2k suggests in a comment, the pattern also won't be matched if the input file happens to use Windows line endings (which consist of the sequence \r\n), unless the generated flex file was compiled for Windows. So if you created the file on Windows and run the flex-generated scanner in a Unix environment, the \r will also cause the pattern to fail to match. In that case, you can use (f)lex's trailing context operator:
ABC/\r?\n { puts("Matched ABC at the end of a line"); }
See the flex documentation for patterns for a full description of the trailing context operator. (Search for "trailing context" on that page; it's roughly halfway down.) $ is exactly equivalent to /\n.
That still won't match ABC at the very end of the file. Matching strings at the very end of the file is a bit tricky, but it can be done with two patterns if it's ok to recognise the string other than at the end of the file, triggering a different action:
ABC/. { /* Do nothing. This ABC is not at the end of a line or the file */ }
ABC { puts("ABC recognised at the end of a line"); }
That works because the first pattern will match as long as there is some non-newline character following ABC. (. matches any character other than a newline. See the above link for details.) If you also need to work with Windows line endings, you'll need to modify the trailing context in the first pattern.

Replacing part of LaTeX command using BBedit grep

How can I use the BBedit grep option to replace LaTeX commands like
\textcolor{blue}{Some text}
by the contents of the second set of braces, so
Some text
?
The BBEdit Grep Tutorial gives a lot of information and good examples on using the grep option in BBEdit. What you are trying to achieve is actually a variation of one of the examples. The solution is to enter the following:
Find: \\textcolor\{blue\}\{([^\}]*)\}
Replace: \1
The relevant part is the "Find" section. The first part: \\textcolor\{blue\}\{ basically searches for the content \textcolor{blue}{. You need the \s to escape special characters.
Next, we have the cryptic sequence ([^\}]*): The (...) saves everything inside the parentheses into the variable \1, which you can use in the "Replace" section to insert the content. The [^\}]* consists of ^\} which means match all characters which are not ^ a closing brace \}. With [...]* we say, match any number of "not brace" characters. Overall, this expression makes the grep match all characters which are not closing braces, and saves them into \1.
Finally, the expression ends with a \}, i.e. a closing brace, which is the end of what we want to find.
The "Replace" only contains \1, which is everything inside the parentheses (...) in the "Find" field.

Invalid '`' error when using local macro

I am following instructions from this link on how to append Stata files via a foreach loop. I think that it's pretty straightforward.
However, when I try to refer to each f in datafiles in my foreach loop, I receive the error:
invalid `
I've set my working directory and the data is in a subfolder called csvfiles. I am trying to call each file f in the csvfiles subfolder using my local macro datafiles and then append each file to an aggregate Stata dataset called data.dta.
I've included the code from my do file below:
clear
local datafiles: dir "csvfiles" files "*.csv"
foreach f of local datafiles {
preserve
insheet using “csvfiles\`f'”, clear
** add syntax here to run on each file**
save temp, replace
restore
append using temp
}
rm temp
save data.dta, replace
The backslash character has meaning to Stata: it will prevent the interpretation of any following character that has a special meaning to Stata, in particular the left single quote character
`
will not be interpreted as indicating a reference to a macro.
But all is not lost: Stata will allow you to use the forward slash character in path names on any operating system, and on Windows will take care of doing what must be done to appease Windows. Replacing your insheet command with
insheet using “csvfiles/`f'”, clear
should solve your problem.
Note that the instructions you linked to do exactly that; some of the code includes backslashes in path names, but where a macro is included, forward slashes are used instead.

Remove \text generated by TeXForm

I need to remove all \text generated by TeXForm in Mathematica.
What I am doing now is this:
MyTeXForm[a_]:=StringReplace[ToString[TeXForm[a]], "\\text" -> ""]
But the result keeps the braces, for example:
for a=fx,
the result of TeXForm[a] is \text{fx}
the result of MyTeXForm[a] is {fx}
But what I would like is it to be just fx
You should be able to use string patterns. Based on http://reference.wolfram.com/mathematica/tutorial/StringPatterns.html, something like the following should work:
MyTeXForm[a_]:=StringReplace[ToString[TeXForm[a]], "\\text{"~~s___~~"}"->s]
I don't have Mathematica handy right now, but this should say 'Match "\text{" followed by zero or more characters that are stored in the variable s, followed by "}", then replace all of that with whatever is stored in s.'
UPDATE:
The above works in the simplest case of there being a single "\text{...}" element, but the pattern s___ is greedy, so on input a+bb+xx+y, which Mathematica's TeXForm renders as "a+\text{bb}+\text{xx}+y", it matches everything between the first "\text{" and last "}" --- so, "bb}+\text{xx" --- leading to the output
In[1]:= MyTeXForm[a+bb+xx+y]
Out[1]= a+bb}+\text{xx+y
A fix for this is to wrap the pattern with Shortest[], leading to a second definition
In[2]:= MyTeXForm2[a_] := StringReplace[
ToString[TeXForm[a]],
Shortest["\\text{" ~~ s___ ~~ "}"] -> s
]
which yields the output
In[3]:= MyTeXForm2[a+bb+xx+y]
Out[3]= a+bb+xx+y
as desired.
Unfortunately this still won't work when the text itself contains a closing brace. For example, the input f["a}b","c}d"] (for some reason...) would give
In[4]:= MyTeXForm2[f["a}b","c}d"]]
Out[4]= f(a$\$b},c$\$d})
instead of "f(a$\}$b,c$\}$d)", which would be the proper processing of the TeXForm output "f(\text{a$\}$b},\text{c$\}$d})".
This is what I did (works fine for me):
MyTeXForm[a_] := ToString[ToExpression[StringReplace[ToString[TeXForm[a]], "\\text" -> ""]][[1]]]
This is a really late reply, but I just came up against the same issue and discovered a simple solution. Put a space between the variables in the Mathematica expression that you wish to convert using TexForm.
For the original poster's example, the following code works great:
a=f x
TeXForm[a]
The output is as desired: f x
Since LaTeX will ignore that space in math mode, things will format correctly.
(As an aside, I was having the same issue with subscripted expressions that have two side-by-side variables in the subscript. Inserting a space between them solved the issue.)

Regular expression in Ruby

Could anybody help me make a proper regular expression from a bunch of text in Ruby. I tried a lot but I don't know how to handle variable length titles.
The string will be of format <sometext>title:"<actual_title>"<sometext>. I want to extract actual_title from this string.
I tried /title:"."/ but it doesnt find any matches as it expects a closing quotation after one variable from opening quotation. I couldn't figure how to make it check for variable length of string. Any help is appreciated. Thanks.
. matches any single character. Putting + after a character will match one or more of those characters. So .+ will match one or more characters of any sort. Also, you should put a question mark after it so that it matches the first closing-quotation mark it comes across. So:
/title:"(.+?)"/
The parentheses are necessary if you want to extract the title text that it matched out of there.
/title:"([^"]*)"/
The parentheses create a capturing group. Inside is first a character class. The ^ means it's negated, so it matches any character that's not a ". The * means 0 or more. You can change it to one or more by using + instead of *.
I like /title:"(.+?)"/ because of it's use of lazy matching to stop the .+ consuming all text until the last " on the line is found.
It won't work if the string wraps lines or includes escaped quotes.
In programming languages where you want to be able to include the string deliminator inside a string you usually provide an 'escape' character or sequence.
If your escape character was \ then you could write something like this...
/title:"((?:\\"|[^"])+)"/
This is a railroad diagram. Railroad diagrams show you what order things are parsed... imagine you are a train starting at the left. You consume title:" then \" if you can.. if you can't then you consume not a ". The > means this path is preferred... so you try to loop... if you can't you have to consume a '"' to finish.
I made this with https://regexper.com/#%2Ftitle%3A%22((%3F%3A%5C%5C%22%7C%5B%5E%22%5D)%2B)%22%2F
but there is now a plugin for Atom text editor too that does this.

Resources