Grep with BBEdit - grep

I do have one large text file with lot of the following patterns;
because of,this
or that,has
or,not
Of course I want to change the following
because of, this
or that, has
or, not
To make myself clear: i would like to insert a space after each ,
How can i do that with BBEdit Find/Replace/Grep?
Find works ok with
[\,](\w)
but i can't figure out the coresponding part for replace.

Find: (\,)(\w)
Replace: \1 \2
Note: You need to hit the spacebar between \1 and \2. Works on my computer with BBedit v13

I complicates matters.
Pattern like Letter,Letter can also be found with ,\b.
\b is at the beginning or end of a word. \b in a regular expression means "word boundary".
The replacement is then done with ,_
Nota Bene: _ is a "Space" after ,

you could just replace all commas with a comma-space and then replace all space-space with space

Related

Replacing part of LaTeX command using BBedit grep

How can I use the BBedit grep option to replace LaTeX commands like
\textcolor{blue}{Some text}
by the contents of the second set of braces, so
Some text
?
The BBEdit Grep Tutorial gives a lot of information and good examples on using the grep option in BBEdit. What you are trying to achieve is actually a variation of one of the examples. The solution is to enter the following:
Find: \\textcolor\{blue\}\{([^\}]*)\}
Replace: \1
The relevant part is the "Find" section. The first part: \\textcolor\{blue\}\{ basically searches for the content \textcolor{blue}{. You need the \s to escape special characters.
Next, we have the cryptic sequence ([^\}]*): The (...) saves everything inside the parentheses into the variable \1, which you can use in the "Replace" section to insert the content. The [^\}]* consists of ^\} which means match all characters which are not ^ a closing brace \}. With [...]* we say, match any number of "not brace" characters. Overall, this expression makes the grep match all characters which are not closing braces, and saves them into \1.
Finally, the expression ends with a \}, i.e. a closing brace, which is the end of what we want to find.
The "Replace" only contains \1, which is everything inside the parentheses (...) in the "Find" field.

Easiest way to remove Latex tag (but not its content)?

I am using TeXnicCenter to edit a LaTeX document.
I now want to remove a certain tag (say, emph{blabla}} which occurs multiple times in my document , but not tag's content (so in this example, I want to remove all emphasization).
What is the easiest way to do so?
May also be using another program easily available on Windows 7.
Edit: In response to regex suggestions, it is important that it can deal with nested tags.
Edit 2: I really want to remove the tag from the text file, not just disable it.
Using a regular expression do something like s/\\emph\{([^\}]*)\}/\1/g. If you are not familiar with regular expressions this says:
s -- replace
/ -- begin match section
\\emph\{ -- match \emph{
( -- begin capture
[^\}]* -- match any characters except (meaning up until) a close brace because:
[] a group of characters
^ means not or "everything except"
\} -- the close brace
and * means 0 or more times
) -- end capture, because this is the first (in this case only) capture, it is number 1
\} -- match end brace
/ -- begin replace section
\1 -- replace with captured section number 1
/ -- end regular expression, begin extra flags
g -- global flag, meaning do this every time the match is found not just the first time
This is with Perl syntax, as that is what I am familiar with. The following perl "one-liners" will accomplish two tasks
perl -pe 's/\\emph\{([^\}]*)\}/\1/g' filename will "test" printing the file to the command line
perl -pi -e 's/\\emph\{([^\}]*)\}/\1/g' filename will change the file in place.
Similar commands may be available in your editor, but if not this will (should) work.
Crowley should have added this as an answer, but I will do that for him, if you replace all \emph{ with { you should be able to do this without disturbing the other content. It will still be in braces, but unless you have done some odd stuff it shouldn't matter.
The regex would be a simple s/\\emph\{/\{/g but the search and replace in your editor will do that one too.
Edit: Sorry, used the wrong brace in the regex, fixed now.
\renewcommand{\emph}[1]{#1}
any reasonably advanced editor should let you do a search/replace using regular expressions, replacing emph{bla} by bla etc.

Regular expression in Ruby

Could anybody help me make a proper regular expression from a bunch of text in Ruby. I tried a lot but I don't know how to handle variable length titles.
The string will be of format <sometext>title:"<actual_title>"<sometext>. I want to extract actual_title from this string.
I tried /title:"."/ but it doesnt find any matches as it expects a closing quotation after one variable from opening quotation. I couldn't figure how to make it check for variable length of string. Any help is appreciated. Thanks.
. matches any single character. Putting + after a character will match one or more of those characters. So .+ will match one or more characters of any sort. Also, you should put a question mark after it so that it matches the first closing-quotation mark it comes across. So:
/title:"(.+?)"/
The parentheses are necessary if you want to extract the title text that it matched out of there.
/title:"([^"]*)"/
The parentheses create a capturing group. Inside is first a character class. The ^ means it's negated, so it matches any character that's not a ". The * means 0 or more. You can change it to one or more by using + instead of *.
I like /title:"(.+?)"/ because of it's use of lazy matching to stop the .+ consuming all text until the last " on the line is found.
It won't work if the string wraps lines or includes escaped quotes.
In programming languages where you want to be able to include the string deliminator inside a string you usually provide an 'escape' character or sequence.
If your escape character was \ then you could write something like this...
/title:"((?:\\"|[^"])+)"/
This is a railroad diagram. Railroad diagrams show you what order things are parsed... imagine you are a train starting at the left. You consume title:" then \" if you can.. if you can't then you consume not a ". The > means this path is preferred... so you try to loop... if you can't you have to consume a '"' to finish.
I made this with https://regexper.com/#%2Ftitle%3A%22((%3F%3A%5C%5C%22%7C%5B%5E%22%5D)%2B)%22%2F
but there is now a plugin for Atom text editor too that does this.

Removing accents/diacritics from string while preserving other special chars (tried mb_chars.normalize and iconv)

There is a very similar question already. One of the solutions uses code like this one:
string.mb_chars.normalize(:kd).gsub(/[^x00-\x7F]/n, '').to_s
Which works wonders, until you notice it also removes spaces, dots, dashes, and who knows what else.
I'm not really sure how the first code works, but could it be made to strip only accents? Or at the very least be given a list of chars to preserve? My knowledge of regexps is small, but I tried (to no avail):
/[^\-x00-\x7F]/n # So it would leave the dash alone
I'm about to do something like this:
string.mb_chars.normalize(:kd).gsub('-', '__DASH__').gsub
(/[^x00-\x7F]/n, '').gsub('__DASH__', '-').to_s
Atrocious? Yes...
I've also tried:
iconv = Iconv.new('UTF-8', 'US-ASCII//TRANSLIT') # Also tried ISO-8859-1
iconv.iconv 'Café' # Throws an error: Iconv::IllegalSequence: "é"
Help please?
it also removes spaces, dots, dashes, and who knows what else.
It shouldn't.
string.mb_chars.normalize(:kd).gsub(/[^x00-\x7F]/n, '').to_s
You've mistyped, there should be a backslash before the x00, to refer to the NUL character.
/[^\-x00-\x7F]/n # So it would leave the dash alone
You've put the ‘-’ between the ‘\’ and the ‘x’, which will break the reference to the null character, and thus break the range.
I'd use the transliterate method. See http://api.rubyonrails.org/classes/ActiveSupport/Inflector.html#method-i-transliterate
It's not as neat as Iconv, but does what I think you want:
http://snippets.dzone.com/posts/show/2384

Customising word separators in vi

vi treats dash - and space as word separators for commands such as dw and cw.
Is there a way to add underscore _ as well?
I quite often want to change part of a variable name containing underscores, such as changing src_branch to dest_branch. I end up counting characters and using s (like 3sdest), but it would be much easier to use cw (like cwdest).
Is there a way to add underscore _ as well?
:set iskeyword-=_
What is, and is not a member character to keywords depends on the language. For help on iskeyword use :help iskeyword.
In case you're using vim, you can change that by setting the iskeyword option (:he iskeyword). If that is not an option, you can always use ct_ instead of counting.
One other good option in such cases is to use camelcasemotion plugin.
It adds new motions ,b, ,e, and ,w, which work analogously with b, e, and w, except that they recognize CamelCase and snake_case words. With it you can use
c,edest
and this will replace "src_branch" with "dest_branch" if your cursor was on first character of "src_branch".
You could type cf_dest_ and save the counting part.
Edit: or as suggested: ct_ changes text until right before the underline character. (I'm using the f motion more, so it came more naturally to me)
Or you could redefine 'iskeyword' (:help iskeyword for details).
I was just looking at this myself and added this to my .vimrc:
set iskeyword=!-~,^*,^45,^124,^34,192-255,^_
My .vimrc had issues with ^| and ^", which was part of the default iskeyword for my setup, so I changed to their ascii values and it works fine. My main modification was to add "^_" to the end of the default setting to keep vim from seeing underscore as being part of a word.
To delete to the next underscore enter "df_"
To change to the next underscore enter "cf_"
NOTE: don't include the double quotes.

Resources