I am using Atom to find and replace certain letters in different lines in a file.
However, I am not able to select different lines and then apply the only in selection method.
I used cmd to select the lines I want to replace the letters in.
Thanks for any help!
Related
I am working on a multi-language book, and I need to find and replace all hyphens with en-dashes that occur between numbers in the citations section. I need to avoid all hyphens that exist between roman letters.
If I use a GREP [0-9]-[0-9] it selects the numbers before and after the hyphen and I have to manually select the hyphen and replace it with an en-dash. This is labor intensive.
Is there a way for me to find the hyphen that exists between the numbers, but EXCLUDE the numbers themselves from being highlighted? This way I can run a the Find and Replace to change what will probably be 1000+ manual changes?
I tried using GREP [0-9]-[0-9] to find the hyphens, but then couldn't find a way to have the find and replace keep the existing numbers.
That's what lookaheads and lookbehinds are for`
(?<=[0-9])-(?=[0-9])
If you have selected GREP, another way could be to use 2 capture groups and use those 2 groups in the replacement with an en-dash in between.
([0-9])-([0-9])
Replace with $1–$2
[![enter image description here][4]][4][![enter image description here][5]][5]I have a PDF that has tabular data that runs over 50+ pages, i want to extract this table into an excel file using Automation Anywhere. (i am using community version of AA 11.3). I watched videos of the PDF integration command but haven't had any success trying this for tabular data.
Requesting assistance.
Thanks.
I am afraid that your case will be quite challenging... and the main reason for that are the values that contains multiple lines. You can still achieve what you need, and with good performance, but the code itself will not be pretty. You will also be facing challanges with Automation Anywhere, since it does not really provide the right tools to do such a thing and you may need to resort to scripting (VBScripts) or Metabots.
Solution 1
This one will try to use purely text extraction and Regular expressions. Mainly standard functionality, nothing too "dirty".
First you need to realise how do the exported data look like. You can see that you can export to Plain or Structured.
The Plain one is not useful at all as the data is all over the place, without any clear pattern.
The Structured one is much better as the data structure resembles the data from the original document. From looking at the data you can make these observations:
Each row contains 5 columns
All columns are always filled (at least in the visible sample set)
The last two columns can serve as a pattern "anchor" (identifier), because they contain a clear pattern (a number followed by minimum of two spaces followed by a dollar sign and another number)
Rows with data are separated by a blank row
The text columns may contain a multiline value, which will duplicate the rows (this one thing makes it especially tricky)
First wou need to ensure that the Structured data contain only the table, nothing else. You can probably use the Before-After string command for that.
Then you need to check if you can reliably identify the character width of every column. You can try this for yourself if you copy the text into Excel, use the Text to Columns with the Fixed Width option and try to play around with the sliders
The you need to try to find a way how to reliably identify each row and prepare it for the Split command in AA. For that you need to have a delimiter. But since each data row can actually consists of multiple text rows, you need to create a delimiter of your own. I used the Replace function with Regular Expression option and replace a specific pattern for a delimiter (pipe). See here.
Now that you have added a custom delimiter, you can use the Split command to add each row into a list and loop through it.
Because each data row may consists of several rows, you will need to use Split again, this time use the [ENTER] as delimiter. Now you need to loop through each of the text line of a single data line and use the Substring function to extract data based on column width and concatenate them to a single value that you store somewhere else.
All in all, a painful process.
Solution 2
This may not be applicable, but it's worth a try - open the PDF in Microsoft Word. It will give you a warning, ignore it. Word will attempt to open the document and, if you're lucky, it will recognise your table as a table. If it works, it will make the data extraction much easier an you will be able to use Macros/VBA or even simple Copy&Paste. I tried it on a random PDF of my own and it works quite well.
I have a problem that requires me to write a regex that finds a line that containing exactly 3 groups of characters (it could be words or numbers) and that ends with another specific word. The way I had in mind was to find a pattern that ended in a space, and look for it 3 times. assuming this is the correct way to go about it, I do no know how to find a space, but I thought it would look like .*"find a space"{3} endword$. Is this the way it would be done? Even if it is not the way to do it how do you find a space? Any suggestions?
Assuming by three groups of words you would accept any non-space character, you could write:
/^\s*(?:\S+\s+){3}endword$/
The initial caret is to make sure you have exactly 3 non-space groups on the line.
Of course you need to consider whether things like control characters could appear, and adjust accordingly.
Depending on your flavor, something like the below would do it:
\b+.+?\b+.+?\b+.+?\bendword$
This makes use of the word boundary mark (\b) and non-greedy repetitions (+?), so it may be slightly different in your specific implementation, especially if you're using something old like grep.
I can't seem to find a package plugin or setting for highlighting matching LINES in Sublimetext 2.2? If you highlight a single word it will circle all the other matching words, but I need that for the entire line. See the attached image from EditPlus, that's what I need. You guys know of anything like this? Thanks!
After selecting the line of interest, ctrl+d will select the next instance of the line.
If you wantto select all matching lines, use ctrl+super+g (if on OSX, I think its alt+f3 on windows).
No additional package is needed to find matching lines.
I want to do what I've done for the first 30 lines in this picture:
Except I don't want to do it manually, because I have 1838 lines of text.
After this I could easily delete duplicates, and all lines which don't contain URLs,
but right now it's kind of a mess.
In your particular example, where all links are youtube videos as shown in the screenshot, the following should do:
Search:
(.+?)\b(https?://www.youtube.com(?:/watch\?v=\w*)?)\b(.+)
Replace:
$1\n$2\n$3
Don't forget to select the Regular Expression Search Mode radio button first.
In case you needed a more generic approach matching any URL, the regex could get much more complicated.