String input in Forth - forth

I need a user to input two words from the keyboard and then do things with their letters. I found this example in a book and it works for one word, but when I try to enter two words, I immediately get : Stack underflow >invitation<
I'm really new to Forth. I've tried to find examples in books, but I couldn't. How can I solve my problem?
CREATE word1 14 ALLOT
: .wordik1 word1 14 -TRAILING TYPE ;
: .getword1 word1 14 BLANK word1 14 EXPECT ;
CREATE word2 15 ALLOT
: .wordik2 word2 15 -TRAILING TYPE ;
: .getword2 word2 15 BLANK word2 EXPECT ;
: invitation
." First word:" CR
.getword1
." Second word:" CR
.getword2
;

Related

Removing mutltiple specific lines from a txt with Batch

I am trying to remove multiple lines from a text file that have been parsed out of a PDF.
What the file looks like:
word1
word2
word3
b
word4
word5
b
word6
B
b
word7
word8
word9
b
Now the results I am looking for:
word1
word2
word3
word4
word5
word6
B (is an initial of a user and should remain)
word7
word8
word9
Issues:
I can't get the batch to be case sensitive and if I do somewhat get it working it will remove all the b's from the words.
I keep walking into issues trying to achieve this in batch. I have no example script because I did not make any progress. Does someone have a way to do this properly?
If possible, I would like to have it working 100% in batch with no dependencies, please.
Using findstr's regular expression will help you here. To exclude all lowercase standalone b's you can do:
(findstr /V /RC:"\<b\>" filename.txt)>output.txt
Or to find only the uppercase standalone B's and no other text:
(findstr /RC:"\<B\>" filename.txt)>output.txt

Lua patterns - why does custom set '[+-_]' match alphanumeric characters?

I was playing around with some patterns today to try to match some specific characters in a string, and ran into something unusual that I'm hoping someone can explain.
I had created a set looking for a list of characters within some strings, and noticed I was getting back some unexpected results. I eliminated the characters in the set until I got down to just three, and it seems to be these three that are responsible:
string = "alpha.5dc1704B40bc7f.beta.123456789.gamma.987654321.delta.abc123ABC321"
result = ""
for a in string.gmatch(string, '[+-_]') do
result = result .. a .. " "
end
> print(result)
. 5 1 7 0 4 B 4 0 7 . . 1 2 3 4 5 6 7 8 9 . . 9 8 7 6 5 4 3 2 1 . . 1 2 3 A B C 3 2 1
Why are these characters getting returned here (looks like any number or uppercase letter, plus dots)? I note that if I change up the order of the set, I don't get the same output - '[_+-]' or '[-_+]' or '[+_-]' or '[-+_]' all return nothing, as expected.
What is it about '[+-_]' that's causing a match here? I can't figure out what I'm telling lua that is being interpreted as instructions to match these characters.
When a - is between other characters inside square brackets, it means everything between those two. For example, [a-z] is all of the lowercase letters, and [A-F] is A, B, C, D, E, and F. [+-_] means every ASCII character between + and _, which includes all the numbers, all the uppercase letters, and a lot of punctuation.

Merging >2 files with AWK or JOIN?

Merging 2 files using AWK is a well covered topic on StackOverflow. However, the technique of reading 3 files into an array gets more complicated. As I'm formatting the output to go into an R script, I'm going to need to add lots of syntax so I don't think I can use JOIN. Here is a simplistic version I have working so far:
awk 'FNR==1{f++}
f==1{a[FNR]=$1;next}
f==2{b[FNR]=$1;next}
{print a[FNR], "<- c(", b[FNR], ",", $1, ")"}' words.txt x.txt y.txt
Where:
$ cat words.txt
word1
word2
word3
$ cat x.txt
1
2
3
$ cat y.txt
11
22
33
The output is then
word1 <- c(1, 11)
word2 <- c(2, 22)
word3 <- c(3, 22)
The best way I can summarize this technique is
Create a variable f to keep track of which file you're processing
For file 1 read the values into array a
For file 2 read the values into array b
Fall through to file three, where you concatenate your final output
As a beginner to AWK, this works, but I find it a bit awkward and I worry coming back to the code in 6 months, I'll no longer understand it. Is this the best way to merge these 3 files in AWK? Could JOIN actually handle this level of formatting the final output?
a variation of #RavinderSingh13's solution
$ paste {words,x,y}.txt | awk '{print $1, "<- c(" $2 ", " $3 ")"}'
EDIT: Could you please try following.
paste words.txt x.txt y.txt | awk '{$2="<- c("$2", "$3")";$3="";sub(/ +$/,"")} 1'
Output will be as follows.
word1 <- c(1, 11)
word2 <- c(2, 22)
word3 <- c(3, 33)
In case you simply want to add 3 file's contents in column vice then try following.
paste words.txt x.txt y.txt
word1 1 11
word2 2 22
word3 3 33
If it's for readability, you can change the file checking method, as well as the variable names.
Try these please:
awk 'ARGIND==1{words[FNR]=$1;}
ARGIND==2{xcol[FNR]=$1;}
ARGIND==3{print words[FNR], "<- c(", xcol[FNR], ",", $1, ")"}' words.txt x.txt y.txt
Above file checking method is for GNU awk.
Change to another, as well as change the file reading order, would be:
awk 'FILENAME=="words.txt"{print $1, "<- c(", xcol[FNR], ",", ycol[FNR], ")";}
FILENAME=="x.txt"{xcol[FNR]=$1;}
FILENAME=="y.txt"{ycol[FNR]=$1;}' x.txt y.txt words.txt
As you can also see here, file reading order and block order can be different.
Since words.txt has first column, or main column, so to speak, so it's sensible to read it last.
You can also use FILENAME==ARGV[1] FILENAME==ARGV[2] etc to check files, and put comments inside (use awk script file and load with awk -f scriptfile is better with comments):
awk 'FILENAME==ARGV[1]{xcol[FNR]=$1;} #Read column B, x column
FILENAME==ARGV[2]{ycol[FNR]=$1;} # Read column C, y cloumn
FILENAME==ARGV[3]{print $1, "<- c(", xcol[FNR], ",", ycol[FNR], ")";}' x.txt y.txt words.txt

ANTLR4 lexer rules not matching correct block of text

I am trying to understand how ANTLR4 works based on lexer and parser rules but I am missing something in the following example:
I am trying to parse a file and match all mathematic additions (eg 1+2+3 etc.). My file contains the following text:
start
4 + 5 + 22 + 1
other text other text test test
test test other text
55 other text
another text 2 + 4 + 255
number 44
end
and I would like to match
4 + 5 + 22 + 1
and
2 + 4 + 255
My grammar is as follows:
grammar Hello;
hi : expr+ EOF;
expr : NUM (PLUS NUM)+;
PLUS : '+' ;
NUM : [0-9]+ ;
SPACE : [\n\r\t ]+ ->skip;
OTHER : [a-z]+ ;
My abstract Syntax Tree is visualized as
Why does rule 'expr' matches the text 'start'? I also get an error "extraneous input 'start' expecting NUM"
If i make the following change in my grammar
OTHER : [a-z]+ ->skip;
the error is gone. In addition in the image above text '55 other text
another text' matches the expression as a node in the AST. Why is this happening?
All the above have to do with the way lexer matches an input? I know that lexer looks for the first longest matching rule but how can I change my grammar so as to match only the additions?
Why does rule 'expr' matches the text 'start'?
It doesn't. When a token shows up red in the tree, that indicates an error. The token did not match any of the possible alternatives, so an error was produced and the parser continued with the next token.
In addition in the image above text '55 other text another text' matches the expression as a node in the AST. Why is this happening?
After you skipped the OTHER tokens, your input basically looks like this:
4 + 5 + 22 + 1 55 2 + 4 + 255 44
4 + 5 + 22 + 1 can be parsed as an expression, no problem. After that the parser either expects a + (continuing the expression) or a number (starting a new expression). So when it sees 55, that indicates the start of a new expression. Now it expects a + (because the grammar says that PLUS NUM must appear at least once after the first number in an expression). What it actually gets is the number 2. So it produces an error and ignores that token. Then it sees a +, which is what it expected. And then it continues that way until the 44, which again starts a new expression. Since that isn't followed by a +, that's another error.
All the above have to do with the way lexer matches an input?
Not really. The token sequence for "start 4 + 5" is OTHER NUM PLUS NUM, or just NUM PLUS NUM if you skip the OTHERs. The token sequence for "55 skippedtext 2 + 4" is NUM NUM PLUS NUM. I assume that's exactly what you'd expect.
Instead what seems to be confusing you is how ANTLR recovers from errors (or maybe that it recovers from errors).

Recursive grep with wildcard and a pattern in the middle

I have 4 patterns of lines in files, in current directory and subdirectories:
type bed
type bed 1
type bed 1 +
type bed 1 .
type bed 2
type bed 2 +
type bed 2 .
etc., where the pattern is that the number (1 - 15) after "bed" increases, followed by a "+" or a "." or not followed anything.
I need to corral output for files only with pattern type
type bed 1 +
type bed 2 +
type bed 3 +
and I do not want to see files like
type bed
type bed 1
type bed 1 .
type bed 2
type bed 2 .
etc.
What I've tried:
grep -r -E 'type bed [1-15]' *
But I can't figure out how to limit my search to only include files that are followed by a "+"
And I need help defining [1-15], as it seems only [1-9] works in that it returns numbers 1,2,3,4,5,6,7,8,9. But as soon as I double digit numbers then my results are not what I would expect.
Ideas or links to related posts much appreciated!
This may help. [1-5]? matches an optional second digit. A + escaped by \ matches a literal + at the end
grep -r -E 'type bed [0-9][1-5]? \+' *

Resources