I'm trying to create a grammar. Here's my code so far:
use Text::Table::Simple; # zef install Text::Table::Simple
my $desc = q:to"FIN";
record person
name string;
age int;
end-record
FIN
grammar rec {
token TOP { <ws>* 'record' \s+ <rec-name> <field-descriptors> <ws> 'end-record' <ws> }
token rec-name { \S+ }
token field-descriptors { <field-descriptor>* }
token field-descriptor { <ws>* <field-name> <ws>+ <field-type> <ws>* ';' }
token field-name { \S+ }
token field-type { <[a..z]>+ }
token ws { <[\r\n\t\ ]> }
}
class recActions {
method field-descriptors($/) { $/.make: $/; }
method field-descriptor($/) { $/.make: $/; }
method field-name($/) { $/.make: $/ }
method field-type($/) { $/.make: $/ }
}
my $r = rec.parse($desc, :actions(recActions));
#say $r;
my $inp = q:to"FIN";
adam 26
joe 23
mark 51
FIN
sub splitter($line) {
my #lst = split /\s+/, $line;
}
sub matrixify(&splitter, $data)
{
my #d = (split /\n/, (trim-trailing $data)).map( -> $x { splitter $x ; } );
##d.say;
#my #cols = <name age>;
#say lol2table(#cols, #d).join("\n");
#d;
}
#my #cols =<A B>;
#my #rows = ([1,2], [3,4]);
#say lol2table(#cols, #rows).join("\n");
my #m = matrixify &splitter, $inp;
sub tabulate($rec-desc, #matrix)
{
my $fds = $rec-desc<field-descriptors>;
#say %fds<field-name>;
say $fds;
my #cols = $rec-desc.<field-descriptors>.map( -> $fd { say $fd; $fd.<field-name> ; 1;} );
#say $rec-desc.<field-descriptors>;
#say #cols;
}
tabulate $r, #m ;
I really just want the grammar to create a tree of lists/hash tables from the input. The output from the code is:
「
name string;
age int;」
field-descriptor => 「
name string;」
ws => 「
」
ws => 「 」
field-name => 「name」
ws => 「 」
field-type => 「string」
field-descriptor => 「
age int;」
ws => 「
」
ws => 「 」
field-name => 「age」
ws => 「 」
ws => 「 」
field-type => 「int」
which looks fairly good. perl6 seems to be decoding the fact that field-descriptors is composed of multiple field-descriptor, but it doesn't actually seem to put them into a list. I can do say $fds;, but I can't do say $fds[0];. Why does the former "work", but the latter doesn't?
I must admit to having a fairly weak grasp of what's going on. Would I be better of using rules instead of tokens? Do I really need an actions class; can't I just get perl to "automagically" populate the parse tree for me without having to specify a class of actions?
Update: possible solution
Suppose we just want to parse:
my $desc = q:to"FIN";
record person
name string;
age int;
end-record
FIN
and report on the field names and types that we find. I'm going to make a slight simplification to the grammar I wrote above:
grammar rec {
token TOP { <ws>* 'record' \s+ <rec-name> <field-descriptor>+ <ws> 'end-record' <ws> }
token rec-name { \S+ }
token field-descriptor { <ws>* <field-name> <ws>+ <field-type> <ws>* ';' }
token field-name { \S+ }
token field-type { <[a..z]>+ }
token ws { <[\r\n\t\ ]> }
}
Let's eschew actions completely, and just parse it into a tree:
my $r1 = rec.parse($desc);
Let's now inspect our handiwork, and print out the name and type for each field that we have parsed:
for $r1<field-descriptor> -> $fd { say "Name: $fd<field-name>, Type: $fd<field-type>"; }
Our output is as we expect:
Name: name, Type: string
Name: age, Type: int
I know you're now all set but here's an answer to wrap things up for others reading things later.
How do I just create a parse tree with perl6 grammar?
It's as simple as it can get: just use the return value from calling one of the built in parsing routines.
(Provided parsing is successful parse and cousins return a parse tree.)
The output from the code ... looks fairly good. perl6 seems to be decoding the fact that field-descriptors is composed of multiple field-descriptor, but it doesn't actually seem to put them into a list. I can do say $fds;, but I can't do say $fds[0];. Why does the former "work", but the latter doesn't?
See my answer to the SO question "How do I access the captures within a match?".
Would I be better of using rules instead of tokens?
The only difference between a token and a rule is the default for interpreting bare whitespace that you include within the token/rule.
(Bare whitespace within a token is completely ignored. Bare whitespace within a rule denotes "there can be whitespace at this point in the input".)
Do I really need an actions class[?]
No.
Only bother with an actions class if you want to systematically post process the parse tree.
can't I just get perl to "automagically" populate the parse tree for me without having to specify a class of actions?
Yes. Any time you call parse and the parse is successful its return value is a parse tree.
Update: possible solution
Let's eschew actions completely, and just parse it into a tree:
Right. If all you want is the parse tree then you don't need an actions class and you don't need to call make or made.
Conversely, if you want another tree, such as an Abstract Syntax Tree, then you will probably find it convenient to use the built in make and made routines. And if you use make and made you may well find it appropriate to use them in conjunction with a separate actions class rather than just embedding them directly in the grammar's rules/tokens/regexes.
Related
I have used Rails filter parameters to not print out the cities and States in the logs.
Rails.application.config.filter_parameters += [:city, :state]
However, I have a custom log that is getting incorrectly filtered and I need it to get printed:
Rails.logger.info("Transaction state has changed.", state: transaction.state)
How can I force this hash to bypass the parameter filtering?
I don't believe there is a way to ignore specific filters, however you can make the filter less generic. They are more generic than it seems.
Each entry in Rails.application.config.filter_parameters is actually an argument to ActiveSupport::ParameterFilter.new.
This takes two types of filters: Regexes and Proc. Everything else is converted to a Regex.
Rails.application.config.filter_parameters += [:city, :state] is really the very generic Rails.application.config.filter_parameters += [/city/i, /state/i]. That will match state_of_mind and specificity which is probably not what you had in mind.
pf = ActiveSupport::ParameterFilter.new([:city, :state])
p pf.filter({specificity: 42, state_of_mind: "confused"})
# {:specificity=>"[FILTERED]", :state_of_mind=>"[FILTERED]"}
You can match specific models with model.param.
Rails.application.config.filter_parameters += ["user.city", "user.state"]
This will match { user: { state: "..."} } (user.state) and not { transaction: { state: "..." } } (transaction.state).
However, these are still unanchored regex matches, so it will match { admin_user: { state_of_mind: "..." } } (admin_user.state_of_mind).
To avoid this, use a full regex.
Rails.application.config.filter_parameters += [/^user\.(city|state)$/]
You can also use a function for more control. This takes the key, value, and original parameters. These are a bit awkward.
The "key" is only the immediate key so { transaction: { state: "..." } } will only get :state.
The value itself must be changed.
Rails.application.config.filter_parameters << ->(key,value,original_params) do
p key
p value
p original_params
v.replace("[FILTERED]") if key.to_s.match(/^(city|state)$/i)?
end
I am writing a simple "language" in ANTLR4/JavaScript which can associate numbers to variables and print them. This works fine but after extending the print statement to take one or many variables I don't figure out how to get the count of them. (I am using a visitor, not listener, but am interested for both.)
Grammar:
print : 'print' ID (',' ID)* ';' ;
How do I find out how many ID tokens there are?
Currently I hacked something together as follows:
visitPrint( ctx ) {
let i = 0;
let c = undefined;
while( (c = ctx.ID(i)) ) {
let val = ctx.ID(i++).getText();
print( this.variables[val] );
}
}
Shouldn't there be a better way to do this, like some count() method?
Thanks for your response!
In your visitPrint method you get a PrintContext with a member ID(). This returns an array and you can simply use context.ID().length to get the ID count (note: no parameter).
If you create an id parser rule:
id
: ID
;
and then use this id rule in all other parser rules instead of the ID token, then you can override the visitId function:
visitId(ctx) {
// Check ctx.ID() here
}
I have 2 different DSL. These are linked in their grammar, in such a way that "hobbies" defined in 1 DSL, can be referenced in the 2nd DSL. I want to have validation rule that all the Hobbies are referenced in the 2nd DSL.
How can I obtain all the defined hobbies of the first DSL in the validation file of the 2nd DSL?
The first DSL called "MyDsl.xtext" looks like:
'I' 'am' name=ID
greetings+=HelloGreeting* 'Hobbies' hobbies+=Hobbie+
('I_dont_like' detests+=[Hobbie|QualifiedName])?
;
Hobbie:
'eg' name=ID
;
HelloGreeting:
'Hello' person=[Person] '!'
;
The second DSL called "MyDsl1.xtext" looks like:
JustGreetings:
greetings+=HiGreeting* stuff=Stuff
;
HiGreeting:
'Hi' person=[imported::Person] '!'
;
Stuff:
'I_also_like_to_do:' hobbies+=[imported::Hobbie|QualifiedName]+
;
The validation I am trying to do in the validation file looks like:
import org.eclipse.xtext.validation.Check;
import org.xtext.example.mydsl.myDsl.Person
import org.xtext.example.mydsl1.myDsl1.Stuff
class MyDsl1Validator extends AbstractMyDsl1Validator {
public static val INVALID_NAME = 'invalidName'
boolean found
#Check
def checkWalking(Stuff stuffs) {
var myHobbies = stuffs.hobbies
var definedHobbies = getPersons().hobbies
for (i: 0..definedHobbies.size) {
found = false
for (j: 0..myHobbies.size) {
if (definedHobbies.get(i).name == myHobbies.get(j).name) {
found = true
}
}
if (found == false) {
error("you are missing hobbie" + myHobbies.get(i).name + '.', null)
}
}
}
}
Current result: Right now I can import the Person type in the imports, that gives no error, but i do not know how to get all the instances.
Expected Result:
I can get a list of Persons defined in the other DSL, and use it to compare.
Edit:
Example inputs:
definition.mydsl
I am A
Hello B !
Hobbies eg walking
usage.mydsl1
Hi A!
I_also_like_to_do: A.walking
I have a nested JSON file, consisting of keys and values which are string only. But the structure of the JSON file is not fixed, so sometimes it could be nested 3 levels, sometimes only 2 levels.
I wonder how i could serialize this in strict mode?
"live" : {
"host" : "localhost",
"somevalue" : "nothing",
"anobject" : {
"one" : "two",
"three" : "four",
"five" : {
"six" : "seven"
}
}
}
If i would know the structure of the JSON, i simply would write my own class for it, but since the keys are not fixed, and also the nesting could be into several levels, i really wonder how i cut put such an object into a specific type.
Any help or hints appreciated
I think invariants will serve you well here. First off, it might be helpful to know that you can type a keyed tree strictly in Hack:
<?hh // strict
class KeyedTree<+Tk as arraykey, +T> {
public function __construct(
private Map<Tk, KeyedTree<Tk, T>> $descendants = Map{},
private ?T $v = null
) {}
}
(It must be a class because cyclic shape definitions are sadly not allowed)
I haven't tried it yet, but type_structures and Fred Emmott's TypeAssert look to also be of interest. If some part of your JSON blob is known to be fixed, then you could isolate the nested, uncertain part and build a tree out of it with invariants. In the limiting case where the whole blob is unknown, then you could excise the TypeAssert since there's no interesting fixed structure to assert:
use FredEmmott\TypeAssert\TypeAssert;
class JSONParser {
const type Blob = shape(
'live' => shape(
'host' => string, // fixed
'somevalue' => string, // fixed
'anobject' => KeyedTree<arraykey, mixed> // nested and uncertain
)
);
public static function parse_json(string $json_str): this::Blob {
$json = json_decode($json_str, true);
invariant(!array_key_exists('anobject', $json), 'JSON is not properly formatted.');
$json['anobject'] = self::DFS($json['anobject']);
// replace the uncertain array with a `KeyedTree`
return TypeAssert::matchesTypeStructure(
type_structure(self::class, 'Blob'),
$json
);
return $json;
}
public static function DFS(array<arraykey, mixed> $tree): KeyedTree<arraykey, mixed> {
$descendants = Map{};
foreach($tree as $k => $v) {
if(is_array($v))
$descendants[$k] = self::DFS($v);
else
$descendants[$k] = new KeyedTree(Map{}, $v); // leaf node
}
return new KeyedTree($descendants);
}
}
Down the road, you'll still have to supplement containsKey invariants on the KeyedTree, but that's the reality with unstructured data in Hack.
Assume I have:
visit(p) {
case ...
default:
println("This should not happen. All elements should be catched. Check: <x>");
};
How can I print out (in this case as x) what could not be matched?
I tried:
x:default:
\x:default:
default:x:
\default:x:
Tx,
Jos
We have a library named Traversal that allows you to get back the context of a match. So, you can do something like this:
import Traversal;
import IO;
void doit() {
m = (1:"one",2:"two",3:"three");
bottom-up visit(m) {
case int n : println("<n> is an int");
default: {
tc = getTraversalContext();
println("Context is: <tc>");
println("<tc[0]> is not an int");
if (str s := tc[0]) {
println("<s> is a string");
}
}
}
}
tc is then a list of all the nodes back to the top of the term -- in this case, it will just be the current value, like "three", and the entire value of map m (or the entire map, which will also be a match for the default case). If you had something structured as a tree, such as terms formed using ADTs or nodes, you would get all the intervening structure from the point of the match back to the top (which would be the entire term).
For some reason, though, default is matching the same term multiple times. I've filed this as bug report https://github.com/cwi-swat/rascal/issues/731 on GitHub.
You could also try this idiom:
visit(x) {
case ...
case ...
case value x: throw "default case should not happen <x>";
}
The value pattern will catch everything but only after the others are tried.