If I process a simple string with a regex, I expect I can extract variables. The examples in the manual states that the extraction results in a stored variable. This does not work as expected. If I do the following regex:
/\w*<subtext:text>\w*/ := "myfulltextstring"
I would expect the variable subtext to contain the string text. But it is undeclared. If I declare subtext before executing, it is empty. What is the simple way to do this extraction?
The scope of the variable subtext is not global, but "to the right of the match":
/\w*<subtext:text>\w*/ := "myfulltextstring" && bprintln(subtext)
or
if (/\w*<subtext:text>\w*/ := "myfulltextstring") {
println(subtext);
}
or
str x = "";
if (/\w*<subtext:text>\w*/ := "myfulltextstring") {
x = subtext;
}
Just having a declaration for subtext in an outer scope is not enough, since it would be masked by the regex variable.
Related
I working on a language similar to ruby called gaiman and I'm using PEG.js to generate the parser.
Do you know if there is a way to implement heredocs with proper indentation?
xxx = <<<END
hello
world
END
the output should be:
"hello
world"
I need this because this code doesn't look very nice:
def foo(arg) {
if arg == "here" then
return <<<END
xxx
xxx
END
end
end
this is a function where the user wants to return:
"xxx
xxx"
I would prefer the code to look like this:
def foo(arg) {
if arg == "here" then
return <<<END
xxx
xxx
END
end
end
If I trim all the lines user will not be able to use a string with leading spaces when he wants. Does anyone know if PEG.js allows this?
I don't have any code yet for heredocs, just want to be sure if something that I want is possible.
EDIT:
So I've tried to implement heredocs and the problem is that PEG doesn't allow back-references.
heredoc = "<<<" marker:[\w]+ "\n" text:[\s\S]+ marker {
return text.join('');
}
It says that the marker is not defined. As for trimming I think I can use location() function
I don't think that's a reasonable expectation for a parser generator; few if any would be equal to the challenge.
For a start, recognising the here-string syntax is inherently context-sensitive, since the end-delimiter must be a precise copy of the delimiter provided after the <<< token. So you would need a custom lexical analyser, and that means that you need a parser generator which allows you to use a custom lexical analyser. (So a parser generator which assumes you want a scannerless parser might not be the optimal choice.)
Recognising the end of the here-string token shouldn't be too difficult, although you can't do it with a single regular expression. My approach would be to use a custom scanning function which breaks the here-string into a series of lines, concatenating them as it goes until it reaches a line containing only the end-delimiter.
Once you've recognised the text of the literal, all you need to normalise the spaces in the way you want is the column number at which the <<< starts. With that, you can trim each line in the string literal. So you only need a lexical scanner which accurately reports token position. Trimming wouldn't normally be done inside the generated lexical scanner; rather, it would be the associated semantic action. (Equally, it could be a semantic action in the grammar. But it's always going to be code that you write.)
When you trim the literal, you'll need to deal with the cases in which it is impossible, because the user has not respected the indentation requirement. And you'll need to do something with tab characters; getting those right probably means that you'll want a lexical scanner which computes visible column positions rather than character offsets.
I don't know if peg.js corresponds with those requirements, since I don't use it. (I did look at the documentation, and failed to see any indication as to how you might incorporate a custom scanner function. But that doesn't mean there isn't a way to do it.) I hope that the discussion above at least lets you check the detailed documentation for the parser generator you want to use, and otherwise find a different parser generator which will work for you in this use case.
Here is the implementation of heredocs in Peggy successor to PEG.js that is not maintained anymore. This code was based on the GitHub issue.
heredoc = "<<<" begin:marker "\n" text:($any_char+ "\n")+ _ end:marker (
&{ return begin === end; }
/ '' { error(`Expected matched marker "${begin}", but marker "${end}" was found`); }
) {
const loc = location();
const min = loc.start.column - 1;
const re = new RegExp(`\\s{${min}}`);
return text.map(line => {
return line[0].replace(re, '');
}).join('\n');
}
any_char = (!"\n" .)
marker_char = (!" " !"\n" .)
marker "Marker" = $marker_char+
_ "whitespace"
= [ \t\n\r]* { return []; }
EDIT: above didn't work with another piece of code after heredoc, here is better grammar:
{ let heredoc_begin = null; }
heredoc = "<<<" beginMarker "\n" text:content endMarker {
const loc = location();
const min = loc.start.column - 1;
const re = new RegExp(`^\\s{${min}}`, 'mg');
return {
type: 'Literal',
value: text.replace(re, '')
};
}
__ = (!"\n" !" " .)
marker 'Marker' = $__+
beginMarker = m:marker { heredoc_begin = m; }
endMarker = "\n" " "* end:marker &{ return heredoc_begin === end; }
content = $(!endMarker .)*
I have such string:
"k1=v1; k2=v2; k3=v3"
Is there any simple way to make a map[string]string from it?
You will need to use a couple of calls to strings.Split():
s := "k1=v1; k2=v2; k3=v3"
entries := strings.Split(s, "; ")
m := make(map[string]string)
for _, e := range entries {
parts := strings.Split(e, "=")
m[parts[0]] = parts[1]
}
fmt.Println(m)
The first call will separate the different entries in the supplied string while the second will split the key/values apart. A working example can be found here.
I have a string like A=B&C=D&E=F, how to parse it into map in golang?
Here is example on Java, but I don't understand this split part
String text = "A=B&C=D&E=F";
Map<String, String> map = new LinkedHashMap<String, String>();
for(String keyValue : text.split(" *& *")) {
String[] pairs = keyValue.split(" *= *", 2);
map.put(pairs[0], pairs.length == 1 ? "" : pairs[1]);
}
Maybe what you really want is to parse an HTTP query string, and url.ParseQuery does that. (What it returns is, more precisely, a url.Values storing a []string for every key, since URLs sometimes have more than one value per key.) It does things like parse HTML escapes (%0A, etc.) that just splitting doesn't. You can find its implementation if you search in the source of url.go.
However, if you do really want to just split on & and = like that Java code did, there are Go analogues for all of the concepts and tools there:
map[string]string is Go's analog of Map<String, String>
strings.Split can split on & for you. SplitN limits the number of pieces split into like the two-argument version of split() in Java does. Note that there might only be one piece so you should check len(pieces) before trying to access pieces[1] say.
for _, piece := range pieces will iterate the pieces you split.
The Java code seems to rely on regexes to trim spaces. Go's Split doesn't use them, but strings.TrimSpace does something like what you want (specifically, strips all sorts of Unicode whitespace from both sides).
I'm leaving the actual implementation to you, but perhaps these pointers can get you started.
import ( "strings" )
var m map[string]string
var ss []string
s := "A=B&C=D&E=F"
ss = strings.Split(s, "&")
m = make(map[string]string)
for _, pair := range ss {
z := strings.Split(pair, "=")
m[z[0]] = z[1]
}
This will do what you want.
There is a very simple way provided by golang net/url package itself.
Change your string to make it a url with query params text := "method://abc.xyz/A=B&C=D&E=F";
Now just pass this string to Parse function provided by net/url.
import (
netURL "net/url"
)
u, err := netURL.Parse(textURL)
if err != nil {
log.Fatal(err)
}
Now u.Query() will return you a map containing your query params. This will also work for complex types.
Here is a demonstration of a couple of methods:
package main
import (
"fmt"
"net/url"
)
func main() {
{
q, e := url.ParseQuery("west=left&east=right")
if e != nil {
panic(e)
}
fmt.Println(q) // map[east:[right] west:[left]]
}
{
u := url.URL{RawQuery: "west=left&east=right"}
q := u.Query()
fmt.Println(q) // map[east:[right] west:[left]]
}
}
https://golang.org/pkg/net/url#ParseQuery
https://golang.org/pkg/net/url#URL.Query
In lua, is there any way to read an interface file to extract name/methods/args?
I have an .idl file like this:
interface
{
name = myInterface,
methods = {
testing = {
resulttype = "double",
args = {{direction = "in",
type = "double"},
}
}
}
This is equal to the code bellow (easier to read):
interface myInterface {
double testing (in double a);
};
I can read file, load as string and parse with gmatch for example to extract information, but is there any easy mode to parse this info?
At the end i want something (a table for example) with the interface name, their methods, result types and args. Just to know the interface that i`m working.
Lua has several facilities to interpret chunks of code. Namely, dofile, loadfile and loadstring. Luckily, your input file is almost valid Lua code (assuming those braces were matched). The only thing that is problematic is interface {.
All of the above functions effectively create a function object with a file's or a string's contents as their code. dofile immediately executes that function, while the others return a function, which you can invoke whenever you like. Therefore, if you're free to change the files, replace interface in the first line with return. Then you can do:
local interface = dofile("input.idl")
And interface will be a nice table, just as you have specified it in the file. If you cannot change those files to your liking, you will have to load the file into the string, perform some string manipulation (specifically, replace the first interface with return) and then use loadstring instead:
io.input("input.idl")
local input = io.read("*all")
input = string.gsub(input, "^interface", "return") -- ^ marks beginning of string
local f = loadstring(input)
local interface = f()
In both cases this is what you will get:
> require"pl.pretty".dump(interface)
{
name = "myInterface",
methods = {
testing = {
args = {
{
type = "double",
direction = "in"
}
},
resulttype = "double"
}
}
}
> print(interface.methods.testing.args[1].type)
double
EDIT:
I just realised, in your example input myInterface is not enclosed in " and therefore not a proper string. Is that also a mistake in your input file or is that what your files actually look like? In the latter case, you would need to change that as well. Lua is not going to complain if it's a name it doesn't know, but you also won't get the field in that case.
I have an integer field in a ClientDataSet and I need to compare to some values, something like this:
I can use const
const
mvValue1 = 1;
mvValue2 = 2;
if ClientDataSet_Field.AsInteger = mvValue1 then
or enums
TMyValues = (mvValue1 = 1, mvValue2 = 2);
if ClientDataSet_Field.AsInteger = Integer(mvValue1) then
or class const
TMyValue = class
const
Value1 = 1;
Value2 = 2;
end;
if ClientDataSet_Field.AsInteger = TMyValues.Value1 then
I like the class const approach but it seems that is not the delphi way, So I want to know what do you think
Declaration:
type
TMyValues = class
type TMyEnum = (myValue1, myValue2, myValue3, myValue4);
const MyStrVals: array [TMyEnum] of string =
('One', 'Two', 'Three', 'Four');
const MyIntVals: array [TMyEnum] of integer =
(1, 2, 3, 4);
end;
Usage:
if ClientDataSet_Field.AsInteger = TMyValues.MyIntVals[myValue1] then
A cast would generally be my last choice.
I wouldn't say that class consts are not the Delphi way. It's just they have been introduced to Delphi quite recently, and a lot of books and articles you'll find on the internet were written before their introduction, and thus you won't see them widely used. Many Delphi developers (I'd say the majority) will have started using Delphi before they were made available, and thus they're not the first thing that one thinks about.
One thing to consider is backwards compatibility - class constants are relatively new to Delphi so if your code has to be sharable with previous versions than they are out.
I typically use enumerated types, with the difference from yours is that my first enumeration is usually an 'undefined' item to represent NULL or 0 in an int field.
TmyValues = (myvUndefined, myvDescription1, myvDescription2)
if ClientDataSet_Field.AsInteger = Ord(myvDescription1) then...
To use a little bit of Jim McKeeth's answer - if you need to display to the user a text viewable version, or if you need to convert their selected text into the enumerated type, then an array comes in handy in conjuction with the type:
const MYVALS: array [TmyValues ] of string = ('', 'Description1', 'Description2');
You can then have utility functions to set/get the enumerated type to/from a string:
Function MyValString(const pMyVal:TmyValues):string;
begin
result := MYVALS[Ord(pMyVal)];
end;
Function StringToMyVal(const pMyVal:String):TMyValues;
var i:Integer;
begin
result := myvUndefined;
for i := Low(MYVALS) to High(MYVALS) do
begin
if SameText(pMyVal, MYVALS[i]) then
begin
result := TMyValues(i);
break;
end;
end;
end;
Continuing on... you can have scatter routine to set a combo/list box:
Procedure SetList(const DestList:TStrings);
begin
DestList.Clear;
for i := Low(MYVALS) to High(MYVALS) do
begin
DestList.Insert(MYVALS[i]);
end;
end;
In code: SetList(Combo1.Items) or SetList(ListBox1.Items)..
Then if you are seeing the pattern here... useful utility functions surrounding your enumeration, then you add everything to it's own class and put this class into it's own unit named MyValueEnumeration or whaterver. You end up with all the code surrounding this enumeration in one place and keep adding the utility functions as you need them. If you keep the unit clean - don't mix in other unrelated functionality then it will stay very handy for all projects related to that enumeration.
You'll see more patterns as time goes and you use the same functionality over and over again and you'll build a better mousetrap again.
When using constants I recommend assigning the type when the data type is a numeric float.
Delphi and other languages will not always evaluate values correctly if the types do not match...
TMyValue = class
const
// will not compare correctly to float values.
Value1 = 1; // true constant can be used to supply any data type value
Value2 = 2; // but should only be compared to similar data type
// will not compare correctly to a single or double.
Value3 = 3.3; // default is extended in debugger
// will not compare correctly to a single or extended.
Value1d : double = Value1; // 1.0
Value2d : double = Value2; // 2.0
end;
Compared float values in if () and while () statements should be compared to values of the same data type, so it is best to define a temporary or global variable of the float type used for any comparison statements (=<>).
When compared to the same float data type this format is more reliable for comparison operators in any programming language, not just in Delphi, but in any programming language where the defined float types vary from variable to constant.
Once you assign a type, Delphi will not allow you to use the variable to feed another constant, so true constants are good to feed any related data type, but not for comparison in loops and if statements, unless they are assigned and compared to integer values.
***Note: Casting a value from one float type to another may alter the stored value from what you entered for comparison purposes, so verify with a unit test that loops when doing this.
It is unfortunate that Delphi doesn't allow an enumeration format like...
TController : Integer = (NoController = 0, ncpod = 1, nextwave = 2);
or enforce the type name for access to the enumeration values.
or allow a class constant to be used as a parameter default in a call like...
function getControllerName( Controller : TController = TController.NoController) : string;
However, a more guarded approach that provides both types of access would be to place the enumeration inside a class.
TController = class
//const
//NoController : Integer = 1;
//ncpod : Integer = 2;
//nextwave : Integer = 3;
type
Option = (NoController = 0, ncpod = 1, nextwave = 2);
public
Class function Name( Controller : Option = NoController) : string; static;
end;
implementation
class function TController.Name( Controller : Option = NoController) : string;
begin
Result := 'CNC';
if (Controller = Option.nextwave) then
Result := Result + ' Piranha'
else if (Controller = Option.ncpod) then
Result := Result + ' Shark';
Result := Result + ' Control Panel';
end;
This approach will effectively isolate the values, provide the static approach and allow access to the values using a for () loop.
The access to the values from a floating function would be like this...
using TControllerUnit;
function getName( Controller : TController.Option = TController.Option.NoController) : string;
implementation
function getName( Controller : TController.Option = TController.Option.NoController) : string;
begin
Result := 'CNC';
if (Controller = TController.Option.nextwave) then
Result := Result + ' Piranha'
else if (Controller = TController.Option.ncpod) then
Result := Result + ' Shark';
Result := Result + ' Control Panel';
end;
so many options! :-) i prefer enums and routinely use them as you describe. one of the parts i like is that i can use them with a "for" loop. i do use class constants as well but prefer enums (even private enums) depending on what i'm trying to achieve.
TMyType=class
private const // d2007 & later i think
iMaxItems=1; // d2007 & later i think
private type // d2007 & later i think
TMyValues = (mvValue1 = 1, mvValue2 = 2); // d2007 & later i think
private
public
end;
An option you haven't thought of is to use a lookup table in the database and then you can check against the string in the database.
eg.
Select value, Description from tbl_values inner join tbl_lookup_values where tbl_values.Value = tbl_lookup_values.value
if ClientDataSet_Field.AsString = 'ValueIwant' then