cannot traverse the nodes of an AST, while assigning each node an ID - rascal

This is more a simple personal attempt to understand what goes on inside Rascal. There must be better (if not already supported) solution.
Here's the code:
fileLoad = |home:///PHPAnalysis/systems/ApilTestScripts/simple1.php|;
fileAST=loadPHPFile(fileLoad,true,false);
//assign a simple id to each node
public map[value,int] assignID12(node N)
{
myID=();
visit(N)
{
case node M:
{
name=getName(M);
myID[name] =999;
}
}
return myID;
}
ids=assignID12(fileAST);
gives me
|stdin:///|(92,4,<1,92>,<1,96>): Expected str, but got value
loadPHPFile returns a node of type: list[Stmt], where each Stmt is one of the many types of statements that could occur in a program (PHP, in my case). Without going into why I'd do this, why doesn't the above code work? Especially frustrating because a very simple example is worked out in the online documentation. See: http://tutor.rascal-mpl.org/Recipes/Basic/Basic.html#/Recipes/Common/CountConstructors/CountConstructors.html

I started a new console, and it seems to work. Of course, I changed the return type from map[value,int] to map[str,int] as it was originally in the example.
The problem I was having was that I may have erroneously defined the function previously. While I quickly fixed an apparent problem, it kept giving me errors. I realized that in Rascal, when you've started a console and imported certain definitions, it (seems)is impossible to overwrite those definitions. The interpreter keeps making reference to the very first definition that you provided. This could just be the interpreter performing a type-check, and preventing unintentional and/or incompatible assignments further down the road. That makes sense for variables (in the typical program sense), but it doesn't seem like the best idea to enforce that on functions (or methods). I feel it becomes cumbersome, because a user typically has to undergo some iterations before he/she is satisfied with a function definition. Just my opinion though...

Most likely you already had the name ids in scope as having type map[str,int], which would be the direct source of the error. You can look in script https://github.com/cwi-swat/php-analysis/blob/master/src/lang/php/analysis/cfg/LabelState.rsc at the function labelScript to see how this is done in PHP AiR (so you don't need to write this code yourself). What this will give you is a script where all the expressions and statements have an assigned ID, as well as the label state, which just keeps track of some info used in this labeling operation (mainly the counter to generate a unique ID).
As for the earlier response, the best thing to do is to give your definitions in modules which you can import. If you do that, any changes to types, etc will be picked up (automatically if the module is already imported, since Rascal will reimport the module for you if it has changed, or when you next import the module). However, if you define something directly in the console, this won't happen. Think of the console as one large module that you keep adding to. Since we can have overloads of functions, if you define the function again you are really defining a new alternative to the function, but this may not work like you expect.

Related

Why is using final, with no type, considered good practice in Dart? ie `final foo = config.foo;`?

I see this recommended in the dart style guide, and copied in tons of tutorials and flutter source.
final foo = config.foo;
I don't understand it, how is this considered best practice when the readability is so poor? I have no clue what foo is here, surely final String foo = config.foo is preferable if we really want to use final?
This seems the equivalent to using var, which many consider a bad practice because it prevents the compiler from spotting errors and is less readable.
Where am I wrong?
In a lot of cases is does not really matter what type you are using as long the specific type can be statically determined by the compiler. When you are using var/final in Dart it is not that Dart does not know the type, it will just figure it out automatically based on the context. But the type must still be defined when the program are compiled so the types will never be dynamic based on runtime behavior. If you want truly dynamic types, you can use the dynamic keyword where you tell Dart "trust me, I know what I am doing with this types".
So types should still be defined where it matter most. This is e.g. for return and argument types for methods and class variables. The common factor for this types is that they are used to define the interface for using the method or class.
But when you are writing code inside a method, you are often not that interested in the specific types of variables used inside the method. Instead the focus should be the logic itself and you can often make it even more clear by using good describing variable names. As long the Dart analyzer can figure out the type, you will get autocomplete from your IDE and you can even still see the specific type from your IDE by e.g. Ctrl+Q in IntelliJ on the variable if you ends up in a situation where you want to know the type.
This is in particular the case when we are talking about the use of generics where it can be really tiresome to write the full specific type. Especially if you are using multiple generics inside each other like e.g. Map<String, List<String>>.
In my experience, Dart is really good to figure out very specific types and will complain loudly if your code cannot be determined statically. In the coming future, Dart will introduce non-null by default, which will make the Dart compiler and analyzer even more powerful since it will make sure your variable cannot be null unless this is something you specifically want and will make sure you are going to test for null when using methods which are specified to not expecting null.
About "Where am I wrong?". Well, a lot of languages have similar feature of var/final like Dart with the same design principle about the type should still be statically determined by a compiler or runtime. And even Java has introducing this feature. As a experienced Java and Dart programmer I have come to the conclusion for myself that typing inside methods are really not that important in a lot of cases as long I can still easily figure out the specific type by using an IDE when it really matters.
But it does make it more important to name your variables so they are clearly defining the purpose. But I am hoping you already are doing that. :)

D/Dlang: Lua interface, any way to force users to have no access to intermediate objects?

Status: Sort of solved. Switching Lua.Ref (close equivalent to LuaD LuaObject) to struct as suggested in answer has solved most issues related to freeing references, and I changed back to similar mechanism LuaD uses. More about this in the end.
In one of my project, I am working with Lua interface. I have mainly borrowed the ideas from LuaD. The mechanism in LuaD uses lua_ref & lua_unref to be able to move lua table/function references in D space, but this causes heavy problems because the calls to destructors and their order is not guaranteed. LuaD usually segfaults at least at the program exit.
Because it seems that LuaD is not maintained anymore, I decided to write my own interface for my purposes. My Lua interface class is here: https://github.com/mkoskim/games/blob/master/engine/util/lua.d
Usage examples can be found here:
https://github.com/mkoskim/games/blob/master/demo/luasketch/luademo.d
And in case you need, the Lua script used by the example is here:
https://github.com/mkoskim/games/blob/master/demo/luasketch/data/test.lua
The interface works like this:
Lua.opIndex pushes global table and index key to stack, and return Top object. For example, lua["math"] pushes _G and "math" to stack.
Further accesses go through Top object. Top.opIndex goes deeper in the table hierarchy. Other methods (call, get, set) are "final" methods, which perform an operation with the table and key at the top of the stack, and clean the stack afterwards.
Close everything works fine, except this mechanism has nasty quirk/bug that I have no idea how to solve it. If you don't call any of those "final" methods, Top will leave table and key to the stack:
lua["math"]["abs"].call(-1); // Works. Final method (call) called.
lua["math"]["abs"]; // table ref & key left to stack :(
What I know for sure, is that playing with Top() destructor does not work, as it is not called immediately when object is not referenced anymore.
NOTE: If there is some sort of operator to be called when object is accessed as rvalue, I could replace call(), set() and get() methods with operator overloads.
Questions:
Is there any way to prevent users to write such expressions (getting Top object without calling any of "final" methods)? I really don't want users to write e.g. luafunc = lua["math"]["abs"] and then later try to call it, because it won't work at all. Not without starting to play with lua_ref & lua_unref and start fighting with same issues that LuaD has.
Is there any kind of opAccess operator overloading, that is, overloading what happens when object is used as rvalue? That is, expression "a = b" -> "a.opAssign(b.opAccess)"? opCast does not work, it is called only with explicit casts.
Any other suggestions? I internally feel that I am looking solution from wrong direction. I feel that the problem reside in the realm of metaprogramming: I am trying to "scope" things at expression level, which I feel is not that suitable for classes and objects.
So far, I have tried to preserve the LuaD look'n'feel at interface user's side, but I think that if I could change the interface to something like following, I could get it working:
lua.call(["math", "abs"], 1); // call lua.math.abs(2)
lua.get(["table", "x", "y", "z"], 2); // lua table.x.y.z = 2
...
Syntactically that would ensure that reference to lua object fetched by indexing is finally used for something in the expression, and the stack would be cleaned.
UPDATE: Like said, changing Lua.Ref to struct solved problems related to dereferencing, and I am again using reference mechanism similar to LuaD. I personally feel that this mechanism suits the LuaD-style syntax I am using, too, and it can be quite a challenge to make the syntax working correctly with other mechanisms. I am still open to hear if someone has ideas to make it work.
The system I sketched to replace references (to tackle the problem with objects holding references living longer than lua sandbox) would probably need different kind of interface, something similar I sketched above.
You also have an issue when people do
auto math_abs = lua["math"]["abs"];
math_abs.call(1);
math_abs.call(3);
This will double pop.
Make Top a struct that holds the stack index of what they are referencing. That way you can use its known scoping and destruction behavior to your advantage. Make sure you handle this(this) correctly as well.
Only pop in the destructor when the value is the actual top value. You can use a bitset in LuaInterface to track which stack positions are in use and put the values in it using lua_replace if you are worried about excessive stack use.

Using Dart as a DSL

I am trying to use Dart to tersely define entities in an application, following the idiom of code = configuration. Since I will be defining many entities, I'd like to keep the code as trim and concise and readable as possible.
In an effort to keep boilerplate as close to 0 lines as possible, I recently wrote some code like this:
// man.dart
part of entity_component_framework;
var _man = entity('man', (entityBuilder) {
entityBuilder.add([TopHat, CrookedTeeth]);
})
// test.dart
part of entity_component_framework;
var man = EntityBuilder.entities['man']; // null, since _man wasn't ever accessed.
The entity method associates the entityBuilder passed into the function with a name ('man' in this case). var _man exists because only variable assignments can be top-level in Dart. This seems to be the most concise way possible to use Dart as a DSL.
One thing I wasn't counting on, though, is lazy initialization. If I never access _man -- and I had no intention to, since the entity function neatly stored all the relevant information I required in another data structure -- then the entity function is never run. This is a feature, not a bug.
So, what's the cleanest way of using Dart as a DSL given the lazy initialization restriction?
So, as you point out, it's a feature that Dart doesn't run any code until it's told to. So if you want something to happen, you need to do it in code that runs. Some possibilities
Put your calls to entity() inside the main() function. I assume you don't want to do that, and probably that you want people to be able to add more of these in additional files without modifying the originals.
If you're willing to incur the overhead of mirrors, which is probably not that much if they're confined to this library, use them to find all the top-level variables in that library and access them. Or define them as functions or getters. But I assume that you like the property that variables are automatically one-shot. You'd want to use a MirrorsUsed annotation.
A variation on that would be to use annotations to mark the things you want to be initialized. Though this is similar in that you'd have to iterate over the annotated things, which I think would also require mirrors.

Why MEF has [ImportMany] and not just [Import]

I just hunted down an problem in my mef application; problem was, that I had an [Import] instead of [ImportMany] in my IEnumerable<IFoo> property. I started to wonder why. MEF sees that the injection target is a "collection" and could determine that collection is needed instead of a single element. At least Ninject works this way.
Does anyone have insight why [ImportMany] is required? Only reason I can think of is that one might want to [Export(typeof(IEnumerable<IBar>)] public IEnumerable<Bar> { get; } but is this really the reason for this design? I bet I'm not the only one who has been debugging this kind of error.
It's not the same ;)
[Import] indicates that you want to import a single thing according to a contract. In MEF, a contract is just a string, and when you import a type (like IEnumerable<IBar>), you're really importing according to a contract which is just the name of that type.
In MEF, cardinality is very important, so when you state that you wish to import a single instance of something that fits the stated contract, there can only be a single source. If multiple exports are found, an exception is thrown because of cardinality mismatch.
The [Import] functionality doesn't contain special logic to handle IEnumerable<T>, so from its perspective, it's just a contract like everything else.
The [ImportMany] attribute, however, exists especially to bridge that gap. It accepts zero to any number of exports for the stated contract. This means that instead of having a single export of IEnumerable<IBar> you can have many exports of IBar scattered across multiple assemblies, and there's never going to be a cardinality mismatch.
In the end it's a design philosphy. MEF could have had special, built-in knowledge about IEnumerable<T>. Autofac (and apparently Ninject) does that and call it a Relationship Type.
However, special-casing like that implies that somewhere the implementing code violates the Liskov Substitution Principle, which again can lead to POLA violations, so in this case I tend towards taking side with the MEF designers. Going for a more explicit API may decrease discoverability, but may be a bit safer.
To simplify the above answer slightly:
[Import] will throw an exception if there is more than one matching export.
[ImportMany] will load more than one matching export without throwing an error.
If I have an IDataAccessLayer that I want to import, there should only ever be ONE export available - I'm never going to be writing to 2 databases simultaneously so i use [Import] to ensure that only one will exist.
If I want to load up many different BusinessObjects, I will use [ImportMany] because I want lots of different types of BusinessObjects.

How do programmers practice code reuse

I've been a bad programmer because I am doing a copy and paste. An example is that everytime i connect to a database and retrieve a recordset, I will copy the previous code and edit, copy the code that sets the datagridview and edit. I am aware of the phrase code reuse, but I have not actually used it. How can i utilize code reuse so that I don't have to copy and paste the database code and the datagridview code.,
The essence of code reuse is to take a common operation and parameterize it so it can accept a variety of inputs.
Take humble printf, for example. Imagine if you did not have printf, and only had write, or something similar:
//convert theInt to a string and write it out.
char c[24];
itoa(theInt, c, 10);
puts(c);
Now this sucks to have to write every time, and is actually kind of buggy. So some smart programmer decided he was tired of this and wrote a better function, that in one fell swoop print stuff to stdout.
printf("%d", theInt);
You don't need to get as fancy as printf with it's variadic arguments and format string. Even just a simple routine such as:
void print_int(int theInt)
{
char c[24];
itoa(theInt, c, 10);
puts(c);
}
would do the trick nickely. This way, if you want to change print_int to always print to stderr you could update it to be:
void print_int(int theInt)
{
fprintf(stderr, "%d", theInt);
}
and all your integers would now magically be printed to standard error.
You could even then bundle that function and others you write up into a library, which is just a collection of code you can load in to your program.
Following the practice of code reuse is why you even have a database to connect to: someone created some code to store records on disk, reworked it until it was usable by others, and decided to call it a database.
Libraries do not magically appear. They are created by programmers to make their lives easier and to allow them to work faster.
Put the code into a routine and call the routine whenever you want that code to be executed.
Check out Martin Fowler's book on refactoring, or some of the numerous refactoring related internet resources (also on stackoverflow), to find out how you could improve code that has smells of duplication.
At first, create a library with reusable functions. They can be linked with different applications. It saves a lot of time and encourages reuse.
Also be sure the library is unit tested and documented. So it is very easy to find the right class/function/variable/constant.
Good rule of thumb is if you use same piece three times, and it's obviously possible to generalize it, than make it a procedure/function/library.
However, as I am getting older, and also more experienced as a professional developer, I am more inclined to see code reuse as not always the best idea, for two reasons:
It's difficult to anticipate future needs, so it's very hard to define APIs so you would really use them next time. It can cost you twice as much time - once you make it more general just so that second time you are going to rewrite it anyway. It seems to me that especially Java projects of late are prone to this, they seem to be always rewritten in the framework du jour, just to be more "easier to integrate" or whatever in the future.
In a larger organization (I am a member of one), if you have to rely on some external team (either in-house or 3rd party), you can have a problem. Your future then depends on their funding and their resources. So it can be a big burden to use foreign code or library. In a similar fashion, if you share a piece of code to some other team, they can then expect that you will maintain it.
Note however, these are more like business reasons, so in open source, it's almost invariably a good thing to be reusable.
to get code reuse you need to become a master of...
Giving things names that capture their essence. This is really really important
Making sure that it only does one thing. This is really comes back to the first point, if you can't name it by its essence, then often its doing too much.
Locating the thing somewhere logical. Again this comes back to being able to name things well and capturing its essence...
Grouping it with things that build on a central concept. Same as above, but said differntly :-)
The first thing to note is that by using copy-and-paste, you are reusing code - albeit not in the most efficient way.
You have recognised a situation where you have come up with a solution previously.
There are two main scopes that you need to be aware of when thinking about code reuse. Firstly, code reuse within a project and, secondly, code reuse between projects.
The fact that you have a piece of code that you can copy and paste within a project should be a cue that the piece of code that you're looking at is useful elsewhere. That is the time to make it into a function, and make it available within the project.
Ideally you should replace all occurrances of that code with your new function, so that it (a) reduces redundant code and (b) ensures that any bugs in that chunk of code only need to be fixed in one function instead of many.
The second scope, code reuse across projects, requires some more organisation to get the maximum benefit. This issue has been addressed in a couple of other SO questions eg. here and here.
A good start is to organise code that is likely to be reused across projects into source files that are as self-contained as possible. Minimise the amount of supporting, project specific, code that is required as this will make it easier to reuse entire files in a new project. This means minimising the use of project specific data-types, minimising the use project specific global variables, etc.
This may mean creating utility files that contain functions that you know are going to be useful in your environment. eg. Common database functions if you often develop projects that depend on databases.
I think the best way to answer your problem is that create a separate assembly for your important functions.. in this way you can create extension methods or modify the helper assemble itself.. think of this function..
ExportToExcel(List date, string filename)
this method can be use for your future excel export functions so why don't store it in your own helper assembly.. i this way you just add reference to these assemblies.
Depending on the size of the project can change the answer.
For a smaller project I would recommend setting up a DatabaseHelper class that does all your DB access. It would just be a wrapper around opening/closing connections and execution of the DB code. Then at a higher level you can just write the DBCommands that will be executed.
A similar technique could be used for a larger project, but would need some additional work, interfaces need to be added, DI, as well as abstracting out what you need to know about the database.
You might also try looking into ORM, DAAB, or over to the Patterns and Practices Group
As far as how to prevent the ole C&P? - Well as you write your code, you need to periodically review it, if you have similar blocks of code, that only vary by a parameter or two, that is always a good candidate for refactoring into its own method.
Now for my pseudo code example:
Function GetCustomer(ID) as Customer
Dim CMD as New DBCmd("SQL or Stored Proc")
CMD.Paramaters.Add("CustID",DBType,Length).Value = ID
Dim DHelper as New DatabaseHelper
DR = DHelper.GetReader(CMD)
Dim RtnCust as New Customer(Dx)
Return RtnCust
End Function
Class DataHelper
Public Function GetDataTable(cmd) as DataTable
Write the DB access code stuff here.
GetConnectionString
OpenConnection
Do DB Operation
Close Connection
End Function
Public Function GetDataReader(cmd) as DataReader
Public Function GetDataSet(cmd) as DataSet
... And So on ...
End Class
For the example you give, the appropriate solution is to write a function that takes as parameters whatever it is that you edit whenever you paste the block, then call that function with the appropriate data as parameters.
Try and get into the habit of using other people's functions and libraries.
You'll usually find that your particular problem has a well-tested, elegant solution.
Even if the solutions you find aren't a perfect fit, you'll probably gain a lot of insight into the problem by seeing how other people have tackled it.
I'll do this at two levels. First within a class or namespace, put that code piece that is reused in that scope in a separate method and make sure it is being called.
Second is something similar to the case that you are describing. That is a good candidate to be put in a library or a helper/utility class that can be reused more broadly.
It is important to evaluate everything that you are doing with an perspective whether it can be made available to others for reuse. This should be a fundamental approach to programming that most of us dont realize.
Note that anything that is to be reused needs to be documented in more detail. Its naming convention be distinct, all the parameters, return results and any constraints/limitations/pre-requisites that are needed should be clearly documented (in code or help files).
It depends somewhat on what programming language you're using. In most languages you can
Write a function, parameterize it to allow variations
Write a function object, with members to hold the varying data
Develop a hierarchy of (function object?) classes that implement even more complicated variations
In C++ you could also develop templates to generate the various functions or classes at compile time
Easy: whenever you catch yourself copy-pasting code, take it out immediately (i.e., don't do it after you've already CP'd code several times) into a new function.

Resources