ObjectPascal identifier naming style on other languages - delphi

I learned to program with delphi, and i always liked the object pascal code style, looks very intuitive and clean.
When you look at the variable declaration, you know what are you dealing with..
A fast summary:
Exception E EMyError
Classes and Types T TMyClass
Fields in classes f fVisible
Events On OnMouseDown
Pointer types P PMyRecord
Property Get Something Set SetSomething
It's too bad to use this identifier naming style in C++ C# Java, or any other language code?

Aside from taste and cultural issues (as already pointed by Mason)
There might be reasons why a convention is tied to a certain language, and the other languages might also have reasons for theirs.
I can only quickly think of a few examples though:
On languages that don't require a pointer type to be defined before use (like most non-Borland Pascals, C etc), the "P" one is usually rarely necessary.
Other languages might also have additional means of disambiguating (like in C where often types are upper cased, and the variables or fields get the lowercase identifier), and does not need "T". (strictly speaking Delphi doesn't neither at least for fields, since identifiers are somewhat context dependantly looked up (likeseparate namespaces for fields and types), but the convention is older than that feature)
BTW, you forget "I" for interface, and enum names being prefixed with some prefix derived from the base type name (e.g.
TStringsDefined = set of (sdDelimiter, sdQuoteChar, sdNameValueSeparator,
sdLineBreak, sdStrictDelimiter)
)
etc.
Hmm, this is another language specific bit, since Object Pascal always adds enum names to the global space (instead of requiring enumtype.enumname). With a prefix there is less polution of the global space.
That is one of my pet peeves with Delphi btw, the lack of import control (Modula2 style IMPORT QUALIFIED , FROM xxx IMPORT. Extended Pascal also has some of this)

As far as I know, the T, E, F, and P prefixes are only commonly used in Delphi programming. They're a standard part of the idiom here, but in C# or Java they'd look out of place.
Get and Set are pretty standard across object-oriented programming. Not sure about the On prefix, but it wouldn't surprise me to find that that's common in any event-driven framework.

Related

No F# generics with constant "template arguments"?

It just occurred to me, that F# generics do not seem to accept constant values as "template parameters".
Suppose one wanted to create a type RangedInt such, that it behaves like an int but is guaranteed to only contain a sub-range of integer values.
A possible approach could be a discriminated union, similar to:
type RangedInt = | Valid of int | Invalid
But this is not working either, as there is no "type specific storage of the range information". And 2 RangedInt instances should be of different type, if the range differs, too.
Being still a bit C++ infested it would look similar to:
template<int low,int high>
class RangedInteger { ... };
Now the question, arising is two fold:
Did I miss something and constant values for F# generics exist?
If I did not miss that, what would be the idiomatic way to accomplish such a RangedInt<int,int> in F#?
Having found Tomas Petricek's blog about custom numeric types, the equivalent to my question for that blog article would be: What if he did not an IntegerZ5 but an IntegerZn<int> custom type family?
The language feature you're requesting is called Dependent Types, and F# doesn't have that feature.
It's not a particularly common language feature, and even Haskell (which most other Functional programming languages 'look up to') doesn't really have it.
There are languages with Dependent Types out there, but none of them I would consider mainstream. Probably the one I hear about the most is Idris.
Did I miss something and constant values for F# generics exist?
While F# has much strong type inference than other .NET languages, at its heart it is built on .NET.
And .NET generics only support a small subset of what is possible with C++ templates. All type arguments to generic types must be types, and there is no defaulting of type arguments either.
If I did not miss that, what would be the idiomatic way to accomplish such a RangedInt in F#?
It would depend on the details. Setting the limits at runtime is one possibility – this would be the usual approach in .NET. Another would be units of measure (this seems less likely to be a fit).
What if he did not an IntegerZ5 but an IntegerZn<int> custom type family?
I see two reasons:
It is an example, and avoiding generics keeps things simpler allowing focus on the point of the example.
What other underlying type would one use anyway? On contemporary systems smaller types (byte, Int16 etc.) are less efficient (unless space at runtime is the overwhelming concern); long would add size without benefit (it is only going to hold 5 possible values).

Is there a relationship between untyped/typed code quotations in F# and macro hygiene?

I wonder if there is a relationship between untyped/typed code quotations in F# and the hygiene of macro systems. Do they solve the same issues in their respective languages or are they separate concerns?
The meta-programming aspect is the only similarity, and even in that regard, there is a big difference. You can think of the macro's transformer as a function from syntax to syntax like you can manipulate quotations, but the transformers are globally coordinated so that names used as binders follow a specific protocol:
1) Binders may not be the same as any free name in input to the macro (unless you use an unhygienic escape hatch)
2) Names bound in a macro definition's context that are free in the macro's expansion must point to the same thing at macro use time. (this needs global coordination)
Choices for names are made so that expansion does not fail if you used the wrong name (unless it turns out that name is unbound).
Transformers of typed quotations do not have this definition time context idea. You manipulate quotations to form a program that does not refer to any names in your program. They are not meant to provide a syntactic abstraction mechanism. Arbitrary shapes of syntax? Nope. It all has to be core AST shapes.
Open code in typed quotation systems can be closed with anything that fits the type structure of the expected context - there is no coordinated composition of several open components into a coherent structure.
Quotations are a form of meta-programming. They allow you to manipulate abstract syntax trees programmatically, which can be in turned spliced into code, and evaluated.
Typed quotations embed the reified type of the AST in the host language's type system, so they ensure you cannot generate ill-typed fragments of code. Untyped quotations do not offer that guarantee (it may fail with a runtime error).
As an aside, typed quotations are strongly similar to Template Haskell quasiquotations.
Hygenic macros in Lisp-like languages are related, in that they exist to support meta-programming. The hygiene however is for simple name capture confusion, something that typed quasi quotations already avoid (and more).
So yes, they are similar, in that they are mechanisms for meta-programming in typed and untyped languages, respectively. Both typed quasi quotes and hygenic macros add additional safety to fully untyped, unsound meta programming. The level of guarantee they offer the programmer though is different. The typed quotes are strictly stronger.

Alpha renaming in many languages

I have what I imagine will be a fairly involved technical challenge: I want to be able to reliably alpha-rename identifiers in multiple languages (as many as possible). This will require special consideration for each language, and I'm asking for advice for how to minimize the amount of work I need to do by sharing code. Something like a unified parsing or abstract syntax framework that already has support for many languages would be great.
For example, here is some python code:
def foo(x):
def bar(y):
return x+y
return bar
An alpha renaming of x to y changes the x to a y and preserves semantics. So it would become:
def foo(y):
def bar(y1):
return y+y1
return bar
See how we needed to rename y to y1 in order to keep from breaking the code? That is why this is a hard problem. It seems like the program would have to have a pretty good knowledge of what constitutes a scope, rather than just doing, say, a string search and replace.
I would also like to preserve as much of the formatting as possible: comments, spacing, indentation. But that is not 100% necessary, it would just be nice.
Any tips?
To do this safely, you need to be able to to determine
all the identifiers (and those things that are not, e.g., the middle of a comment) in your code
the scopes of validity for each identifer
the ability to substitute a new identifier for an old one in the text
the ability to determine if renaming an identifier causes another name to be shadowed
To determine identifiers accurately, you need a least a langauge-accurate lexer. Identifiers in PHP look different than the do in COBOL.
To determine scopes of validity, you have to be determine program structure in practice, since most "scopes" are defined by such structure. This means you need a langauge-accurate parser; scopes in PHP are different than scopes in COBOL.
To determine which names are valid in which scopes, you need to know the language scoping rules. Your language may insist that the identifier X will refer to different Xes depending on the context in which X is found (consider object constructors named X with different arguments). Now you need to be able to traverse the scope structures according to the naming rules. Single inheritance, multiple inheritance, overloading, default types all will pretty much require you to build a model of the scopes for the programs, insert the identifiers and corresponding types into each scope, and then climb from the point of encounter of an identifier in the program text through the various scopes according to the language semantics. You will need symbol tables, inheritance linkages, ASTs, and the ability to navigage all of these. These structures are different from PHP and COBOL, but they share lots of common ideas so you likely need a library with the common concept support.
To rename an identifier, you have to modify the text. In a million lines of code, you need to point carefully. Modifying an AST node is one way to point carefully. Actually, you need to modify all the identifiers that correspond to the one being renamed; you have to climb over the tree to find them all, or record in the AST where all the references exist so they can be found easily. After modifyingy the tree you have to regenerate the source text after modifying the AST. That's a lot of machinery; see my SO answer on how to prettyprint ASTs preseriving all of the stuff you reasonably suggest should be preserved.
(Your other choice is to keep track in the AST of where the text for the string is,
and the read/patch/write the file.)
Before you update the file, you need to check that you haven't shadowed something. Consider this code:
{ local x;
x=1;
{local y;
y=2;
{local z;
z=y
print(x);
}
}
}
We agree this code prints "1". Now we decide to rename y to x.
We've broken the scoping, and now the print statement which referred
conceptually to the outer x refers to an x captured by the renamed y. The code now prints "2", so our rename broke it. This means that one must check all the other identifiers in scopes in which the renamed variable might be found, to see if the new name "captures" some name we weren't expecting. (This would be legal if the print statement printed z).
This is a lot of machinery.
Yes, there is a framework that has almost all of this as well as a number of robust language front ends. See our DMS Software Reengineering Toolkit. It has parsers producing ASTs, prettyprinters to produce text back from ASTs, generic symbol table management machinery (including support for multiple inheritance), AST visiting/modification machinery. Ithas prettyprinting machinery to turn ASTs back into text. It has front ends for C, C++, COBOL and Java that implement name and type resolution (e.g. instanting symbol table scopes and identifier to symbol table entry mappings); it has front ends for many other langauges that don't have scoping implemented yet.
We've just finished an exercise in implementing "rename" for Java. (All the above issues of course appeared). We about about to start one for C++.
You could try to create Xtext based implementations for the involved languages. The Xtext framework provides reliable infrastructure for cross language rename refactoring. However, you'll have to provide a grammar a at least a "good enough" scope resolution for each language.
Languages mostly guarantee tokens will be unique, whatever the context. A naive first approach (and this will break many, many pieces of code) would be:
cp file file.orig
sed -i 's/\b(newTokenName)\b/TEMPTOKEN/g' file
sed -i 's/\b(oldTokenName)\b/newTokenName/g' file
With GNU sed, this will break on PHP. Rewriting \b to a general token match, like ([^a-zA-Z~$-_][^a-zA-Z0-9~$-_]) would work on most C, Java, PHP, and Python, but not Perl (need to add # and % to the token characters. Beyond that, it would require a plugin architecture that works for any language you wanted to add. At some point, there will be two languages whose variable and function naming rules will be incompatible, and at that point, you'll need to do more and more in the plugin.

F# instance syntax

What indicator do you use for member declaration in F#? I prefer
member a.MethodName
this is to many letters and x is used otherwise.
I do almost always use x as the name of this instance. There is no logic behind that, aside from the fact that it is shorter than other options.
The options that I've seen are:
member x.Foo // Simply use (short) 'x' everywhere
member ls.Foo // Based on type name as Benjol explains
member this.Foo // Probably comfortable for C# developers
member self.Foo // I'm not quite sure where this comes from!
member __.Foo // Double underscore to resemble 'ignore pattern'
// (patterns are not allowed here, but '__' is identifier)
The option based on the type name makes some sense (and is good when you're nesting object expressions inside a type), but I think it could be quite difficult to find reasonable two/three abbreviation for every type name.
Don't have a system. Wonder if I should have one, and I am sure there will be a paradigm with its own book some day soon. I tend to use first letter(s) of the type name, like Benjol.
This is a degree of freedom in F# we could clearly do without. :)
I tend to use some kind of initials which represent the type so:
type LaserSimulator =
member ls.Fire() =
I largely tend to use self.MethodName, for the single reason that self represents the current instance by convention in the other language I use most: Python. Come to think of it, I used Delphi for some time and they have self as well instead of this.
I have been trying to convert to a x.MethodName style, similar to the two books I am learning from: Real World Functional Programming and Expert F#. So far I am not succeeding, mainly because referring to x rather than self (or this) in the body of the method still confuses me.
I guess what I am saying is that there should be a meaningful convention. Using this or self has already been standardised by other languages. And I personally don't find the three letter economy to be that useful.
Since I work in the .NET world, I tend to use "this" on the assumption that most of the .NET people who encounter F# will understand its meaning. Of course, the other edge of that sword is that they might get the idea that "this" is the required form.
.NET self-documentation concerns aside, I think I would prefer either: "x" in general, or -- like Benjol -- some abbreviation of the class name (e.g. "st" for SuffixTrie, etc.).
The logic I use is this: if I'm not using the instance reference inside the member definition, I use a double underscore ('__'), a la let-binding expressions. If I am referencing the instance inside the definition (which I don't do often), I tend to use 'x'.

what does underscore mean in Delphi4

I come across the following in code.
_name1
_name2
smeEGiGross:
In general, what does _name1 underscore mean in Delphi 4?
I think it's just a common practice to begin variable names with underscores.
Rules for variable (and component) names in Delphi:
Name can be any length but Delphi uses only first 255 characters.
First character must be letter or underscore not a number.
You cannot use any special characters such as a question mark (?), but you can
use an underscore (_).
No spaces area allowed in the name of a variable.
Reserved words (such as begin, end, if, program) cannot be used as variables.
Delphi is case insensitive – it does not matter whether capital letters are
used or not. Just make sure the way variables or components are used is
consistent throughout the program.
It's a convention to help determine scope of a variable by its name, like private class members. The original author probably uses C++ as well.
In Delphi, I prefer to prefix fields with "F", method parameters with "a" (argument) and local variables with "l".
Update:
Another place you might see underscores is after certain identifiers in code generated with WSDLImp or TLBImp to avoid confusion with existing Delphi identifiers. For example, unless you specify otherwise, "Name" is renamed (no pun intended) to "Name_".
To add to the answer of phoenix:
* You can use reserved words as identifiers, but you must add a & sign: &then,
&unit.
I only use this if another name ís not apropriate.
I associate leading underscores with C/C++ not with Delphi. The C syntax is more character oriented (like {, }, || and &&) so the underscores fit perfectly. While the Delphi/Pascal syntax is more text oriented (begin, end, or, and) so the underscores look a bit strange as you don't expect them there.
It's regularly used for scope.
I declare all my private variables with a _ at the beginning, which is more common in C#/C++.
It's for readability and doesn't necessarily mean anything.
I cannot say what the author of the code you have in mind was thinking, but an underscore prefix has a fairly well recognised convention in relation to COM.
Members in a COM interface that have an underscore prefix are (or were) hidden by default by interface browsers/inspectors, such as VB's library browser etc. This is used when members are published by necessity but are not intended to be used directly.
The _AddRef and _Release members of the IUnknown interface are perhaps the most obvious example of this.
I've adopted this convention myself, so that for example if I declare a unit implementation variable that is exposed through the unit interface via an accessor function I will name the variable with an underscore prefix. It's not strictly necessary in that sort of example but it acts as documentation for myself (and anyone else reading my code that is aware of the convention, which is fairly self-evident from the context).
Another example is when I have a function pointer that may refer to different functions according to runtime conditions so the actual functions that the pointer may refer to are not intended to be called directly but are instead intended to be invoked via the function pointer.
So speaking for myself, I use it as a warning/reminder.... Proceed with caution you are not expected to reference this symbol directly.

Resources