Coding Standards for Gremlin/Cypher - neo4j

We are in the process of developing a review tool for Gremlin/Cypher as we predominantly work with Neo4j graph databases in our project to reduce the manual review effort and also deliver quality code.
Are there any list of coding standards(formatting/performance tips etc.,) for Gremlin and Cypher scripts which can be used as a checklist for performing review of these scripts?

I don't think you're going to find one specific answer, as discussing coding standards can lead to very subjective (and debate-laden) answers. That said: I'll go with something more objective:
First step would be to decide on Gremlin vs Cypher since they're not the same thing nor the same style. When making that decision (and maybe that decision is use both), you should really take a close look at Neo4j 2.0 development (currently at Milestone 4), as Cypher is maturing rapidly and there's a lot of work being put into it, both from expressiveness point of view and performance point of view.
Assuming you go with Cypher, I'd suggest you look at the samples being published by Neo Technology, especially the Cypher learning module. I don't know of any published guidelines, but I'd think most of the guidelines would be similar to any scripting guidelines you already have (such as naming conventions, spacing, etc.). Going further, you're likely going to use Cypher via code as well as through the console. So you'll want to continue using your traditional programming style guidelines, as well as specifying the language-specific library you'll be using.

I can only give you an answer related to Gremlin. First, it is important to note that almost all examples, in the Gremlin wiki, GremlinDocs, the gremlin-users mailing list, etc. are meant for the REPL. Example traversals from these sources tend to be lengthy one-liners, that are written that way for easy transfer via copy/paste to the command line for execution. It is somewhat satisfying to get your answer in one-line in the REPL, but for production code that requires maintenance over time, consider avoiding the temptation to do so, unless there is some specific reason that dictates it.
Second, from a style/formatting perspective, Gremlin is a DSL built on Groovy. Whatever style you like for Groovy should generally work for Gremlin. Start with the Groovy recommendations and tweak them to your needs/liking. I would expect that a tool like CodeNarc would help with general style checks and identifying common Groovy coding problems.

Also see the message from the mailing list https://groups.google.com/forum/#!searchin/neo4j/coding$20standards/neo4j/JYz2APHV-_k/V1BKRwyv5hAJ

Related

What's the purpose and mechanism of Ontology in D3WEB

In the expert system D3WEB, it is possible to insert\develop\use Ontology. However, I cannot get the point what's the purpose to introduce ontology in D3WEB?
The nice example on this page, https://www.d3web.de/Wiki.jsp?page=Demo%20-%20Ontology , shows how to develop an ontology in D3WEB. In my opinion, it can be more efficiently developed using Protégé. If the contents shall be changed with a real application, for instance, an ontology about 'dog', in the real application there could be instance dog A, B, C, D. It might be not feasible to 'insert' the instances into the D3WEB knowledge base. However, if the ontology changes over time, how to use the ontology in D3WEB then?
In my opinion, the best way is to develop an ontology outside of D3WEB using Java code. However, I believe the designer of D3WEB would have a nice reason to introduce ontology in D3WEB. I will appreciate it if someone let me know.
This is a somewhat common question we get regarding d3web-KnowWE, one reason might be, that our naming is somewhat misleading. So let me explain.
First there is d3web the java framework to run knowledge bases with strong problem solving knowledge, including rules, decision trees, flow-charts, covering lists, cost-benefit dialog strategies, time based reasoning, and so on. This framework in its core does not provide any GUIs, but is meant to integrate problem solving capabilities in other applications/expert systems. It also does not provide a way to properly create/author the knowledge bases it runs, aside maybe from doing it in the Java code on an API level.
To also provide proper means to author and develop a knowledge base, including some basic dialogs to run, demo, test, and debug the authored knowledge bases, we began working on the wiki system KnowWE, which today is basically a heavily extended JSPWiki. The page d3web.de itself for example is also just a build of KnowWE with specific content.
While we were working on and with KnowWE, we began to really like the approach to edit and author large knowledge bases in this 'wiki way', were you automatically support multiple distributed users to work on the same knowledge base, have automatic versioning, can add nice documentation directly beside the actual formal knowledge, can generate knowledge using script (because it's all just simple text markup), and so forth. Also, the underlying architecture of KnowWE became quite good and mature over the years.
So after some time of this, we found ourselves in the need to also author large ontologies. And yes, Protégé is a nice tool to develop ontologies, but for our use cases, it was just not well suited and we also found it to not scale very well. So we began to implement some simple markups to also allow to also develop ontologies in KnowWE. After then recognizing, that authoring ontologies the 'wiki way' indeed works pretty nicely, we decided to again also share these tools with everybody else on d3web.de. And that is why today you can author/develop both d3web knowledge bases and ontologies in KnowWE, although there is no actual connection/interoperability between both as of now. That would be nice of course and maybe we add this in the future, but for KnowWE is just a development environment for these two knowledge representation.
Maybe you can see KnowWE similar to an IDE like eclipse or IntelliJ, where the same application can be used to develop many different programming languages. KnowWE does the same for different knowledge representations.
A problem is maybe, that historically, we didn't differentiate very well between KnowWE and d3web, because KnowWE was narrowly used to build d3web knowledge bases. We also like to call KnowWE and its distribution package d3web-KnowWE for example. But maybe this should change...
Thanks for pointing this out, I will try to correct/clarify this on d3web.de

OWL/RDF Automated Planner

Is there any software that acts as an intersection between contemporary OWL/RDF reasoners, and the older STRIPS-style automated planners and schedulers? Both systems make use of RETE-based pattern matching, but only the automated planners seem to formalise the concept of an "action". Unfortunately, all the projects I've found that implemented automated planning, like Graphplan or SOAR, seem to be dead or dying, and never seemed to scale well to begin with. Current data stores are implemented on RDMS and can scale to and reason over millions of triples, but I haven't found any that specifically try and reason over actions. I can envision how the concept of actions might be represented in traditional RDF, but I'm sure it would still be very complicated and hackish without official support. Unfortunately, I can't find much prior art. Has this been done before?
Drools Planner (open source, java, ASL) sits on top of the RETE based rule engine Drools Expert and formalizes the concept of a Move, which might or might not be the action you're looking for. It excels at scaling out, both in data as in planning constraints. And it's production ready and has a complete reference manual.
There is some research going on to do OWL with Drools Expert, but I don't know how far that is at this point.

Appropriate uses for yacc/byacc/bison and lex/flex

Most of the posts that I read pertaining to these utilities usually suggest using some other method to obtain the same effect. For example, questions mentioning these tools usual have at least one answer containing some of the following:
Use the boost library (insert appropriate boost library here)
Don't create a DSL use (insert favorite scripting language here)
Antlr is better
Assuming the developer ...
... is comfortable with the C language
... does know at least one scripting
language (e.g., Python, Perl, etc.)
... must write some parsing code in almost
every project worked on
So my questions are:
What are appropriate situations which
are well suited for these utilities?
Are there any (reasonable) situations
where there is not a better
alternative to a problem than yacc
and lex (or derivatives)?
How often in actual parsing problems
can one expect to run into any short
comings in yacc and lex which are
better addressed by more recent
solutions?
For a developer which is not already
familiar with these tools is it worth
it for them to invest time in
learning their syntax/idioms? How do
these compare with other solutions?
The reasons why lex/yacc and derivatives seem so ubiquitous today are that they have been around for much longer than other tools, that they have far more coverage in the literature and that they traditionally came with Unix operating systems. It has very little to do with how they compare to other lexer and parser generator tools.
No matter which tool you pick, there is always going to be a significant learning curve. So once you have used a given tool a few times and become relatively comfortable in its use, you are unlikely to want to incur the extra effort of learning another tool. That's only natural.
Also, in the late 1960s and early 1970s when lex/yacc were created, hardware limitations posed a serious challenge to parsing. The table driven LR parsing method used by Yacc was the most suitable at the time because it could be implemented with a small memory footprint by using a relatively small general program logic and by keeping state in files on tape or disk. Code driven parsing methods such as LL had a larger minimum memory footprint because the parser program's code itself represents the grammar and therefore it needs to fit entirely into RAM to execute and it keeps state on the stack in RAM.
When memory became more plentiful a lot more research went into different parsing methods such as LL and PEG and how to build tools using those methods. This means that many of the alternative tools that have been created after the lex/yacc family use different types of grammars. However, switching grammar types also incurs a significant learning curve. Once you are familiar with one type of grammar, for example LR or LALR grammars, you are less likely to want to switch to a tool that uses a different type of grammar, for example LL grammars.
Overall, the lex/yacc family of tools is generally more rudimentary than more recent arrivals which often have sophisticated user interfaces to graphically visualise grammars and grammar conflicts or even resolve conflicts through automatic refactoring.
So, if you have no prior experience with any parser tools, if you have to learn a new tool anyway, then you should probably look at other factors such as graphical visualisation of grammars and conflicts, auto-refactoring, availability of good documentation, languages in which the generated lexers/parsers can be output etc etc. Don't pick any tool simply because "this is what everybody else seems to be using".
Here are some reasons I could think of for using lex/yacc or flex/bison :
the developer is already familiar with lex/yacc or flex/bison
the developer is most familiar and comfortable with LR/LALR grammars
the developer has plenty of books covering lex/yacc but no books covering others
the developer has a prospective job offer coming up and has been told that lex/yacc skills would increase his chances to get hired
the developer could not get buy-in from project members/stake holders for the use of other tools
the environment has lex/yacc installed and for some reason it is not feasible to install other tools
Whether it's worth learning these tools or not will depend heavily (almost entirely on how much parsing code you write, or how interested you are in writing more code on that general order. I've used them quite a bit, and find them extremely useful.
The tool you use doesn't really make as much difference as many would have you believe. For about 95% of the inputs I've had to deal with, there's little enough difference between one and another that the best choice is simply the one with which I'm most familiar and comfortable.
Of course, lex and yacc produce (and demand that you write your actions in) C (or C++). If you're not comfortable with them, a tool that uses and produces a language you prefer (e.g. Python or Java) will undoubtedly be a much better choice. I, for one, would not advise trying to use a tool like this with a language with which you're unfamiliar or uncomfortable. In particular, if you write code in an action that produces a compiler error, you'll probably get considerably less help from the compiler than usual in tracking down the problem, so you really need to be familiar enough with the language to recognize the problem with only a minimal hint about where compiler noticed something being wrong.
In a previous project, I needed a way to be able to generate queries on arbitrary data in a way that was easy for a relatively non-technical person to be able to use. The data was CRM-type stuff (so First Name, Last Name, Email Address, etc) but it was meant to work against a number of different databases, all with different schemas.
So I developed a little DSL for specifying the queries (e.g. [FirstName]='Joe' AND [LastName]='Bloggs' would select everybody called "Joe Bloggs"). It had some more complicated options, for example there was the "optedout(medium)" syntax which would select all people who had opted-out of receiving messages on a particular medium (email, sms, etc). There was "ingroup(xyz)" which would select everybody in a particular group, etc.
Basically, it allowed us to specify queries like "ingroup('GroupA') and not ingroup('GroupB')" which would be translated to an SQL query like this:
SELECT
*
FROM
Users
WHERE
Users.UserID IN (SELECT UserID FROM GroupMemberships WHERE GroupID=2) AND
Users.UserID NOT IN (SELECT UserID GroupMemberships WHERE GroupID=3)
(As you can see, the queries aren't as effecient as possible, but that's what you get with machine generation, I guess).
I didn't use flex/bison for it, but I did use a parser generator (the name of which has escaped me at the moment...)
I think it's pretty good advice to eschew the creation of new languages just to support a Domain specific language. It's going to be a better use of your time to take an existing language and extend it with domain functionality.
If you are trying to create a new language for some other reason, perhaps for research into language design, then these tools are a bit outdated. Newer generators such as antlr, or even newer implementation languages like ML, make language design a much easier affair.
If there's a good reason to use these tools, it's probably because of their legacy. You might already have a skeleton of a language you need to enhance, which is already implemented in one of these tools. You might also benefit from the huge volumes of tutorial information written about these old tools, for which there is not so great a corpus written for newer and slicker ways of implementing languages.
We have a whole programming language implemented in my office. We use it for that. I think it's meant to be a quick and easy way to write interpreters for things. You could conceivably write almost any sort of text parser using them, but a lot of times it's either A) easier to write it yourself quick or B) you need more flexibility than they provide.

Does Pair programming mean you don't need design documentation? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
In pair programming, the experience of every member of the team can be spread to new member. This experience is always in sync with the code, because the "senior" of the pair knows how the code works and what the design is.
So what is the utility of design documentation in this case ?
UPDATE
I don't imply no design, I imply no documentation.
With a team which practice pair programming I think that everybody is disposable, because everybody knows the code. If the senior developer leaves, I think that there is always at least one person who knows the code, because the experience was shared before.
What if your team is larger than 2 persons?
Just because two people know a part of a system does not mean it shouldn't be documented.
And I would be glad to know that I don't have to remember every tiny detail of a system just because it it's stored nowhere else than in my head.
For a small system this might work, but as the system gets larger, your limiting yourself and your colleagues. I'd rather use the memory capacity for a new system than to remember everything of the old system.
Have you ever played "telephone?" I don't think you should play it with your codebase.
What if the senior programmer leaves the company/project?
The set of deliverables should be decided independently of whether you use pair programming or not.
Six months or two years later, all the people involved could be in a different project (or a different company). Do you want to be able to come back and use the design documentation? Then, produce it. If you don't want to come back, or the design is simple enough that with the specs and the code you can understand it without the aid of an explicit design document, then you may skip it.
But don't rely on the two people explaining the design to you one year later.
Maintenance. You can't expect the team to remain static, for there to be no new members or loss of old members. Design documentation ensures that those who are new to the project, that have to maintain it years down the line, have information on decisions that were taken, why the approach was chosen, and how it was to be implemented. It's very important for the long term success of a project to have this documentation, which can be provided via a combination of traditional documents, source comments, unit tests, and various other methods.
I don't see that pair programming makes design documentation obsolete. I immediately have to think about the Truck factor. Sure, the senior may know what the design is. But what happens when he is ill? What happens when he gets hit by a truck? What if he is fired?
Pair programming does spread knowledge, but it never hurts to document that knowledge.
Who knows about the first-written code? The answer is nobody knows, because it hasn't been written. The reason it hasn't been written is because nobody knows what to do, hence the need for a design document.
Pair programming is just two people sharing one computer. By itself, it says nothing about what kind of design methodology the pair(s) uses.
Pair programming, when taking as part of "Extreme Programming", means following the Extreme Programming guidelines for design. This typically involves gathering and coding to "user stories". These stories would then stand in place of other design documentation.
The experience of people may be in sync with the code, as you say. But the design decisions are not all captured in the code - only the choices made are there.
In my experience, to really understand why code is designed the way it is, you need to know about the design choices that were not selected, the approaches that had tried and failed etc. You can hope that the "chinese whispers" chain transmits that correctly, given that there's no record of this in the code to refresh memories or correct errors...
... or you can write some documentation on the design and how it was arrived at. That way, you avoid being taken down a dark alley by the maintenance programmers in future.
Depends what you mean by "design documentation".
If you have functional tests - especially behaviour-driven development (BDD) tests, or Fitnesse or FIT tests then they're certainly a form of "active documentation"... and they certainly have value as well as being regression tests.
If you write user stories and break them down into tasks and write those tasks on cards for pairs to do then you're doing a form of documentation...
Those are the two main forms of documentation I've used in XP teams that pair on all production code.
The only other document that I find quite handy is a half-page or so set of bullet points showing people how to set up the build environment for a development machine. You're supposed to maintain the list as you go along using it.
The code base may be so large you can't humanly remember every detail of what you were intending to implement. A reference is useful in this case.
Also, you need a design if you are interacting with other components etc.
Well if you want a spreadsheet program instead of a word processor a design doc use useful :-)
XP, pair programing, agile, etc... do not mean you do not have a plan, it is just a far less detailed plan (at the micro level) of what is going on. The use cases that the user picks are more of the design, and it is more of a living document than with other styles of design/programming.
Do not fall into the trap that because youa re doing something "cool" that you no longer need good practices - indeed this style of programming requires more discipline rather than less to be successful.
Pair programming is an opportunity for the team to avoid having to spend a large proportion of the project time on documenting everything. But the need for documentation depends on how good you are at remembering the important stuff and how good your code is. You may still want lots of documentation if the code is difficult to work with.
You could try some experiments:-
Document a couple of small parts of
the design and note how often you
have to refer to it.
Document stuff that is always a pain
to work with.
No Nor does lack of pair programming mean you need documentation. Documentation is needed! What it looks like may surprise you!
An agile team will decide when and what documentation is needed. A good rule of thumb, if no one is going to read it, don't write it. Don't get caught up in the waterfall artifact thinking by provide artifacts because the Project Manager says so.
Most think of documentation as something you do with Word. If an agile team is working properly, the code itself, with TDD (test driven development) will have a set of automated test that document and enforce the requirements. Image, documentation that is in sync with the code ... and it stays that way.
Having said that, pairing does help domain, application, practice and skill knowledge propagate through the team very quickly. Pairing also helps ensure that the team follow the engineering practices including TDD and other automated test. The results are that the application remains healthy and future change is easy to bring about.
So, bottom line, pair programming produces better documentation. It does not eliminate documentation (although you might not be able to find a Word document).
I am a pro-advocate and a fan of documentation. Pair programming does not require "one senior developer". In my experience with pair programming, developers of all levels are paired together, for the purpose of rapid development. There are many times I worked with junior developers and would trade off on the keyboard. There are many times I worked with senior architects and would trade off on the keyboard. Documentation is still necessary, especially with your core components and database.
Pair Programming only enables your coding and logical aspect.
But documentation is good practice. Always do documentation...

Hands-on study plan for learning Grails

I'm a Java developer trying to learn Grails, and I'd like to get exposure to as many parts of the Grails framework as possible. Preferably doing so by solving small real-world problems using the "Grails way to do it" (DRY, convention-over-configuration, and so on).
Three example could be:
Learn about GORM by creating a few classes (say Movie, Actor, etc.) and relations/mappings between them (1:1, 1:N, N:N, etc.).
Learn about the layout support (sitemesh) by using it to generate headers and footers common to all GSP:s on the site.
Learn about the filter support by using it to make sure all accesses to a certain controller comes from authenticated users.
My question goes to all Grails developers out there - what would you include in a "Grails curriculum" and in what order?
All input appreciated!
Here's some examples, but be warned that they're fairly trivial and don't really show you how the system works together. One of the strengths of Grails is that the different parts all combine to reduce your code complexity and speed development. I recommend doing a single project of moderate size (like blogging software or a photo gallery) that forces you to touch virtually everything. I'm currently working on a ticket management application, and I've had to learn basically everything in the framework. It's really not that much material, actually.
That being said, here's my list of "must study", along with some examples:
Groovy, especially closures, maps, and properties. If you're coming from Java, closures might seem a little strange at first. However, once you wrap your head around them, it'll be hard to go back to a language that doesn't use them. Maps and properties use ideas that might be familiar, but the syntax and usage is different enough that it's worth studying them closely. Grails uses these three things ALL THE TIME, all throughout the framework. For a good example, examine the "BeanBuilder" that instantiates the Spring beans defined in resources.groovy. Also, run through the Groovy documentation at groovy.codehaus.org. A couple of hours there will save you DAYS down the road.
MVC programming. The "MVC" model in Grails pretty closely matches the one used in Rails, but it's significantly different than the "MVC" model used in Java desktop applications. Basically, all incoming URL requests are a message to a controller, which returns a view. Domain objects are the data that you want to store, view, and manipulate through views and controllers. Do an input form that validates the user's input using constraints, and then manipulates it somehow using a controller. Something like a page that takes in your birthday, and returns your Zodiac sign and Chinese Zodiac animal. See if you can get it to return errors to the user when bad input is given.
GORM. GORM is super-important, but you'll be forced to learn it with virtually any project you pick. Give the documentation a once-over, just so you know what its' capabilities are.
Filters and Services. These are "the grails way" to do a lot of DRY programming. Authentication is a canonical example, and it's perfect for learning filters. For services, write something that will send out email. There's a great example of a simple emailer service on the Grails website.
Groovy Server Pages. If you've worked with a templating engine before, then this should seem familiar. Get to know the GSP tag library, it's a huge help. Practical examples include: virtually anything. Every application needs a front-end. Try and make it pretty. NOTE: This spills into a lot of stuff that isn't Grails-specific, like JavaScript, CSS, etc. Unless you have that knowledge already, prepare for a bit of a learning curve.
Your "conf" directory. Get to know every file in there, especially UrlMappings.groovy. Play with UrlMappings so that you have an app that takes meaningful information from the URL. Something like /myapp/calculate/36/times/145, where the app returns an answer.
I'd say those are the basics, but there's a lot of other topics like webflows, i18n, testing, session handling, and so on. The best way to learn those is by building a decent sized project. While you're doing that, you'll probably find yourself thinking, "Gosh, I wish that Grails did ____". Read through the excellent documentation on Grails.org, and you'll probably find a built-in capability or plugin that does what you want. The reference PDF lives on my desktop, and I've found it invaluable during my learning experience.
Oh, and look at the scaffolding code that Grails generates. You'll probably end up pitching it all out, but it'll give you a good idea of how the system works.
Have fun, and happy hacking!
Step 1 - Learn Groovy
If you already know Java, I highly recommend Programming Groovy. It's a lot more concise and up-to-date than the otherwise excellent Groovy in Action. Neither of these books cover the significant language changes in Groovy 1.6, so you should also read this page.
Step 2 - Learn Grails
The Definitive Guide to Grails is probably the standard Grails reference - make sure you get the second edition. Grails in Action is slightly more recent, but I haven't read it so can't comment further on it. I found TDGTG a little lighton GORM, so you might also consider checking out Grails Persistence with GORM and GSQL. It's a very short book, but worth it's weight in gold.
Step 3 - Dive In
Try and modify the sample app in the Grails book, or build your own from scratch. The Groovy console is a great way to experiment with snippets of Groovy code.
If the audience is not familiar with programming in Groovy there should be an introduction to that. Java will work but it gets the juices going when you see how less verbose the code is in Groovy. When discussing GORM include constraints and how they influence validation. Scaffolding is a real time saver when starting a new project so be sure to include it. One of the features of Grails that really has helped me is Plug-ins. Pick a few and show how they provide solutions while saving development time. A security plug-in would fit right into the filter topic you mention. Testing, can there ever be enough testing?
I would really recommend reading the Definite Guide to Grails, Second Edition. It covers everything you need to know about writing applications in Grails. It probably lacks the "what happens under the hood" knowledge, but you should get the hang of it. You can buy it as a PDF and start reading it immediately.
You should also have a list of plug-ins to use - Grails has some really nice ones that come in handy. I can tell you some of the ones I use, but that may be a good question here, too. :-)

Resources