I'm trying to analyze Java source files with Clojure but I couldn't find a way to do that.
First, I thought using Eclipse AST plugin(by copying necessary JAR's to my Clojure project) but I gave up after seeing Eclipse AST's API(visitor based walker).
Then I've tried creating a Java parser with ANTLR. I can only find one Java 1.6 grammar for ANTLR( http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g ) and it doesn't compile with latest ANTLR(here's the errors I'm getting ).
Now I have no idea how can I do that. At worst I'll try to go with Eclipse AST.
Does anyone know a better way to parse Java files with Clojure?
Thanks.
Edit: To clarify my point:
I need to find some specific method calls in Java projects and inspect it's parameters(we have multiple definitions of the method, with different type of parameters). Right now I have a simple solution written in Java(Eclipse AST) but I want to use Clojure in this project as much as possible.
... and it doesn't compile with latest ANTLR ...
I could not reproduce that.
Using ANTLR v3.2, I got some warnings, but no errors. Using both ANTLR v3.3 and v3.4 (latest version), I have no problems generating a parser.
You didn't mention how you're (trying) to generate a lexer/parser, but here's how it works for me:
java -cp antlr-3.4.jar org.antlr.Tool Java.g
EDIT 1
Here's my output when running the commands:
ls
wget http://www.antlr.org/download/antlr-3.4-complete.jar
wget http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g
java -cp antlr-3.4-complete.jar org.antlr.Tool Java.g
ls
As you can see, the .java files of the lexer and parser are properly created.
EDIT 2
Instead of generating a parser yourself (from a grammar), you could use an existing parser like this one (Java 1.5 only AFAIK) and call it from your Clojure code.
It depends a bit on what you want to do - what are you hoping to get from the analysis?
If you want to actually compile Java or at least build an AST, then you probably need to go the ANTLR or Eclipse AST route. Java isn't that bad of a language to parse, but you still probably don't want to be reinventing too many wheels..... so you might as well build on the Eclipse and OpenJDK work.
If however you are just interesting in parsing the basic syntax and analysing certain features, it might be easier to use a simpler general purpose parser combinator library. Options to explore:
fnparse (Clojure, not sure how well maintained)
jparsec (Java, but can probably be used quite easily from Clojure)
Related
I'd like to derive exactly that subset of the sources of a dart comiler (dart2js or dartdevc or other) or of a dart analyser that can 1. transform a string of dart code (or better a list of strings each representing a compilation unit) into a typed syntax tree, 2. be translated into js, 3. be run in the browser. Is there a marked subset that fulfills these requirements, which is it, and how can I find it, in general.
Accomplishing #1 is fairly simple using package:analyzer, which is the same static analyzer used to provide IDE hinting and autocomplete, etc. The Dart Team is currently working on unifying their compiler frontends behind on main API, but for now, analyzer should definitely take care of most of what you need.
Here's an example of getting a syntax tree and running analysis on it: https://github.com/thosakwe/analyzer_examples/blob/master/analyze_a_file/analyze_a_file.dart
As for #2, you'll likely have to fork the dart-lang/sdk repo and make your own adjustments to dart2js. It's not published as a standalone package. Otherwise, you can write your own compiler, which is probably not going be fun.
I suppose you'd have to figure out how to get #2 up and running, but hypothetically, if you could compile a JavaScript source, you could just eval it after compilation.
To answer your final question, no, AFAIK, there is no subset of dart2js available that lets you create your own Dart-to-JavaScript compiler.
I'd really like to be able to run some Rascal's program from outside the REPL (e.g. as part of a script, or called from another program). What I'm using Rascal for is and intermediate stage in a larger framework so I am wondering what the best way to go about integrating executing the Rascal code from another program.
Right now the best way is to package your code together with the Rascal shell executable jar. There is a convenience class JavaToRascal for calling into Rascal code. Sometimes it requires some thinking to add your own modules to the Rascal search path using IRascalSearchPathContributors, but if you include a RASCAL.MF file with the right properties it all should go automatically.
If you are thinking of an Eclipse plugin, then the best way is to let your plugin depend on the rascal-eclipse plugin and use ProjectEvaluatorFactory to get access to the interpreter.
Caveat: since we are moving to a compiled system, the code you write for this kind of integration will change. This is the reason we haven't documented the API for calling Rascal from Java yet.
As I was pondering the same question, this is the (maybe trivial) answer I came up with:
java -jar /path/to/rascal-shell-stable.jar path/to/myProgram.rsc
You have to be aware that Rascal calculates module names from the current directory (don't know if it supports something like Java's CLASS_PATH), so in the above example myProgram.rsc should have a module declaration of module path::to::myProgram. Calculated and declared module name have to match.
I'm thinking it might be easiest if I modify the Java syntax used in Rascal to better fit our Java-like language.
Is there a way I can build Rascal from the source? I've cloned the repo from Github and imported it as a project into Eclipse but there are some compilation errors regarding org.eclipse.imp. Before I head down the rabbit-hole of trying to get this all to work in Eclipse I thought I would post here to see if there is an easy way to handle this.
Thanks!
Sure you could build Rascal from scratch; following the developer instructions at https://github.com/cwi-swat/rascal/wiki/Rascal-Developers-Setup---Step-by-Step
On the other hand, if you wish to simply adapt the Java syntax definition it would be better to clone it into your own files. Grammars may look modular, but in reality there are complex interactions between different parts of the grammar. Better to clone and manage the whole thing as your own than depend on two co-evolving definitions.
If you clone the Java grammar Rascal will generate new parsers for you on-the-fly. If this generation becomes cumbersome, a "cached parser" can help you to optimize the deployment of your tools. Please contact us if you need help with that.
This link describes how bytecode can be generated from an AST tree. Basically, it shows how the parsing phase of compilation can be bypassed and the AST be picked up by the java compiler to produce bytecode.
This works well but I would like to be able to generate the AST using javac the way it is without changing its source code and without any framework. Is this possible and has there been anything done like this before?
Thanks in advance for your reply.
So it turns out you cannot compile a tree created by the user using the arbitrary implementations of com.sun.source.tree.*. What can be done though is to print the AST to a string and compile the string in memory using the Java 6 Compiler API.
Is there anyway to use the llvm-clang parser in an incremental/online manner?
Say I'm writing an editor and I want to be able to parse the C++ code I have in front of me.
I don't want to write my own hacked up parser.
I'd like to use something full featured, like llvm-clang.
Is there an easy way to hijack the llvm-clang parser? (And is it fast enough to run it continuously in the background)?
Thanks!
I don't think clang can incrementally parse C++ files, but it's one of this project goals: http://clang.llvm.org/features.html
I've written something similar for my final year project. It wasn't C++ editor, but a Visual Studio plugin, which main task was improving C++ intellisense (like Visual Assist X).
When I was writing this project I've been also thinking about C++ incremental parser, but I haven't found any suitable solution. To solve the C++ intellisense problem I used normal C++ parser from GCC. However it was to slow, to parse file after each code completion request (ctrl+space), just try including boost::spirit. To make this project work properly I parsed files in the background and after each code completion request I compared current file with it's previous version (via diff) to detect changes made from last parsing. Having those changes I updated syntax tree, mostly by adding or removing variables.
Except incremental parsing, there is also another problem with projects like this. Mostly you'll be parsing C++ code which is being edited so it's invalid code. Given the complex C++ grammar, sometimes parser won't be able to recover from syntax errors, so it won't detect correctly some symbols in code.
Another issue are C++ parsers / compilers differences. Let's say I'm using working in Visual Studio and I have used some VC++ compiler specific contruction in my code. Clang parser won't be able to parse it correctly.
For writing something similair to IntelliSense, I would advise you to write your own parser using the LALR parsing algorithm. Since you can save its state in each line so you don't have to reparse the whole file when a file has been editted, which is very fast!
Note that C++ can't be fully expressed in BNF, but I think you could get pretty far with some adjustments. It's ofcourse a lot more work than using Clang's frontend, but you could still use Clang for analysing header files in coöperation with you own written parser.