How to extract readable source code from a .web file? - latex

I'm trying to dig into the source for Donald Knuth's Metafont compiler. However, I am getting bogged down in his toolchain. What is the best way to extract readable, navigable, source code from one of Knuth's .web files? I am toying with doing a reimplementation in another language, and I want to look some at the geometric algorithms, etc... so I have some idea what I am getting into.
The .web -> .tex -> .dvi route left me with a huge document without an index, that is terribly slow to render, at least in evince.
The .web -> .p file resulted in source code that was stripped of all comments, and deliberately packed without any consideration for readability.
Should I start messing around with pascal pretty printers? Use a pascal-> C converter like modern latex does and then pretty print and explore that?

The idea of WEB is that the program source code is readable in the .web source itself, or in the documentation (produced using weave *.web-> pdftex *.tex).
The program code (generated by tangle *.web) is intended just for the computer, not for humans.

Related

Lua Bytecode to Lua human "readable"

I just got an script that I want to make some changes and I'm looking for someone to develop me a freelance job to make the loadstring that I will give readable for editing.
The Lua code is like this:
------------------------- ENGINE -----------------------------
code='\27\76\117\97\81\0\1\4\4\4\8\0\56\0\0\0\64\67\58\92\85\115\101\114\115\92\74\101\
I want it to be turned into a human code. I already searched about the subject and found that there are some tools like Chuckspy, Luadec51 and Unluac that can do this job. Anyway, I never programmed before in lua and got no compiler knowledge to perform that.
I'm looking for someone to help me, I have no idea how I'll do it.
Thanks anyway
The link of two archive:
http://www.4shared.com/file/uQguRL4D/Avani_Dice_Script_1.html
http://www.4shared.com/file/FSLbD9tA/Avani_Dice_Script_2.html
luac -l will print out the Lua bytecode in human-readable form. With a basic understanding of Lua's instruction format, this is fairly easy to manually turn into source code.
As with other languages, automatic decompilers will rarely produce source code which is useful for understanding or editing the code.

Document preparation in Isabelle

I want to use isabelle build -D xxx to produce a LaTeX .tex file out of an Isabelle .thy file.
But Isabelle checks all the theory dependencies, and all the related .thy files must be involved.
Is it possible that I casually use a .thy file that has syntax errors to produce a .tex file? In fact I only need a part of it to write a paper.
Does that mean you want to write a paper based on a faulty or incomplete formal theory?
The Isabelle document preparation system was intended to publish formal theories that actually work out, with nice typography so that this does not look like "code". So all the defaults are for producing LaTeX from well-formed and checked theories.
Nonetheless, there are numerous ways to get unofficial LaTeX output from the system. A very basic mechanism is the latex print mode. Various diagnostic commands of Isabelle allow such print mode specifications in round parentheses, e.g. like this:
thm (latex) exI exE
or
print_statement (latex) exI exE
You can do this interactively and copy-paste the output into your raw tex file. You need to ensure that it gets proper surroundings with environments from the isabelle.sty file.
To the best of my knowledge, no. The LaTeX generation requires the file to be processed successfully, e.g. due to notation (latex) commands, and due to antiquotations.
If you only need parts of your file, simply copy’n’paste it from the generated .tex file or, if you want something more automated, have a look at the Generate TeX Snippets wiki page.

Decompiling an old Program

I have been asked to update a program written in 1987 in Delphi (I guess). I have no documentation about this program only a few side notes the programmer took that don't make too much sense to make.
The cd show this files:
Size | Filename
19956 VP.DTA
142300 VP.LEX
404 VP.NDX
126502 VP.RCS
131016 VP.SCR
150067 VP.XEL
101791 vp.exe
Is anyone of this files a database? If so can I access it's data?
I tried several code decompilers but they show a message saying it was not a Win32 compatible application.
The program run in MS-DOS.
Is it possible to obtain the source code? Can I use this code in any way to build a new application?
Update01: I can run the program in MS-DOS. The program conjugate verbs and shows an example sentence where the verb can be used. The GUI is a little bit confusing and there is no help menu so I can't see all the capabilities of the program.
Update02: In conversation with the owner of the program we found another solution. He ask me if it was possible to have the program in a server and the clients could login in with a user and a password and execute the program in a terminal. I have an account in my university server, which I can access throughout ssh and compile and execute c programs in it. The server is in linux so I couldn't try the program in it. If I set up a windows server, can I have multiple people accessing and executing the program in a terminal? The program is an exe. Doesn't this raise some security issues?
Delphi is from mid nineties, so that probably means Delphi's ancestor Turbo Pascal, not Delphi.
Some extensions sound familiar, as shortened versions of words:
ndx = index
dta = data
scr = screen (?)
lex = lexicon (list of words or deduped strings in general) (?)
Screen was sometimes used for e.g. helpscreens, a medieval form of helpfiles, they are typicall ansi screens that can be loaded directly into screen memory
There is a fair chance that this is something handcrafted, specially if that date of 1987 and the general assumption "pascal" is true, and not generated by some known database package at all.
Reverseengineering the fileformat might be a more worthwhile way than trying to reverseengineering the app.
A good start would to be to take a the unix "file" command to see if it can recognize the file types. (the file command searches for signatures inside files, and there are windows ports. I use Cygwin's)
A devel experienced in such matters can also see a lot from a hexdump (specially the first parts of a file)
Is it possible to obtain the source code?
Probably not, you may want to look at something like IDA Pro which can disassemble applications to C using something like Hex-Rays.
Do you know what the application is supposed to be?
If it's ms-dos, you're probably better off just drawing up new requirements and doing new development.
Look for DeDe to reverse engineering a delphi compiled program. But as far as i know, delphi is a real compiler. So there is no way to de-compiled it. If you are able to read assembler code then you can try de-compile it. Clipper and Foxpro (dos version) are another stories cause they not real compiler.
This is definitely not Delphi. It might be one of the database centric languages like Clipper 1. .SCR probably means "screen" and defines I/O masks. .NDX is a table index and .DTA means "data".
If it is clipper, you might actually be lucky, because as far as I remember these programs were P code, so it could be possible to decompile it.
It looks like CLipper (NDX and SCR). If you have a DBF file then it's Clipper for sure. But some people renamed the DBF to something like DAT. If it is Clipper, I believe there was a decompile named Valkyrie.

How to print Smalltalk code from Pharo/Squeak?

What is the best way to print - syntax colored and well formatted - code from Pharo/Squeak on paper?
1) Is there a way to print directly from within Pharo/Squeak? (i use it on macosx)
2) Is there a way to export syntax colored, well formatted code from Pharo/Squak?
3) Are there external tools to color and format a filed out piece of code?
For the appendix in my master thesis I used the Pier CMS-to-LaTeX converter in the Pier-Documentation package. However, this plugin only takes class comments and method comments into consideration, it does not print the source code. Pier also provides a package ShoutPier for syntax highlighting of Smalltalk code, so I guess it would require little work to bring the two together. You can find the mentioned extension packages in http://source.lukas-renggli.ch/pieraddons.html.
Pharo browsers seem to use syntax highlighting.
What difficulty are you having reading Smalltalk code using the browsers and senders/implementors ?
Edit: Would something that produces UML give the overview you're looking for? The Dandelion website only shows downloads for old Squeak versions - I don't know if they would work with Pharo.
And perhaps this GSoC project "Generate UML diagrams from Smalltalk code for Pharo" suggests not.
Here's how I did it on my Mac, I think this should work on other platforms too.
Save your categories to a Monticello local folder on your disk -- see the Pharo manual on how to do this: http://book.pharo-project.org/book/PharoTools/Monticello/?_s=hdGOLc_FXsvVY1iR&_k=YYH-Ln8f5mtWZ8z2&_n&148
Browse to this folder, and unzip the .mcz file
You'll see all your code in snapshot/source.st file
You'll need to edit this a bit, to remove the ! characters for e.g., there might be a tool to do this?
-Eric.
There is webdoc project, which allows you navigating code in web browser:
http://ss3.gemstone.com/ss/webdoc.html
(and of course you can print code from your favorite web browser)..
1) Install shout from www.squeaksource.com
2) I don't know. May be you can customize shout.
3) In gnu-smalltalk you have a smalltalk mode for emacs. But I am not pretty sure to understand what you are looking for.

llvm-clang: incremental or online parser?

Is there anyway to use the llvm-clang parser in an incremental/online manner?
Say I'm writing an editor and I want to be able to parse the C++ code I have in front of me.
I don't want to write my own hacked up parser.
I'd like to use something full featured, like llvm-clang.
Is there an easy way to hijack the llvm-clang parser? (And is it fast enough to run it continuously in the background)?
Thanks!
I don't think clang can incrementally parse C++ files, but it's one of this project goals: http://clang.llvm.org/features.html
I've written something similar for my final year project. It wasn't C++ editor, but a Visual Studio plugin, which main task was improving C++ intellisense (like Visual Assist X).
When I was writing this project I've been also thinking about C++ incremental parser, but I haven't found any suitable solution. To solve the C++ intellisense problem I used normal C++ parser from GCC. However it was to slow, to parse file after each code completion request (ctrl+space), just try including boost::spirit. To make this project work properly I parsed files in the background and after each code completion request I compared current file with it's previous version (via diff) to detect changes made from last parsing. Having those changes I updated syntax tree, mostly by adding or removing variables.
Except incremental parsing, there is also another problem with projects like this. Mostly you'll be parsing C++ code which is being edited so it's invalid code. Given the complex C++ grammar, sometimes parser won't be able to recover from syntax errors, so it won't detect correctly some symbols in code.
Another issue are C++ parsers / compilers differences. Let's say I'm using working in Visual Studio and I have used some VC++ compiler specific contruction in my code. Clang parser won't be able to parse it correctly.
For writing something similair to IntelliSense, I would advise you to write your own parser using the LALR parsing algorithm. Since you can save its state in each line so you don't have to reparse the whole file when a file has been editted, which is very fast!
Note that C++ can't be fully expressed in BNF, but I think you could get pretty far with some adjustments. It's ofcourse a lot more work than using Clang's frontend, but you could still use Clang for analysing header files in coöperation with you own written parser.

Resources