console output formatting - stdout

Are there any conventions for formatting console output from a command line app for readability and consistency? For instance, do you indent sub-information, when do you print a blank line, if ever, how should you accent important statements.
I've found output can quickly degenerate into a chaotic blur. I'm interested in hearing about what other people do.
Update: Really this is for embedded software which spits debug status out a terminal, but it's pretty much like a console app, and I figured everyone would be more familiar with that. Thanks so far.

I'd differentiate two kinds of programs:
Do you print information that might be used by a script (i.e. it should be parseable)? Then define a pretty strict format and use only that (for example fixed field separators).
Do you print information that need not be parsed by a script (or is there an alternative script-parseable format already)? Then write what comes natural:
My suggestions:
write it so that you would like to read it
indent sub-information 2 or 4 spaces, definitely not more
separate blocks of information by one empty line at most
respect the COLUMN environment variable (and possible ROWS if it applies to your output).

If this is for a *nix environment, then I'd recommend reading Basics of Unix Philosophy. It's not specific to output but there are some good guidelines for command line programs in general.
Expect the output of every program to become the input to another, as yet unknown, program. Don't clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don't insist on interactive input.

Related

Erlang and Elixir's colorful REPL shells

How does Learn some Erlang or IEx colorize the REPL shell? Is kjell a stable drop-in replacement?
The way this is done in LYSE is to use a javascript plugin called highlight.js, so LYSE isn't actually doing it, your browser is. There are plugins/modes for most mainstream(ish) languages available for highlight.js. If the web is what you are interested in, this is one way to do it (except for when a user can't use JS or has it turned off).
This isn't actually the shell being highlighted at all, nor is it useful anywhere outside of browsers. I've been messing around with a way to do this more generically, initially by inserting static formatting in HTML and XML documents (feed it a document, and it outputs one with Erlang syntax highlighted a certain way whenever this is detected/tagged). I don't yet have a decent project to publish for this (very low on my priority list atm), but I can point you in the direction of some solid inspiration: the source for wx:demo.
Pay particular attention to the function demo:code_area/1. There you will see how the tokenization routines are used to provide highlight hints for the source code text display area in the wx:demo application. This can provide a solid foundation to build your own source highlighting/display utility. (I think it wouldn't be impossible, considering every terminal in common use today responds correctly to ANSI color codes, to write a plugin to the shell that highlights terminal input directly -- not that there is a big clamor for this feature at the moment.)
EDIT (Prompted by a comment by Fred the Magic Wonder Dog)
On the subject of ANSI color codes, if this is what you are actually after, they are easy to implement as a prepend to any string value you are returning within a terminal. The terminal escapes them, so you won't see the characters, but will perform whatever action the code represents. There is no termination (its not like a markup tag that encloses the text) and typically no concept of "default color to go back to" (though there are a gajillion-jillion extensions to telnet and terminal modes that enable all sorts of nonsense like this).
An example of basic colorization is the telcon:greet/0 and telcon:sys_help/0 functions in the v0.1 code of ErlMUD (along with a slew of other places -- colorization in games is sort of a thing). What you see there is a pre-built list per color, but this could be represented any way that would get those values at the front of the string. (I just happened to remember the code value sequences, but didn't remember the characters that make them up; the next version of the code represents this somewhat differently.) Here is a link to a list of ANSI color codes and a discussion about colorizing the shell. Play around! Its nerdy fun, 1980's style!
Oh, I almost forgot... if you really want to go down the rabbit hole without silly little child toys like ncurses to help you, take a look at termcap.
I don't know if kjell is a stable drop-in replacement for Erl but it wouldn't be for IEx.
As far as how the colors are done; to the best of my knowledge it's done with ANSI Escape Sequences.

Determine Cobol coding style

I'm developing an application that parses Cobol programs. In these programs some respect the traditional coding style (programm text from column 8 to 72), and some are newer and don't follow this style.
In my application I need to determine the coding style in order to know if I should parse content after column 72.
I've been able to determine if the program start at column 1 or 8, but prog that start at column 1 can also follow the rule of comments after column 72.
So I'm trying to find rules that will allow me to determine if texts after column 72 are comments or valid code.
I've find some but it's hard to tell if it will work everytime :
dot after column 72, determine the end of sentence but I fear that dot can be in comments too
find the close character of a statement after column 72 : " ' ) }
look for char at columns 71 - 72 - 73, if there is not space then find the whole word, and check if it's a key word or a var. Problem, it can be a var from a COPY or a replacement etc...
I'd like to know what do you think of these rules and if you have any ideas to help me determine the coding style of a Cobol program.
I don't need an API or something just solid rules that I will be able to rely on.
I think you need to know the COBOL compiler for each program. Its documentation should tell you what conventions/configurations/switches it uses to decide if the source code ends at column 72 or not.
So.... which compiler(s)?
And if you think the column 72 issue is a pain, wait till you get around to actually parsing the COBOL itself. If you are not well prepared to handle the lexical issues of the language, you are probably very badly prepared to handle the syntactic ones.
There is no absolutely reliable way to determine if a COBOL program
is in fixed or free format based only on the source code. Heck it is sometimes difficult to identify
the programming language based only on source code. Check out
this classic polyglot - it is valid under 8 different language compilers. That
said, you could try a few heuristics that might yield
the correct answer more often than not.
Compiler directives imbedded in source code
Watch for certain compiler directives that determine code format.
Unfortunately, every compiler vendor uses their own flavour of directive.
For example, Microfocus COBOL uses the
SOURCEFORMAT directive. This directive will appear near the top of the program so a short pre-scan
could be used to find it. On the other hand, OpenCobol uses >>SOURCE FORMAT IS FREE and
>>SOURCE FORMAT IS FIXED to toggle between free and fixed format, different parts of the same program
could be formatted differently!
The bottom line here is that you will have to support the conventions of multiple COBOL compilers.
Compiler switches
Source code format can be also be specified using a compiler switch. In this case, there are no concrete
clues to go on. However, you can be reasonably sure that the entire source program will be either
fixed or free. All you can do here is guess. Unless the programmer is out to "mess with
your head" (and some will), a program in free format will have the keywords IDENTIFICATION DIVISION or ID DIVISION, starting before column 8.
Every COBOL program will begin with these keywords so you can use them as the anchor point for determining code format in the
absence of imbedded compiler directives.
Warning - this is far from fool proof, but might be a good start.
There won't be an algorithm to do this with 100% certainty, because if comments can be anything, they can also be compilable COBOL code. So you could theoretically write a program that means one thing if the comments are ignored, and something else entirely if the comments are treated as part of the COBOL.
But that's extremely unlikely. What's most likely to happen is that if you try to compile the code under the wrong convention, it will simply fail. So the only accurate way to do this is to try compiling/parsing the program one way, and if you come to a line that can't make sense, switch to the other style. You could also support passing an argument to the compiler when the style is already known.
You can try using heuristics like what you've described, but that will never be totally accurate. The most they can give you is a probability that the code is one or the other style, which will increase as they examine more and more lines of code. They could be useful for helping you guess the style before you start compiling, or for figuring out when the problem is really just a typo in the code.
EDIT:
Regarding ideas for heuristics, it's hard to say. If there were a standard comment sigil like // or # in other languages, this would be a lot easier (actually, there is, but it sounds like your code doesn't follow this convention). The only thing I can think of would be to check whether every line (or maybe 99% of lines, and not counting empty lines or lines commented with *) has a period somewhere before position 72.
One thing you DON'T want to do is apply any heuristics to the part after position 72. That is, you don't want to be checking the comments to see if they're valid COBOL. You want to check what you know is COBOL first, and see if that works by itself. There are several reasons for this:
Comments written in English are likely to have periods and quotes in them, so your first and second bullet points are out.
Natural languages are WAY harder to parse than something like COBOL.
The comments could easily have COBOL in them (maybe someone commented out the previous version of the line).
An important rule for comments is that they should never affect what the program does. If changing the comments can change how the program is compiled, you violate that.
All that in mind, my opinion is that you shouldn't use heuristics at all. You should always try to compile the program under both conventions unless one is explicitly specified. There's a chance that code will compile successfully under both conventions, and then you'll have two different programs and no way to tell which one is correct.
If that happens, you need to compare the two results (perhaps with a hash or something) to see if they're the same program. If they're the same, great, but if not, you'll need to force the user to explicitly choose a convention.
Most COBOL compilers will allow you to generate and analyze the post text manipulation phase.
The text preprocessor output can be seen (using OpenCOBOL for the example)
cobc -E program.cob
The text manipulation processor deals with any COPY ... REPLACING compiler directives, as well as converting SOURCE FORMAT IS FIXED (with line continuations, string literal concatenations, comment line removal, among other things) to the actual free format that the compiler lexical analyzer needs. A lot of the OpenCOBOL toolkits (Cross referencer and Animator, to name two) use source code AFTER the preprocessor pass. I don't think you'll lose any street cred if your parser program relies on post processed source code files.

Tex command which affects the next complete word

Is it possible to have a TeX command which will take the whole next word (or the next letters up to but not including the next punctuation symbol) as an argument and not only the next letter or {} group?
I’d like to have a \caps command on certain acronyms but don’t want to type curly brackets over and over.
First of all create your command, for example
\def\capsimpl#1{{\sc #1}}% Your main macro
The solution to catch a space or punctuation:
\catcode`\#=11
\def\addtopunct#1{\expandafter\let\csname punct#\meaning#1\endcsname\let}
\addtopunct{ }
\addtopunct{.} \addtopunct{,} \addtopunct{?}
\addtopunct{!} \addtopunct{;} \addtopunct{:}
\newtoks\capsarg
\def\caps{\capsarg{}\futurelet\punctlet\capsx}
\def\capsx{\expandafter\ifx\csname punct#\meaning\punctlet\endcsname\let
\expandafter\capsend
\else \expandafter\continuecaps\fi}
\def\capsend{\expandafter\capsimpl\expandafter{\the\capsarg}}
\def\continuecaps#1{\capsarg=\expandafter{\the\capsarg#1}\futurelet\punctlet\capsx}
\catcode`\#=12
#Debilski - I wrote something similar to your active * code for the acronyms in my thesis. I activated < and then \def<#1> to print the acronym, as well as the expansion if it's the first time it's encountered. I also went a bit off the deep end by allowing defining the expansions in-line and using the .aux files to send the expansions "back in time" if they're used before they're declared, or to report errors if an acronym is never declared.
Overall, it seemed like it would be a good idea at the time - I rarely needed < to be catcode 12 in my actual text (since all my macros were in a separate .sty file), and I made it behave in math mode, so I couldn't foresee any difficulties. But boy was it brittle... I don't know how many times I accidentally broke my build by changing something seemingly unrelated. So all that to say, be very careful activating characters that are even remotely commonly-used.
On the other hand, with XeTeX and higher unicode characters, it's probably a lot safer, and there are generally easy ways to type these extra characters, such as making a multi (or compose) key (I usually map either numlock or one of the windows keys to this), so that e.g. multi-!-! produces ¡). Or if you're running in emacs, you can use C-\ to switch into TeX input mode briefly to insert unicode by typing the TeX command for it (though this is a pain for actually typing TeX documents, since it intercepts your actual \'s, and please please don't try defining your own escape character!)
Regarding whitespace after commands: see package xspace, and TeX FAQ item Commands gobble following space.
Now why this is very difficult: as you noted yourself, things like that can only be done by changing catcodes, it seems. Catcodes are assigned to characters when TeX reads them, and TeX reads one line at a time, so you can not do anything with other spaces on the same line, IMHO. There might be a way around this, but I do not see it.
Dangerous code below!
This code will do what you want only at the end of the line, so if what you want is more "fluent" typing without brackets, but you are willing to hit 'return' after each acronym (and not run any auto-indent later), you can use this:
\def\caps{\begingroup\catcode`^^20 =11\mcaps}
\def\mcaps#1{\def\next##1 {\sc #1##1\catcode`^^20 =10\endgroup\ }\next}
One solution might be setting another character as active and using this one for escaping. This does not remove the need for a closing character but avoids typing the \caps macro, thus making it overall easier to type.
Therefore under very special circumstances, the following works.
\catcode`\*=\active
\def*#1*{\textsc{\MakeTextLowercase{#1}}}
Now follows an *Acronym*.
Unfortunately, this makes uses of \section*{} impossible without additional macro definitions.
In Xetex, it seems to be possible to exploit unicode characters for this, so one could define
\catcode`\•=\active
\def•#1•{\textsc{\MakeTextLowercase{#1}}}
Now follows an •Acronym•.
Which should reduce the effects on other commands but of course needs to have the character ‘•’ mapped to the keyboard somewhere to be of use.

Define every symbol as a command in LaTeX

I'm working on a large project involving multiple documents typeset in LaTeX. I want to be consistent in my use of symbols, so it might be a nice idea to define a command for every symbol that has a specific meaning throughout the project. Does anyone have any experience with this? Are there issues I should pay attention to?
A little more specific. Say that, throughout the document I would denote something called permability by a script P, would it be an idea to define
\providecommand{\permeability}{\mathscr{P}}
or would this be more like the case "defining a command for $n$"?
A few tips:
Using \providecommand will define that command only if it's not been previously defined. So if you're not getting the results you expected, you may be trying to define a command that's been defined elsewhere.
If you wrap the math in your commands with \ensuremath, it will do the right thing regardless of whether you're in math mode when you issue the command:
\providecommand{\permeability}{\ensuremath{\mathscr{P}}}
Now I can easily use \permeability in text or $\permeability$ in math mode.
Using your own commands allows you to easily change the typographical representation of something later. For instance:
\newcommand{\vect}[1]{\ensuremath{\mathbf{#1}}}
would print \vect{x} as a boldfaced x. If you later decide you prefer arrows above your vectors, you could change the command to:
\newcommand{\vect}[1]{\ensuremath{\vec{#1}}}
I have been doing this for anything that has a specific meaning and is longer than a single symbol, mostly to save typing:
\newcommand{\objId}{\mbox{$\mathit{objId}$}\xspace}
\newcommand{\insOp}[1]{#1\mbox{$^+$}\xspace}
\newcommand{\delOp}[1]{#1\mbox{$^-$}\xspace}
However then I noticed that I stopped making inconsistency errors (objId vs ObjId vs ObjID), so I agree that it is a nice idea.
However I am not sure if it is a good idea in case symbols in the output are, well, single Latin symbols, as in:
\newcommand{\numOfObjs}{$n$}
It is too easy to type a single symbol and forget about it even though a command was defined for it.
EDIT: using your example IMHO it'd be a good idea to define \permeability because it is more than a single P that you have to type in without the command. But it's a close call.

Is it possible to write own "packages" for LaTeX?

As a programmer, I wonder if I could create my own package for LaTeX. I need something like that famous "listings" package, but something that is much more capable for my needs. I look for a listings solution that would watch out for a comment line like
// BEGIN LISTING 3122
// END LISTING 3122
No syntax highlighting, but intelligent support for tab indents. That package then would be used with a file name or path, walk through the lines and copy out just the snippets of interest.
I am 100% sure there is absolutely nothing like this on the market. So I want to program it for LaTeX. If that's possible. I have no idea how and what programming language / IDE. Where would I start looking?
This is certainly possible, but it is non-trivial in the TeX programming language. I don't have time to code it up at the moment but here's an algorithm; I suggest asking on comp.text.tex for more specific LaTeX programming advice.
Locally set all catcodes of special chars to "other" (\dospecials) and start reading in the input file line by line (\read)
for each line compare the first however many tokens of the line (some iterative use of \if or \ifx; there might be a package to make this easier such as stringstrings or xstring)
in the default state throw away the input line and read the next
unless it's // BEGIN LISTING, in which case save each line into a macro (something like \g#addto#macro)
until it's // END LISTING, obviously
keep going until the end of the file (\ifeof)
TeX by Topic is a good reference guide for this sort of work.
The rather simple texments package shows how code can be piped into pdflatex: by writing your shell-invocable filter, you should be able to do something similar with your idea.
I'm pretty certain you can't do this in LaTeX. Basically you can go nuts with anything that's either a command (\foo) or an environment (\begin{foo} ... \end{foo}) but not in the way you are describing here. Within environments or commands it is possible to turn off LaTeX's processing and handle everything yourself in some way. This is how verbatim and listings work. It's not very pretty though.
Basically, I think it might be possible, if you make ‘/’ an active character (there is \makeactive for example but I imagine there are more solutions) and then invent some good magic around it. (You will need to emulate/create an environment with your logic.) Similar things are done in some of the internationalisation packages in order to ease the input of letters with diacritics.
For a character like ‘/’ this might be even harder as this one could have been written in other places of your text, too. So of course you’d have to take special care for that.

Resources