Disclaimer: I kept this because some things may be useful to others, however, it does not solve what I had initially tried to do.
Right now, I'm trying to solve the following:
Given something like {a, B, {c, D}} I want to scan through Erlang forms given to parse_transform/2 and find each use of the send operator (!). Then I want to check the message being sent and determine whether it would fit the pattern {a, B, {c, D}}.
Therefore, consider finding the following form:
{op,17,'!',
{var,17,'Pid'},
{tuple,17,[{atom,17,a},{integer,17,5},{var,17,'SomeVar'}]}}]}]}
Since the message being sent is:
{tuple,17,[{atom,17,a},{integer,17,5},{var,17,'SomeVar'}]}
which is an encoding of {a, 5, SomeVar}, this would match the original pattern of {a, B, {c, D}}.
I'm not exactly sure how I'm going to go about this but do you know of any API functions which could help?
Turning the given {a, B, {c, D}} into a form is possible by first substituting the variables with something, e.g. strings (and taking a note of this), else they'll be unbound, and then using:
> erl_syntax:revert(erl_syntax:abstract({a, "B", {c, "D"}})).
{tuple,0,
[{atom,0,a},
{string,0,"B"},
{tuple,0,[{atom,0,c},{string,0,"D"}]}]}
I was thinking that after getting them in the same format like this, I could analyze them together:
> erl_syntax:type({tuple,0,[{atom,0,a},{string,0,"B"},{tuple,0,[{atom,0,c},string,0,"D"}]}]}).
tuple
%% check whether send argument is also a tuple.
%% then, since it's a tuple, use erl_syntax:tuple_elements/1 and keep comparing in this way, matching anything when you come across a string which was a variable...
I think I'll end up missing something out (and for example recognizing some things but not others ... even though they should have matched).
Are there any API functions which I could use to ease this task? And as for a pattern match test operator or something along those lines, that does not exist right? (i.e. only suggested here: http://erlang.org/pipermail/erlang-questions/2007-December/031449.html).
Edit: (Explaining things from the beginning this time)
Using erl_types as Daniel suggests below is probably doable if you play around with the erl_type() returned by t_from_term/1 i.e. t_from_term/1 takes a term with no free variables so you'd have to stay changing something like {a, B, {c, D}} into {a, '_', {c, '_'}} (i.e. fill the variables), use t_from_term/1 and then go through the returned data structure and change the '_' atoms to variables using the module's t_var/1 or something.
Before explaining how I ended up going about it, let me state the problem a bit better.
Problem
I'm working on a pet project (ErlAOP extension) which I'll be hosting on SourceForge when ready. Basically, another project already exists (ErlAOP) through which one can inject code before/after/around/etc... function calls (see doc if interested).
I wanted to extend this to support injection of code at the send/receive level (because of another project). I've already done this but before hosting the project, I'd like to make some improvements.
Currently, my implementation simply finds each use of the send operator or receive expression and injects a function before/after/around (receive expressions have a little gotcha because of tail recursion). Let's call this function dmfun (dynamic match function).
The user will be specifying that when a message of the form e.g. {a, B, {c, D}} is being sent, then the function do_something/1 should be evaluated before the sending takes place. Therefore, the current implementation injects dmfun before each use of the send op in the source code. Dmfun would then have something like:
case Arg of
{a, B, {c, D}} -> do_something(Arg);
_ -> continue
end
where Arg can simply be passed to dmfun/1 because you have access to the forms generated from the source code.
So the problem is that any send operator will have dmfun/1 injected before it (and the send op's message passed as a parameter). But when sending messages like 50, {a, b}, [6, 4, 3] etc... these messages will certainly not match {a, B, {c, D}}, so injecting dmfun/1 at sends with these messages is a waste.
I want to be able to pick out plausible send operations like e.g. Pid ! {a, 5, SomeVar}, or Pid ! {a, X, SomeVar}. In both of these cases, it makes sense to inject dmfun/1 because if at runtime, SomeVar = {c, 50}, then the user supplied do_something/1 should be evaluated (but if SomeVar = 50, then it should not, because we're interested in {a, B, {c, D}} and 50 does not match {c, D}).
I wrote the following prematurely. It doesn't solve the problem I had. I ended up not including this feature. I left the explanation anyway, but if it were up to me, I'd delete this post entirely... I was still experimenting and I don't think what there is here will be of any use to anyone.
Before the explanation, let:
msg_format = the user supplied message format which will determine which messages being sent/received are interesting (e.g. {a, B, {c, D}}).
msg = the actual message being sent in the source code (e.g. Pid ! {a, X, Y}).
I gave the explanation below in a previous edit, but later found out that it wouldn't match some things it should. E.g. when msg_format = {a, B, {c, D}}, msg = {a, 5, SomeVar} wouldn't match when it should (by "match" I mean that dmfun/1 should be injected.
Let's call the "algorithm" outlined below Alg. The approach I took was to execute Alg(msg_format, msg) and Alg(msg, msg_format). The explanation below only goes through one of these. By repeating the same thing only getting a different matching function (matching_fun(msg_format) instead of matching_fun(msg)), and injecting dmfun/1 only if at least one of Alg(msg_format, msg) or Alg(msg, msg_format) returns true, then the result should be the injection of dmfun/1 where the desired message can actually be generated at runtime.
Take the message form you find in the [Forms] given to parse_transform/2 e.g. lets say you find: {op,24,'!',{var,24,'Pid'},{tuple,24,[{atom,24,a},{var,24,'B'},{var,24,'C'}]}}
So you would take {tuple,24,[{atom,24,a},{var,24,'B'},{var,24,'C'}]} which is the message being sent. (bind to Msg).
Do fill_vars(Msg) where:
-define(VARIABLE_FILLER, "_").
-spec fill_vars(erl_parse:abstract_form()) -> erl_parse:abstract_form().
%% #doc This function takes an abstract_form() and replaces all {var, LineNum, Variable} forms with
%% {string, LineNum, ?VARIABLE_FILLER}.
fill_vars(Form) ->
erl_syntax:revert(
erl_syntax_lib:map(
fun(DeltaTree) ->
case erl_syntax:type(DeltaTree) of
variable ->
erl_syntax:string(?VARIABLE_FILLER);
_ ->
DeltaTree
end
end,
Form)).
Do form_to_term/1 on 2's output, where:
form_to_term(Form) -> element(2, erl_eval:exprs([Form], [])).
Do term_to_str/1 on 3's output, where:
-define(inject_str(FormatStr, TermList), lists:flatten(io_lib:format(FormatStr, TermList))).
term_to_str(Term) -> ?inject_str("~p", [Term]).
Do gsub(v(4), "\"_\"", "_"), where v(4) is 4's output and gsub is: (taken from here)
gsub(Str,Old,New) -> RegExp = "\\Q"++Old++"\\E", re:replace(Str,RegExp,New,[global, multiline, {return, list}]).
Bind a variable (e.g. M) to matching_fun(v(5)), where:
matching_fun(StrPattern) ->
form_to_term(
str_to_form(
?inject_str(
"fun(MsgFormat) ->
case MsgFormat of
~s ->
true;
_ ->
false
end
end.", [StrPattern])
)
).
str_to_form(MsgFStr) ->
{_, Tokens, _} = erl_scan:string(end_with_period(MsgFStr)),
{_, Exprs} = erl_parse:parse_exprs(Tokens),
hd(Exprs).
end_with_period(String) ->
case lists:last(String) of
$. -> String;
_ -> String ++ "."
end.
Finally, take the user supplied message format (which is given as a string), e.g. MsgFormat = "{a, B, {c, D}}", and do: MsgFormatTerm = form_to_term(fill_vars(str_to_form(MsgFormat))). Then you can M(MsgFormatTerm).
e.g. with user supplied message format = {a, B, {c, D}}, and Pid ! {a, B, C} found in code:
2> weaver_ext:fill_vars({tuple,24,[{atom,24,a},{var,24,'B'},{var,24,'C'}]}).
{tuple,24,[{atom,24,a},{string,0,"_"},{string,0,"_"}]}
3> weaver_ext:form_to_term(v(2)).
{a,"_","_"}
4> weaver_ext:term_to_str(v(3)).
"{a,\"_\",\"_\"}"
5> weaver_ext:gsub(v(4), "\"_\"", "_").
"{a,_,_}"
6> M = weaver_ext:matching_fun(v(5)).
#Fun<erl_eval.6.13229925>
7> MsgFormatTerm = weaver_ext:form_to_term(weaver_ext:fill_vars(weaver_ext:str_to_form("{a, B, {c, D}}"))).
{a,"_",{c,"_"}}
8> M(MsgFormatTerm).
true
9> M({a, 10, 20}).
true
10> M({b, "_", 20}).
false
There is functionality for this in erl_types (HiPE).
I'm not sure you have the data in the right form for using this module though. I seem to remember that it takes Erlang terms as input. If you figure out the form issue you should be able to do most what you need with erl_types:t_from_term/1 and erl_types:t_is_subtype/2.
It was a long time ago that I last used these and I only ever did my testing runtime, as opposed to compile time. If you want to take a peek at usage pattern from my old code (not working any more) you can find it available at github.
I don't think this is possible at compile time in the general case. Consider:
send_msg(Pid, Msg) ->
Pid ! Msg.
Msg will look like a a var, which is a completely opaque type. You can't tell if it is a tuple or a list or an atom, since anyone could call this function with anything supplied for Msg.
This would be much easier to do at run time instead. Every time you use the ! operator, you'll need to call a wrapper function instead, which tries to match the message you are trying to send, and executes additional processing if the pattern is matched.
I have scripts that have one-liners or sort scripts from other languages within them. How can I have LaTeX listings detect this and change the syntax formating language within the script? This would be especially useful for awk within bash I believe.
Bash
#!/bin/bash
echo "hello world"
R --vanilla << EOF
# Data on motor octane ratings for various gasoline blends
x <- c(88.5,87.7,83.4,86.7,87.5,91.5,88.6,100.3,
95.6,93.3,94.7,91.1,91.0,94.2,87.5,89.9,
88.3,87.6,84.3,86.7,88.2,90.8,88.3,98.8,
94.2,92.7,93.2,91.0,90.3,93.4,88.5,90.1,
89.2,88.3,85.3,87.9,88.6,90.9,89.0,96.1,
93.3,91.8,92.3,90.4,90.1,93.0,88.7,89.9,
89.8,89.6,87.4,88.9,91.2,89.3,94.4,92.7,
91.8,91.6,90.4,91.1,92.6,89.8,90.6,91.1,
90.4,89.3,89.7,90.3,91.6,90.5,93.7,92.7,
92.2,92.2,91.2,91.0,92.2,90.0,90.7)
x
length(x)
mean(x);var(x)
stem(x)
EOF
perl -n -e '
#t = split(/\t/);
%t2 = map { $_ => 1 } split(/,/,$t[1]);
$t[1] = join(",",keys %t2);
print join("\t",#t); ' knownGeneFromUCSC.txt
awk -F'\t' '{
n = split($2, t, ","); _2 = x
split(x, _) # use delete _ if supported
for (i = 0; ++i <= n;)
_[t[i]]++ || _2 = _2 ? _2 "," t[i] : t[i]
$2 = _2
}-3' OFS='\t' infile
Python
#!/usr/local/bin/python
print "Hello World"
os.system("""
VAR=even;
sed -i "s/$VAR/odd/" testfile;
for i in `cat testfile` ;
do echo $i; done;
echo "now the tr command is removing the vowels";
cat testfile |tr 'aeiou' ' '
""")
UPDATE:
These are my current Listings settings in the preamble:
% This gives syntax highlighting in the python environment
\renewcommand{\lstlistlistingname}{Code Listings}
\renewcommand{\lstlistingname}{Code Listing}
\definecolor{gray}{gray}{0.5}
\definecolor{key}{rgb}{0,0.5,0}
\lstloadlanguages{Fortran,C++,C,[LaTeX]TeX,Python,bash,R, Perl}
\lstnewenvironment{python}[1][]{
\lstset{
language=python,
basicstyle=\ttfamily\small,
otherkeywords={1, 2, 3, 4, 5, 6, 7, 8 ,9 , 0, -, =, +, [, ], (, ), \{, \}, :, *, !},
keywordstyle=\color{blue},
stringstyle=\color{red},
showstringspaces=false,
emph={class, pass, in, for, while, if, is, elif, else, not, and, or,
def, print, exec, break, continue, return},
emphstyle=\color{black}\bfseries,
emph={[2]True, False, None, self},
emphstyle=[2]\color{key},
emph={[3]from, import, as},
emphstyle=[3]\color{blue},
upquote=true,
morecomment=[s]{"""}{"""},
commentstyle=\color{gray}\slshape,
rulesepcolor=\color{blue},#1
}
}{}
\lstnewenvironment{bash}{%
\lstset{%
language=bash,
otherkeywords={=, +, [, ], (, ), \{, \}, *},
% bash commands from:
%http://www.math.montana.edu/Rweb/Rhelp/00Index.html
emph={addgroup,adduser,alias,
ant,
apropos,apt-get,aptitude,aspell,awk,
basename,bash,bc,bg,break,builtin,bzip2,cal,case,cat,cd,cfdisk,chgrp,
chkconfig,chmod,chown,chroot,cksum,clear,cmp,comm,command,continue,
cp,cron,crontab,csplit,cut,date,dc,dd,ddrescue,declare,df,diff,diff3,
dig,dir,dircolors,dirname,dirs,dmesg,du,echo,egrep,eject,enable,env,
ethtool,eval,exec,exit,expand,expect,export,expr,false,fdformat,
fdisk,fg,fgrep,file,find,fmt,fold,for,format,free,fsck,ftp,function,
fuser,gawk,getopts,
git,
grep,groups,gzip,
gunzip,
,hash,head,help,history,hostname,
id,if,ifconfig,ifdown,ifup,import,install,
java, java6, java_cur
join,kill,killall,less,
let,ln,local,locate,logname,logout,look,lpc,lpr,lprint,lprintd,
lprintq,lprm,ls,lsof,make,man,mkdir,mkfifo,mkisofs,mknod,mmv,more,
mount,mtools,mtr,mv,
mysql,
netstat,nice,nl,nohup,notify-send,
noweb,noweave,
nslookup,op,
open,passwd,paste,pathchk,ping,pkill,popd,pr,printcap,printenv,
printf,ps,pushd,pwd,quota,quotacheck,quotactl,ram,rcp,read,
readarray,readonly,reboot,remsync,rename,renice,return,rev,rm,rmdir,
rsync,scp,screen,sdiff,sed,select,seq,set,sftp,shift,shopt,shutdown,
sleep,slocate,sort,source,split,ssh,strace,su,sudo,sum,
svn, svn2git,
symlink,sync,
tail,tar,tee,test,time,times,top,touch,tr,traceroute,trap,true,
tsort,tty,type,ulimit,umask,umount,unalias,uname,unexpand,uniq,
units,
unrar,
unset,unshar,until,useradd,usermod,users,uudecode,uuencode,
vdir,vi,vmstat,watch,wc,Wget,whereis,which,while,who,whoami,write,
zcat},
breaklines=true,
keywordstyle=\color{blue},
stringstyle=\color{red},
emphstyle=\color{black}\bfseries,
commentstyle=\color{gray}\slshape,
}
}{}
\lstnewenvironment{latexCode}[1]{\lstset{language=[latex]tex} \lstset{#1}}{}
\lstnewenvironment{Rcode}{
\lstset{%
language={R},
basicstyle=\small, % print whole listing small
keywordstyle=\color{black}, % style for keyword
% Function list from:
% http://www.math.montana.edu/Rweb/Rhelp/00Index.html
emph={abbreviate, abline,
abs, acos, acosh, all, all.names,
all.vars, anova, anova.glm, anova.lm, any,
aperm, append, apply, approx, approxfun,
apropos, Arg, args, Arithmetic, array,
arrows, as.array, as.call, as.character, as.complex,
as.data.frame, as.double, as.expression, as.factor, asin,
asinh, as.integer, as.list, as.logical, as.matrix,
as.na, as.name, as.null, as.numeric, as.ordered,
as.qr, as.real, assign, as.ts, as.vector,
atan, atan2, atanh, attach, attr,
attributes, autoload, .AutoloadEnv, axis, backsolve,
barplot, beta, binomial, box, boxplot,
boxplot.stats, break, browser, bw.bcv, bw.sj,
bw.ucv, bxp, c, .C, call,
cat, cbind, ceiling, character, charmatch,
chisq.test, chol, chol2inv, choose, class,
class<-, codes, coef, coefficients, coefficients.glm,
coefficients.lm, co.intervals, col, colnames, colors,
colours, Comparison, complete.cases, complex, Conj,
contour, contrasts, contr.helmert, contr.poly, contr.sum,
contr.treatment, convolve, cooks.distance, coplot, cor,
cos, cosh, count.fields, cov, covratio,
crossprod, cummax, cummin, cumprod, cumsum,
curve, cut, D, data, data.class,
data.entry, dataentry, data.frame, data.matrix, dbeta,
dbinom, dcauchy, dchisq, de, debug,
delay, demo, de.ncols, density, deparse,
de.restore, deriv, deriv.default, deriv.formula, de.setup,
detach, deviance, deviance.glm, deviance.lm, device,
Devices, dev.off, dexp, df, dfbetas,
dffits, df.residual, df.residual.glm, df.residual.lm, dgamma,
dgeom, dget, dhyper, diag, diff,
digamma, dim, dim<-, dimnames, dimnames<-,
dlnorm, dlogis, dnbinom, dnchisq, dnorm,
do.call, dotplot, double, dpois, dput,
drop, dt, dump, dunif, duplicated,
dweibull, dyn.load, edit, effects.glm, effects.lm,
eigen, else, emacs, end, environment,
environment<-, eval, exists, exp, expression,
Extract, factor, family, family.glm, fft,
finite, fitted, fitted.values, fitted.values.glm, fitted.values.lm,
fivenum, fix, floor, for, formals,
format, formatC, format.default, formula.default, formula.formula,
formula.terms, .Fortran, frame, frequency, function,
Gamma, gamma, gaussian, gc, gcinfo,
get, getenv, gl, glm, glm.control,
glm.fit, .GlobalEnv, graphics.off, gray, grep,
grid, gsub, hat, heat.colors, help,
hist, hsv, identify, if, ifelse,
Im, image, \%in\%, influence.measures, inherits,
integer, interactive, .Internal, inverse.gaussian, invisible,
invisible, IQR, is.array, is.atomic, is.call,
is.character, is.complex, is.data.frame, is.double, is.environment,
is.expression, is.factor, is.function, is.integer, is.language,
is.list, is.loaded, is.logical, is.matrix, is.na,
is.name, is.null, is.numeric, is.ordered, is.qr,
is.real, is.recursive, is.single, is.ts, is.unordered,
is.vector, lapply, lbeta, lchoose, legend,
length, LETTERS, letters, levels, levels<-,
lgamma, .lib.loc, .Library, library, library.dynam,
license, lines, lines.default, list, lm,
lm.fit, lm.influence, lm.wfit, load, locator,
log, log10, log2, Logic, logical,
lower.tri, lowess, ls, ls.diag, lsfit,
lsf.str, ls.print, ls.str, .Machine, Machine,
machine, macintosh, mad, match, match.arg,
match.call, matlines, mat.or.vec, matplot, matpoints,
matrix, max, mean, median, menu,
methods, min, missing, Mod, mode,
mode<-, model.frame, model.frame.default, model.matrix, model.matrix.default,
month.abb, month.name, mtext, mvfft, NA,
na.action, na.action.default, na.fail, names, na.omit,
nargs, nchar, NCOL, ncol, next,
NextMethod, nextn, nlevels, nlm, [.noquote,
noquote, NROW, nrow, NULL, numeric,
objects, on.exit, optimize, options, order,
ordered, outer, pairs, palette, par,
parse, paste, pbeta, pbinom, pcauchy,
pchisq, pentagamma, pexp, pf, pgamma,
pgeom, phyper, pi, pictex, piechart,
plnorm, plogis, plot, plot.default, plot.density,
plot.ts, plot.xy, pmatch, pmax, pmin,
pnbinom, pnchisq, pnorm, points, points.default,
poisson, polygon, polyroot, postscript, ppoints,
ppois, pretty, print, print.anova.glm, print.anova.lm,
print.data.frame, print.default, print.density, print.formula, print.glm,
print.lm, print.noquote, print.plot, print.summary.glm, print.summary.lm,
print.terms, print.ts, proc.time, prod, prompt,
prompt.default, prop.test, provide, .Provided, ps.options,
pt, punif, pweibull, q, qbeta,
qbinom, qcauchy, qchisq, qexp, qf,
qgamma, qgeom, qhyper, qlnorm, qlogis,
qnbinom, qnchisq, qnorm, qpois, qqline,
qqnorm, qqplot, qr, qr.coef, qr.fitted,
qr.Q, qr.qty, qr.qy, qr.R, qr.resid,
qr.solve, qr.X, qt, quantile, quasi,
quit, qunif, quote, qweibull, rainbow,
.Random.seed, range, rank, rbeta, rbind,
rbinom, rcauchy, rchisq, Re, readline,
read.table, real, rect, remove, rep,
repeat, replace, require, resid, residuals,
residuals.glm, residuals.lm, return, rev, rexp,
rf, rgamma, rgb, rgeom, rhyper,
RLIBS, rlnorm, rlogis, rm, rnbinom,
rnchisq, rnorm, round, row, row.names,
rownames, rpois, rstudent, rt, runif,
rweibull, sample, sapply, save, save.plot,
scale, scan, sd, segments, seq,
sequence, sign, signif, sin, sinh,
sink, solve, solve.qr, sort, source,
spline, splinefun, split, sqrt, start,
stem, stop, storage.mode, storage.mode<-, str,
str.data.frame, str.default, strheight, stripplot, strsplit,
structure, strwidth, sub, Subscript, substitute,
substr, substring, sum, summary, summary.glm,
summary.lm, svd, sweep, switch, symbol.C,
symbol.For, symnum, sys.call, sys.calls, sys.frame,
sys.frames, sys.function, sys.nframe, sys.on.exit, sys.parent,
sys.parents, system, system.date, system.time, t,
table, tabulate, tan, tanh, tapply,
tempfile, terms, terms.default, terms.formula, terms.terms,
terrain.colors, tetragamma, text, time, title,
topo.colors, trace, traceback, trigamma, trunc,
ts, tsp, t.test, typeof, unclass,
undebug, unique, uniroot, unlink, unlist,
untrace, update, update.formula, update.glm, update.lm,
upper.tri, UseMethod, var, vector, Version,
version, vi, warning, weighted.mean, weights.lm,
while, window, windows, write, x11,
xedit, xemacs, xinch, xor, xy.coords,
yinch}, % define a list of word to emphasis
stringstyle=\color{red},
emphstyle=\color{black}\bfseries, % define the way to emphase
showspaces=false, % show the space in code, or not
stringstyle=\ttfamily, % style of the string (like "hello word")
showstringspaces=false, % show the space in string, on not #1
commentstyle=\color{gray}\slshape,
tabsize=2, % sets default tabsize to 2 spaces
breaklines=true, % sets automatic line breaking
breakatwhitespace=false, % sets if automatic breaks should only happen at whitespace
}
}{}
\lstnewenvironment{Perl}{
\lstset{%
language={perl},
basicstyle=\small, % print whole listing small
keywordstyle=\color{black}, % style for keyword
emph={% From http://www.sdsc.edu/~moreland/courses/IntroPerl/docs/manual/pod/perlfunc.html
-X, run, abs, absolute, accept, accept, alarm, schedule, atan2,
arctangent, bind, binds, binmode, prepare, bless, create, caller,
get, chdir, change, chmod, changes, chomp, remove, chop, remove,
chown, change, chr, get, chroot, make, close, close, closedir, close,
connect, connect, continue, optional, cos, cosine, crypt, one-way,
dbmclose, breaks, dbmopen, create, defined, test, delete, deletes,
die, raise, do, turn, dump, create, each, retrieve, endgrent, be,
endhostent, be, endnetent, be, endprotoent, be, endpwent, be,
endservent, be, eof, test, eval, catch, exec, abandon, exists, test,
exit, terminate, exp, raise, fcntl, file, fileno, return, flock,
lock, fork, create, format, declare, formline, internal, getc, get,
getgrent, get, getgrgid, get, getgrnam, get, gethostbyaddr, get,
gethostbyname, get, gethostent, get, getlogin, return, getnetbyaddr,
get, getnetbyname, get, getnetent, get, getpeername, find, getpgrp,
get, getppid, get, getpriority, get, getprotobyname, get,
getprotobynumber, get, getprotoent, get, getpwent, get, getpwnam,
get, getpwuid, get, getservbyname, get, getservbyport, get,
getservent, get, getsockname, retrieve, getsockopt, get, glob,
expand, gmtime, convert, goto, create, grep, locate, hex, convert,
import, patch, int, get, ioctl, system-dependent, join, join, keys,
retrieve, kill, send, last, exit, lc, return, lcfirst, return,
length, return, link, create, listen, register, local, create,
localtime, convert, log, retrieve, lstat, stat, m//, match, map,
apply, mkdir, create, msgctl, SysV, msgget, get, msgrcv, receive,
msgsnd, send, my, declare, next, iterate, no, unimport, oct, convert,
open, open, opendir, open, ord, find, pack, convert, package,
declare, pipe, open, pop, remove, pos, find, print, output, printf,
output, prototype, get, push, append, q/STRING/, singly, qq/STRING/,
doubly, quotemeta, quote, qw/STRING/, quote, qx/STRING/, backquote,
rand, retrieve, read, fixed-length, readdir, get, readlink,
determine, recv, receive, redo, start, ref, find, rename, change,
require, load, reset, clear, return, get, reverse, flip, rewinddir,
reset, rindex, right-to-left, rmdir, remove, s///, replace, scalar,
force, seek, reposition, seekdir, reposition, select, reset, semctl,
SysV, semget, get, semop, SysV, send, send, setgrent, prepare,
sethostent, prepare, setnetent, prepare, setpgrp, set, setpriority,
set, setprotoent, prepare, setpwent, prepare, setservent, prepare,
setsockopt, set, shift, remove, shmctl, SysV, shmget, get, shmread,
read, shmwrite, write, shutdown, close, sin, return, sleep, block,
socket, create, socketpair, create, sort, sort, splice, add, split,
split, sprintf, formatted, sqrt, square, srand, seed, stat, get,
study, optimize, sub, declare, substr, get, symlink, create, syscall,
execute, sysread, fixed-length, system, run, syswrite, fixed-length,
tell, get, telldir, get, tie, bind, time, return, times, return,
tr///, transliterate, truncate, shorten, uc, return, ucfirst, return,
umask, set, undef, remove, unlink, remove, unpack, convert, unshift,
prepend, untie, break, use, load, utime, set, values, return, vec,
test, wait, wait, waitpid, wait, wantarray, get, warn, print, write,
print, y///, transliterate}, % define a list of word to emphasis
stringstyle=\color{red},
emphstyle=\color{black}\bfseries, % define the way to emphase
showspaces=false, % show the space in code, or not
stringstyle=\ttfamily, % style of the string (like "hello word")
showstringspaces=false, % show the space in string, on not #1
commentstyle=\color{gray}\slshape,
tabsize=2, % sets default tabsize to 2 spaces
breaklines=true, % sets automatic line breaking
breakatwhitespace=false, % sets if automatic breaks should only happen at whitespace
}
}{}
\lstnewenvironment{plaintext}{
\lstset{
tabsize=2, % sets default tabsize to 2 spaces
breaklines=true, % sets automatic line breaking
breakatwhitespace=false, % sets if automatic breaks should only happen at whitespace
basicstyle=\normalfont\ttfamily,
}
}{}
It is almost certainly easier to modify the Bash/Python highlighters than to write a context-sensitive highlighter. I'm guessing that just adding the keywords to the other highlighters should give acceptable results.
Modifying Pygments doesn't look too difficult, from Pygments' Write your own lexer documentation.