Mathematica function foo that can distinguish foo[.2] from foo[.20] - parsing

Suppose I want a function that takes a number and returns it as a string, exactly as it was given. The following doesn't work:
SetAttributes[foo, HoldAllComplete];
foo[x_] := ToString[Unevaluated#x]
The output for foo[.2] and foo[.20] is identical.
The reason I want to do this is that I want a function that can understand dates with dots as delimiters, eg, f[2009.10.20]. I realize that's a bizarre abuse of Mathematica but I'm making a domain-specific language and want to use Mathematica as the parser for it by just doing an eval (ToExpression). I can actually make this work if I can rely on double-digit days and months, like 2009.01.02 but I want to also allow 2009.1.2 and that ends up boiling down to the above question.
I suspect the only answer is to pass the thing in as a string and then parse it, but perhaps there's some trick I don't know. Note that this is related to this question: Mathematica: Unevaluated vs Defer vs Hold vs HoldForm vs HoldAllComplete vs etc etc

I wouldn't rely on Mathematica's float-parsing. Instead I'd define rules on MakeExpression for foo. This allows you to intercept the input, as boxes, prior to it being parsed into floats. This pair of rules should be a good starting place, at least for StandardForm:
MakeExpression[RowBox[{"foo", "[", dateString_, "]"}], StandardForm] :=
With[{args = Sequence ## Riffle[StringSplit[dateString, "."], ","]},
MakeExpression[RowBox[{"foo", "[", "{", args, "}", "]"}], StandardForm]]
MakeExpression[RowBox[{"foo", "[", RowBox[{yearMonth_, day_}], "]"}],
StandardForm] :=
With[{args =
Sequence ## Riffle[Append[StringSplit[yearMonth, "."], day], ","]},
MakeExpression[RowBox[{"foo", "[", "{", args, "}", "]"}], StandardForm]]
I needed the second rule because the notebook interface will "helpfully" insert a space if you try to put a second decimal place in a number.
EDIT: In order to use this from the kernel, you'll need to use a front end, but that's often pretty easy in version 7. If you can get your expression as a string, use UsingFrontEnd in conjunction with ToExpression:
UsingFrontEnd[ToExpression["foo[2009.09.20]", StandardForm]
EDIT 2: There's a lot of possibilities if you want to play with $PreRead, which allows you to apply special processing to the input, as strings, before they're parsed.

$PreRead = If[$FrontEnd =!= Null, #1,
StringReplace[#,x:NumberString /; StringMatchQ[x,"*.*0"] :>
StringJoin[x, "`", ToString[
StringLength[StringReplace[x, "-" -> ""]] -
Switch[StringTake[StringReplace[x,
"-" -> ""], 1], "0", 2, ".", 1, _,
1]]]]] & ;
will display foo[.20] as foo[0.20]. The InputForm of it will be
foo[0.2`2.]
I find parsing and displaying number formats in Mathematica more difficult than
it should be...

Floats are, IIRC, parsed by Mathematica into actual Floats, so there's no real way to do what you want.

Related

How to output text from lists of variable sizes using printf?

I'm trying to automate some output using printf but I'm struggling to find a way to pass to it the list of arguments expr_1, ..., expr_n in
printf (dest, string, expr_1, ..., expr_n)
I thought of using something like Javascript's spread operator but I'm not even sure I should need it.
For instace, say I have a list of strings to be output
a:["foo","bar","foobar"];
a string of appropriate format descriptors, say
s: "~a ~a ~a ~%";
and an output stream, say os. How can I invoke printf using these things in such a way that the result will be the same as writing
printf(os,s,a[1],a[2],a[3]);
Then I could generalize it to output lists of variable size.
Any suggestions?
Thanks.
EDIT:
I just learned about apply and, using the conditions I posed in my OP, the following seems to work wonderfully:
apply(printf,append([os,s],a));
Maxima printf implements most or maybe all of the formatting operators from Common Lisp FORMAT, which are quite extensive; see: http://www.lispworks.com/documentation/HyperSpec/Body/22_c.htm See also ? printf in Maxima to get an abbreviated list of formatting operators.
In particular for a list you can do something like:
printf (os, "my list: ~{~a~^, ~}~%", a);
to get the elements of a separated by ,. Here "~{...~}" tells printf to expect a list, and ~a is how to format each element, ~^ means omit the inter-element stuff after the last element, and , means put that between elements. Of course , could be anything.
There are many variations on that; if that's not what you're looking for, maybe I can help you find it.

Writing a text file containing LaTeX code from maxima expressions

Suppose in a (wx)Maxima session I have the following
f:sin(x);
df:diff(f,x);
Now I want to have it output a text file containing something like, for example
If $f(x)=\sin(x)$, then $f^\prime(x)=\cos(x)$.
I found the tex and tex1 functions but I think I need some additional string processing to be able to do what I want.
Any help appreciated.
EDIT: Further clarifications.
Auto Multiple Choice is a software that helps you create and manage questionaires. To declare questions one may use LaTeX syntax. From AMC's documentation, a question looks like this:
\element{geographie}{
\begin{question}{Cameroon}
Which is the capital city of Cameroon?
\begin{choices}
\correctchoice{Yaoundé}
\wrongchoice{Douala}
\wrongchoice{Abou-Dabi}
\end{choices}
\end{question}
}
As can be seen, it is just LaTeX. Now, with a little modification, I can turn this example into a math question
\element{derivatives}{
\begin{question}{trig_fun_diff_1}
If $f(x)=\sin(x)$ then $f^\prime(0)$ is
\begin{choices}
\correctchoice{$1$}
\wrongchoice{$-1$}
\wrongchoice{$0$}
\end{choices}
\end{question}
}
This is the sort of output I want. I'll have, say, a list of functions then execute a loop calculating their derivatives and so on.
OK, in response to your updated question. My advice is to work with questions and answers as expressions -- build up your list of questions first, and then when you have the list in the structure that you want, then output the TeX file as the last step. It is generally much clearer and simpler to work with expressions than with strings.
E.g. Here is a simplistic approach. I'll use defstruct to define a structure so that I can refer to its parts by name.
defstruct (question (name, datum, item, correct, incorrect));
myq1 : new (question);
myq1#name : "trig_fun_diff_1";
myq1#datum : f(x) = sin(x);
myq1#item : 'at ('diff (f(x), x), x = 0);
myq1#correct : 1;
myq1#incorrect : [0, -1];
You can also write
myq1 : question ("trig_fun_diff_1", f(x) = sin(x),
'at ('diff (f(x), x), x = 0), 1, [0, -1]);
I don't know which form is more convenient for you.
Then you can make an output function similar to this:
tex_question (q, output_stream) :=
(printf (output_stream, "\\begin{question}{~a}~%", q#name),
printf (output_stream, "If $~a$, then $~a$ is:~%", tex1 (q#datum), tex1 (q#item)),
printf (output_stream, "\\begin{choices}~%"),
/* make a list comprising correct and incorrect here */
/* shuffle the list (see random_permutation) */
/* output each correct or incorrect here */
printf (output_stream, "\\end{choices}~%"),
printf (output_stream, "\\end{question}~%));
where output_stream is an output stream as returned by openw (which see).
It may take a little bit of trying different stuff to get derivatives to be output in just the format you want. My advice is to put the logic for that into the output function.
A side effect of working with expressions is that it is straightforward to output some representations other than TeX (e.g. plain text, XML, HTML). That might or might not become important for your project.
Well, tex is the TeX output function. It can be customized to some extent via texput (which see).
As to post-processing via string manipulation, I don't recommend it. However, if you want to go down that road, there are regex functions which you can access via load(sregex). Unfortunately it's not yet documented; see the comment header of sregex.lisp (somewhere in your Maxima installation) for examples.

Can I match against a string that contains non-ASCII characters?

I am writing an program in which I am dealing with strings in the form, e.g., of "\001SOURCE\001". That is, the strings contained alphanumeric text with an ASCII character of value 1 at each end. I am trying to write a function to match strings like these. I have tried a match like this:
handle(<<1,"SOURCE",1>>) -> ok.
But the match does not succeed. I have tried a few variations on this theme, but all have failed.
Is there a way to match a string that contains mostly alphanumeric text, with the exception of a non-alpha character at each end?
You can also do the following
[1] ++ "SOURCE" ++ [1] == "\001SOURCE\001".
Or convert to binary using list_to_binary and pattern match as
<<1,"SOURCE",1>> == <<"\001SOURCE\001">>.
Strings are syntactic sugar for lists. Lists are a type and binaries are a different type, so your match isn't working out because you're trying to match a list against a binary (same problem if you tried to match {1, "STRING", 1} to it, tuples aren't lists).
Remembering that strings are lists, we have a few options:
handle([1,83,84,82,73,78,71,1]) -> ok.
This will work just fine. Another, more readable (but uglier, sort of) way is to use character literals:
handle([1, $S,$T,$R,$I,$N,$G, 1]) -> ok.
Yet another way would be to strip the non-character values, and then pass that on to a handler:
handle(String) -> dispatch(string:strip(String, both, 1)).
dispatch("STRING") -> do_stuff();
dispatch("OTHER") -> do_other_stuff().
And, if at all possible, the best case is if you just stop using strings for text values entirely (if that's feasible) and process binaries directly instead. The syntax of binaries is much friendlier, they take up way fewer resources, and quite a few binary operations are significantly more efficient than their string/list counterparts. But that doesn't fit every case! (But its awesome when dealing with sockets...)

Remove \text generated by TeXForm

I need to remove all \text generated by TeXForm in Mathematica.
What I am doing now is this:
MyTeXForm[a_]:=StringReplace[ToString[TeXForm[a]], "\\text" -> ""]
But the result keeps the braces, for example:
for a=fx,
the result of TeXForm[a] is \text{fx}
the result of MyTeXForm[a] is {fx}
But what I would like is it to be just fx
You should be able to use string patterns. Based on http://reference.wolfram.com/mathematica/tutorial/StringPatterns.html, something like the following should work:
MyTeXForm[a_]:=StringReplace[ToString[TeXForm[a]], "\\text{"~~s___~~"}"->s]
I don't have Mathematica handy right now, but this should say 'Match "\text{" followed by zero or more characters that are stored in the variable s, followed by "}", then replace all of that with whatever is stored in s.'
UPDATE:
The above works in the simplest case of there being a single "\text{...}" element, but the pattern s___ is greedy, so on input a+bb+xx+y, which Mathematica's TeXForm renders as "a+\text{bb}+\text{xx}+y", it matches everything between the first "\text{" and last "}" --- so, "bb}+\text{xx" --- leading to the output
In[1]:= MyTeXForm[a+bb+xx+y]
Out[1]= a+bb}+\text{xx+y
A fix for this is to wrap the pattern with Shortest[], leading to a second definition
In[2]:= MyTeXForm2[a_] := StringReplace[
ToString[TeXForm[a]],
Shortest["\\text{" ~~ s___ ~~ "}"] -> s
]
which yields the output
In[3]:= MyTeXForm2[a+bb+xx+y]
Out[3]= a+bb+xx+y
as desired.
Unfortunately this still won't work when the text itself contains a closing brace. For example, the input f["a}b","c}d"] (for some reason...) would give
In[4]:= MyTeXForm2[f["a}b","c}d"]]
Out[4]= f(a$\$b},c$\$d})
instead of "f(a$\}$b,c$\}$d)", which would be the proper processing of the TeXForm output "f(\text{a$\}$b},\text{c$\}$d})".
This is what I did (works fine for me):
MyTeXForm[a_] := ToString[ToExpression[StringReplace[ToString[TeXForm[a]], "\\text" -> ""]][[1]]]
This is a really late reply, but I just came up against the same issue and discovered a simple solution. Put a space between the variables in the Mathematica expression that you wish to convert using TexForm.
For the original poster's example, the following code works great:
a=f x
TeXForm[a]
The output is as desired: f x
Since LaTeX will ignore that space in math mode, things will format correctly.
(As an aside, I was having the same issue with subscripted expressions that have two side-by-side variables in the subscript. Inserting a space between them solved the issue.)

REBOL path operator vs division ambiguity

I've started looking into REBOL, just for fun, and as a fan of programming languages, I really like seeing new ideas and even just alternative syntaxes. REBOL is definitely full of these. One thing I noticed is the use of '/' as the path operator which can be used similarly to the '.' operator in most object-oriented programming languages. I have not programmed in REBOL extensively, just looked at some examples and read some documentation, but it isn't clear to me why there's no ambiguity with the '/' operator.
x: 4
y: 2
result: x/y
In my example, this should be division, but it seems like it could just as easily be the path operator if x were an object or function refinement. How does REBOL handle the ambiguity? Is it just a matter of an overloaded operator and the type system so it doesn't know until runtime? Or is it something I'm missing in the grammar and there really is a difference?
UPDATE Found a good piece of example code:
sp: to-integer (100 * 2 * length? buf) / d/3 / 1024 / 1024
It appears that arithmetic division requires whitespace, while the path operator requires no whitespace. Is that it?
This question deserves an answer from the syntactic point of view. In Rebol, there is no "path operator", in fact. The x/y is a syntactic element called path. As opposed to that the standalone / (delimited by spaces) is not a path, it is a word (which is usually interpreted as the division operator). In Rebol you can examine syntactic elements like this:
length? code: [x/y x / y] ; == 4
type? first code ; == path!
type? second code
, etc.
The code guide says:
White-space is used in general for delimiting (for separating symbols).
This is especially important because words may contain characters such as + and -.
http://www.rebol.com/r3/docs/guide/code-syntax.html
One acquired skill of being a REBOler is to get the hang of inserting whitespace in expressions where other languages usually do not require it :)
Spaces are generally needed in Rebol, but there are exceptions here and there for "special" characters, such as those delimiting series. For instance:
[a b c] is the same as [ a b c ]
(a b c) is the same as ( a b c )
[a b c]def is the same as [a b c] def
Some fairly powerful tools for doing introspection of syntactic elements are type?, quote, and probe. The quote operator prevents the interpreter from giving behavior to things. So if you tried something like:
>> data: [x [y 10]]
>> type? data/x/y
>> probe data/x/y
The "live" nature of the code would dig through the path and give you an integer! of value 10. But if you use quote:
>> data: [x [y 10]]
>> type? quote data/x/y
>> probe quote data/x/y
Then you wind up with a path! whose value is simply data/x/y, it never gets evaluated.
In the internal representation, a PATH! is quite similar to a BLOCK! or a PAREN!. It just has this special distinctive lexical type, which allows it to be treated differently. Although you've noticed that it can behave like a "dot" by picking members out of an object or series, that is only how it is used by the DO dialect. You could invent your own ideas, let's say you make the "russell" command:
russell [
x: 10
y: 20
z: 30
x/y/z
(
print x
print y
print z
)
]
Imagine that in my fanciful example, this outputs 30, 10, 20...because what the russell function does is evaluate its block in such a way that a path is treated as an instruction to shift values. So x/y/z means x=>y, y=>z, and z=>x. Then any code in parentheses is run in the DO dialect. Assignments are treated normally.
When you want to make up a fun new riff on how to express yourself, Rebol takes care of a lot of the grunt work. So for example the parentheses are guaranteed to have matched up to get a paren!. You don't have to go looking for all that yourself, you just build your dialect up from the building blocks of all those different types...and hook into existing behaviors (such as the DO dialect for basics like math and general computation, and the mind-bending PARSE dialect for some rather amazing pattern matching muscle).
But speaking of "all those different types", there's yet another weirdo situation for slash that can create another type:
>> type? quote /foo
This is called a refinement!, and happens when you start a lexical element with a slash. You'll see it used in the DO dialect to call out optional parameter sets to a function. But once again, it's just another symbolic LEGO in the parts box. You can ascribe meaning to it in your own dialects that is completely different...
While I didn't find any written definitive clarification, I did also find that +,-,* and others are valid characters in a word, so clearly it requires a space.
x*y
Is a valid identifier
x * y
Performs multiplication. It looks like the path operator is just another case of this.

Resources