Can you "teach" computers to do algebra using variable expressions (eg aX+bX=(a+b)X) - machine-learning

Let's say in the example lower case is constant and upper case is variable.
I'd like to have programs that can "intelligently" do specified tasks like algebra, but teaching the program new methods should be easy using symbols understood by humans. For example if the program told these facts:
aX+bX=(a+b)X
if a=bX then X=a/b
Then it should be able to perform these operations:
2a+3a=5a
3x+3x=6x
3x=1 therefore x=1/3
4x+2x=1 -> 6x=1 therefore x= 1/6
I was trying to do similar things with Prolog as it can easily "understand" variables, but then I had too many complications, mainly because two describing a relationship both ways results in a crash. (not easy to sort out)
To summarise: I want to know if a program which can be taught algebra by using mathematic symbols only. I'd like to know if other people have tried this and how complicated it is expected to be. The purpose of this is to make programming easier (runtime is not so important)

It depends on what do you want machine to do and how intelligent it should be.
Your question is mostly about AI but not ML. AI deals with formalization of "human" tasks while ML (though being a subset of AI) is about building models from data.
Described program may be implemented like this:
Each fact form a pattern. Program given with an expression and some patterns can try to apply some of them to expression and see what happens. If you want your program to be able to, for example, solve quadratic equations given rule like ax² + bx + c = 0 → x = (-b ± sqrt(b²-4ac))/(2a) then it'd be designed as follows:
Somebody gives a set of rules. Rule consists of a pattern and an outcome (solution or equivalent form). Think about the pattern as kind of a regular expression.
Then the program is asked to show some intelligence and prove its knowledge via doing something with a given expression. Here comes the major part:
you build a graph of expressions by applying possible rules (if a pattern is applicable to an expression you add new vertex with the corresponding outcome).
Then you run some path-search algorithm (A*, for example) to find sequence of transformations leading to the form like x = ...

I think this is an interesting question, although it off topic in SO (tool recommendation)
But nevertheless, because it captured my imagination, I wrote couple of function using R that can solve stuff like that quite easily
First, you'll have to install R, after words you'll need to download package called stringr
So in R console run
install.packages("stringr")
library(stringr)
And then you can define the following functions that I wrote
FirstFunc <- function(temp){
paste0(eval(parse(text = gsub("[A-Z]", "", temp))), unique(str_extract_all(temp, "[A-Z]")[[1]]))
}
SecondFunc <- function(temp){
eval(parse(text = strsplit(temp, "=")[[1]][2])) / eval(parse(text = gsub("[[:alpha:]]", "", strsplit(temp, "=")[[1]][1])))
}
Now, the first function will solve equations like
aX+bX=(a+b)X
While the second will solve equations like
4x+2x=1
For example
FirstFunc("3X+6X-2X-3X")
will return
"4X"
Now this functions is pretty primitive (mostly for the propose of illustration) and will solve equation that contain only one variable type, something like FirstFunc("3X-2X-2Y") won't give the correct result (but the function could be easily modified)
The second function will solve stuff like
SecondFunc("4x-2x=1")
will return
0.5
or
SecondFunc("4x+2x*3x=1")
will return
0.1
Note that this function also works only for one unknown variable (x) but could be easily modified too

Related

simplify equations/expressions using Javacc/jjtree

I have created a grammar to read a file of equations then created AST nodes for each rule.My question is how can I do simplification or substitute vales on the equations that the parser is able to read correctly. in which stage? before creating AST nodes or after?
Please provide me with ideas or tutorials to follow.
Thank you.
I'm assuming you equations are something like simple polynomials over real-value variables, like X^2+3*Y^2
You ask for two different solutions to two different problems that start with having an AST for at least one equation:
How to "substitute values" into the equation and compute the resulting value, e.g, for X==3 and Y=2, substitute into the AST for the formula above and compute 3^2+3*2^2 --> 21
How to do simplification: I assume you mean algebraic simplification.
The first problem of substituting values is fairly easy if yuo already have the AST. (If not, parse the equation to produce the AST first!) Then all you have to do is walk the AST, replacing every leaf node containing a variable name with the corresponding value, and then doing arithmetic on any parent nodes whose children now happen to be numbers; you repeat this until no more nodes can be arithmetically evaluated. Basically you wire simple arithmetic into a tree evaluation scheme.
Sometimes your evaluation will reduce the tree to a single value as in the example, and you can print the numeric result My SO answer shows how do that in detail. You can easily implement this yourself in a small project, even using JavaCC/JJTree appropriately adapted.
Sometimes the formula will end up in a state where no further arithmetic on it is possible, e.g., 1+x+y with x==0 and nothing known about y; then the result of such a subsitution/arithmetic evaluation process will be 1+y. Unfortunately, you will only have this as an AST... now you need to print out the resulting AST in order for the user to see the result. This is harder; see my SO answer on how to prettyprint a tree. This is considerably more work; if you restrict your tree to just polynomials over expressions, you can still do this in small project. JavaCC will help you with parsing, but provides zero help with prettyprinting.
The second problem is much harder, because you must not only accomplish variable substitution and arithmetic evaluation as above, but you have to somehow encode knowledge of algebraic laws, and how to match those laws to complex trees. You might hardwire one or two algebraic laws (e.g., x+0 -> x; y-y -> 0) but hardwiring many laws this way will produce an impossible mess because of how they interact.
JavaCC might form part of such an answer, but only a small part; the rest of the solution is hard enough so you are better off looking for an alternative rather than trying to build it all on top of JavaCC.
You need a more organized approach for this: a Program Transformation System (PTS). A typical PTS will allow you specify
a grammar for an arbitrary language (in your case, simply polynomials),
automatically parses instance to ASTs and can regenerate valid text from the AST. A good PTS will let you write source-to-source transformation rules that the PTS will apply automatically the instance AST; in your case you'd write down the algebraic laws as source-to-source rules and then the PTS does all the work.
An example is too long to provide here. But here I describe how to define formulas suitable for early calculus classes, and how to define algebraic rules that simply such formulas including applying some class calculus derivative laws.
With sufficient/significant effort, you can build your own PTS on top of JavaCC/JJTree. This is likely to take a few man-years. Easier to get a PTS rather than repeat all that work.

Source code logic evaluation

I was given a fragment of code (a function called bubbleSort(), written in Java, for example). How can I, or rather my program, tell if a given source code implements a particular sorting algorithm the correct way (using bubble method, for instance)?
I can enforce a user to give a legitimate function by analyzing function signature: making sure the the argument and return value is an array of integers. But I have no idea how to determine that algorithm logic is being done the right way. The input code could sort values correctly, but not in an aforementioned bubble method. How can my program discern that? I do realize a lot of code parsing would be involved, but maybe there's something else that I should know.
I hope I was somewhat clear.
I'd appreciate if someone could point me in the right direction or give suggestions on how to tackle such a problem. Perhaps there are tested ways that ease the evaluation of program logic.
In general, you can't do this because of the Halting problem. You can't even decide if the function will halt ("return").
As a practical matter, there's a bit more hope. If you are looking for a bubble sort, you can decide that it has number of parts:
a to-be-sorted datatype S with a partial order,
a container data type C with single instance variable A ("the array")
that holds the to-be-sorted data
a key type K ("array index") used to access the container that has a partial order
such that container[K] is type S
a comparison of two members of container, using key A and key B
such that A < B according to the key partial order, that determines
if container[B]>container of A
a swap operation on container[A], container[B] and some variable T of type S, that is conditionaly dependent on the comparison
a loop wrapped around the container that enumerates keys in according the partial order on K
You can build bits of code that find each of these bits of evidence in your source code, and if you find them all, claim you have evidence of a bubble sort.
To do this concretely, you need standard program analysis machinery:
to parse the source code and build an abstract syntax tree
build symbol tables (ST) that know the type of each identifier where it is used
construct a control flow graph (CFG) so that you check that various recognized bits occur in appropriate ordering
construct a data flow graph (DFG), so that you can determine that values recognized in one part of the algorithm flow properly to another part
[That's a lot of machinery just to get started]
From here, you can write ad hoc code procedural code to climb over the AST, ST, CFG, DFG, to "recognize" each of the individual parts. This is likely to be pretty messy as each recognizer will be checking these structures for evidence of its bit. But, you can do it.
This is messy enough, and interesting enough, so there are tools which can do much of this.
Our DMS Software Reengineering Toolkit is one. DMS already contains all the machinery to do standard program analysis for several languages. DMS also has a Dataflow pattern matching language, inspired by Rich and Water's 1980's "Programmer's Apprentice" ideas.
With DMS, you can express this particular problem roughly like this (untested):
dataflow pattern domain C;
dataflow pattern swap(in out v1:S, in out v2:S, T:S):statements =
" \T = \v1;
\v1 = \v2;
\v2 = \T;";
dataflow pattern conditional_swap(in out v1:S, in out v2:S,T:S):statements=
" if (\v1 > \v2)
\swap(\v1,\v2,\T);"
dataflow pattern container_access(inout container C, in key: K):expression
= " \container.body[\K] ";
dataflow pattern size(in container:C, out: integer):expression
= " \container . size "
dataflow pattern bubble_sort(in out container:C, k1: K, k2: K):function
" \k1 = \smallestK\(\);
while (\k1<\size\(container\)) {
\k2 = \next\(k1);
while (\k2 <= \size\(container\) {
\conditionalswap\(\container_access\(\container\,\k1\),
\container_access\(\container\,\k2\) \)
}
}
";
Within each pattern, you can write what amounts to the concrete syntax of the chosen programming language ("pattern domain"), referencing dataflows named in the pattern signature line. A subpattern can be mentioned inside another; one has to pass the dataflows to and from the subpattern by naming them. Unlike "plain old C", you have to pass the container explicitly rather than by implicit reference; that's because we are interested in the actual values that flow from one place in the pattern to another. (Just because two places in the code use the same variable, doesn't mean they see the same value).
Given these definitions, and ask to "match bubble_sort", DMS will visit the DFG (tied to CFG/AST/ST) to try to match the pattern; where it matches, it will bind the pattern variables to the DFG entries. If it can't find a match for everything, the match fails.
To accomplish the match, each of patterns above is converted essentially into its own DFG, and then each pattern is matched against the DFG for the code using what is called a subgraph isomorphism test. Constructing the DFG for the patter takes a lot of machinery: parsing, name resolution, control and data flow analysis, applied to fragments of code in the original language, intermixed with various pattern meta-escapes. The subgraph isomorphism is "sort of easy" to code, but can be very expensive to run. What saves the DMS pattern matchers is that most patterns have many, many constraints [tech point: and they don't have knots] and each attempted match tends to fail pretty fast, or succeed completely.
Not shown, but by defining the various bits separately, one can provide alternative implementations, enabling the recognition of variations.
We have used this to implement quite complete factory control model extraction tools from real industrial plant controllers for Dow Chemical on their peculiar Dowtran language (meant building parsers, etc. as above for Dowtran). We have version of this prototyped for C; the data flow analysis is harder.

How to convert Latex formula to C/C++ code?

I need to convert a math formula written in the Latex style to the function of a C/C++ code.
For example:
y = sin(x)^2 would become something like
double y = sin(x) * sin(x);
or
double y = pow(sin(x), 2);
where x is a variable defined somewhere before.
I mean that it should convert the latex formula to the C/C++ syntax. So that if there is a function y = G(x, y)^F(x) it doesn't matter what is G(x,y) and F(x),
it is a problem of the programmer to define it. It will just generate
double y = pow(G(x, y), F(x));
When the formula is too complicated it will take some time to make include it in the C/C++ formula. Is there any way to do this conversion?
Emacs' built-in calculator calc-mode can do this (and much more). Your examples can be converted like this:
Put the formula in some emacs buffer
$ y = sin(x)^2 $
With the cursor in the formula, activate calc-embedded mode
M-x calc-embedded
Switch the display language to C:
M-x calc-c-language
There you are:
$ y == pow(sin(x), 2) $
Note that it interprets the '=' sign in latex as an equality, which results in '==' for C. The latex equivalent to Cs assignment operator '=' would be '\gets'.
More on this topic on Turong's blog
I know the question is too old, but I'll just add a reply anyway as a think it might help someone else later. The question popped up a lot for me in my searches.
I'm working on a tool that does something similar, in a public git repo
You'll have to put some artificial limitations on your latex input, that's out of question.
Currently the tool I wrote only supports mul, div, add, sub, sqrt, pow, frac and sum as those are the only set of operations I need to handle, and the imposed limitations can be a bit loose by providing a preprocessor (see preproc.l for an [maybe not-so-good] example) that would clean away the raw latex input.
A mathematical equation, such as the ones in LaTeX, and a C expression are not interchangeable. The former states a relation between two terms, the latter defines an entity that can be evaluated, unambiguously yielding one value. a = b in C means 'take the value in variable b and store it in variable a', wheres in Math, it means 'in the current context, a and b are equal'. The first describes a computation process, the second describes a static fact. Consequently, the Math equation can be reversed: a = b is equivalent to b = a, but doing the same to the C equation yields something quite different.
To make matters worse, LaTeX formulae only contain the information needed to render the equations; often, this is not enough to capture their meaning.
Of course some LaTeX formulae, like your example, can be converted into C computations, but many others cannot, so any automated way of doing so would only make limited sense.
I'm not sure there is a simple answer, because mathematical formulaes (in LaTeX documents) are actually ambiguous, so to automate their translation to some code requires automating their understanding.
And the MathML standard has, IIRC, two forms representing formulaes (one for displaying, another for computing) and there is some reason for that.

What are advantages and disadvantages of "point free" style in functional programming?

I know that in some languages (Haskell?) the striving is to achieve point-free style, or to never explicitly refer to function arguments by name. This is a very difficult concept for me to master, but it might help me to understand what the advantages (or maybe even disadvantages) of that style are. Can anyone explain?
The point-free style is considered by some author as the ultimate functional programming style. To put things simply, a function of type t1 -> t2 describes a transformation from one element of type t1 into another element of type t2. The idea is that "pointful" functions (written using variables) emphasize elements (when you write \x -> ... x ..., you're describing what's happening to the element x), while "point-free" functions (expressed without using variables) emphasize the transformation itself, as a composition of simpler transforms. Advocates of the point-free style argue that transformations should indeed be the central concept, and that the pointful notation, while easy to use, distracts us from this noble ideal.
Point-free functional programming has been available for a very long time. It was already known by logicians which have studied combinatory logic since the seminal work by Moses Schönfinkel in 1924, and has been the basis for the first study on what would become ML type inference by Robert Feys and Haskell Curry in the 1950s.
The idea to build functions from an expressive set of basic combinators is very appealing and has been applied in various domains, such as the array-manipulation languages derived from APL, or the parser combinator libraries such as Haskell's Parsec. A notable advocate of point-free programming is John Backus. In his 1978 speech "Can Programming Be Liberated From the Von Neumann Style ?", he wrote:
The lambda expression (with its substitution rules) is capable of
defining all possible computable functions of all possible types
and of any number of arguments. This freedom and power has its
disadvantages as well as its obvious advantages. It is analogous
to the power of unrestricted control statements in conventional
languages: with unrestricted freedom comes chaos. If one
constantly invents new combining forms to suit the occasion, as
one can in the lambda calculus, one will not become familiar with
the style or useful properties of the few combining forms that
are adequate for all purposes. Just as structured programming
eschews many control statements to obtain programs with simpler
structure, better properties, and uniform methods for
understanding their behavior, so functional programming eschews
the lambda expression, substitution, and multiple function
types. It thereby achieves programs built with familiar
functional forms with known useful properties. These programs are
so structured that their behavior can often be understood and
proven by mechanical use of algebraic techniques similar to those
used in solving high school algebra problems.
So here they are. The main advantage of point-free programming are that they force a structured combinator style which makes equational reasoning natural. Equational reasoning has been particularly advertised by the proponents of the "Squiggol" movement (see [1] [2]), and indeed use a fair share of point-free combinators and computation/rewriting/reasoning rules.
[1] "An introduction to the Bird-Merteens Formalism", Jeremy Gibbons, 1994
[2] "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire", Erik Meijer, Maarten Fokkinga and Ross Paterson, 1991
Finally, one cause for the popularity of point-free programming among Haskellites is its relation to category theory. In category theory, morphisms (which could be seen as "transformations between objects") are the basic object of study and computation. While partial results allow reasoning in specific categories to be performed in a pointful style, the common way to build, examine and manipulate arrows is still the point-free style, and other syntaxes such as string diagrams also exhibit this "pointfreeness". There are rather tight links between the people advocating "algebra of programming" methods and users of categories in programming (for example the authors of the banana paper [2] are/were hardcore categorists).
You may be interested in the Pointfree page of the Haskell wiki.
The downside of pointfree style is rather obvious: it can be a real pain to read. The reason why we still love to use variables, despite the numerous horrors of shadowing, alpha-equivalence etc., is that it's a notation that's just so natural to read and think about. The general idea is that a complex function (in a referentially transparent language) is like a complex plumbing system: the inputs are the parameters, they get into some pipes, are applied to inner functions, duplicated (\x -> (x,x)) or forgotten (\x -> (), pipe leading nowhere), etc. And the variable notation is nicely implicit about all that machinery: you give a name to the input, and names on the outputs (or auxiliary computations), but you don't have to describe all the plumbing plan, where the small pipes will go not to be a hindrance for the bigger ones, etc. The amount of plumbing inside something as short as \(f,x,y) -> ((x,y), f x y) is amazing. You may follow each variable individually, or read each intermediate plumbing node, but you never have to see the whole machinery together. When you use a point-free style, all the plumbing is explicit, you have to write everything down, and look at it afterwards, and sometimes it's just plain ugly.
PS: this plumbing vision is closely related to the stack programming languages, which are probably the least pointful programming languages (barely) in use. I would recommend trying to do some programming in them just to get of feeling of it (as I would recommend logic programming). See Factor, Cat or the venerable Forth.
I believe the purpose is to be succinct and to express pipelined computations as a composition of functions rather than thinking of threading arguments through. Simple example (in F#) - given:
let sum = List.sum
let sqr = List.map (fun x -> x * x)
Used like:
> sum [3;4;5]
12
> sqr [3;4;5]
[9;16;25]
We could express a "sum of squares" function as:
let sumsqr x = sum (sqr x)
And use like:
> sumsqr [3;4;5]
50
Or we could define it by piping x through:
let sumsqr x = x |> sqr |> sum
Written this way, it's obvious that x is being passed in only to be "threaded" through a sequence of functions. Direct composition looks much nicer:
let sumsqr = sqr >> sum
This is more concise and it's a different way of thinking of what we're doing; composing functions rather than imagining the process of arguments flowing through. We're not describing how sumsqr works. We're describing what it is.
PS: An interesting way to get your head around composition is to try programming in a concatenative language such as Forth, Joy, Factor, etc. These can be thought of as being nothing but composition (Forth : sumsqr sqr sum ;) in which the space between words is the composition operator.
PPS: Perhaps others could comment on the performance differences. It seems to me that composition may reduce GC pressure by making it more obvious to the compiler that there is no need to produce intermediate values as in pipelining; helping make the so-called "deforestation" problem more tractable.
While I'm attracted to the point-free concept and used it for some things, and agree with all the positives said before, I found these things with it as negative (some are detailed above):
The shorter notation reduces redundancy; in a heavily structured composition (ramda.js style, or point-free in Haskell, or whatever concatenative language) the code reading is more complex than linearly scanning through a bunch of const bindings and using a symbol highlighter to see which binding goes into what other downstream calculation. Besides the tree vs linear structure, the loss of descriptive symbol names makes the function hard to intuitively grasp. Of course both the tree structure and the loss of named bindings also have a lot of positives as well, for example, functions will feel more general - not bound to some application domain via the chosen symbol names - and the tree structure is semantically present even if bindings are laid out, and can be comprehended sequentially (lisp let/let* style).
Point-free is simplest when just piping through or composing a series of functions, as this also results in a linear structure that we humans find easy to follow. However, threading some interim calculation through multiple recipients is tedious. There are all kinds of wrapping into tuples, lensing and other painstaking mechanisms go into just making some calculation accessible, that would otherwise be just the multiple use of some value binding. Of course the repeated part can be extracted out as a separate function and maybe it's a good idea anyway, but there are also arguments for some non-short functions and even if it's extracted, its arguments will have to be somehow threaded through both applications, and then there may be a need for memoizing the function to not actually repeat the calculation. One will use a lot of converge, lens, memoize, useWidth etc.
JavaScript specific: harder to casually debug. With a linear flow of let bindings, it's easy to add a breakpoint wherever. With the point-free style, even if a breakpoint is somehow added, the value flow is hard to read, eg. you can't just query or hover over some variable in the dev console. Also, as point-free is not native in JS, library functions of ramda.js or similar will obscure the stack quite a bit, especially with the obligate currying.
Code brittleness, especially on nontrivial size systems and in production. If a new piece of requirement comes in, then the above disadvantages get into play (eg. harder to read the code for the next maintainer who may be yourself a few weeks down the line, and also harder to trace the dataflow for inspection). But most importantly, even something seemingly small and innocent new requirement can necessitate a whole different structuring of the code. It may be argued that it's a good thing in that it'll be a crystal clear representation of the new thing, but rewriting large swaths of point-free code is very time consuming and then we haven't mentioned testing. So it feels that the looser, less structured, lexical assignment based coding can be more quickly repurposed. Especially if the coding is exploratory, and in the domain of human data with weird conventions (time etc.) that can rarely be captured 100% accurately and there may always be an upcoming request for handling something more accurately or more to the needs of the customer, whichever method leads to faster pivoting matters a lot.
To the pointfree variant, the concatenative programming language, i have to write:
I had a little experience with Joy. Joy is a very simple and beautiful concept with lists. When converting a problem into a Joy function, you have to split your brain into a part for the stack plumbing work and a part for the solution in the Joy syntax. The stack is always handled from the back. Since the composition is contained in Joy, there is no computing time for a composition combiner.

Backtracking in Erlang

First of all sorry for my English.
I would like to use a backtracking algorithm in Erlang. It would serve as a guessing to solve partially filled sudokus. A 9x9 sudoku is stored as a list of 81 elements, where every element stores the possible number which can go into that cell.
For a 4x4 sudoku my initial solution looks like this:
[[1],[3],[2],[4],[4],[2],[3],[1],[2,3],[4],[1],[2,3],[2,3],[1],[4],[2,3]]
This sudoku has 2 solutions. I have to write out both of them. After that initial solution reached, I need to implement a backtracking algorithm, but I don't know how to make it.
My thought is to write out the fixed elements into a new list called fixedlist which will change the multiple-solution cells to [].
For the above mentioned example the fixedlist looks like this:
[[1],[3],[2],[4],[4],[2],[3],[1],[],[4],[1],[],[],[1],[4],[]]
From here I have a "sample", I look for the lowest length in the solutionlist which is not equal to 1, and I try the first possible number of this cell and I put it to that fixedlist. Here I have an algorithm to update the cells and checks if it is still a solvable sudoku or not. If not, I don't know how to step back one and try a new one.
I know the pseudo code of it and I can use it for imperative languages but not for erlang. (prolog actually implemented backtrack algorithm, but erlang didn't)
Any idea?
Re: My bactracking functions.
These are the general functions which provide a framework for handling back-tracking and logical variables similar to a prolog engine. You must provide the function (predicates) which describe the program logic. If you write them as you would in prolog I can show you how to translate them into erlang. Very briefly you translate something like:
p :- q, r, s.
in prolog into something like
p(Next0) ->
Next1 = fun () -> s(Next0) end,
Next2 = fun () -> r(Next1) end,
q(Next2).
Here I am ignoring all other arguments except the continuations.
I hope this gives some help. As I said if you describe your algorithms I can help you translate them, I have been looking for a good example. You can, of course, just as well do it by yourself but this provides some help.

Resources