Primitive operations in proofs

Primitive operations in proofs - primitive-types

For learning dependent types, I'm rewriting my old Haskell game in Idris. Currently the game „engine“ uses builtin integral types, such as Word8. I'd want to prove some lemmas involving numerical properties of those numbers (such as, that double negation is identity). However, it's not possible to say something about behaviour of primitive arithmetic operations. What would be better, to just use believe_me or other handwaving (at least for the most basic properties), or to rewrite my code using Nat, Fin and other „high-level“ numerical types?

I'd suggest using postulate for any of the primitive properties you need, being careful only to use things which are actually true for the numerical types in question, of course (which basically just means being careful about overflow). So you can say things like:
postulate add_commutes : (x, y : Int) -> x + y = y + x
believe_me is best avoided unless you need some computation behaviour of the proof. But, to be honest, we're still trying to work this stuff out when reasoning about primitives...

I believe for the moment it's generally better to use Nat when you can. The Idris devs are planning, eventually, to implement a general mechanism for replacing proof-friendly types with fast primitive ones in compilation, but for now that only happens for Nat. You could believe_me through if you really wanted, but you'd end up with functions that are not so easy to work with in proofs. Note that if you decide to play with believe_me, then you should consider also really_believe_me, which apparently makes it more believable to the type checker.

Related

simplify equations/expressions using Javacc/jjtree

I have created a grammar to read a file of equations then created AST nodes for each rule.My question is how can I do simplification or substitute vales on the equations that the parser is able to read correctly. in which stage? before creating AST nodes or after?
Please provide me with ideas or tutorials to follow.
Thank you.

I'm assuming you equations are something like simple polynomials over real-value variables, like X^2+3*Y^2
You ask for two different solutions to two different problems that start with having an AST for at least one equation:
How to "substitute values" into the equation and compute the resulting value, e.g, for X==3 and Y=2, substitute into the AST for the formula above and compute 3^2+3*2^2 --> 21
How to do simplification: I assume you mean algebraic simplification.
The first problem of substituting values is fairly easy if yuo already have the AST. (If not, parse the equation to produce the AST first!) Then all you have to do is walk the AST, replacing every leaf node containing a variable name with the corresponding value, and then doing arithmetic on any parent nodes whose children now happen to be numbers; you repeat this until no more nodes can be arithmetically evaluated. Basically you wire simple arithmetic into a tree evaluation scheme.
Sometimes your evaluation will reduce the tree to a single value as in the example, and you can print the numeric result My SO answer shows how do that in detail. You can easily implement this yourself in a small project, even using JavaCC/JJTree appropriately adapted.
Sometimes the formula will end up in a state where no further arithmetic on it is possible, e.g., 1+x+y with x==0 and nothing known about y; then the result of such a subsitution/arithmetic evaluation process will be 1+y. Unfortunately, you will only have this as an AST... now you need to print out the resulting AST in order for the user to see the result. This is harder; see my SO answer on how to prettyprint a tree. This is considerably more work; if you restrict your tree to just polynomials over expressions, you can still do this in small project. JavaCC will help you with parsing, but provides zero help with prettyprinting.
The second problem is much harder, because you must not only accomplish variable substitution and arithmetic evaluation as above, but you have to somehow encode knowledge of algebraic laws, and how to match those laws to complex trees. You might hardwire one or two algebraic laws (e.g., x+0 -> x; y-y -> 0) but hardwiring many laws this way will produce an impossible mess because of how they interact.
JavaCC might form part of such an answer, but only a small part; the rest of the solution is hard enough so you are better off looking for an alternative rather than trying to build it all on top of JavaCC.
You need a more organized approach for this: a Program Transformation System (PTS). A typical PTS will allow you specify
a grammar for an arbitrary language (in your case, simply polynomials),
automatically parses instance to ASTs and can regenerate valid text from the AST. A good PTS will let you write source-to-source transformation rules that the PTS will apply automatically the instance AST; in your case you'd write down the algebraic laws as source-to-source rules and then the PTS does all the work.
An example is too long to provide here. But here I describe how to define formulas suitable for early calculus classes, and how to define algebraic rules that simply such formulas including applying some class calculus derivative laws.
With sufficient/significant effort, you can build your own PTS on top of JavaCC/JJTree. This is likely to take a few man-years. Easier to get a PTS rather than repeat all that work.

How to guarantee referential transparency in F# applications?

So I'm trying to learn FP and I'm trying to get my head around referential transparency and side effects.
I have learned that making all effects explicit in the type system is the only way to guarantee referential transparency:
The idea of “mostly functional programming” is unfeasible. It is impossible to make imperative
programming languages safer by only partially removing implicit side effects. Leaving one kind of effect is often enough to simulate the very effect you just tried to remove. On the other hand, allowing effects to be “forgotten” in a pure language also causes mayhem in its own way.
Unfortunately, there is no golden middle, and we are faced with a classic dichotomy: the curse of the excluded middle, which presents the choice of either (a) trying to tame effects using purity annotations, yet fully embracing the fact that your code is still fundamentally effectful; or (b) fully embracing purity by making all effects explicit in the type system and being pragmatic - Source
I have also learned that not-pure FP languages like Scala or F# cannot guarantee referential transparency:
The ability to enforce referential transparency this is pretty much incompatible with Scala's goal of having a class/object system that is interoperable with Java. - Source
And that in not-pure FP it is up to the programmer to ensure referential transparency:
In impure languages like ML, Scala or F#, it is up to the programmer to ensure referential transparency, and of course in dynamically typed languages like Clojure or Scheme, there is no static type system to enforce referential transparency. - Source
I'm interested in F# because I have a .Net background so my next questions is:
What can I do to guarantee referential transparency in an F# applications if it is not enforced by the F# compiler?

The short answer to this question is that there is no way to guarantee referential transparency in F#. One of the big advantages of F# is that it has fantastic interop with other .NET languages but the downside of this, compared to a more isolated language like Haskell, is that side-effects are there and you will have to deal with them.
How you actually deal with side effects in F# is a different question entirely.
There is actually nothing to stop you from bringing effects into the type system in F# in very much the same way as you might in Haskell although effectively you are 'opting in' to this approach rather than it being enforced upon you.
All you really need is some infrastructure like this:
/// A value of type IO<'a> represents an action which, when performed (e.g. by calling the IO.run function), does some I/O which results in a value of type 'a.
type IO<'a> =
private
|Return of 'a
|Delay of (unit -> 'a)
/// Pure IO Functions
module IO =
/// Runs the IO actions and evaluates the result
let run io =
match io with
|Return a -> a
|Delay (a) -> a()
/// Return a value as an IO action
let return' x = Return x
/// Creates an IO action from an effectful computation, this simply takes a side effecting function and brings it into IO
let fromEffectful f = Delay (f)
/// Monadic bind for IO action, this is used to combine and sequence IO actions
let bind x f =
match x with
|Return a -> f a
|Delay (g) -> Delay (fun _ -> run << f <| g())
return brings a value within IO.
fromEffectful takes a side-effecting function unit -> 'a and brings it within IO.
bind is the monadic bind function and lets you sequence effects.
run runs the IO to perform all of the enclosed effects. This is like unsafePerformIO in Haskell.
You could then define a computation expression builder using these primitive functions and give yourself lots of nice syntactic sugar.
Another worthwhile question to ask is, is this useful in F#?
A fundamental difference between F# and Haskell is that F# is an eager by default language while Haskell is lazy by default. The Haskell community (and I suspect the .NET community, to a lesser extent) has learnt that when you combine lazy evaluation and side-effects/IO, very bad things can happen.
When you work in the IO monad in Haskell, you are (generally) guaranteeing something about the sequential nature of IO and ensuring that one piece of IO is done before another. You are also guaranteeing something about how often and when effects can occur.
One example I like to pose in F# is this one:
let randomSeq = Seq.init 4 (fun _ -> rnd.Next())
let sortedSeq = Seq.sort randomSeq
printfn "Sorted: %A" sortedSeq
printfn "Random: %A" randomSeq
At first glance, this code might appear to generate a sequence, sort the same sequence and then print the sorted and unsorted versions.
It doesn't. It generates two sequences, one of which is sorted and one of which isn't. They can, and almost certainly do, have completely distinct values.
This is a direct consequence of combining side effects and lazy evaluation without referential transparency. You could gain back some control by using Seq.cache which prevents repeat evaluation but still doesn't give you control over when, and in what order, effects occur.
By contrast, when you're working with eagerly evaluated data structures, the consequences are generally less insidious so I think the requirement for explicit effects in F# is vastly reduced compared to Haskell.
That said, a large advantage of making all effects explicit within the type system is that it helps to enforce good design. The likes of Mark Seemann will tell you that the best strategy for designing robust a system, whether it's object oriented or functional, involves isolating side-effects at the edge of your system and relying on a referentially transparent, highly unit-testable, core.
If you are working with explicit effects and IO in the type system and all of your functions are ending up being written in IO, that's a strong and obvious design smell.
Going back to the original question of whether this is worthwhile in F# though, I still have to answer with a "I don't know". I have been working on a library for referentially transparent effects in F# to explore this possibility myself. There is more material there on this subject as well as a much fuller implementation of IO there, if you are interested.
Finally, I think it's worth remembering that the Curse of the Excluded Middle is probably targeted at programming language designers more than your typical developer.
If you are working in an impure language, you will need to find a way of coping with and taming your side effects, the precise strategy which you follow to do this is open to interpretation and what best suits the needs of yourself and/or your team but I think that F# gives you plenty of tools to do this.
Finally, my pragmatic and experienced view of F# tells me that actually, "mostly functional" programming is still a big improvement over its competition almost all of the time.

I think you need to read the source article in an appropriate context - it is an opinion piece coming from a specific perspective and it is intentionally provocative - but it is not a hard fact.
If you are using F#, you will get referential transparency by writing good code. That means writing most logic as a sequence of transformations and performing effects to read the data before running the transformations & running effects to write the results somewhere after. (Not all programs fit into this pattern, but those that can be written in a referentially transparent way generally do.)
In my experience, you can live perfectly happily in the "middle". That means, write referentially transparent code most of the time, but break the rules when you need to for some practical reason.
To respond to some of the specific points in the quotes:
It is impossible to make imperative programming languages safer by only partially removing implicit side effects.
I would agree it is impossible to make them "safe" (if by safe we mean they have no side-effects), but you can make them safer by removing some side effects.
Leaving one kind of effect is often enough to simulate the very effect you just tried to remove.
Yes, but simulating effect to provide theoretical proof is not what programmers do. If it is sufficiently discouraged to achieve the effect, you'll tend to write code in other (safer) ways.
I have also learned that not-pure FP languages like Scala or F# cannot guarantee referential transparency:
Yes, that's true - but "referential transparency" is not what functional programming is about. For me, it is about having better ways to model my domain and having tools (like the type system) that guide me along the "happy path". Referential transparency is one part of that, but it is not a silver bullet. Referential transparency is not going to magically solve all your problems.

Like Mark Seemann has confirmed in the comments "Nothing in F# can guarantee referential transparency. It's up to the programmer to think about this."
I have been doing some search online and I found that "discipline is your best friend" and some recommendations to try to keep the level of referential transparency in your F# applications as high as possible:
Don't use mutable, for or while loops, ref keywords, etc.
Stick with purely immutable data structures (discriminated union, list, tuple, map, etc).
If you need to do IO at some point, architect your program so that they are separated from your purely functional code. Don't forget functional programming is all about limiting and isolating side-effects.
Algebraic data types (ADT) AKA "discriminated unions" instead of objects.
Learning to love laziness.
Embracing the Monad.

Can you "teach" computers to do algebra using variable expressions (eg aX+bX=(a+b)X)

Let's say in the example lower case is constant and upper case is variable.
I'd like to have programs that can "intelligently" do specified tasks like algebra, but teaching the program new methods should be easy using symbols understood by humans. For example if the program told these facts:
aX+bX=(a+b)X
if a=bX then X=a/b
Then it should be able to perform these operations:
2a+3a=5a
3x+3x=6x
3x=1 therefore x=1/3
4x+2x=1 -> 6x=1 therefore x= 1/6
I was trying to do similar things with Prolog as it can easily "understand" variables, but then I had too many complications, mainly because two describing a relationship both ways results in a crash. (not easy to sort out)
To summarise: I want to know if a program which can be taught algebra by using mathematic symbols only. I'd like to know if other people have tried this and how complicated it is expected to be. The purpose of this is to make programming easier (runtime is not so important)

It depends on what do you want machine to do and how intelligent it should be.
Your question is mostly about AI but not ML. AI deals with formalization of "human" tasks while ML (though being a subset of AI) is about building models from data.
Described program may be implemented like this:
Each fact form a pattern. Program given with an expression and some patterns can try to apply some of them to expression and see what happens. If you want your program to be able to, for example, solve quadratic equations given rule like ax² + bx + c = 0 → x = (-b ± sqrt(b²-4ac))/(2a) then it'd be designed as follows:
Somebody gives a set of rules. Rule consists of a pattern and an outcome (solution or equivalent form). Think about the pattern as kind of a regular expression.
Then the program is asked to show some intelligence and prove its knowledge via doing something with a given expression. Here comes the major part:
you build a graph of expressions by applying possible rules (if a pattern is applicable to an expression you add new vertex with the corresponding outcome).
Then you run some path-search algorithm (A*, for example) to find sequence of transformations leading to the form like x = ...

I think this is an interesting question, although it off topic in SO (tool recommendation)
But nevertheless, because it captured my imagination, I wrote couple of function using R that can solve stuff like that quite easily
First, you'll have to install R, after words you'll need to download package called stringr
So in R console run
install.packages("stringr")
library(stringr)
And then you can define the following functions that I wrote
FirstFunc <- function(temp){
paste0(eval(parse(text = gsub("[A-Z]", "", temp))), unique(str_extract_all(temp, "[A-Z]")[[1]]))
}
SecondFunc <- function(temp){
eval(parse(text = strsplit(temp, "=")[[1]][2])) / eval(parse(text = gsub("[[:alpha:]]", "", strsplit(temp, "=")[[1]][1])))
}
Now, the first function will solve equations like
aX+bX=(a+b)X
While the second will solve equations like
4x+2x=1
For example
FirstFunc("3X+6X-2X-3X")
will return
"4X"
Now this functions is pretty primitive (mostly for the propose of illustration) and will solve equation that contain only one variable type, something like FirstFunc("3X-2X-2Y") won't give the correct result (but the function could be easily modified)
The second function will solve stuff like
SecondFunc("4x-2x=1")
will return
0.5
or
SecondFunc("4x+2x*3x=1")
will return
0.1
Note that this function also works only for one unknown variable (x) but could be easily modified too

What are advantages and disadvantages of "point free" style in functional programming?

I know that in some languages (Haskell?) the striving is to achieve point-free style, or to never explicitly refer to function arguments by name. This is a very difficult concept for me to master, but it might help me to understand what the advantages (or maybe even disadvantages) of that style are. Can anyone explain?

The point-free style is considered by some author as the ultimate functional programming style. To put things simply, a function of type t1 -> t2 describes a transformation from one element of type t1 into another element of type t2. The idea is that "pointful" functions (written using variables) emphasize elements (when you write \x -> ... x ..., you're describing what's happening to the element x), while "point-free" functions (expressed without using variables) emphasize the transformation itself, as a composition of simpler transforms. Advocates of the point-free style argue that transformations should indeed be the central concept, and that the pointful notation, while easy to use, distracts us from this noble ideal.
Point-free functional programming has been available for a very long time. It was already known by logicians which have studied combinatory logic since the seminal work by Moses Schönfinkel in 1924, and has been the basis for the first study on what would become ML type inference by Robert Feys and Haskell Curry in the 1950s.
The idea to build functions from an expressive set of basic combinators is very appealing and has been applied in various domains, such as the array-manipulation languages derived from APL, or the parser combinator libraries such as Haskell's Parsec. A notable advocate of point-free programming is John Backus. In his 1978 speech "Can Programming Be Liberated From the Von Neumann Style ?", he wrote:
The lambda expression (with its substitution rules) is capable of
defining all possible computable functions of all possible types
and of any number of arguments. This freedom and power has its
disadvantages as well as its obvious advantages. It is analogous
to the power of unrestricted control statements in conventional
languages: with unrestricted freedom comes chaos. If one
constantly invents new combining forms to suit the occasion, as
one can in the lambda calculus, one will not become familiar with
the style or useful properties of the few combining forms that
are adequate for all purposes. Just as structured programming
eschews many control statements to obtain programs with simpler
structure, better properties, and uniform methods for
understanding their behavior, so functional programming eschews
the lambda expression, substitution, and multiple function
types. It thereby achieves programs built with familiar
functional forms with known useful properties. These programs are
so structured that their behavior can often be understood and
proven by mechanical use of algebraic techniques similar to those
used in solving high school algebra problems.
So here they are. The main advantage of point-free programming are that they force a structured combinator style which makes equational reasoning natural. Equational reasoning has been particularly advertised by the proponents of the "Squiggol" movement (see [1] [2]), and indeed use a fair share of point-free combinators and computation/rewriting/reasoning rules.
[1] "An introduction to the Bird-Merteens Formalism", Jeremy Gibbons, 1994
[2] "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire", Erik Meijer, Maarten Fokkinga and Ross Paterson, 1991
Finally, one cause for the popularity of point-free programming among Haskellites is its relation to category theory. In category theory, morphisms (which could be seen as "transformations between objects") are the basic object of study and computation. While partial results allow reasoning in specific categories to be performed in a pointful style, the common way to build, examine and manipulate arrows is still the point-free style, and other syntaxes such as string diagrams also exhibit this "pointfreeness". There are rather tight links between the people advocating "algebra of programming" methods and users of categories in programming (for example the authors of the banana paper [2] are/were hardcore categorists).
You may be interested in the Pointfree page of the Haskell wiki.
The downside of pointfree style is rather obvious: it can be a real pain to read. The reason why we still love to use variables, despite the numerous horrors of shadowing, alpha-equivalence etc., is that it's a notation that's just so natural to read and think about. The general idea is that a complex function (in a referentially transparent language) is like a complex plumbing system: the inputs are the parameters, they get into some pipes, are applied to inner functions, duplicated (\x -> (x,x)) or forgotten (\x -> (), pipe leading nowhere), etc. And the variable notation is nicely implicit about all that machinery: you give a name to the input, and names on the outputs (or auxiliary computations), but you don't have to describe all the plumbing plan, where the small pipes will go not to be a hindrance for the bigger ones, etc. The amount of plumbing inside something as short as \(f,x,y) -> ((x,y), f x y) is amazing. You may follow each variable individually, or read each intermediate plumbing node, but you never have to see the whole machinery together. When you use a point-free style, all the plumbing is explicit, you have to write everything down, and look at it afterwards, and sometimes it's just plain ugly.
PS: this plumbing vision is closely related to the stack programming languages, which are probably the least pointful programming languages (barely) in use. I would recommend trying to do some programming in them just to get of feeling of it (as I would recommend logic programming). See Factor, Cat or the venerable Forth.

I believe the purpose is to be succinct and to express pipelined computations as a composition of functions rather than thinking of threading arguments through. Simple example (in F#) - given:
let sum = List.sum
let sqr = List.map (fun x -> x * x)
Used like:
> sum [3;4;5]
12
> sqr [3;4;5]
[9;16;25]
We could express a "sum of squares" function as:
let sumsqr x = sum (sqr x)
And use like:
> sumsqr [3;4;5]
50
Or we could define it by piping x through:
let sumsqr x = x |> sqr |> sum
Written this way, it's obvious that x is being passed in only to be "threaded" through a sequence of functions. Direct composition looks much nicer:
let sumsqr = sqr >> sum
This is more concise and it's a different way of thinking of what we're doing; composing functions rather than imagining the process of arguments flowing through. We're not describing how sumsqr works. We're describing what it is.
PS: An interesting way to get your head around composition is to try programming in a concatenative language such as Forth, Joy, Factor, etc. These can be thought of as being nothing but composition (Forth : sumsqr sqr sum ;) in which the space between words is the composition operator.
PPS: Perhaps others could comment on the performance differences. It seems to me that composition may reduce GC pressure by making it more obvious to the compiler that there is no need to produce intermediate values as in pipelining; helping make the so-called "deforestation" problem more tractable.

While I'm attracted to the point-free concept and used it for some things, and agree with all the positives said before, I found these things with it as negative (some are detailed above):
The shorter notation reduces redundancy; in a heavily structured composition (ramda.js style, or point-free in Haskell, or whatever concatenative language) the code reading is more complex than linearly scanning through a bunch of const bindings and using a symbol highlighter to see which binding goes into what other downstream calculation. Besides the tree vs linear structure, the loss of descriptive symbol names makes the function hard to intuitively grasp. Of course both the tree structure and the loss of named bindings also have a lot of positives as well, for example, functions will feel more general - not bound to some application domain via the chosen symbol names - and the tree structure is semantically present even if bindings are laid out, and can be comprehended sequentially (lisp let/let* style).
Point-free is simplest when just piping through or composing a series of functions, as this also results in a linear structure that we humans find easy to follow. However, threading some interim calculation through multiple recipients is tedious. There are all kinds of wrapping into tuples, lensing and other painstaking mechanisms go into just making some calculation accessible, that would otherwise be just the multiple use of some value binding. Of course the repeated part can be extracted out as a separate function and maybe it's a good idea anyway, but there are also arguments for some non-short functions and even if it's extracted, its arguments will have to be somehow threaded through both applications, and then there may be a need for memoizing the function to not actually repeat the calculation. One will use a lot of converge, lens, memoize, useWidth etc.
JavaScript specific: harder to casually debug. With a linear flow of let bindings, it's easy to add a breakpoint wherever. With the point-free style, even if a breakpoint is somehow added, the value flow is hard to read, eg. you can't just query or hover over some variable in the dev console. Also, as point-free is not native in JS, library functions of ramda.js or similar will obscure the stack quite a bit, especially with the obligate currying.
Code brittleness, especially on nontrivial size systems and in production. If a new piece of requirement comes in, then the above disadvantages get into play (eg. harder to read the code for the next maintainer who may be yourself a few weeks down the line, and also harder to trace the dataflow for inspection). But most importantly, even something seemingly small and innocent new requirement can necessitate a whole different structuring of the code. It may be argued that it's a good thing in that it'll be a crystal clear representation of the new thing, but rewriting large swaths of point-free code is very time consuming and then we haven't mentioned testing. So it feels that the looser, less structured, lexical assignment based coding can be more quickly repurposed. Especially if the coding is exploratory, and in the domain of human data with weird conventions (time etc.) that can rarely be captured 100% accurately and there may always be an upcoming request for handling something more accurately or more to the needs of the customer, whichever method leads to faster pivoting matters a lot.

To the pointfree variant, the concatenative programming language, i have to write:
I had a little experience with Joy. Joy is a very simple and beautiful concept with lists. When converting a problem into a Joy function, you have to split your brain into a part for the stack plumbing work and a part for the solution in the Joy syntax. The stack is always handled from the back. Since the composition is contained in Joy, there is no computing time for a composition combiner.

Explaining pattern matching vs switch

I have been trying to explain the difference between switch statements and pattern matching(F#) to a couple of people but I haven't really been able to explain it well..most of the time they just look at me and say "so why don't you just use if..then..else".
How would you explain it to them?
EDIT! Thanks everyone for the great answers, I really wish I could mark multiple right answers.

Having formerly been one of "those people", I don't know that there's a succinct way to sum up why pattern-matching is such tasty goodness. It's experiential.
Back when I had just glanced at pattern-matching and thought it was a glorified switch statement, I think that I didn't have experience programming with algebraic data types (tuples and discriminated unions) and didn't quite see that pattern matching was both a control construct and a binding construct. Now that I've been programming with F#, I finally "get it". Pattern-matching's coolness is due to a confluence of features found in functional programming languages, and so it's non-trivial for the outsider-looking-in to appreciate.
I tried to sum up one aspect of why pattern-matching is useful in the second of a short two-part blog series on language and API design; check out part one and part two.

Patterns give you a small language to describe the structure of the values you want to match. The structure can be arbitrarily deep and you can bind variables to parts of the structured value.
This allows you to write things extremely succinctly. You can illustrate this with a small example, such as a derivative function for a simple type of mathematical expressions:
type expr =
| Int of int
| Var of string
| Add of expr * expr
| Mul of expr * expr;;
let rec d(f, x) =
match f with
| Var y when x=y -> Int 1
| Int _ | Var _ -> Int 0
| Add(f, g) -> Add(d(f, x), d(g, x))
| Mul(f, g) -> Add(Mul(f, d(g, x)), Mul(g, d(f, x)));;
Additionally, because pattern matching is a static construct for static types, the compiler can (i) verify that you covered all cases (ii) detect redundant branches that can never match any value (iii) provide a very efficient implementation (with jumps etc.).

Excerpt from this blog article:
Pattern matching has several advantages over switch statements and method dispatch:
Pattern matches can act upon ints,
floats, strings and other types as
well as objects.
Pattern matches can act upon several
different values simultaneously:
parallel pattern matching. Method
dispatch and switch are limited to a single
value, e.g. "this".
Patterns can be nested, allowing
dispatch over trees of arbitrary
depth. Method dispatch and switch are limited
to the non-nested case.
Or-patterns allow subpatterns to be
shared. Method dispatch only allows
sharing when methods are from
classes that happen to share a base
class. Otherwise you must manually
factor out the commonality into a
separate function (giving it a
name) and then manually insert calls
from all appropriate places to your
unnecessary function.
Pattern matching provides redundancy
checking which catches errors.
Nested and/or parallel pattern
matches are optimized for you by the
F# compiler. The OO equivalent must
be written by hand and constantly
reoptimized by hand during
development, which is prohibitively
tedious and error prone so
production-quality OO code tends to
be extremely slow in comparison.
Active patterns allow you to inject
custom dispatch semantics.

Off the top of my head:
The compiler can tell if you haven't covered all possibilities in your matches
You can use a match as an assignment
If you have a discriminated union, each match can have a different 'type'

Tuples have "," and Variants have Ctor args .. these are constructors, they create things.
Patterns are destructors, they rip them apart.
They're dual concepts.
To put this more forcefully: the notion of a tuple or variant cannot be described merely by its constructor: the destructor is required or the value you made is useless. It is these dual descriptions which define a value.
Generally we think of constructors as data, and destructors as control flow. Variant destructors are alternate branches (one of many), tuple destructors are parallel threads (all of many).
The parallelism is evident in operations like
(f * g) . (h * k) = (f . h * g . k)
if you think of control flowing through a function, tuples provide a way to split up a calculation into parallel threads of control.
Looked at this way, expressions are ways to compose tuples and variants to make complicated data structures (think of an AST).
And pattern matches are ways to compose the destructors (again, think of an AST).

Switch is the two front wheels.
Pattern-matching is the entire car.

Pattern matches in OCaml, in addition to being more expressive as mentioned in several ways that have been described above, also give some very important static guarantees. The compiler will prove for you that the case-analysis embodied by your pattern-match statement is:
exhaustive (no cases are missed)
non-redundant (no cases that can never be hit because they are pre-empted by a previous case)
sound (no patterns that are impossible given the datatype in question)
This is a really big deal. It's helpful when you're writing the program for the first time, and enormously useful when your program is evolving. Used properly, match-statements make it easier to change the types in your code reliably, because the type system points you at the broken match statements, which are a decent indicator of where you have code that needs to be fixed.

If-Else (or switch) statements are about choosing different ways to process a value (input) depending on properties of the value at hand.
Pattern matching is about defining how to process a value given its structure, (also note that single case pattern matches make sense).
Thus pattern matching is more about deconstructing values than making choices, this makes them a very convenient mechanism for defining (recursive) functions on inductive structures (recursive union types), which explains why they are so abundantly used in languages like Ocaml etc.
PS: You might know the pattern-match and If-Else "patterns" from their ad-hoc use in math;
"if x has property A then y else z" (If-Else)
"some term in p1..pn where .... is the prime decomposition of x.." ((single case) pattern match)

Perhaps you could draw an analogy with strings and regular expressions? You describe what you are looking for, and let the compiler figure out how for itself. It makes your code much simpler and clearer.
As an aside: I find that the most useful thing about pattern matching is that it encourages good habits. I deal with the corner cases first, and it's easy to check that I've covered every case.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart