Conceptually I see a value as a single element. I can understand that at the lowest level of hardware the value returned is zero or one. I just see a "value" as returning a single unit. I see a procedure as a multiple unit. For example, a procedure (+ x x) to me seems like it should return "(", ")", "+" , "x". In this example, the value of lambda is the procedure.
What am I missing here?
Scheme is primarily a functional programming language. Functional languages deal with expressions1 (as opposed to statements); expressions are at the core of functional languages pretty much like classes are at the core of object-oriented languages.
In Scheme, functions are expressed as lambda expressions. Since Scheme primarily deals with expressions, and since lambda expressions themselves are expressions, Scheme deals with functions just like any other expression. Therefore, functions are first-class citizens of the language.
I don't think one should feel overly concerned about how exactly all of this translates under the hood in terms of bits and bytes. What plays as a strength in some languages (C/C++) can quickly turn against you here: imperative thinking in Scheme will only get you frustrated, and bounce you right back out to mainstream paradigms and languages.
What functional languages are really about is abstraction, metaprogramming (many Schemes feature powerful syntactic macros), and more abstraction. There is this well-known quote from Peter Deutsch: "Lisp ... made me aware that software could be close to executable mathematics.", I think it sums it up very well.
1 In Scheme (and other dialects of LISP), s-expressions are used to denote expressions. They give the language that distinctive parenthesized syntax.
When the program is compiled the code for a procedure is generated. When you run the program the code for a procedure is stored at a given address. A procedure value will in most implementations consist of a record/struct containing the name of the procedure, the address where the procedure us stored (when you call the procedure, the cpu jumps to this address) and finally in the case of procedure created by a lambda expression a table of the free values.
In Scheme everything is passed by value. When that said values that are not able to be stored in the actual address space of a machine word are pointers to data structures.
An internal procedure (primitive) is treated specially such that it might be just a value while a evaluated lambda expression is multi value object. That object has an address which is the "value" of the procedure. When evaluating a lambda form it turns into such a structure. Example:
;; a lambda form is evaluated into a closure and then
;; that object is the value of variable x
(define x (lambda y y)) ; ==> undefined
;; x is evaluated and turns out to be a closure. Scheme
;; evalautes rest of arguments before applying.
(x 1 2 3) ; ==> (1 2 3)
;; same only that all the arguments also evaluates to the same closure.
(x x x x) ; ==> (#<closure> #<closure> #<closure>)
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last month.
Improve this question
I'm reading Crafting Interpreters. It's very readable. Now I'm reading
chapter 17 Compiling Expressions and find algorithm:
Vaughan Pratt’s “top-down operator precedence parsing”. The implementation is very brief
and I don't understand it why it works.
So I read Vaughan Pratt’s “top-down operator precedence parsing” paper. It's so old
and not easy to read. I read related blogs about it and spend days reading the
original paper.
related blogs :
https://abarker.github.io/typped/pratt_parsing_intro.html
https://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/
https://matklad.github.io/2020/04/13/simple-but-powerful-pratt-parsing.html
I am now more confident that I can write an implementation. But I still can't see the
trick behind the magic. Here are some of my questions, if I can describe them clealy:
What grammar can Pratt’s parser handle? Floyd Operator Grammar? I even take a look
at Floyd's paper, but it's very abstract. Or Pratt’s parser can handle any Language
as long as it meets the restrictions on page 44
These restrictions on the language, while slightly irksome,...
On page 45,Theorem 2, Proof.
First assign even integers (to make room for the followin~terpolations) to the data type classes.
Then to each argument position assign an integer lying strictly (where possible) between the integers
corresponding to the classes of the argument and result types.
On page 44,
The idea is to assign data types to classes and then to totally order
the classes.
An example might be, in ascending order,Outcomes (e.g., the pseudo-result of “print”), Booleans,
Graphs (e.g. trees, lists, plexes), Strings, Algebraics (e.g. integers, complex nos, polynomials)...
I can't figure out what the term "data type" means in this paper. If it means primitive
data types in a programming language, like boolean , int , char in Java, then the following exmaple
may be Counterexample
1 + 2 * 3
for +, it's argument type is number, say assign integer 2 to data type number class. +'s result data type is
also a number. so + must have the integer 2. But the same is for *. In this way + and * would have the same
binding power.
I guess data type in this paper is the AST Node type. So +'s result type is term, *'s result type is factor,
which will have a bigger integger than +'s. But I can't be sure.
By data type, Pratt meant, roughly, "primitive data type in the programming language being parsed". (Although he included some things which are often not thought about as types, unless you program in C or C++, and I don't think he contemplated user-defined types.)
But Pratt is not really talking about a parsing algorithm on page 44. What he's talking about is language design. He wants languages to be designed in such a way that a simple semantic rule involving argument and result types can be used to determine operator precedence. He wants this rule, apparently, in order to save the programmer the task of memorizing arbitrary operator precedence tables (or to have to deal with different languages ordering operators in different ways.)
These are indeed laudable goals, but I'm afraid that the train had already left the station when Pratt was writing that paper. Half a century later, it's evident that we're never going to achieve that level of interlanguage consistency. Fortunately, we can always use parentheses, and we can even write compilers which nag at you about not using redundant parentheses until you give up and write them.
Anyway, that paragraph probably contravened SO's no-opinions policy, so let's get back to Pratt's paper. The rule he proposes is that all of a languages primitive datatypes be listed in a fixed total ordering, with the hope that all language designers will choose the same ordering. (I use the word "dominates" to describe this ordering: type A dominates another type B if A comes later in Pratt's list than B. A type does not dominate itself.)
At one end of the ordering is the null type, which is the result type of an operator which doesn't have a return value. Pratt calls this type "Outcome", since an operator which doesn't return anything must have had some side-effect --its "outcome"-- in order to not be pointless. At the other end of the ordering is what C++ calls a reference type: something which can be used as an argument to an assignment operator. And he then proposes a semantic rule: no operator can produce a result whose type dominates the type of one or more of its arguments, unless the operator's syntax unambiguously identifies the arguments.
That last exception is clearly necessary, since there will always be operators which produce types subordinate to the types of their arguments. Pratt's example is the length operator, which in his view must require parentheses because the Integer type dominates the String and Collection types, so length x, which returns an Integer given a String, cannot be legal. (You could write length(x) or |x| (provided | is not used for other purposes), because those syntaxes are unambiguous.)
It's worth noting that this rule must also apply to implicit coercions, which is equivalent to saying that the rule applies to all overloads of a single operator symbol. (C++ was still far in the future when Pratt was writing, but implicit coercions were common.)
Given that total ordering on types and the restriction (which Pratt calls "slightly irksome") on operator syntax, he then proposes a single simple syntactic rule: operator associativity is resolved by eliminating all possibilities which would violate type ordering. If that's not sufficient to resolve associativity, it can only be the case that there is only one type between the result and argument types of the two operators vying for precedence. In that case, associativity is to the left.
Pratt goes on to prove that this rule is sufficient to remove all ambiguity, and furthermore that it is possible to derive Floyd's operator precedence relation from type ordering (and the knowledge about return and argument types of every operator). So syntactically, Pratt's languages are similar to Floyd's operator precedence grammars.
But remember that Pratt is talking about language design, not parsing. (Up to that point of the paper.) Floyd, on the other hand, was only concerned with parsing, and his parsing model would certainly allow a prefix length operator. (See, for example, C's sizeof operator.) So the two models are not semantically equivalent.
This reduces the amount of memorization needed by someone learning the language: they only have to memorize the order of types. They don't need to try to memorize the precedence between, say, a concatenation operator and a division operator, because they can just work it out from the fact that Integer dominates String. [Note 1]
Unfortunately, Pratt has to deal with the fact that "left associative unless you have a type mismatch" really does not cover all the common expectations for expression parsing. Although not every language complies, most of us would be surprised to find that a*4 + b*6 was parsed as ((a * 4) + b) * 6, and would demand an explanation. So Pratt proposes that it is possible to make an exception by creating "pseudotypes". We can pretend that the argument and return types of multiplication and division are different from (and dominate) the argument and return types of addition and subtraction. Then we can allow the Product type to be implicitly coerced to the Sum type (conceptually, because the coercion does nothing), thereby forcing the desired parse.
Of course, he has now gone full circle: the programmer needs to memorise both the type ordering rules, but also the pseudotype ordering rules, which are nothing but precedence rules in disguise.
The rest of the paper describes the actual algorithm, which is quite clever. Although it is conceptually identical to Floyd's operator precedence parsing, Pratt's algorithm is a top-down algorithm; it uses the native call stack instead of requiring a separate stack algorithm, and it allows the parser to interpolate code with production parsing without waiting for the production to terminate.
I think I've already deviated sufficiently from SO's guidelines in the tone of this answer, so I'll leave it at that, with no other comment about the relative virtues of top-down and bottom-up control flows.
Notes
Integer dominates String means that there is no implicit coercion from a String to an Integer, the same reason that the length operator needs to be parenthesised. There could be an implicit coercion from Integer to String. So the expression a divide b concatenate c must be parsed as (a divide b) concatenate c. a divide (b concatenate c) is disallowed and so the parser can ignore the possibility.
I was curious that are there any ternary operator being used in programming language except ?: operator. And could found only 2 from wikipedia
Is it only operator we have been used? Are there any more than these?
Element update
Another useful class of ternary operator, especially in functional languages, is the "element update" operation. For example, OCaml expressions have three kinds of update syntax:
a.b<-c means the record a where field b has value c
a.(b)<-c means the array a where index b has value c
a.[b]<-c means the string a where index b has value c
Note that these are not "update" in the sense of "assignment" or "modification"; the original object is unchanged, and a new object is yielded that has the stated properties. Consequently, these operations cannot be regarded as a simple composition of two binary operators.
Similarly, the Isabelle theorem prover has:
a(|b := c|) meaning the record a where field b has value c
Array slice
Yet another sort of ternary operator is array slice, for example in Python we have:
a[b:c] meaning an array whose first element is a[b] and last element is a[c-1]
In fact, Python has a quaternary form of slice:
a[b:c:d] meaning an array whose elements are a[b + n*d] where n ranges from 0 to the largest value such that b + n*d < c
Bash/ksh variable substitution
Although quite obscure, bash has several forms of variable expansion (apparently borrowed from ksh) that are ternary:
${var:pos:len} is a maximum of len characters from $var, starting at pos
${var/Pattern/Replacement} is $var except the first substring within it that matches Pattern is replaced with Replacement
${var//Pattern/Replacement} is the same except all matches are replaced
${var/#Pattern/Replacement} is like the first case except Pattern has to match a prefix of $var
${var/%Pattern/Replacement} is like the previous except for matching a suffix
These are borderline in my opinion, being close to ordinary functions that happen to accept three arguments, written in the sometimes baroque style of shell syntax. But, I include them as they are entirely made of non-letter symbols.
Congruence modulo
In mathematics, an important ternary relation is congruence modulo:
a ≡ b (mod c) is true iff a and b both belong to the same equivalence class in c
I'm not aware of any programming language that has this, but programming languages often borrow mathematical notation, so it's possible it exists in an obscure language. (Of course, most programming languages have mod as a binary operator, allowing the above to be expressed as (a mod c) == (b mod c).) Furthermore, unlike the bash variable substitution syntax, if this were introduced in some language, it would not be specific to that language since it is established notation elsewhere, making it more similar to ?: in ubiquity.
Excluded
There are some categories of operator I've chosen to exclude from the category of "ternary" operator:
Operations like function application (a(b,c)) that could apply to any number of operators.
Specific named functions (e.g., f(a,b,c)) that accept three arguments, as there are too many of them for any to be interesting in this context.
Operations like SUM (Σ) or let that function as a binding introduction of a new variable, since IMO a ternary operator ought to act on three already-existing things.
One-letter operators in languages like sed that happen to accept three arguments, as these really are like the named function case, and the language just has a very terse naming convention.
Well it’s not a ternary operator per-say but I do think the three way comparison operator is highly underrated.
The ternary operator is appropriate when a computation has to take place, even if I cannot use the effect inside of an if/else statement or switch statement Consequently, 0 or the DEFAULT VALUE is treated as a DEFAULT VALUE when I try the computation.
The if/else or switch statements require me to enumerate every case that can take place and are only helpful if the ability to discriminate between a numeric value and a branching choice can help. In some cases, it is clear why it won't help if a condition test exists, simply because I am either too early or too late to do something about the condition, even though some other function is unable to test for the given condition.
The ternary operator forces a computation to have the effect that can pass through the code that follows it. Other types of condition tests resort to using, for instance, the && and || operators which can't guarantee a pass through.
I am getting the user to input a function, e.g. y = 2x^2 + 3, as a string. What I am looking to do is to enter that string into TChart and for TChart to graph the function.
As far as I know, TChart/TeeChart will only accept X values that are assigned values, e.g. -10 to 10 for X, so the X value would need to be calculated each time - this isn't an issue.
The issue is getting each part of the inputted function and substituting the X-values into each part. The workaround I have found is to get the user to enter the degree for each part of the function, e.g. 2 for X^2, 3 for X^3, etc. but is there a cleaner way of doing this?
If I could convert the inputted string into a Mathematical formula which TeeChart would accept, that would be the ideal outcome.
Saying that you can't use external units effectively makes your question unanswerable in the SO format, because the topic is far broader (and deeper) that can comfortably be dealt with in SO's Q&A format. So the following is at best an outline:
If you want to, or have to, write a DIY expression evaluator, one way to do it is to proceed as follows:
Write yourself a class that takes a string as input and snips it up into a series of symbols, aka "tokens" which represent the component parts of the expression, e.g, numbers, operators, parentheses, names of functions, names of variables, etc; these tokens might themselves be records or class instances and need to include a mechanisms for storing values associated with particular symbols (e.g. the tokens that represent numbers in the input). This step is called "tokenisation" or "lexing". Store the resulting list of symbols in a list or similiar structure. This class needs to implement a mechanism to retrieve the next symbol from the list (usually, this method is called something like "NextToken") and indicate whether there are any symbols left. This class also needs a mechanism to "put back" a symbol (or, equivalently, "peek" the symbol following the current one).
Then, write yourself a s/ware machine which takes the tokenised symbols and "evaluates" the list of symbols to produce the (mathematical) result you're after. This step is an order of magnitude or two more difficult than the tokenisation step. There are numerous ways to do it. As I said an a comment earlier, a recursive descent parser is probably the most tractable approach if you've never done anything like this before. There are countless examples in textbooks, but here's a link to an article about a Delphi implementation that should be understandable as an intro:
http://www8.umoncton.ca/umcm-deslierres_michel/Calcs/ParsingMathExpr-1.html
That article begins by noting that there are numerous pre-existing Delphi expression evaluators but makes the point that they are not necessarily the best place to start for someone wanting to learn how to write an evaluator/parser rather than just use one. Instead it goes through the coding of an evaluator to implement this simple expression grammar:
expression : term | term + term | term − term
term : factor | factor * factor | factor / factor
factor : number | ( expression ) | + factor | − factor
(the vertical bar | denotes ‘or’)
The article has a link to a second part which shows had to add exponentiation to the evaluator - this is trickier than it might sound and involves issues of ambiguity: e.g. how to evaluate - and what does it mean to write - an expression like
x^y^z
? This relates to the issue of "associativity": most operators are "left associative" which means that they bind more tightly to what's on the left of them than what's on their right. The exponentiation operator is an example of the reverse, where the operator binds more tightly to what's on its right.
Have fun!
By the way, you used to see suggestions to implement an evaluator using the "shunting yard algorithm"
http://en.wikipedia.org/wiki/Shunting-yard_algorithm
to convert an "infix" expression where the operators are between the operands, as in 1 + 3 * 4 to RPN (reverse Polish notation), as used on older HP calculators. The reason to do that was that RPN makes for much more efficient evaluation of an expression that the infix equivalent. Ymmv, but personally I found that implementing the SY algorithm properly was actually trickier than learning how to write an evaluator in the expression/term/factor style.
Fwiw, RPN is the basis of the Forth programming language, http://en.wikipedia.org/wiki/Forth_%28programming_language%29, so you could write a Forth implementation in Delphi if you wanted!
Why is it that functions in F# and OCaml (and possibly other languages) are not by default recursive?
In other words, why did the language designers decide it was a good idea to explicitly make you type rec in a declaration like:
let rec foo ... = ...
and not give the function recursive capability by default? Why the need for an explicit rec construct?
The French and British descendants of the original ML made different choices and their choices have been inherited through the decades to the modern variants. So this is just legacy but it does affect idioms in these languages.
Functions are not recursive by default in the French CAML family of languages (including OCaml). This choice makes it easy to supercede function (and variable) definitions using let in those languages because you can refer to the previous definition inside the body of a new definition. F# inherited this syntax from OCaml.
For example, superceding the function p when computing the Shannon entropy of a sequence in OCaml:
let shannon fold p =
let p x = p x *. log(p x) /. log 2.0 in
let p t x = t +. p x in
-. fold p 0.0
Note how the argument p to the higher-order shannon function is superceded by another p in the first line of the body and then another p in the second line of the body.
Conversely, the British SML branch of the ML family of languages took the other choice and SML's fun-bound functions are recursive by default. When most function definitions do not need access to previous bindings of their function name, this results in simpler code. However, superceded functions are made to use different names (f1, f2 etc.) which pollutes the scope and makes it possible to accidentally invoke the wrong "version" of a function. And there is now a discrepancy between implicitly-recursive fun-bound functions and non-recursive val-bound functions.
Haskell makes it possible to infer the dependencies between definitions by restricting them to be pure. This makes toy samples look simpler but comes at a grave cost elsewhere.
Note that the answers given by Ganesh and Eddie are red herrings. They explained why groups of functions cannot be placed inside a giant let rec ... and ... because it affects when type variables get generalized. This has nothing to do with rec being default in SML but not OCaml.
One crucial reason for the explicit use of rec is to do with Hindley-Milner type inference, which underlies all staticly typed functional programming languages (albeit changed and extended in various ways).
If you have a definition let f x = x, you'd expect it to have type 'a -> 'a and to be applicable on different 'a types at different points. But equally, if you write let g x = (x + 1) + ..., you'd expect x to be treated as an int in the rest of the body of g.
The way that Hindley-Milner inference deals with this distinction is through an explicit generalisation step. At certain points when processing your program, the type system stops and says "ok, the types of these definitions will be generalised at this point, so that when someone uses them, any free type variables in their type will be freshly instantiated, and thus won't interfere with any other uses of this definition."
It turns out that the sensible place to do this generalisation is after checking a mutually recursive set of functions. Any earlier, and you'll generalise too much, leading to situations where types could actually collide. Any later, and you'll generalise too little, making definitions that can't be used with multiple type instantiations.
So, given that the type checker needs to know about which sets of definitions are mutually recursive, what can it do? One possibility is to simply do a dependency analysis on all the definitions in a scope, and reorder them into the smallest possible groups. Haskell actually does this, but in languages like F# (and OCaml and SML) which have unrestricted side-effects, this is a bad idea because it might reorder the side-effects too. So instead it asks the user to explicitly mark which definitions are mutually recursive, and thus by extension where generalisation should occur.
There are two key reasons this is a good idea:
First, if you enable recursive definitions then you can't refer to a previous binding of a value of the same name. This is often a useful idiom when you are doing something like extending an existing module.
Second, recursive values, and especially sets of mutually recursive values, are much harder to reason about then are definitions that proceed in order, each new definition building on top of what has been already defined. It is nice when reading such code to have the guarantee that, except for definitions explicitly marked as recursive, new definitions can only refer to previous definitions.
Some guesses:
let is not only used to bind functions, but also other regular values. Most forms of values are not allowed to be recursive. Certain forms of recursive values are allowed (e.g. functions, lazy expressions, etc.), so it needs an explicit syntax to indicate this.
It might be easier to optimize non-recursive functions
The closure created when you create a recursive function needs to include an entry that points to the function itself (so the function can recursively call itself), which makes recursive closures more complicated than non-recursive closures. So it might be nice to be able to create simpler non-recursive closures when you don't need recursion
It allows you to define a function in terms of a previously-defined function or value of the same name; although I think this is bad practice
Extra safety? Makes sure that you are doing what you intended. e.g. If you don't intend it to be recursive but you accidentally used a name inside the function with the same name as the function itself, it will most likely complain (unless the name has been defined before)
The let construct is similar to the let construct in Lisp and Scheme; which are non-recursive. There is a separate letrec construct in Scheme for recursive let's
Given this:
let f x = ... and g y = ...;;
Compare:
let f a = f (g a)
With this:
let rec f a = f (g a)
The former redefines f to apply the previously defined f to the result of applying g to a. The latter redefines f to loop forever applying g to a, which is usually not what you want in ML variants.
That said, it's a language designer style thing. Just go with it.
A big part of it is that it gives the programmer more control over the complexity of their local scopes. The spectrum of let, let* and let rec offer an increasing level of both power and cost. let* and let rec are in essence nested versions of the simple let, so using either one is more expensive. This grading allows you to micromanage the optimization of your program as you can choose which level of let you need for the task at hand. If you don't need recursion or the ability to refer to previous bindings, then you can fall back on a simple let to save a bit of performance.
It's similar to the graded equality predicates in Scheme. (i.e. eq?, eqv? and equal?)
I have been trying to explain the difference between switch statements and pattern matching(F#) to a couple of people but I haven't really been able to explain it well..most of the time they just look at me and say "so why don't you just use if..then..else".
How would you explain it to them?
EDIT! Thanks everyone for the great answers, I really wish I could mark multiple right answers.
Having formerly been one of "those people", I don't know that there's a succinct way to sum up why pattern-matching is such tasty goodness. It's experiential.
Back when I had just glanced at pattern-matching and thought it was a glorified switch statement, I think that I didn't have experience programming with algebraic data types (tuples and discriminated unions) and didn't quite see that pattern matching was both a control construct and a binding construct. Now that I've been programming with F#, I finally "get it". Pattern-matching's coolness is due to a confluence of features found in functional programming languages, and so it's non-trivial for the outsider-looking-in to appreciate.
I tried to sum up one aspect of why pattern-matching is useful in the second of a short two-part blog series on language and API design; check out part one and part two.
Patterns give you a small language to describe the structure of the values you want to match. The structure can be arbitrarily deep and you can bind variables to parts of the structured value.
This allows you to write things extremely succinctly. You can illustrate this with a small example, such as a derivative function for a simple type of mathematical expressions:
type expr =
| Int of int
| Var of string
| Add of expr * expr
| Mul of expr * expr;;
let rec d(f, x) =
match f with
| Var y when x=y -> Int 1
| Int _ | Var _ -> Int 0
| Add(f, g) -> Add(d(f, x), d(g, x))
| Mul(f, g) -> Add(Mul(f, d(g, x)), Mul(g, d(f, x)));;
Additionally, because pattern matching is a static construct for static types, the compiler can (i) verify that you covered all cases (ii) detect redundant branches that can never match any value (iii) provide a very efficient implementation (with jumps etc.).
Excerpt from this blog article:
Pattern matching has several advantages over switch statements and method dispatch:
Pattern matches can act upon ints,
floats, strings and other types as
well as objects.
Pattern matches can act upon several
different values simultaneously:
parallel pattern matching. Method
dispatch and switch are limited to a single
value, e.g. "this".
Patterns can be nested, allowing
dispatch over trees of arbitrary
depth. Method dispatch and switch are limited
to the non-nested case.
Or-patterns allow subpatterns to be
shared. Method dispatch only allows
sharing when methods are from
classes that happen to share a base
class. Otherwise you must manually
factor out the commonality into a
separate function (giving it a
name) and then manually insert calls
from all appropriate places to your
unnecessary function.
Pattern matching provides redundancy
checking which catches errors.
Nested and/or parallel pattern
matches are optimized for you by the
F# compiler. The OO equivalent must
be written by hand and constantly
reoptimized by hand during
development, which is prohibitively
tedious and error prone so
production-quality OO code tends to
be extremely slow in comparison.
Active patterns allow you to inject
custom dispatch semantics.
Off the top of my head:
The compiler can tell if you haven't covered all possibilities in your matches
You can use a match as an assignment
If you have a discriminated union, each match can have a different 'type'
Tuples have "," and Variants have Ctor args .. these are constructors, they create things.
Patterns are destructors, they rip them apart.
They're dual concepts.
To put this more forcefully: the notion of a tuple or variant cannot be described merely by its constructor: the destructor is required or the value you made is useless. It is these dual descriptions which define a value.
Generally we think of constructors as data, and destructors as control flow. Variant destructors are alternate branches (one of many), tuple destructors are parallel threads (all of many).
The parallelism is evident in operations like
(f * g) . (h * k) = (f . h * g . k)
if you think of control flowing through a function, tuples provide a way to split up a calculation into parallel threads of control.
Looked at this way, expressions are ways to compose tuples and variants to make complicated data structures (think of an AST).
And pattern matches are ways to compose the destructors (again, think of an AST).
Switch is the two front wheels.
Pattern-matching is the entire car.
Pattern matches in OCaml, in addition to being more expressive as mentioned in several ways that have been described above, also give some very important static guarantees. The compiler will prove for you that the case-analysis embodied by your pattern-match statement is:
exhaustive (no cases are missed)
non-redundant (no cases that can never be hit because they are pre-empted by a previous case)
sound (no patterns that are impossible given the datatype in question)
This is a really big deal. It's helpful when you're writing the program for the first time, and enormously useful when your program is evolving. Used properly, match-statements make it easier to change the types in your code reliably, because the type system points you at the broken match statements, which are a decent indicator of where you have code that needs to be fixed.
If-Else (or switch) statements are about choosing different ways to process a value (input) depending on properties of the value at hand.
Pattern matching is about defining how to process a value given its structure, (also note that single case pattern matches make sense).
Thus pattern matching is more about deconstructing values than making choices, this makes them a very convenient mechanism for defining (recursive) functions on inductive structures (recursive union types), which explains why they are so abundantly used in languages like Ocaml etc.
PS: You might know the pattern-match and If-Else "patterns" from their ad-hoc use in math;
"if x has property A then y else z" (If-Else)
"some term in p1..pn where .... is the prime decomposition of x.." ((single case) pattern match)
Perhaps you could draw an analogy with strings and regular expressions? You describe what you are looking for, and let the compiler figure out how for itself. It makes your code much simpler and clearer.
As an aside: I find that the most useful thing about pattern matching is that it encourages good habits. I deal with the corner cases first, and it's easy to check that I've covered every case.