Hyperledger documentation questions - hyperledger

I am going through the Hyperledger documentation on architecture. I am not able to follow some of the symbols and text. For example,
Both V and N contain a special element \bot,
What does \bot mean?
for k\in K and v\in V,
What do k\ and \v mean?

Note that the architecture section on the Hyperledger documentation is highly mathematical. Some of the symbols (or intended symbols) may have come from another source and are not properly rendered in the document.
The \bot symbol in LaTeX represents the empty type or falsum: ⊥. In the particular case of the state mapping K -> (V X N), the author is just trying to explain that all values and versions are empty when a key is initialized.
The \in symbol represents the relation "is a member of": k ∈ K and v ∈ V.

Related

How to calculate Follow set of a production rule that has only ONE symbol on the right side

So i have this grammar :
S -> (D)
D -> EF
E -> a|b|S
F -> *D | +D | ε
First of all, books solution uses the P -> pBq , First(q) - {ε} is subset of FOLLOW(B) for the rule D -> EF but that rule has only 2 symbols do we assume ε infront of E (ε being the p in pBq)?
And secondly i can't understand how to calculate Follow(E).
FOLLOW(E) consists of every terminal symbol which can immediately follow E in some derivation step. That's the precise definition; it's not very complicated.
For a simple grammar, you should be able to figure out all the FOLLOW sets just be looking at the grammar and applying a little bit of common sense. It would probably be a good idea to do that, since it will give you a better idea of how the algorithm works.
As a side note, it's maybe worth mentioning that ε is not a thing. Or at least, it's not a grammar symbol. It's one of several conventions used to make the empty sequence visible, just like 0 is a way to make nothing visible. Sometimes that's useful, but it's important to not let it confuse you. (Abuse of notation is endemic in mathematics, which can be frustrating.)
So, what can follow E? E only appears in one place on the right-hand sdie of that grammar, in the production D → E F. So clearly any symbol which be the first symbol of F must be in FOLLOW(E). The symbols which could be at the start of F are + and *, since as mentioned, ε is not a grammar symbol. (Many definitions of FIRST allow ε to be a member of that set, along with any actual terminal symbol. That's an example of the abuse of notation I was talking about in the previous paragraph, since it makes it look like ε is a terminal symbol. But it isn't. It's nothing.)
F is what we call a "nullable" non-terminal, because it can derive the empty sequence (which was written as ε so that you can see it). In other words, it's possible for F to disappear completely in a derivation step. And if it does disappear, then E might be at the end of the production D → E F. If E is at the end of D, then it can be followed by anything which could follow D, which includes ). D can also appear at the end of a derivation of F, which means that F could be followed by anything which could follow F, a tautology which adds no information whatsoever.
So it's easy to see that FOLLOW(F) = {*, +, )}, and you can use that to check your understanding of any algorithm to compute follow sets.
Now, I don't know what book you are referring to (and it would have been courteous to mention that in your original question; sources should always be correctly cited). But the book I happen to have in front of me --the Dragon Book-- has a pretty similar algorithm. The Dragon book uses a simple convention for writing statements like that. Probably your book does, too, but it might not be the same convention. You should check what it says and make sure that you typed the copied statement correctly, respecting whatever formatting used to indicate what the symbols stand for.
In the Dragon book, some of the conventions include:
Lower case characters at the start of the alphabet. –a, b, c,…– are terminals (as well as actual symbols like * and +).
Upper case characters at the start of the alphabet. –A, B, C,…– are non-terminals.
S is the start symbol.
Upper case characters at the end of the alphabet. –X, Y, Z– stand for arbitrary grammar symbols (either terminals or non-terminals).
$ is the marker used to indicate the end of the input.
Lower-case Greek letters –α, β, γ,…– are possibly-empty strings of grammar symbols.
The phrase "possibly empty" is very important, so I'm repeating it.
With that convention, they write the rules for computing the FOLLOW set:
Place $ in FOLLOW(S).
For every production A → αBβ, copy everything from FIRST(&beta) except ε into FOLLOW(B).
If there is a production A → αB or a production A → αBβ where FIRST(β) contains ε, place everything in FOLLOW(A) into FOLLOW(B).
As mentioned above, α is a possibly-empty string of grammar symbols. So it might not be visible.
Keep doing steps 2 and 3 until no new symbols are added to any follow set.
I'm pretty sure that the algorithm in your book differs only in notation conventions.

What is `where .force`?

I've been playing around with the idea of writing programs that run on Streams and properties with them, but I feel that I am stuck even with the simplest of things. When I look at the definition of repeat in Codata/Streams in the standard library, I find a construction that I haven't seen anywhere in Agda: λ where .force →.
Here, an excerpt of a Stream defined with this weird feature:
repeat : ∀ {i} → A → Stream A i
repeat a = a ∷ λ where .force → repeat a
Why does where appear in the middle of the lambda function definition?, and what is the purpose of .force if it is never used?
I might be asking something that is in the documentation, but I can't figure out how to search for it.
Also, is there a place where I can find documentation to use "Codata" and proofs with it? Thanks!
Why does where appear in the middle of the lambda function definition?,
Quoting the docs:
Anonymous pattern matching functions can be defined using one of the
two following syntaxes:
\ { p11 .. p1n -> e1 ; … ; pm1 .. pmn -> em }
\ where p11 .. p1n -> e1 … pm1 .. pmn -> em
So λ where is an anonymous pattern matching function. force is the field of Thunk and .force is a copattern in postfix notation (originally I said nonsense here, but thanks to #Cactus it's now fixed, see his answer).
Also, is there a place where I can find documentation to use "Codata" and proofs with it? Thanks!
Check out these papers
Normalization by Evaluation in the Delay Monad
A Case Study for Coinduction via Copatterns and Sized Types
Equational Reasoning about Formal Languages in Coalgebraic Style
Guarded Recursion in Agda via Sized Types
As one can see in the definition of Thunk, force is the field of the Thunk record type:
record Thunk {ℓ} (F : Size → Set ℓ) (i : Size) : Set ℓ where
coinductive
field force : {j : Size< i} → F j
So in the pattern-matching lambda, .force is not a dot pattern (why would it be? there is nothing prescribing the value of the parameter), but instead is simply syntax for the record field selector force. So the above code is equivalent to making a record with a single field called force with the given value, using copatterns:
repeat a = a :: as
where
force as = repeat a
or, which is actually where the .force syntax comes from, using postfix projection syntax:
repeat a = a :: as
where
as .force = repeat a

Agda: How to apply +-right-identity to match types

Somewhere in my code, I have a hole which expects a natural number, let's call it n for our purposes. I have a function which returns me a n + 0.
Data.Nat.Properties.Simple contains a proof +-right-identity of the following type:
+-right-identity : ∀ n → n + 0 ≡ n
I'm not familiar enough with the Agda syntax and stdlib yet to know how to easily use this proof to convince the type checker that I can use my value.
More generally, how to I use a relation x ≡ y to transform a given x into y?
I found an answer in this thread: Agda Type-Checking and Commutativity / Associativity of +
For future readers, the keyword I was looking for was rewrite.
By appending rewrite +-right-identity n to the pattern matching (before the = sign), Agda "learned" about this equality.

Determining FOLLOW sets of CFG

On this page the author explains how to determine the FOLLOW sets of a CFG. Under the headline Syntax Analysis Goal: FOLLOW Sets he states:
Steps to Make the Follow Set
Conventions: a, b, and c represent a terminal or non-terminal. a*
represents zero or more terminals or non-terminals (possibly both). a+
represents one or more... D is a non-terminal.
Place an End of Input token ($) into the starting rule's follow set.
Suppose we have a rule R → a*Db. Everything in First(b) (except for ε)
is added to Follow(D). If First(b) contains ε then everything in
Follow(R) is put in Follow(D).
Finally, if we have a rule R → a*D,
then everything in Follow(R) is placed in Follow(D).
The Follow set of
a terminal is an empty set.
So far so good. But in the box below this item, we read:
[...] Step 2 on rule 1 (N → V = E) indicates that first(=) is in Follow(V).
Now this is the part I don't understand. When he says that First(=) is in Follow (V), he obviously maps = to b and V to D (b and D from the explication in the first box). But (a*)(D)(b) does not match ()(V)(=)E.
Am I reading this completely wrong, or did the author maybe write a*Db instead of a*Dba*?
(Especially if you read this on wikipedia: "FOLLOW(I) of an Item I [A → α • B β, x] is the set of terminals that can appear after nonterminal B, where α, β are arbitrary symbol strings, and x is an arbitrary lookahead terminal.")
Yes, he meant:
R → a* D b*
and since b* could be zero symbols, i.e. ε, the second rule is unneeded. Remember that FIRST is defined on arbitrary sequences of symbols.
In other words, for:
A → α B β
Every terminal in FIRST(β) is in FOLLOW(B), and
If β ⇒* ε, then everything in FOLLOW(A) is in FOLLOW(B).
Here's what Aho, Sethi & Ullman say in the dragon book:
Formally, we say LR(1) item [A → α·β, a] is valid for a viable prefix γ if there is a derivation S ⇒* δAw ⇒ δαβw
where γ &equals; δα and either a is the first symbol of w or w is ε and a is $.
(The ⇒'s above are marked rm, meaning right-most derivation; in other words, in every derivation step, the right-most non-terminal is substituted with one of its productions. Consequently, w only contains terminals.)
In other words, the LR(1) item is valid (could apply) if we've reached some point where we've decided that A might be the next reduction and a might follow A; at the current point in the parse, we've read α. So if a follows β, then the reduction is possible. We don't yet know that, unless β is the empty sequence, but we need to remember the fact in case it turns out that β can derive the empty sequence.
I hope that helps. It's late here and I'm too tired to check it again. Maybe tomorrow...

What is a "Production" in plain English?

I can read on Wikipedia the formal definition of a Production, however when you start reading that article, it makes an assumption about prior knowledge.
Wikipedia defines it as follows:
A production or production rule in computer science is a rewrite rule specifying a symbol substitution that can be recursively performed to generate new symbol sequences.
This assumes that I know and understand what a rewrite rule is. I don't, and if I click the link, I get into another fairly technical explanation.
Can someone explain to me in plain English what a Production actually is?
Note: I have made many attempts to understand this, but I don't think I've succeeded. From what I can tell it rewrites the given string in terms of grammar rules. Not sure if I'm correct.
To explain what a production is I'd like to introduce a bit of context first.
The dragon book states that a context free grammar has 4 components:
a set of terminal symbols (tokens)
a set of non-terminal symbols (syntactic variables)
a set of productions of the form: non-term --> sequence of terminals and non-terminals
a non-terminal symbol designated as the start symbol
It is also said that parsing is the problem of taking a string of terminals (the source code) and figuring out what are the steps required to derive this string of terminals from the start symbol of the grammar.
Now that this has been said, a production is essentially a possible (intermediate) step. I say possible because some symbols can derive into different sequences.
For example, let's make a simple grammar to represent an arbitrarily long sequences of a's ending with a b. The 4 components of this grammar would be:
Terminals: a, b
Non-terminals: S, X
Rules: S --> X, X --> aX, X --> ab
Start symbol: S
From the description I gave above "aaaab" should be derivable from this grammar. Let's see if that holds up. We start from, the start symbol, and then apply productions until a) we get the final sequence, b) we exhaust all possibilities without succeeding (meaning the sequence is not "grammatically correct").
S
X (after applying S --> X)
aX (after applying X --> aX)
aaX (after applying X --> aX)
aaaX (after applying X --> aX)
aaaab (after applying X --> ab)
And we're done, we got the original sequence. So as you can see we re-wrote the non-terminal symbols by applying rules (one of them we applied recursively) which transformed the sequence into a new sequence of symbols at every step and we did so until we had the final sequence.
A rewrite rule is a method of replacing subterms of a formula with other terms. In their most basic form, they consist of a set of objects, plus relations on how to transform those objects.
An example of a rewrite rule could look like:
A → B
Now as for what this actually does! You are right on your note, take for example a list of things (and 2 rewrite rules):
X, Y, Z
X → Y
Y → Z
Which would result in:
Z, Z, Z
A production rule is a rewrite rule because it is a method of replacing subterms of a formula (probably a string in your case). A production rule could look like:
X, Y, Z
X → aX
By using the rule in such a way it becomes possible to apply recursion (create new sequences) as it will keep replacing itself:
aX, Y, Z
aaX, Y, Z
aaaX, Y, Z
As for the question you are asking, you could say: "A production rule is a replacement rule for formulas that uses recursion to create new sequences".

Resources