I built a simple ontology to test how SWRL rules infer new relations between individuals in an ontology, but it didn't work. My rule is:
(hasFather(?x, ?y) ∧ hasMother(?x, ?z) → spouseOf(?y, ?z)
and may be read as
x has a father y, x has mother z → y is spouse of z).
There are three individuals in my ontology: Husband, Wife, and Son. I set child has mother is Wife, has father is Husband. And then my rule is employed in order to set Husband is spouseOf Wife. I used the Jess plugin to test my rule but no result. Why doesn't the rule isn't work? Is there something wrong with my rule, or something wrong with Jess on Protege 3.3?
What you wrote in your ontology is not the rule you wrote in this question. Your ontology contains the following rule:
hasFather(?y, ?x) ∧ hasMother(?z, ?x) → spouseOf(?y, ?z)
In the RDF/XML file, swap swrl:argument1 with swrl:argument2 and it'll work.
Sonvx, Pellet reasoner can be used to test SWRL rules. Pellet reasoner provides Java APIs to test SWRL rules. download pellet here
let me know if you need something else.
Related
So i have this grammar :
S -> (D)
D -> EF
E -> a|b|S
F -> *D | +D | ε
First of all, books solution uses the P -> pBq , First(q) - {ε} is subset of FOLLOW(B) for the rule D -> EF but that rule has only 2 symbols do we assume ε infront of E (ε being the p in pBq)?
And secondly i can't understand how to calculate Follow(E).
FOLLOW(E) consists of every terminal symbol which can immediately follow E in some derivation step. That's the precise definition; it's not very complicated.
For a simple grammar, you should be able to figure out all the FOLLOW sets just be looking at the grammar and applying a little bit of common sense. It would probably be a good idea to do that, since it will give you a better idea of how the algorithm works.
As a side note, it's maybe worth mentioning that ε is not a thing. Or at least, it's not a grammar symbol. It's one of several conventions used to make the empty sequence visible, just like 0 is a way to make nothing visible. Sometimes that's useful, but it's important to not let it confuse you. (Abuse of notation is endemic in mathematics, which can be frustrating.)
So, what can follow E? E only appears in one place on the right-hand sdie of that grammar, in the production D → E F. So clearly any symbol which be the first symbol of F must be in FOLLOW(E). The symbols which could be at the start of F are + and *, since as mentioned, ε is not a grammar symbol. (Many definitions of FIRST allow ε to be a member of that set, along with any actual terminal symbol. That's an example of the abuse of notation I was talking about in the previous paragraph, since it makes it look like ε is a terminal symbol. But it isn't. It's nothing.)
F is what we call a "nullable" non-terminal, because it can derive the empty sequence (which was written as ε so that you can see it). In other words, it's possible for F to disappear completely in a derivation step. And if it does disappear, then E might be at the end of the production D → E F. If E is at the end of D, then it can be followed by anything which could follow D, which includes ). D can also appear at the end of a derivation of F, which means that F could be followed by anything which could follow F, a tautology which adds no information whatsoever.
So it's easy to see that FOLLOW(F) = {*, +, )}, and you can use that to check your understanding of any algorithm to compute follow sets.
Now, I don't know what book you are referring to (and it would have been courteous to mention that in your original question; sources should always be correctly cited). But the book I happen to have in front of me --the Dragon Book-- has a pretty similar algorithm. The Dragon book uses a simple convention for writing statements like that. Probably your book does, too, but it might not be the same convention. You should check what it says and make sure that you typed the copied statement correctly, respecting whatever formatting used to indicate what the symbols stand for.
In the Dragon book, some of the conventions include:
Lower case characters at the start of the alphabet. –a, b, c,…– are terminals (as well as actual symbols like * and +).
Upper case characters at the start of the alphabet. –A, B, C,…– are non-terminals.
S is the start symbol.
Upper case characters at the end of the alphabet. –X, Y, Z– stand for arbitrary grammar symbols (either terminals or non-terminals).
$ is the marker used to indicate the end of the input.
Lower-case Greek letters –α, β, γ,…– are possibly-empty strings of grammar symbols.
The phrase "possibly empty" is very important, so I'm repeating it.
With that convention, they write the rules for computing the FOLLOW set:
Place $ in FOLLOW(S).
For every production A → αBβ, copy everything from FIRST(&beta) except ε into FOLLOW(B).
If there is a production A → αB or a production A → αBβ where FIRST(β) contains ε, place everything in FOLLOW(A) into FOLLOW(B).
As mentioned above, α is a possibly-empty string of grammar symbols. So it might not be visible.
Keep doing steps 2 and 3 until no new symbols are added to any follow set.
I'm pretty sure that the algorithm in your book differs only in notation conventions.
In generated parsing function we use an algorithm which looks on a peek of a tokens list and chooses rule (alternative) based on the current non-terminal FIRST set. If it contains an epsilon (rule is nullable), FOLLOW set is checked as well.
Consider following grammar [not LL(1)]:
B : A term
A : N1 | N2
N1 :
N2 :
During calculation of the FOLLOW set terminal term will be propagated from A to both N1 and N2, so FOLLOW set won't help us decide.
On the other hand, if there is exactly one nullable alternative, we know for sure how to continue execution, even in case current token doesn't match against anything from the FIRST set (by choosing epsilon production).
If above statements are true, FOLLOW set is redundant. Is it needed only for error-handling?
Yes, it is not necessary.
I was asked precisely this question on the colloquium, and my answer that FOLLOW set is used
to check that grammar is LL(1)
to fail immediately when an error occurs, instead of dragging the ill-formatted token to some later production, where generated fail message may be unclear
and for nothing else
was accepted
While you can certainly find grammars for which FOLLOW is unnecessary (i.e., it doesn't play a role in the calculation of the parsing table), in general it is necessary.
For example, consider the grammar
S : A | C
A : B a
B : b | epsilon
C : D c
D : d | epsilon
You need to know that
Follow(B) = {a}
Follow(D) = {c}
to calculate
First(A) = {b, a}
First(C) = {d, c}
in order to make the correct choice at S.
Goal: find a way to formally define a grammar that recognizes elements from a set 0 or 1 times in any order. Subsequently, I want to parse it and generate an AST as well.
For example: Say the set of valid strings in my language is {A, B, C}. I want to define a grammar that recognizes all valid permutations of any number of those elements.
Syntactically valid strings would include:
(the empty string)
A,
B A, and
C A B
Syntactically invalid strings would include:
A A, and
B A C B
To be clear, defining all possible permutations explicitly in a CFG is unacceptable for my purposes, since larger sets would be impossible to maintain.
From what I understand, such a language fails the pumping lemma for context free languages, so the solution will not be context free or regular.
Update
What I'm after is called a "permutation language", which Benedek Nagy has done some theoretical work on as an extension to context free languages.
Regarding a parser generator, I've only found talk of implementing parsers with a permutation phase (link). Parsers evidently have an exponential lower bound on the size of resulting CFG, and I haven't found any parser generators that support it anyhow.
A sort-of solution to this problem was written in ANTLR. It uses semantic predicates to 'code around' the issue.
Assuming that the set of alternative strings is fixed and known in advance, say of size n, one can come up with a (non context-free) grammar of size O(n!). This is not asymptotically smaller than enumerating all permutations, so I suppose it cannot be considered a good solution. I believe that this grammar can be reformulated as a context-sensitive grammar (although in the form I'm suggesting below it is not).
For the example {a, b, c} mentioned in the question, one such grammar is the following. I'm using lower case letters for terminal symbols and upper case letters for non-terminals, as is customary. S is the initial non-terminal symbol.
S ::= XabcY
XabcY ::= aXbcY | bXacY | cXabY
XabY ::= ab | ba
XacY ::= ac | ca
XbcY ::= bc | cb
Non-terminals X and Y enclose the substring in the production which has not been finalized yet; this substring will eventually be replaced by a permutation of the terminals that are given between X and Y (in some arbitrary order).
I am studying grammars in Prolog and I have a litle doubt about conversions from the classic BNF grammars to the Prolog DCG grammars form.
For example I have the following BNF grammar:
<s> ::= a b
<s> ::= a <s> b
that, by rewriting, generates all strings of type:
ab
aabb
aaabbb
aaaabbbb
.....
.....
a^n b^n
Looking on the Ivan Bratko book Programming for Artificial Intelligence he convert this BNF grammar into DCG grammar in this way:
s --> [a],[b].
s --> [a],s,[b].
At a first look this seems to me very similar to the classic BNF grammar form but I have only a doubt related to the , symbol used in the DCG
This is not the symbol of the logical OR in Prolog but it is only a separator from the character in the generated sequence.
Is it right?
You can read the , in DCGs as and then or concatenated with:
s -->
[a],
[b].
and
t -->
[a,b].
is the same:
?- phrase(s,X).
X = [a, b].
?- phrase(t,X).
X = [a, b].
It is different to , in a non-DCG/regular Prolog rule which means logical conjunction (AND):
a.
b.
u :-
a,
b.
?- u.
true.
i.e. u is true if a and b are true (which is the case here).
Another difference is also that the predicate s/0 does not exist:
?- s.
ERROR: Undefined procedure: s/0
ERROR: However, there are definitions for:
ERROR: s/2
false.
The reason for this is that the grammar rule s is translated to a Prolog predicate, but this needs additional arguments. The intended way to evaluate a grammar rule is to use phrase/2 as above (phrase(startrule,List)). If you like, I can add explanations about a translation from DCG to plain rules, but I don’t know if this is too confusing if you are a beginner in Prolog.
Addendum:
An even better example would have been to define t as:
t -->
[b],
[a].
Where the evaluation with phrase results in the list [b,a] (which is definitely different from [a,b]):
?- phrase(t,X).
X = [b, a].
But if we reorder the goals in a rule, the cases in which the predicate is true never changes (*), so in our case, defining
v :-
b,
a.
is equivalent to u.
(*) Because prolog uses depth-first search to find a solution, it might be the case that it might need to try infinitely many candidates before it would find the solution after reordering. (In more technical terms, the solutions don't change but your search might not terminate if you reorder goals).
Is there a (simple) way, within a parsing expression grammar (PEG), to express an "unordered sequence"? A rule such as
Rule <- A B C
requires A, B and C to match in order. A rule such as
Rule <- (A B C) / (B C A) / (C A B) / (A C B) / (C B A) / (B A C)
allows them to match in any order (which is what we want) but it is cumbersome and inapplicable in practice with more terms in the sequence.
Is the only solution to use a syntactically looser rule such as
Rule <- (A / B / C){3}
and semantically check that each rule matches only once?
The fact that, e.g., Relax NG Compact Syntax has an "unordered list" operator to parse XML make me hint that there is no obvious solution.
Last question: do you think the addition of such an operator would bring ambiguity to PEG?
Grammar rules express precisely the sequence of forms that you want, regardless of parsing engine (e.g., PEG, LALR, LL(k), ...) that you choose.
The only way to express that you want all possible sequences of just of something using BNF rules is the big ugly rule you proposed.
The standard solution is to simply define:
rule <- (A | B | C)*
(or whatever syntax your parser generator accepts for lists) and semantically count that only 3 forms are provided and they are unique.
Often people building parser generators add special "extended BNF" notations to let them describe special circumstances; you gave an example use {3} as special syntax implying that you only wanted "3 of" under the assumption the parser generator accepts this notation and does the appropriate enforcement. One can imagine an extension notation {unique} to let you describe your situation. I've never seen a parser generator that implemented that idea.