Theory Language- What is the best way to convert a regular expression to an automata? - automaton

I have a question about how to convert a regular expression to aan automata? I hear about the Gluskov algorithm but i couldn't find a right document about it.
Example: i have an regular expression like (a*|b*) U (a*a|c*)* and i want to convert in a automata with a simple algorithm.
Please help me

Related

How to parse and evaluate a logical expression from a string in swift?

I'm currently parsing out xFrom XML data and displaying it in my app. Sometimes, a question input will have a logical expression attribute that I have to parse and evaluate.
At the simplest level, I’m given a string like this :
"selected(Key,’AnswerToQuestion’)”
Here, I have to parse out the Key, AnswerToQuestion and check my answer dictionary to see if the value of Key is equal to ‘AnswerToQuestion”. This is simple enough.
However in other situations, I am more complex expression such as:
“not(selected(Key,’AnswerToQuestion’) OR selected(Key2,’AnswerToQuestion2’))”
I only need to support "and", "not" and "or" statements.
I am lost on how to tackle this in an efficient way. Something tells me a simple regex won’t cut it and that I’ll need to recursively evaluate it (recursive programming has never been my strong suit). On top of that, I'm given the expression in a string format, which makes things more complicated.
Has anyone done this before? Or can offer some insight?
Thanks

Adding semantics using prolog

to compile a syntax portion of a language we need to parse syntactically and lexiqually
then a step that is not trivial: adding semantic
what I do is to use DCG with some predicates in prolog for the first two step, but now I want to know how it done for the semantic ?
how I consider this task?
separate to the parser or mix it with them?...
EDIT :
Usually, a semantic of simple expression with DCG be written as;
expr (Z) -> term(X) "+", expr(Y), {Z is X + Y}.
but if the language is based on logical formulas such as B Method , problem become more complicated,
and what I need is some tips to overcome this problem
sorry about the wrong expressions
I am very appreciated to your opinions

Feed the Stanford Parser with a formatted text

i have a phrase in the format "Word_POS-TAG_Lemma Word_POS-TAG_Lemma Word_POS-TAG_Lemma Word_POS-TAG_Lemma....." is there a way to feed the stanford parser with this kind of formatted input? Moreover these is a way to obtain a tree in the standard dependencies way?
Thank you in advance
See the FAQ: Can I give the parser part-of-speech (POS) tagged input and force the parser to use those tags?
It's definitely possible, though it would probably help to strip off / ignore the lemma forms to make things easier.

Predictive editor for Rascal grammar

I'm trying to write a predictive editor for a grammar written in Rascal. The heart of this would be a function taking as input a list of symbols and returning as output a list of symbol types, such that an instance of any of those types would be a syntactically legal continuation of the input symbols under the grammar. So if the input list was [4,+] the output might be [integer]. Is there a clever way to do this in Rascal? I can think of imperative programming ways of doing it, but I suspect they don't take proper advantage of Rascal's power.
That's a pretty big question. Here's some lead to an answer but the full answer would be implementing it for you completely :-)
Reify an original grammar for the language you are interested in as a value using the # operator, so that you have a concise representation of the grammar which can be queried easily. The representation is defined over the modules Type, ParseTree which extends Type and Grammar.
Construct the same representation for the input query. This could be done in many ways. A kick-ass, language-parametric, way would be to extend Rascal's parser algorithm to return partial trees for partial input, but I believe this would be too much hassle now. An easier solution would entail writing a grammar for a set of partial inputs, i.e. the language grammar with at specific points shorter rules. The grammar will be ambiguous but that is not a problem in this case.
Use tags to tag the "short" rules so that you can find them easily later: syntax E = #short E "+";
Parse with the extended and now ambiguous grammar;
The resulting parse trees will contain the same representation as in ParseTree that you used to reify the original grammar, except in that one the rules are longer, as in prod(E, [E,+,E],...)
then select the trees which serve you best for the goal of completion (which use the #short tag), and extract their productions "prod", which look like this prod(E,[E,+],...). For example using the / operator: [candidate : /candidate:prod(_,_,/"short") := trees], and you could use a cursor position to find candidates which are close by instead of all short trees in there.
Use list matching to find prefixes in the original grammar, like if (/match:prod(_,[*prefix, predicted, *postfix],_) := grammar) ..., prefix is your query as extracted from the #short rules. predicted is your answer and postfix is whatever would come after.
yield the predicted symbol back as a type for the user to read: "<type(predicted, ())>" (will pretty print it nicely even if it's some complex regexp type and does the quoting right etc.)

How to do part-of-speech tagging of texts, containing mathematical expressions?

The goal is a syntactic parsing of scientific texts. And first I need to make part-of-speech tagging of sentences of such texts. Texts are from arxiv.org. So they are originally in LaTeX. When extracting text from LaTeX documents, math expressions can be converted into MathML (or maybe some other format, but I prefer MathML cause this work is being done to create a specific web-app, and MathML is a convenient tool for this).
The only idea I have is to substitute mathematical expressions with some phrases of natural language and then use some implemented algorithm for pos-tagging. So the question is how to implement this substitutions or, in general, how to implement pos-tagging of texts with mathematics in them?
I have implemented a formula substitution algorithm on top of the Stanford tagger and it works quite nice. The way to go is, as abecadel has written, to replace every formula with a unique but new word, I used a combination of a word and a hash 'formula-duwkziah'.
Replacing all of the mathematical formulae with a single, unique word seem to be the way to go.

Resources