I have a statement parser, when I run the program with this one as a parser argument, all the errors are displayed. To be able to parse several statements, I have defined a new parser; however, when there are errors, it does not display them.
Here is this parser:
(* -------- Program -------- *)
let pprog, pprogimpl = createParserForwardedToRef ()
pprogimpl := attempt (many pstatement |>> Program)
pstatement defining all possible statements via another parser.
I would like to know why errors are not displayed with the pprog parser. Did I make a mistake? Forget something?
Edit
I was finally able to solve the error by changing the many' instruction, and replacing it withmanyTill`. Apparently, the problem came from the fact that he couldn't handle the "listed" errors, if you know more anyway, I'd be curious to know.
let pprog, pprogimpl = createParserForwardedToRef ()
pprogimpl := (attempt (manyTill (pstatement) eof) |>> Program)
Related
I am trying to parse a String using parsec in Haskell, however every attempt throws another type of error.
import Text.ParserCombinators.Parsec
csvFile = endBy line eol
line = sepBy cell (char ',')
cell = many (noneOf ",\n")
eol = char '\n'
parseCSV :: String -> Either ParseError [[String]]
parseCSV input = parse csvFile "(unknown)" input
This code, when run through stack ghci produces an error saying "non type-variable argument in the constraint: Text.Parsec.Prim.Stream"
Basically, I am wondering what the most straight forward way to parse a String into tokens based on commas is in Haskell. It seems like a very straightforward concept and I assumed that it would be a great learning experience, but so far it has produced nothing but errors.
The error I see when entering char '\n' in ghci is:
<interactive>:4:1: error:
• Non type-variable argument
in the constraint: Text.Parsec.Prim.Stream s m Char
(Use FlexibleContexts to permit this)
• When checking the inferred type
it :: forall s (m :: * -> *) u.
Text.Parsec.Prim.Stream s m Char =>
Text.Parsec.Prim.ParsecT s u m Char
The advice about FlexibleContexts is accurate. You can turn on FlexibleContexts like so:
*Main> :set -XFlexibleContexts
Unfortunately, the next error is • No instance for (Show (Text.Parsec.Prim.ParsecT s0 u0 m0 Char)) (basically, we can't print a function) so you'll still need to apply the parser to some input to actually run it.
Like commenters, I find that parseCSV can be used without any language extensions.
There are a few things going on here:
In the context of the whole program, the type of eol is constrained by the type signature on parseCSV. That doesn't happen when typing eol = char '\n' into GHCi.
GHCi's :t is permissive - it's willing to print some types that use language features that aren't turned on.
GHC has grown by adding a large number of language extensions, which can be turned on by the programmer on a per-module basis. Some are widely used by production-ready libraries, others are new & experimental.
The example code below appears to work nicely:
open FParsec
let capitalized : Parser<unit,unit> =(asciiUpper >>. many asciiLower >>. eof)
let inverted : Parser<unit,unit> =(asciiLower >>. many asciiUpper >>. eof)
let capsOrInvert =choice [capitalized;inverted]
You can then do:
run capsOrInvert "Dog";;
run capsOrInvert "dOG";;
and get a success or:
run capsOrInvert "dog";;
and get a failure.
Now that I have a ParserResult, how do I do things with it? For example, print the string backwards?
There are several notable issues with your code.
First off, as noticed in #scrwtp's answer, your parser returns unit. Here's why: operator (>>.) returns only the result returned by the right inner parser. On the other hand, (.>>) would return the result of a left parser, while (.>>.) would return a tuple of both left and right ones.
So, parser1 >>. parser2 >>. eof is essentially (parser1 >>. parser2) >>. eof.
The code in parens completely ignores the result of parser1, and the second (>>.) then ignores the entire result of the parser in parens. Finally, eof returns unit, and this value is being returned.
You may need some meaningful data returned instead, e.g. the parsed string. The easiest way is:
let capitalized = (asciiUpper .>>. many asciiLower .>> eof)
Mind the operators.
The code for inverted can be done in a similar manner.
This parser would be of type Parser<(char * char list), unit>, a tuple of first character and all the remaining ones, so you may need to merge them back. There are several ways to do that, here's one:
let mymerge (c1: char, cs: char list) = c1 :: cs // a simple cons
let pCapitalized = capitalized >>= mymerge
The beauty of this code is that your mymerge is a normal function, working with normal char's, it knows nothing about parsers or so. It just works with the data, and (>>=) operator does the rest.
Note, pCapitalized is also a parser, but it returns a single char list.
Nothing stops you from applying further transitions. As you mentioned printing the string backwards:
let pCapitalizedAndReversed =
capitalized
>>= mymerge
>>= List.rev
I have written the code in this way for purpose. In different lines you see a gradual transition of your domain data, still within the paradigm of Parser. This is an important consideration, because any subsequent transition may "decide" that the data is bad for some reason and raise a parsing exception, for example. Or, alternatively, it may be merged with other parser.
As soon as your domain data (a parsed-out word) is complete, you extract the result as mentioned in another answer.
A minor note. choice is superfluous for only two parsers. Use (<|>) instead. From experience, careful choosing parser combinators is important because a wrong choice deep inside your core parser logic can easily make your parsers dramatically slow.
See FParsec Primitives for further details.
ParserResult is a discriminated union. You simply match the Success and Failure cases.
let r = run capsOrInvert "Dog"
match r with
| Success(result, _, _) -> printfn "Success: %A" result
| Failure(errorMsg, _, _) -> printfn "Failure: %s" errorMsg
But this is probably not what you find tricky about your situation.
The thing about your Parser<unit, unit> type is that the parsed value is of type unit (the first type argument to Parser). What this means is that this parser doesn't really produce any sensible output for you to use - it can only tell you whether it can parse a string (in which case you get back a Success ((), _, _) - carrying the single value of type unit) or not.
What do you expect to get out of this parser?
Edit: This sounds close to what you want, or at least you should be able to pick up some pointers from it. capitalized accepts capitalized strings, inverted accepts capitalized strings that have been reversed and reverses them as part of the parser logic.
let reverse (s: string) =
System.String(Array.rev (Array.ofSeq s))
let capitalized : Parser<string,unit> =
(asciiUpper .>>. manyChars asciiLower)
|>> fun (upper, lower) -> string upper + lower
let inverted : Parser<string,unit> =
(manyChars asciiLower .>>. asciiUpper)
|>> fun (lower, upper) -> reverse (lower + string upper)
let capsOrInvert = choice [capitalized;inverted]
run capsOrInvert "Dog"
run capsOrInvert "doG"
run capsOrInvert "dog"
I want to read values(strings) from console in a loop until a certain value is entered.
What is the code for that?
With Haskell there are a multitude ways of writing such a loop, and the one you choose will depend on context -- i.e. what larger program is this loop part of?
To get you started with some simple imperative-style loops, both the Haskell Wikibook and the Haskell Wiki have some good resources:
Haskell Simple Input and Output
IO for Imperative Programmers
Update
From your comment it appears you want to write a "command processor". Have a look at this SO question and answer:
Number guessing game error and keeping count of guesses
Alternatively, if your bool expression type has a Show instance how about using the REPL in ghci?
ghci> :load your_code
ghci> let e = ...initial bool expression...
ghci> e
...e is displayed...
ghci> let f = e || blah
ghci> f
...f is displayed...
ghci> it && whatever -- it refers to the last expression
...some output...
ghci> not it
...
it is a variable maintained by ghci which always refers to the last evaluated expression.
I'm totally new to Haskell and trying to implement a "Lambda calculus" parser, that will be used to read the input to a lambda reducer .. It's required to parse bindings first "identifier = expression;" from a text file, then at the end there's an expression alone ..
till now it can parse bindings only, and displays errors when encountering an expression alone .. when I try to use the try or option functions, it gives a type mismatch error:
Couldn't match type `[Expr]'
with `Text.Parsec.Prim.ParsecT s0 u0 m0 [[Expr]]'
Expected type: Text.Parsec.Prim.ParsecT
s0 u0 m0 (Text.Parsec.Prim.ParsecT s0 u0 m0 [[Expr]])
Actual type: Text.Parsec.Prim.ParsecT s0 u0 m0 [Expr]
In the second argument of `option', namely `bindings'
bindings weren't supposed to return anything, but I tried to add a return statement and it also returned a type mismatch error:
Couldn't match type `[Expr]' with `Expr'
Expected type: Text.Parsec.Prim.ParsecT
[Char] u0 Data.Functor.Identity.Identity [Expr]
Actual type: Text.Parsec.Prim.ParsecT
[Char] u0 Data.Functor.Identity.Identity [[Expr]]
In the second argument of `(<|>)', namely `expressions'
Don't use <|> if you want to allow both
Your program parser does its main work with
program = do
spaces
try bindings <|> expressions
spaces >> eof
This <|> is choice - it does bindings if it can, and if that fails, expressions, which isn't what you want. You want zero or more bindings, followed by expressions, so let's make it do that.
Sadly, even when this works, the last line of your parser is eof and
First, let's allow zero bindings, since they're optional, then let's get both the bindings and the expressions:
bindings = many binding
program = do
spaces
bs <- bindings
es <- expressions
spaces >> eof
return (bs,es)
This error would be easier to find with plenty more <?> "binding" type hints so you can see more clearly what was expected.
endBy doesn't need many
The error message you have stems from the line
expressions = many (endBy expression eol)
which should be
expressions :: Parser [Expr]
expressions = endBy expression eol
endBy works like sepBy - you don't need to use many on it because it already parses many.
This error would have been easier to find with a stronger data type tree, so:
Use try to deal with common prefixes
One of the hard-to-debug problems you've had is when you get the error expecting space or "=" whilst parsing an expression. If we think about that, the only place we expect = is in a binding, so it must be part way through parsing a binding when we've given it an expression. This only happens if our expression starts with an identifier, just like a binding does.
binding sees the first identifier and says "It's OK guys, I've got this" but then finds no = and gives you an error, where we wanted it to backtrack and let expression have a go. The key point is we've already used the identifier input, and we want to unuse it. try is right for that.
Encase your binding parser with try so if it fails, we'll go back to the start of the line and hand over to expression.
binding = try (do
(Var id) <- identifier
_ <- char '='
spaces
exp <- expression
spaces
eol <?> "end of line"
return $ Eq id exp
<?> "binding")
It's important that as far as possible each parser starts with matching something unique to avoid this problem. (try is backtracking, hence inefficient, so should be avoided if possible.)
In particular, avoid starting parsers with spaces, but instead make sure you finish them all with spaces. Your main program can start with spaces if you like, since it's the only alternative.
Use types for most productions - better structure & readability
My first piece of general advice is that you could do with a more fine-grained data type, and should annotate your parsers with their type. At the moment, everything's wrapped up in Expr, which means you can only get error messages about whether you have an Expr or a [Expr]. The fact that you had to add Eq to Expr is a sign you're pushing the type too far.
Usually it's worth making a data type for quite a lot of the productions, and if you import Control.Applicative hiding ((<|>),(<$>),many) Control.Applicative you can use <$> and <*> so that the production, the datatype and the parser are all the same structure:
--<program> ::= <spaces> [<bindings>] <expressions>
data Program = Prog [Binding] [Expr]
program = spaces >> Prog <$> bindings <*> expressions
-- <expression> ::= <abstraction> | factors
data Expression = Ab Abstraction | Fa [Factor]
expression = Ab <$> abstraction <|> Fa <$> factors <?> "expression"
Don't do this with letters for example, but for important things. What counts as important things is a matter of judgement, but I'd start with Identifiers. (You can use <* or *> to not include syntax like = in the results.)
Amended code:
Before refactoring types and using Applicative here
And afterwards here
Not sure if this is possible (or recommended), but I am essentially trying to search for a sequence of characters in file using Parsec. Example file:
START (name)
junk
morejunk=junk;
dontcare
foo ()
bar
care_about this (stuff in here i dont care about);
don't care about this
or this
foo = bar;
also_care
about_this
(dont care whats in here);
and_this too(only the names
at the front
do i care about
);
foobar
may hit something = perhaps maybe (like this);
foobar
END
And here is my attempt at getting it working:
careAbout :: Parser (String, String)
careAbout = do
name1 <- many1 (noneOf " \n\r")
skipMany space
name2 <- many1 (noneOf " (\r\n")
skipMany space
skipMany1 parens
skipMany space
char ';'
return (name1, name2)
parens :: Parser ()
parens = do
char '('
many (parens <|> skipMany1 (noneOf "()"))
char ')'
return ()
parseFile = do
manyTill (do
try careAbout <|>
anyChar >> return ("", "")) (try $ string "END")
I'm trying to brute force the search by looking for careAbout, and if that doesn't work, eat one character and try again. I could parse all the junk in the middle (I know what it could be), but I don't care about what it is (so why bother parsing it), and it's potentially complicated.
Problem is, my solution doesn't quite work. anyChar ends up consuming everything, and the searching for END never gets a chance. Also, somewhere in the careAbout we hit eof and some Exception is thrown because of it.
This is probably the exact wrong way of doing it, and I would like to know of a way, or even better, the Right Way™, of doing it.
If not for the parens parser, this would be a good fit for a regular language parser, such as regex-applicative. This is because regular language parsers are much more "smart" about "backtracking" (in fact there's no backtracking going on at all, and yet every possible branch is explored).
However, as you probably know, matching parentheses is not a regular language. If you can relax your grammar to become regular, give regex-applicative a try.
I can't really tell from OP's post which parts of the file we care about or
don't, so I'm not going to post a specific solution. But in general, for
searching through a file for patterns which match a recursive parser, one
can use
replace-megaparsec.