Haskell Date Parsing with Custom Separator - parsing

I'm trying to print out something like this:
Here's my code:
import System.Environment
import Data.Time
main = do args <- getArgs
let year = read $ args !! 0
let month = read $ args !! 1
let day = read $ args !! 2
let greg = fromGregorian year month day
print $ showDateFormat $ toGregorian $ addDays 10 $ greg
print $ showDateFormat $ toGregorian $ addDays 100 $ greg
print $ showDateFormat $ toGregorian $ addDays 1000 $ greg
print $ showDateFormat $ toGregorian $ addDays 10000 $ greg
showDateFormat :: (Integer,Int,Int) -> String
showDateFormat (y,m,d) = y ++ "/" ++ m ++ "/" ++ d ++ "\n"
I can't figure out what's wrong.
This is the error I got:

Haskell will not implicitly convert a value of one type to another; you have to do such things explicitly. In this case, you can use show to convert an Int or an Integer to a string containing its base-10 representation.
showDateFormat :: (Integer,Int,Int) -> String
showDateFormat (y,m,d) = show y ++ "/" ++ show m ++ "/" ++ show d ++ "\n"
The error message is literally telling you that it expects y to be a value of type [Char] (aka String), because ++ (receiving "/" :: String as one argument) expects a String as the other, but you have pass an Integer value for y instead.
(Note that String is the expected type because it is a valid type for ++, while Integer is not. Otherwise, the type checker would use the left-hand argument to fix the type. [1::Int] ++ "foo", for example, fails because "foo" does not have type [Int], not because [1] does not have type [Char].)

Related

Why is $ allowed but $$, or <$> disallowed as an operator (FS0035) and what makes $ special?

$ is allowed in a custom operator, but if you try to use $$, <$> or for instance ~$% as operator name you will receive the following error:
error FS0035: This construct is deprecated: '$' is not permitted as a character in operator names and is reserved for future use
$ clearly also has the '$' in the name, but works, why?
I.e.:
let inline ( $ ) f y = f y
// using it works just fine:
let test =
let add x = x + 1
add $ 12
I see $ a lot in online examples and apparently as a particular kind of operator. What is this spcial treatment or role for $ (i.e. in Haskell or OCaml) and what should <$> do if it were allowed (edit)?
Trying to fool the system by creating a function like op_DollarDollar, doesn't fly, syntax check is done on the call site as well. Though as an example, this trick does work with other (legal) operators:
// works
let inline op_BarQmark f y = f y
let test =
let add x = x + 1
add |? 12
// also works:
let inline op_Dollar f y = f y
let test =
let add x = seq { yield x + 1 }
add $ 12
There's some inconsistency in the F# specification around this point. Section 3.7 of the F# spec defines symbolic operators as
regexp first-op-char = !%&*+-./<=>#^|~
regexp op-char = first-op-char | ?
token quote-op-left =
| <# <##
token quote-op-right =
| #> ##>
token symbolic-op =
| ?
| ?<-
| first-op-char op-char*
| quote-op-left
| quote-op-right
(and $ also doesn't appear as a symbolic keyword in section 3.6), which would indicate that it's wrong for the compiler to accept ( $ ) as an operator.
However, section 4.4 (which covers operator precedence) includes these definitions:
infix-or-prefix-op :=
+, -, +., -., %, &, &&
prefix-op :=
infix-or-prefix-op
~ ~~ ~~~ (and any repetitions of ~)
!OP (except !=)
infix-op :=
infix-or-prefix-op
-OP +OP || <OP >OP = |OP &OP ^OP *OP /OP %OP !=
(or any of these preceded by one or more ‘.’)
:=
::
$
or
?
and the following table of precedence and associativity does contain $ (but no indication that $ can appear as one character in any longer symbolic operator). Consider filing a bug so that the spec can be made consistent one way or the other.

Return integer parsed in Parser integer >> eof using bind

Using the trifecta library, I am supposed to parse an integer string that does not contain trailing letters and return the integer parsed:
Prelude> parseString (yourFuncHere) mempty "123"
Success 123
Prelude> parseString (yourFuncHere) mempty "123abc"
Failure (interactive):1:4: error: expected: digit,
end of input
123abc<EOF>
I have been able to do this using do notation like so:
x <- decimal
eof
return x
But I have been unsuccessful in translating this to bind/lambdas.
This does not keep the parsed number, but is correct otherwise:
decimal >> eof
I guess I should start like this
decimal >>= \x -> eof
but after this, every permutation I have tried does not work. How do I return the number parsed and check for eof using bind syntax instead of do?
You would need to do
decimal >>= (\x -> (eof >> return x))
The eof combinator does not return anything, so you have to return the thing you want yourself.

Generating a parser given a list of tokens

Background
I'm trying to implement a date printing and parsing system using Parsec.
I have successfully implemented a printing function of type
showDate :: String -> Date -> Parser String
It takes parses a formatting string and creates a new string based on the tokens that the formatted string presented.
For example
showDate "%d-%m-%Y" $ Date 2015 3 17
has the output Right "17-3-2015"
I already wrote a tokenizer to use in the showDate function, so I thought that I could just use the output of that to somehow generate a parser using the function readDate :: [Token] -> Parser Date. My idea quickly came to a halt as I realised I had no idea how to implement this.
What I want to accomplish
Assume we have the following functions and types (the implementation doesn't matter):
data Token = DayNumber | Year | MonthNumber | DayOrdinal | Literal String
-- Parses four digits and returns an integer
pYear :: Parser Integer
-- Parses two digits and returns an integer
pMonthNum :: Parser Int
-- Parses two digits and returns an integer
pDayNum :: Parser Int
-- Parses two digits and an ordinal suffix and returns an integer
pDayOrd :: Parser Int
-- Parses a string literal
pLiteral :: String -> Parser String
The parser readDate [DayNumber,Literal "-",MonthNumber,Literal "-",Year] should be equivalent to
do
d <- pDayNum
pLiteral "-"
m <- pMonthNum
pLiteral "-"
y <- pYear
return $ Date y m d
Similarly, the parser readDate [Literal "~~", MonthNumber,Literal "hello",DayNumber,Literal " ",Year] should be equivalent to
do
pLiteral "~~"
m <- pMonthNum
pLiteral "hello"
d <- pDayNum
pLiteral " "
y <- pYear
return $ Date y m d
My intuition suggests there's some kind of concat/map/fold using monad bindings that I can use for this, but I have no idea.
Questions
Is parsec the right tool for this?
Is my approach convoluted or ineffective?
If not, how do I achieve this functionality?
If so, what should I try to do instead?
Your Tokens are instructions in a small little language for date formats [Token].
import Data.Functor
import Text.Parsec
import Text.Parsec.String
data Date = Date Int Int Int deriving (Show)
data Token = DayNumber | Year | MonthNumber | Literal String
In order to interpret this language, we need a type that represents the state of the interpreter. We start off not knowing any of the components of the Date and then discover them as we encounter DayNumber, Year, or MonthNumber. The following DateState represents the state of knowing or not knowing each of the components of the Date.
data DateState = DateState {dayState :: (Maybe Int), monthState :: (Maybe Int), yearState :: (Maybe Int)}
We will start interpreting a [Token] with DateState Nothing Nothing Nothing.
Each Token will be converted into a function that reads the DateState and produces a parser that computes the new DateState.
readDateToken :: Token -> DateState -> Parser DateState
readDateToken (DayNumber) ds =
do
day <- pNatural
return ds {dayState = Just day}
readDateToken (MonthNumber) ds =
do
month <- pNatural
return ds {monthState = Just month}
readDateToken (Year) ds =
do
year <- pNatural
return ds {yearState = Just year}
readDateToken (Literal l) ds = string l >> return ds
pNatural :: Num a => Parser a
pNatural = fromInteger . read <$> many1 digit
To read a date interpreting a [Token] we will first convert it into a list of functions that decide how to parse a new state based on the current state with map readDateToken :: [Token] -> [DateState -> Parser DateState]. Then, starting with a parser that succeeds with the initial state return (DateState Nothing Nothing Nothing) we will bind all of these functions together with >>=. If the resulting DateState doesn't completely define the Date we will complain that the [Token]s was invalid. We also could have checked this ahead of time. If you want to include invalid date errors as parsing errors this would also be the place to check that the Date is valid and doesn't represent a non-existent date like April 31st.
readDate :: [Token] -> Parser Date
readDate tokens =
do
dateState <- foldl (>>=) (return (DateState Nothing Nothing Nothing)) . map readDateToken $ tokens
case dateState of
DateState (Just day) (Just month) (Just year) -> return (Date day month year)
_ -> fail "Date format is incomplete"
We will run a few examples.
runp p s = runParser p () "runp" s
main = do
print . runp (readDate [DayNumber,Literal "-",MonthNumber,Literal "-",Year]) $ "12-3-456"
print . runp (readDate [Literal "~~", MonthNumber,Literal "hello",DayNumber,Literal " ",Year]) $ "~~3hello12 456"
print . runp (readDate [DayNumber,Literal "-",MonthNumber,Literal "-",Year,Literal "-",Year]) $ "12-3-456-789"
print . runp (readDate [DayNumber,Literal "-",MonthNumber]) $ "12-3"
This results in the following outputs. Notice that when we asked to read the Year twice, the second of the two years was used in the Date. You can choose a different behavior by modifying the definitions for readDateToken and possibly modifying the DateState type. When the [Token] didn't specify how to read one of the date fields we get the error Date format is incomplete with a slightly incorrect description; this could be improved upon.
Right (Date 12 3 456)
Right (Date 12 3 456)
Right (Date 12 3 789)
Left "runp" (line 1, column 5):
unexpected end of input
expecting digit
Date format is incomplete

Haskell: Traverse through a String/Text File

I am trying to read a script file then process and output it to a html file. In my script file, whenever there is a #title(this is a title), I will add tag [header] this is a title [/header] in my html output. So my approach is to first read the script file, write the content to a string, process the string, then write the string to html file.
In other to recognize the #title, I will need to read character by character in the string. When I read '#', I will need to detect the next character to see if they are t i t l e.
QUESTION: How do I traverse through a string (which is a list of char) in Haskell?
You could use a simple recursion trick, for example
findTag [] = -- end of list code.
findTag ('#':xs)
| take 5 xs == "title" = -- your code for #title
| otherwise = findTag xs
findTag (_:xs) = findTag xs
so basically you just pattern match if the next char (head of list) is '#' and then you check if the next 5 characters form "title". if so you can then continue your parsing code. if next character isnt '#' you just continue the recursing. Once the list is empty you reach the first pattern match.
Someone else might have a better solution.
I hope this answers your question.
edit:
For a bit more flexibility, if you want to find a specific tag you could do this:
findTag [] _ = -- end of list code.
findTag ('#':xs) tagName
| take (length tagName) xs == tagName = -- your code for #title
| otherwise = findTag xs
findTag (_:xs) _ = findTag xs
This way if you do
findTag text "title"
You'll specifically look for the title, and you can always change the tagname to whatever you want.
Another edit:
findTag [] _ = -- end of list code.
findTag ('#':xs) tagName
| take tLength xs == tagName = getTagContents tLength xs
| otherwise = findTag xs
where tLength = length tagName
findTag (_:xs) _ = findTag xs
getTagContents :: Int -> String -> String
getTagContents len = takeWhile (/=')') . drop (len + 1)
to be honest, it's getting a bit messy but here's what's happening:
You first drop the length of the tagName, then one more for the open bracket, and then you finish off by using takeWhile to take the characters until the closing bracket.
Evidently your problem falls into parsing category. As wisely stated by Daniel Wagner, for maintainability reasons you're much better off approaching it generally with a parser.
Another thing is if you want to work with textual data efficiently, you're better off using Text instead of String.
Here's how you could solve your problem using the Attoparsec parser library:
-- For autocasting of hardcoded strings to `Text` type
{-# LANGUAGE OverloadedStrings #-}
-- Import a way more convenient prelude, excluding symbols conflicting
-- with the parser library. See
-- http://hackage.haskell.org/package/classy-prelude
import ClassyPrelude hiding (takeWhile, try)
-- Exclude the standard Prelude
import Prelude ()
import Data.Attoparsec.Text
-- A parser and an inplace converter for title
title = do
string "#title("
r <- takeWhile $ notInClass ")"
string ")"
return $ "[header]" ++ r ++ "[/header]"
-- A parser which parses the whole document to parts which are either
-- single-character `Text`s or modified titles
parts =
(try endOfInput >> return []) ++
((:) <$> (try title ++ (singleton <$> anyChar)) <*> parts)
-- The topmost parser which concats all parts into a single text
top = concat <$> parts
-- A sample input
input = "aldsfj#title(this is a title)sdlfkj#title(this is a title2)"
-- Run the parser and output result
main = print $ parseOnly top input
This outputs
Right "aldsfj[header]this is a title[/header]sdlfkj[header]this is a title2[/header]"
P.S. ClassyPrelude reimplements ++ as an alias for Monoid's mappend, so you can replace it with mappend, <> or Alternative's <|> if you want.
For pattern search-and-replace, you can use
streamEdit.
import Replace.Megaparsec
import Text.Megaparsec
import Text.Megaparsec.Char
title :: Parsec Void String String
title = do
void $ string "#title("
someTill anySingle $ string ")"
editor t = "[header]" ++ t ++ "[/header]"
streamEdit title editor " #title(this is a title) "
" [header]this is a title[/header] "

This F# code is not working

This is not working...
I get error FS0001: The type 'string' is not compatible with the type 'seq'
for the last line. Why?
let rec Parse (charlist) =
match charlist with
| head :: tail -> printf "%s " head
Parse tail
| [] -> None
Parse (Seq.toList "this is a sentence.") |> ignore
The problem is that printf "%s " head means that head must be a string, but you actually want it to be a char, so you'll see that Parse has inferred type string list -> 'a option. Therefore, F# expects Seq.toList to be applied to a string seq, not a string.
The simple fix is to change the line doing the printing to printf "%c " head.

Resources