Replaces the text in the given file - FAKE F#MAKE - f#

I am new to FAKE and trying to implement something in FAKE as described below :
I have a file having more than 100 lines, I want to change few lines in the code , let say I want to change 2nd line i.e. IFR.SIIC._0.12 to
IFR.SIIC._0.45
How will I do this .
Will I do this using ReplaceInFile or RegexReplaceInFileWithEncoding ?

There are many functions that could help you: which one you'll pick will depend on how you'd prefer to write your code. For example, ReplaceInFile wants you to supply it with a function, while RegexReplaceInFileWithEncoding wants you to give it a regular expression (in string form, not as a Regex object). Depending on what text you want to replace, one might be easier than the other. For example, you could use ReplaceInFile like so:
Target "ChangeText" (fun _ ->
"D:\Files\new\oneFile.txt" // Note *no* !! operator to change a single file
|> ReplaceInFile (fun input ->
match input with
| "IFR.SIIC._0.12" -> "IFR.SIIC._0.45"
| "another string" -> "its replacement"
| s -> s // Anything else gets returned unchanged
)
)
That would be useful if, for example, you have a set of specific strings that you want to match, in just a single file. However, there's a simpler function called ReplaceInFiles (note the plural) which allows you to replace text in multiple files at once. Also, instead of taking a function as its parameter, ReplaceInFiles takes a sequence of (old,new) pairs. This is often easier to write:
let stringsToReplace = [
("IFR.SIIC._0.12", "IFR.SIIC._0.45") ;
("another string", "its replacement")
]
Target "ChangeText" (fun _ ->
!! "D:\Files\new\*.txt"
|> ReplaceInFiles stringsToReplace
)
If you want to specify your search and replacement strings in the form a regular expression, then you'd want RegexReplaceInFileWithEncoding or RegexReplaceInFilesWithEncoding (note the plural: the former takes a single file while the latter takes multiple files). I'll just show you an example of the multiple-files version:
Target "ChangeText" (fun _ ->
!! "D:\Files\new\*.txt"
|> RegexReplaceInFilesWithEncoding #"(?<part1>\w+)\.(?<part2>\w+)\._0\.12"
#"${part1}.${part2}._0.45"
System.Text.Encoding.UTF8
)
That would allow you to change IFR.SIIC._0.12 to IFR.SIIC._0.45 and ABC.WXYZ._0.12 to ABC.WXYZ._0.45.
Which one of these you'll want to use all depends on how many files you have, and how many different replacement strings you need (and how hard it would be to write them a regex).

Related

Can I match against a string that contains non-ASCII characters?

I am writing an program in which I am dealing with strings in the form, e.g., of "\001SOURCE\001". That is, the strings contained alphanumeric text with an ASCII character of value 1 at each end. I am trying to write a function to match strings like these. I have tried a match like this:
handle(<<1,"SOURCE",1>>) -> ok.
But the match does not succeed. I have tried a few variations on this theme, but all have failed.
Is there a way to match a string that contains mostly alphanumeric text, with the exception of a non-alpha character at each end?
You can also do the following
[1] ++ "SOURCE" ++ [1] == "\001SOURCE\001".
Or convert to binary using list_to_binary and pattern match as
<<1,"SOURCE",1>> == <<"\001SOURCE\001">>.
Strings are syntactic sugar for lists. Lists are a type and binaries are a different type, so your match isn't working out because you're trying to match a list against a binary (same problem if you tried to match {1, "STRING", 1} to it, tuples aren't lists).
Remembering that strings are lists, we have a few options:
handle([1,83,84,82,73,78,71,1]) -> ok.
This will work just fine. Another, more readable (but uglier, sort of) way is to use character literals:
handle([1, $S,$T,$R,$I,$N,$G, 1]) -> ok.
Yet another way would be to strip the non-character values, and then pass that on to a handler:
handle(String) -> dispatch(string:strip(String, both, 1)).
dispatch("STRING") -> do_stuff();
dispatch("OTHER") -> do_other_stuff().
And, if at all possible, the best case is if you just stop using strings for text values entirely (if that's feasible) and process binaries directly instead. The syntax of binaries is much friendlier, they take up way fewer resources, and quite a few binary operations are significantly more efficient than their string/list counterparts. But that doesn't fit every case! (But its awesome when dealing with sockets...)

Haskell/Parsec: How do you use the functions in Text.Parsec.Indent?

I'm having trouble working out how to use any of the functions in the Text.Parsec.Indent module provided by the indents package for Haskell, which is a sort of add-on for Parsec.
What do all these functions do? How are they to be used?
I can understand the brief Haddock description of withBlock, and I've found examples of how to use withBlock, runIndent and the IndentParser type here, here and here. I can also understand the documentation for the four parsers indentBrackets and friends. But many things are still confusing me.
In particular:
What is the difference between withBlock f a p and
do aa <- a
pp <- block p
return f aa pp
Likewise, what's the difference between withBlock' a p and do {a; block p}
In the family of functions indented and friends, what is ‘the level of the reference’? That is, what is ‘the reference’?
Again, with the functions indented and friends, how are they to be used? With the exception of withPos, it looks like they take no arguments and are all of type IParser () (IParser defined like this or this) so I'm guessing that all they can do is to produce an error or not and that they should appear in a do block, but I can't figure out the details.
I did at least find some examples on the usage of withPos in the source code, so I can probably figure that out if I stare at it for long enough.
<+/> comes with the helpful description “<+/> is to indentation sensitive parsers what ap is to monads” which is great if you want to spend several sessions trying to wrap your head around ap and then work out how that's analogous to a parser. The other three combinators are then defined with reference to <+/>, making the whole group unapproachable to a newcomer.
Do I need to use these? Can I just ignore them and use do instead?
The ordinary lexeme combinator and whiteSpace parser from Parsec will happily consume newlines in the middle of a multi-token construct without complaining. But in an indentation-style language, sometimes you want to stop parsing a lexical construct or throw an error if a line is broken and the next line is indented less than it should be. How do I go about doing this in Parsec?
In the language I am trying to parse, ideally the rules for when a lexical structure is allowed to continue on to the next line should depend on what tokens appear at the end of the first line or the beginning of the subsequent line. Is there an easy way to achieve this in Parsec? (If it is difficult then it is not something which I need to concern myself with at this time.)
So, the first hint is to take a look at IndentParser
type IndentParser s u a = ParsecT s u (State SourcePos) a
I.e. it's a ParsecT keeping an extra close watch on SourcePos, an abstract container which can be used to access, among other things, the current column number. So, it's probably storing the current "level of indentation" in SourcePos. That'd be my initial guess as to what "level of reference" means.
In short, indents gives you a new kind of Parsec which is context sensitive—in particular, sensitive to the current indentation. I'll answer your questions out of order.
(2) The "level of reference" is the "belief" referred in the current parser context state of where this indentation level starts. To be more clear, let me give some test cases on (3).
(3) In order to start experimenting with these functions, we'll build a little test runner. It'll run the parser with a string that we give it and then unwrap the inner State part using an initialPos which we get to modify. In code
import Text.Parsec
import Text.Parsec.Pos
import Text.Parsec.Indent
import Control.Monad.State
testParse :: (SourcePos -> SourcePos)
-> IndentParser String () a
-> String -> Either ParseError a
testParse f p src = fst $ flip runState (f $ initialPos "") $ runParserT p () "" src
(Note that this is almost runIndent, except I gave a backdoor to modify the initialPos.)
Now we can take a look at indented. By examining the source, I can tell it does two things. First, it'll fail if the current SourcePos column number is less-than-or-equal-to the "level of reference" stored in the SourcePos stored in the State. Second, it somewhat mysteriously updates the State SourcePos's line counter (not column counter) to be current.
Only the first behavior is important, to my understanding. We can see the difference here.
>>> testParse id indented ""
Left (line 1, column 1): not indented
>>> testParse id (spaces >> indented) " "
Right ()
>>> testParse id (many (char 'x') >> indented) "xxxx"
Right ()
So, in order to have indented succeed, we need to have consumed enough whitespace (or anything else!) to push our column position out past the "reference" column position. Otherwise, it'll fail saying "not indented". Similar behavior exists for the next three functions: same fails unless the current position and reference position are on the same line, sameOrIndented fails if the current column is strictly less than the reference column, unless they are on the same line, and checkIndent fails unless the current and reference columns match.
withPos is slightly different. It's not just a IndentParser, it's an IndentParser-combinator—it transforms the input IndentParser into one that thinks the "reference column" (the SourcePos in the State) is exactly where it was when we called withPos.
This gives us another hint, btw. It lets us know we have the power to change the reference column.
(1) So now let's take a look at how block and withBlock work using our new, lower level reference column operators. withBlock is implemented in terms of block, so we'll start with block.
-- simplified from the actual source
block p = withPos $ many1 (checkIndent >> p)
So, block resets the "reference column" to be whatever the current column is and then consumes at least 1 parses from p so long as each one is indented identically as this newly set "reference column". Now we can take a look at withBlock
withBlock f a p = withPos $ do
r1 <- a
r2 <- option [] (indented >> block p)
return (f r1 r2)
So, it resets the "reference column" to the current column, parses a single a parse, tries to parse an indented block of ps, then combines the results using f. Your implementation is almost correct, except that you need to use withPos to choose the correct "reference column".
Then, once you have withBlock, withBlock' = withBlock (\_ bs -> bs).
(5) So, indented and friends are exactly the tools to doing this: they'll cause a parse to immediately fail if it's indented incorrectly with respect to the "reference position" chosen by withPos.
(4) Yes, don't worry about these guys until you learn how to use Applicative style parsing in base Parsec. It's often a much cleaner, faster, simpler way of specifying parses. Sometimes they're even more powerful, but if you understand Monads then they're almost always completely equivalent.
(6) And this is the crux. The tools mentioned so far can only do indentation failure if you can describe your intended indentation using withPos. Quickly, I don't think it's possible to specify withPos based on the success or failure of other parses... so you'll have to go another level deeper. Fortunately, the mechanism that makes IndentParsers work is obvious—it's just an inner State monad containing SourcePos. You can use lift :: MonadTrans t => m a -> t m a to manipulate this inner state and set the "reference column" however you like.
Cheers!

Compilation error complaining about value not being a function

I am trying to experiment in F# for one of the utility tools we need, wherein we want to trawl through a folder of xml files and look for a particular tag. If found then insert another similar tag alongwith it. Finally, output all the filenames for which such additional tags have been inserted. But am getting a compilation error, of which I am not able to make a lot of sense.
let configFile =
Directory.GetFiles(Path.Combine("rootdir", "relativepath"), #"*.xml")
|> Seq.map(fun configFileName ->
let xmlNavigator = XPathDocument(configFileName).CreateNavigator()
let node = xmlNavigator.SelectSingleNode(#"Product/ABc[#type='xyz']")
match node with
| null -> "not configuration present"
| _ ->
let nodeAppender() = node.InsertAfter("<Risk type=""abc1"" methodology=""xyz1""/>")
let parentNode = node.SelectAncestors(XPathNodeType.Root, false)
parentNode.Current.OuterXml)
|> Seq.iter (printfn "%s")
The compilation error is as below:
This value is not a function and cannot be applied
Your string is escaped improperly. It should be:
node.InsertAfter("<Risk type=\"abc1\" methodology=\"xyz1\"/>")
EDIT: Apparently I was typing this as Brian posted his answer. Either escaping each quote char or prefixing with # as-is will work.
It would help to point out what line/column the error location is at.
At a glance, in nodeAppender, it looks like you left off the # on the string literal, which means it is five strings in a row (rather than one string with escaped quotes), which may be the cause of the error.

What's with "Uppercase variable identifiers should not generally be used in patterns..."?

This compiler like:
let test Xf Yf = Xf + Yf
This compiler no like:
let test Xfd Yfd = Xfd + Yfd
Warning:
Uppercase variable identifiers should not generally be used in patterns, and may indicate a misspelt pattern name.
Maybe I'm not googling properly, but I haven't managed to track down anything which explains why this is the case for function parameters...
I agree that this error message looks a bit mysterious, but there is a good motivation for it. According to the F# naming guidelines, cases of discriminated unions should be named using PascalCase and the compiler is trying to make sure that you don't accidentally misspell name of a case in pattern matching.
For example, if you have the following union:
type Side =
| Left
| Right
You could write the following function that prints "ok" when the argument is Left and "wrong!" otherwise:
let foo a =
match a with
| Lef -> printfn "ok"
| _ -> printfn "wrong!"
There is a typo in the code - I wrote just Lef - but the code is still valid, because Lef can be interpreted as a new variable and so the matching assigns whatever side to Lef and always runs the first case. The warning about uppercase identifiers helps to avoid this.
F# tries to enforce case rules for active patterns - consider what does this code do
let f X =
match X with
|X -> 1
|_ -> 2
This is quite confusing. Also, function parameters are similar to patterns, you can do
let f (a,b,_) = a,b
for example. Not quite sure why the third letter triggers the warning though

String splitting problems in Erlang

I've been playing around with the splitting of atoms and have a problem with strings. The input data will always be an atom that consists of some letters and then some numbers, for instance ms444, r64 or min1. Since the function lists:splitwith/2 takes a list the atom is first converted into a list:
24> lists:splitwith(fun (C) -> is_atom(C) end, [m,s,4,4,4]).
{[m,s],[4,4,4]}
25> lists:splitwith(fun (C) -> is_atom(C) end, atom_to_list(ms444)).
{[],"ms444"}
26> atom_to_list(ms444).
"ms444"
I want to separate the letters from the numbers and I've succeeded in doing that when using a list, but since I start out with an atom I get a "string" as result to put into my splitwith function...
Is it interpreting each item in the list as a string or what is going on?
You might want to have a look at the string module documentation:
http://www.erlang.org/doc/man/string.html
The following function might interest you:
tokens(String, SeparatorList) -> Tokens
Since strings in Erlang are just a list() of integer() the test in the fun will be made if the item is an atom() when it is in fact an integer(). If the test is changed to look for letters it works:
29> lists:splitwith(fun (C) -> (C >= $a) and (C =< $Z) end, atom_to_list(ms444)).
{"ms","444"}
An atom in erlang is a named constant and not a variable (or not like a variable is in an imperative language).
You should really not create atoms in dynamic fashion (that is, don't convert things to atoms at runtime)
They are used more in pattern matching and send recive code.
Pid ! {matchthis, X}
recive
{foobar,Y} -> doY(Y);
{matchthis,X} -> doX(X);
Other -> doother(Other)
end
A variable, like X could be set to an atom. For example X=if 1==1 -> ok; true -> fail end. I could suffer from poor imagination but I can't think of a way why you would like to parse atom. You should be in charge of what atoms you write and not use list_to_atom(CharIntegerList).
Can you perhaps give a more overview of what you like to accomplish?
A "string" in Erlang is not a primitive type: it is just a list() of integers(). So if you want to "separate" the letters from the digits, you'll have to do comparison with the integer representation of the characters.

Resources