F# Workflows/Development Process - f#

(I'm using the word "workflow" - not in the sense of async workflows - but rather in the "git workflow" sense, that is, how you use it as part of your development)
Having played around with F# for a while, I've started developing my first F# app. I'm from c#/vb. Having watched various demos/talks - rightly or wrongly- I've started off using fsi as the main development "engine" and work up stuff within that area. If I hit a problem which I need to debug, I tend to break out the problematic function into smaller bits and check those work to try and debug the problem.
However, In order to keep the amount of code manageable in fsi, once I am happy with what I have done, I the move it into a .fs and #load the .fs back into fsi. As the app gets bigger, this can begin to feel a bit clunky since when I need to refactor, I end up having to bring back in content from the fs file change it run stuff to get something working again, before pushing the code back out into the .fs file. Further this style isn't really a test first approach and so I am not getting the benefit of building a set of tests. (I can also miss the ability to set breakpoints/step the code which, istm in certain situations e.g. recursion, can be quicker for diagnosing errors than breaking out parts of a function - though maybe this is available in VS11 and I'm not setup right) .. so I think I'm perhaps not doing things optimally or else not thinking about things in the right way.
I was wondering if others could offer how they develop apps. Do you primarily use fsi or do you start with tdd. Should the tdd approach be the primary dev vehicle and FSI used more selectively to aid in the, say, implementation of more complex algorithms, data exploration etc etc
I have looked at this question and obviously it helpfully points to various tdd frameworks for F#, but I'd still be interested to find out the workflow of seasoned F# developers.
Many thx
S

I think you're on the right track.
Development process is a matter of taste. I'll share my approach anyway.
Start by a few fs files. Each file represents a module, which consists of a group of functions closely related to each other. It doesn't have to be precise from beginning; you often move stuffs between modules.
Create a few fsx files for quick testing once skeleton of the modules is ready.
Create a test project and set up NuGet packages. I often use NUnit and FsUnit together.
Whenever fsx scripting gives correct results, move them to test cases. Do this repeatedly.
Include a Program.fs into the main project and compile to executable in order to debug if needed.
In general, F# REPL is the main development engine. It gives me instant feedbacks and allows incremental changes, which are very helpful in prototyping. In F#, TDD is less critical since bug rate is much lower than in other languages. And I don't test everything, just focus on main functionalities and ensure a high test coverage. Using testdriven.net add-in or Visual Studio 2012 Premium and Ultimate can give you useful statistics on test coverage.
Using F# REPL and TDD, I almost never have to use debugging. Whenever there is a wrong behaviour, I stop and think. Since your codes don't have side effects, you can reason on them easily. In many times reasoning and a few printing commands can give me the right answer.
You can use TDD in F# REPL with Unquote and FsCheck. The former offers testing via quotations, which is quite impressive. The latter uses random testing approach which is attractive in handling corner cases of your codes. I find it really useful when your programs have to satisfy certain properties. However, it may take time to learn to use these frameworks properly.

pad gave a great answer that is very practical and useful for a person new to F#. I will give a different means so that others don't think there is only one way F#'ers do it.
Note: If you are very new to programming, then stick with pad's answer, it is much better for a new programmer.
In the Object Oriented world one thinks with objects and in such languages I would start with writing objects down on paper and working with various diagrams such as use-case, state transition, sequence diagram, etc., until I felt I had enough to start creating objects in C# cs files, fleshing out the objects with methods, properties, events, etc.
In the functional world I typically start with abstract concepts and convert them into discriminated unions (DU) in an F# fs file, skipping the use of a REPL, i.e. F# Interactive, and then start adding a few functions. After I have a few functions I set up a test project using NUnit and FsUnit via NuGet. Since the DU are abstract, the test cases are typically harder to write, so I create printers for the DU and then insert them into the test case where I capture result output from the printer in the NUnit tool, for cut and paste back into the test case making changes as necessary. See these for examples of why I don't write them from scratch by hand.
Once I have the abstract DU done, I then can move onto the code to convert the human/concrete form into the abstract DU and convert the abstract DU into human/concrete form. In some cases these would be parsers and pretty printers.
The main point I am trying to make is that I don't focus on the tools I use but on the abstract concept of the problem and bring the tools in when needed.
I will note that I also program in PROLOG and there I do start with the REPL and move the code to a store once the logic works. So I am not opposed to using a REPL, it's just a different way of approaching the problem.
EDIT
Per a request by Ken for an example.
See: Discriminated Unions (F#) and look for the section
Using Discriminated Unions Instead of Object Hierarchies
So instead of a base type of shape with inherited types of Circle, EquilateralTriangle, Square and Rectangle one would create a discriminated Union as noted:
type Shape =
| Circle of float
| EquilateralTriangle of double
| Square of double
| Rectangle of double * double
As your question would make for a much better independent question and get answers with much better detail than I can give, I would suggest you ask it.
Also if you search for info on this also search with the following substitutions for discriminated union (DU):
Algebraic data type
Generalized algebraic data type (GADT)
Tagged union
Variant
variant record
disjoint union
sum type

Related

Is there a sandboxable compiled programming language simililar to lua

I'm working on a crowd simulator. The idea is people walking around a city in 2D. Think gray rectangles for the buildings and colored dots for the people. Now I want these people to be programmable by other people, without giving them access to the core back end.
I also don't want them to be able to use anything other than the methods I provide for them. Meaning no file access, internet access, RNG, nothing.
They will receive get events like "You have just been instructed to go to X" or "You have arrived at P" and such.
The script should then allow them to do things like move_forward or how_many_people_are_in_front_of me and such.
Now I have found out that Lua and python are both thousands of times slower than compiled languages (I figured it would be in order of magnitude of 10s times slower), which is way to slow for my simulation.
So heres my question: Is there a programming language that is FOSS, allows me to restrict system access (sandboxing) the entire language to limit the amount of information the script has by only allowing it to use my provided functions, that is reasonably fast, something like <10x slower than Java, where I can send events to objects inside that language with which I can load in new Classes/Objects on the fly.
Don't you think that if there was a scripting language faster than lua and python, then it'd be talked about at least as much as they are?
The speed of a scripting language is rather vague term. Scripting languages essentially are converted to a series of calls to functions written in fast compiled languages. But the functions are usually written to be general with lots of checks and fail-safes, rather than to be fast. For some problems, not a lot of redundant actions stacks up and the script translation results in essentially same machine code as the compiled program would have. For other problems, a person, knowledgeable about the language, might coerce it to translate to essentially same machine code. For other problems the price of convenience stay forever with the script.
If you look at the timings of benchmark tasks, you'll find that there's no consistent winner across them. For one task the language is fastest, for the other it is way behind.
It would make sense to gauge language speed at your task by looking at similar tasks in benchmarks. So, which of those problem maps the closest to yours? My guess would be: none.
Now, onto the question of user programs inside your program.
That's how script languages came to existence in the first place. You can read up on why such a language may be slow for example in SICP.
If you evaluate what you expect people to write in their programs, you might decide, that you don't need to give them whole programming language. Then you may give them a simple set of instructions they can use to describe a few branching decisions and value lookups. Then your own very performant program will construct an object that encompasses the described logic. This tric is described here and there.
However if you keep adding more and more complex commands for users to invoke, you'll just end up inventing your own language. At that point you'll likely wish you'd went with Lua from the very beginning.
That being said, I don't think the snippet below will run significantly different in compiled code, your own interpreter object, or any embedded scripting language:
if event = "You have just been instructed to go to X":
set_front_of_me(X) # call your function
n = how_many_people_are_in_front_of_me() #call to your function
if n > 3:
move_to_side() #call to function provided by you
else:
move_forward() #call to function provided by you
Now, if the users would need to do complex computer-sciency stuff, solve np-problems, do machine learning or other matrix multiplications, then yes, that would be slow, provided someone would actually trouble themselves with implementing that.
If you get to that point, it seem that there are at least some possibilities to sandbox the compiled dlls (at least in some languages). Or you could do compilation of users' code yourself to control the functionality they invoke and then plug it in as a library.

Is there a way to preprocess ruby code and find errors that would occur runtime?

We have huge code base and we are generating issues that would have been caught at compile time in type languages such as Java but we are not catching them until runtime in Ruby. This is bad since we generate bugs that most of the time are typos or refactoring that leaves some invalid code.
Example:
def mysuperfunc
# some code goes here
# this was a valid call but not anymore since enforcesecurity
# signature changed
#system.enforcesecurity
end
I mean, IDEs can do it but some guys use ATOM or sublime, so we need something to "compile" and report that kind of issues so they don't reach deployment. What have you been using?
This is generating a little percentage of our bug reports, but since we are forced to produce at a ridiculous pace we don't have 100% code coverage. If there is no tool to help, I'll just make sure everybody uses and IDE and run the reports with tools such as Rubymine.
Our stack includes, rspec, minitest, SimpleCov. We enforce code reviews, multistack deployments (dev, qa, pre-prod, sandbox, prod). And still some issues are reaching higher level and makes us programmers look bad. I'm not looking of magic, just a little automation that might help a bit.
Unfortunately, the Halting Problem, Rice's Theorem, and all the other Undecidability and Uncomputability Results tell us that it is simply impossible in the general case to statically determine any "interesting" property about the runtime behavior of a program. We cannot even statically determine something as simple as "will it halt", so how are we going to determine "is bug-free"?
There are certain things that can be statically determined, and there are certain restricted programs for which some interesting properties can be statically determined, but largely, this is not possible. And even to the small extent that it is possible, it generally requires the language to be specifically designed to be easy to statically analyze (which Ruby isn't).
That being said, there are certain tools that contain certain heuristics to point out code that may have problems. There are certain coding standards that may help avoid bugs, and there are tools to enforce those coding standards. Keywords to search for are "code quality tools", "linter", "static analyzer", etc. You have already been given examples in the other answers and comments, and given those examples and these keywords, you'll likely find more.
However, I also wanted to discuss something you wrote:
we are forced to produce at a ridiculous pace we don't have 100% code coverage
That's a problem, which has to be approached from two sides:
Practice, practice, practice. You need to practice testing and writing high-quality code until it is so naturally to you that not doing it actually ends up being harder and slower. It should become second nature to you, such that under pressure when your mind goes blank, the only thing you know is to write tests and write well-designed, well-factored, high-quality code. Note: I'm talking about deliberate practice, which means setting time aside to really practice … and practice is practice, it's not work, it's not fun, it's not hobby, if you don't delete the code you wrote immediately after you have written it, you are not practicing, you are working.
Sustainable Pace. You should never develop faster than the pace you could sustain indefinitely while still producing well-tested, well-designed, well-factored, high-quality code, having a fulfilling social life, no stress, plenty of free time, etc. This is something that has to be backed and supported and understood by management.
I'm unaware of anything exactly like you want. However, there are a few gems that will analyze code and warn you about some errors and/or bad practices. Try these:
https://github.com/bbatsov/rubocop
https://github.com/railsbp/rails_best_practices
FLAY
https://rubygems.org/gems/flay
Via the repo https://github.com/seattlerb/flay:
DESCRIPTION:
Flay analyzes code for structural similarities. Differences in literal
values, variable, class, method names, whitespace, programming style,
braces vs do/end, etc are all ignored. Making this totally rad.
[FEATURES:]
Reports differences at any level of code.
Adds a score multiplier to identical nodes.
Differences in literal values, variable, class, and method names are ignored.
Differences in whitespace, programming style, braces vs do/end, etc are ignored.
Works across files.
Add the flay-persistent plugin to work across large/many projects.
Run --diff to see an N-way diff of the code.
Provides conservative (default) and --liberal pruning options.
Provides --fuzzy duplication detection.
Language independent: Plugin system allows other languages to be flayed.
Ships with .rb and .erb.
javascript and others will be
available separately.
Includes FlayTask for Rakefiles.
Uses path_expander, so you can use:
dir_arg -- expand a directory automatically
#file_of_args -- persist arguments in a file
-path_to_subtract -- ignore intersecting subsets of
files/directories
Skips files matched via patterns in .flayignore (subset format of .gitignore).
Totally rad.
FLOG
https://rubygems.org/gems/flog
Via the repo https://github.com/seattlerb/flog:
DESCRIPTION:
Flog reports the most tortured code in an easy to read pain report.
The higher the score, the more pain the code is in.
[FEATURES:]
Easy to read reporting of complexity/pain.
Uses path_expander, so you can use:
dir_arg – expand a directory automatically
#file_of_args – persist arguments in a file
-path_to_subtract – ignore intersecting subsets of files/directories
SYNOPSIS:
% ./bin/flog -g lib
Total Flog = 1097.2 (17.4 flog / method)
323.8: Flog total
85.3: Flog#output_details
61.9: Flog#process_iter
53.7: Flog#parse_options
...
There is a ruby gem called guard that does automated testing. You can set your own custom rules.
For example, you can make it where anytime you modify certain files, the test framework will automatically run.
Here is the link for guard

What can F# offer for managing nondeterminism?

Using non-deterministic functions is unavoidable in applications that talk to the real world. Making a clear separation between deterministic and non-deterministic is important.
Haskell has the IO monad that sets the impure context by looking at which we know that everything outside of it is pure. Which is nice, if you ask me, when it comes to unit testing one can tell which part of their code is ultimately testable and which is not.
I could not find anything that allows separating the two in F#. Does it mean there is just no way to do that?
The distinction between deterministic and non-deterministic function is not captured by the F# type system, but a typical F# system that needs to deal with non-determinism would use some structure (or "design pattern") that clearly separates the two.
If your core model is some computation that does not interact with the world (you only need to collect inputs and run the computation), then you can write most of your code as functional transformations on immutable data structures and then invoke these from some "main" I/O loop.
If you're writing some highly interactive or reactive application then you can use F# agents (here is an introductory article) and structure your application so that the non-determinism is safely contained in individual agents (see more about agent-based architectures)
F# is based on OCaml, and much like OCaml it isn't pure FP. I don't believe there is away to accomplish your goal in either language.
One way to manage it could be in making up a nominal type that would represent the notion of the real world and make sure each non-deterministic function takes its singleton as a parameter. This way all dependent functions would have to pass it on along the line. This makes a strong distinction between the two at a cost of some discipline and a bit of extra typing. Good thing about this approach is that it can verified by the compiler given necessary conditions are met.

Relationship between parsing, highlighting and completion

For some time now I've been thinking about designing a small toy language from scratch, nothing that will "Rule The World", but mostly as an exercise. I realize there is a lot to learn in order to accomplish this.
This question is about three different concepts (parsing, code highlighting and completion) that strike me as extremely similar. Of course, parsing and ASTgen is part of the compilation, while code highlighting and completion is more of a feature of the IDE, yet I wonder what are the similarities and differences.
I need some hints from someone more experienced in this topic. What code can be shared between these concepts and what are the architecture considerations that could help in this sense?
What you want is a syntax-directed structure editor. This is one that combines parsing with AST building and uses the parser to predict what you can type next (either syntax completion), or has a tie to the compiler's last run, so that it can interpret the edit point to see what valid identifiers might come next by inspecting the compiler's symbol table that was last relevant at that point in the code.
The most difficult part is offering the user a seamless experience; she pretty much has to believe she is editing text or (experience with structure editors shows) she will reject it as awkward.
This is a lot of machinery to coordinate and quite a big effort. The good news is that you need a parser anyway for the compiler; if editing also parses, the AST needed by the compiler is essentially available. (Of course you have to worry about batch compiling, too). The compiler has to build a symbol table; so you can use that in the editing completion process. The more difficult news is that the parsers are a lot harder to build; they can't just declare a user-visible syntax error and quit; rather they have to be tolerant of a number of errors extant at the same moment, hold partial ASTs for the pieces, and stitch them together as the errors are removed by the user.
The Berkeley Harmonia people are doing good work in this area. It is well worth your trouble to read some of their papers to get a detailed sense of the problems and one approach to handling them.
THe other major approach people (notably Intentional Programming and XText) seem to be trying are object-oriented editors, where you attach editing actions to each AST node, and associate every point on the screen with an AST node. Then editing actions invoke AST-node specific actions (insert-character, go right, go up, ...) and it can decide how to act and how to modify the screen. Arguably you can make these editors do anything; its a little harder in practice. I've used these editors; they don't feel like text editors. There are some enthusiastic users, but YMMV.
I think you probably ought to choose between trying to build such an editor, vs. trying to define a new langauge. Doing both at once is likely to overwhelm you with troubles.

Suggestions on how to make a configurable parser

I want to build a parser for a C like language. The interesting aspect about it is that I want to build it in such a way that someone who has access to the source can easily modified it to extend the language (a new expression type of instance) with the extensions being runtime configurable (they can be turned on and off).
My current intent is to build a recursive decent parser as an object. Each production will be a method of an object. The method of extension will be to derive classes from this base replacing methods (and production definitions) as needed. I'm still trying to figure out how to mix and match extensions. One idea is to play games with the v-tbl. Objects would be constructed with a v-tbl that is a copy of the base but with methods replaced from derived classes.
Aside from the bit-twiddling nature of the solution the only issues I have with it is
a reasonable way to do the v-tbl mixup
what to do when 2 extensions alter the same productions (as most replacements will end up calling the original having one replacement call the other would work but the mechanics of setting this up are the issue)
how to allow the extension of extensions (this might end up looking like a standard MI system, but I've never got how they work)
Another solution (a slightly more mundane version of the same same approach) would be to use static member variables to store function-pointers and call them for the same effect.
Edit: I have already built a system that lets me build productions from BNF definitions. I can alter it to support whatever I decide on.
These are some of the challenges the Perl 6 design effort has faced. You may find it worthwhile looking into some of the solutions they came up with. Or you may find that to be gross overkill.
I made a configurable parser I uploadei it some time ago at
http://code.google.com/p/compparser/
The project there is not up-to-date but is working fine.
If I recall my university courses correctly, recursive descent parsers have some limitations that might bite you, especially since you're allowing extensions - somebody elses language extension could cause issues.
A proper compiler toolkit - such as the open source ANTLR - might make things easier, and might also provide some different approaches for you.
another option is to express the parsing rules in XML or something, instead of in code; less efficient, but far more dynamically configurable; each language or variant can just use its own (XML) file, and even include/reference other files as 'base' files...
Frankly, I am not even sure I understood everything you wrote... :-)
But when I see parser and flexibility, I think about LPeg - Parsing Expression Grammars For Lua. It might not fit your needs but it is well worth a look... ;-)

Resources