Breaking public member functions into lots of private member functions - private-members

When I write a class's public member function that does several things, like..
void Level::RunLogic(...);
In that function I find myself splitting it up into several private member functions. There's no point in splitting the public member function up into several functions because you wouldn't do one thing without the other, and I don't want the user worrying about what to in what order etc. Rather, the RunLogic() function would look something like this...
void Level::RunLogic(...) {
DoFirstThing();
DoSecondThing();
DoThirdThing();
}
With the DoThing functions being private member functions. In Code Complete, Steve McConnel recommends to reduce the number of functions you have in a class, but I'd rather not just put all that code into the one function. My assumption of his true meaning is that a class shouldn't have too much functionality, but I'm just wondering what other programmers think regarding this.
In addition, I've been moving towards exposing less and less implementation details in my public member functions, and moving most of the work to small private member functions. Obviously this creates more functions...but that's where the question lies.

You are right to want to keep the public method simple, and split its functionality into multiple private methods.
McConnell is right that you should reduce the number of methods you keep in a class.
Those two goals are not really at odds. I don't think McConnell would advocate making your functions longer to reduce the number of them. Rather, you should consider pushing some of those private methods into one or more utility classes that can be used by the public class.
The best way to accomplish this will depend on the details of your code, of course.

I recommend breaking them into separate methods, where each method takes care of one small task, and have a descriptive for each private method. Breaking the methods up and the method names would make the logic code more readable! Compare:
double average(double[] numbers) {
double sum = 0;
for (double n : numbers) {
sum += n;
}
return sum / numbers.length;
}
to:
double sum(double[] numbers) {
double sum = 0;
for (double n : numbers) sum += n;
return sum;
}
double average(double[] numbers) {
return sum(numbers) / numbers.length;
}
Code Complete addresses the interface that each class exposes, not the implementation.
It may make more sense to make those smaller methods as package protected, so you can easily unit test them, instead being able to test the complicated RunLogic only.

I agree with splitting the functionality.
I've always been taught and have stuck to the knowledge that a function is defined to perform a single encapsulated task, if it seems to be doing more than one thing then it may be feasible to re-factor it. And then a class to encapsulate similar or related functions together.
Splitting these tasks down and still using a single public member in my opinion allows a class to perform that important task in the way it was intended, whilst making it easier to maintain. I also often find that there are multiple similar sections of code in the same complex method which can be re-factored into a single generic function with parameters - both improving readability and maintainability; and even reducing the amount of code.

One rule that I use is the rule of three. If I am doing the same thing three times in different places, it's worth having a separate (possibly private) member function that encapsulates that shared behavior.
The only other rule that I generally use in this situation is that of readability. If I'm looking at a member function that fills more than a full screen in my IDE, I find it hard to trace all the code paths and possible wrong turns. In that case, it's perfectly reasonable to break one monolithic member function into two or more private helper functions.
Regarding the Code Complete comment about the number of functions in a class, remember that the greatest risk involved in the number of functions is primarily linked to the complexity of the externally available interface. The more entry points you provide for other people to work with, the greater the likelihood that something will go wrong (e.g., due to a bug in one of the public members, incorrect input, etc). Adding a private member function does not increase the complexity of the public API.

You are unintentionally following the ff rules:
A class should have only one reason to change / Single responsibility principle
Which is a good thing.
By properly separating the responsibilities you are making sure that your app won't easily break due to change

I don't see the advantage of keeping especially public member functions as small as possible (measured in lines of code). Generally, long functions tend to be less readable, so in most cases, spreading the implementation of big functions improves readability, no matter whether they are part of a class or not.
As to the good advice of keeping classes as small as possible, this is especially important for the public part of the interface and for the amount of data encapsulated. The reason is that classes with lots of data members and public functions undo the advantages of data encapsulation.

A 'rule of thumb' that I use is that no one function will take more than a screen full space at a time. Although rather simple minded, it helps when you can 'take it all in' one screen at a time. It will also limit (by definition) the complexity of any one method. When you are expected to hand over your work to a co-worker for further maintenance, upgrades and bug fixes(!), keeping it simple is good. With that in mind, I agree that keeping your public methods simple is almost always the correct choice.

Related

Performance impact of using Colon vs Dot for function declarations in large lua tables

I've developed the habit of declaring almost all of my modules' functions with a colon rather than a dot, but I don't use much OOP and almost never use "self".
It seems redundant that self gets passed as a parameter every time I call a function, especially if the tables are quite large.
Is there any performance impact with this? Is it worth changing all my function declarations to use a dot?
There's not much of a performance impact to passing a single additional table reference to a function. This is regardless of the table size, as the table doesn't get copied.
Rather than performance, this seems to be a question of programming style. It is very uncommon to use the colon-syntax for module functions, as this idiom is clearly meant to be used for actual method calls. As such, a library that uses it where it's not necessary will look very confusing to any other Lua programmer.

Using F# to Build a Highly Debuggable Business Rules Engine

The Problem
I have code in F# representing a logical tree. It’s a Business Rules Engine with some fairly simple mathematical functions. I would like to be able to run the rules of the tree many times and see how many times each specific route through the tree is taken.
The requirements are that the base rules should not be changed too much from the simple match statements I’m using at the moment. Tagging the important functions with an attribute would be fine, but adding a call to a logging function at every node is not. I want to be able to run the code in two modes, a highly performant standard mode which just gives answers, and then an “exploratory mode” which gives more detail behind each call. While I don’t mind complicated code to dynamically load and profile the rules, the rules code itself must look simple. Ideally I’d like to not rely on 3rd party libraries - powerpack is ok. The solution must also target the .NET 4.0 runtime.
Potential Solutions
Add a logging call to every function with the function name and arguments. I don’t like this because even if I could disable it in some kind of release mode, it still clutters the rules and means all new code has to be written in an unnatural way.
Each function return its result, and then a list which contains the names of the methods so far called. I don’t like this because it would look unnatural, and would carry a performance hit. I’m sure I could use a computational expression to do a lot of the plumbing, but that violates the requirement to keep the rules simple.
Parse the rules tree using quotations, and then build a new expression which is the old expression with a call to a logging function injected into the site of each tagged function. This is the best thing I’ve got so far, but I’m worried about compiling the resulting quotation so I can run it. I understand (please correct me if I’m wrong) that not all quotations can be compiled. I’d rather not have an unstable process that limits the rules code to a subset of the F# language. If the rules compile, I would like my solution to be able to deal with them.
I know this is a difficult problem with a fairly strict set of requirements, but if anyone has any inspiration for a solution, I would be very grateful.
Edit: Just to give an example of the sort of rules I might be using, if I owned a widget factory producing products A and B the simple following code might be used. I don't want to lose the readability and simplicity of the formulas by adorning this layer with helper functions and hooks.
type ProductType = | ProductA | ProductB
let costOfA quantity =
100.0 * quantity
let costOfB quantity =
if quantity < 100.0 then
20.0 * quantity
else
15.0 * quantity
let calculateCostOfProduct productType quantity =
match productType with
| ProductA -> costOfA quantity
| ProductB -> costOfB quantity

Linked list single class vs multiple classes

In my second term as an computer science student almost the whole term we have focused on writing linked lists in different variations(stack, queue, ...). The design of these lists always came down to this
class List<T> {
class ListElement {
T value;
ListElement next;
}
ListElement root;
}
with variations to which methods were implemented and how they worked (I have left out constructors and properties for simplicity here).
Some day I started learning scala and focusing on functional programming. This also came to the point where a linked list was written but in a different style of implementation.
class List[T]( head: T, tail: List[T])
Despite the different syntax and immutability this is in my opinion a different aproach.
And I thought to myself "Well you could have implemented lists the same way in C# or Java with one class less than the aproach you learned".
I can see why you would implement a linked list like that in a functional language where recursion is not as dangerous as in C# or Java because at least for my way of thinking a recursive implementation of all the usual methods on a linked list for this design is very intuitive.
What I do not understand is why are linked lists in C# or Java typically implemented in the first fashion when you could implement them the other way with less code but equal verbosity? (I am not talking about the implementation of lists in the libraries of the language but about the lists you typically write as a programmer to be)
The only benefit I can see with the first approach is that you can hide the implementation from the user a bit better but is this the reason and also is this worth the additional class?
I wouldn't even need to expose my implementation to the user as I could still implement my list internally different and maybe only have chosen to have a constructor like that and provide functionality to retreive the first element of the list as head and also the rest as tail.
The reasons for them to be "implemented in the first fashion" as you mentioned include
Performance.
Time and space complexity are the two most important concerns while writing algorithms or implementing data structures that support operations like search and sort. As you have mentioned, the lists created the recursive way aren't mutable! The very purpose of creating a list is attaining faster operations on that. So designers prefer the 'first fashion'.
Object orientation
While solving real world problems, the initial object oriented analysis and design (OOAD) matter a lot. With an object modelling that closely resembles the real world objects/things as much as they can, designers can achieve better solutions. The recursive approach seems to miss out this aspect
Scalability
Designers of APIs/Libraries keep scalability in mind when they draft the designs. A code written in the 'first fashion' is much more scalable, and easy to comprehend.
Other design concerns
This is not an exhaustive list of the reasons in any way. There are so many other factors and experience based learning that exist in the programming folklore, that lead to the choice of the first fashion.

Is there an idiomatic way to order function arguments in Erlang?

Seems like it's inconsistent in the lists module. For example, split has the number as the first argument and the list as the second, but sublists has the list as the first argument and the len as the second argument.
OK, a little history as I remember it and some principles behind my style.
As Christian has said the libraries evolved and tended to get the argument order and feel from the impulses we were getting just then. So for example the reason why element/setelement have the argument order they do is because it matches the arg/3 predicate in Prolog; logical then but not now. Often we would have the thing being worked on first, but unfortunately not always. This is often a good choice as it allows "optional" arguments to be conveniently added to the end; for example string:substr/2/3. Functions with the thing as the last argument were often influenced by functional languages with currying, for example Haskell, where it is very easy to use currying and partial evaluation to build specific functions which can then be applied to the thing. This is very noticeable in the higher order functions in lists.
The only influence we didn't have was from the OO world. :-)
Usually we at least managed to be consistent within a module, but not always. See lists again. We did try to have some consistency, so the argument order in the higher order functions in dict/sets match those of the corresponding functions in lists.
The problem was also aggravated by the fact that we, especially me, had a rather cavalier attitude to libraries. I just did not see them as a selling point for the language, so I wasn't that worried about it. "If you want a library which does something then you just write it" was my motto. This meant that my libraries were structured, just not always with the same structure. :-) That was how many of the initial libraries came about.
This, of course, creates unnecessary confusion and breaks the law of least astonishment, but we have not been able to do anything about it. Any suggestions of revising the modules have always been met with a resounding "no".
My own personal style is a usually structured, though I don't know if it conforms to any written guidelines or standards.
I generally have the thing or things I am working on as the first arguments, or at least very close to the beginning; the order depends on what feels best. If there is a global state which is chained through the whole module, which there usually is, it is placed as the last argument and given a very descriptive name like St0, St1, ... (I belong to the church of short variable names). Arguments which are chained through functions (both input and output) I try to keep the same argument order as return order. This makes it much easier to see the structure of the code. Apart from that I try to group together arguments which belong together. Also, where possible, I try to preserve the same argument order throughout a whole module.
None of this is very revolutionary, but I find if you keep a consistent style then it is one less thing to worry about and it makes your code feel better and generally be more readable. Also I will actually rewrite code if the argument order feels wrong.
A small example which may help:
fubar({f,A0,B0}, Arg2, Ch0, Arg4, St0) ->
{A1,Ch1,St1} = foo(A0, Arg2, Ch0, St0),
{B1,Ch2,St2} = bar(B0, Arg4, Ch1, St1),
Res = baz(A1, B1),
{Res,Ch2,St2}.
Here Ch is a local chained through variable while St is a more global state. Check out the code on github for LFE, especially the compiler, if you want a longer example.
This became much longer than it should have been, sorry.
P.S. I used the word thing instead of object to avoid confusion about what I was talking.
No, there is no consistently-used idiom in the sense that you mean.
However, there are some useful relevant hints that apply especially when you're going to be making deeply recursive calls. For instance, keeping whichever arguments will remain unchanged during tail calls in the same order/position in the argument list allows the virtual machine to make some very nice optimizations.

What is the difference between a monad and a closure?

i am kinda confused reading the definition between the two. Can they actually intersect in terms of definition? or am i completely lost? Thanks.
Closures, as the word tends to be used, are just functions (or blocks of code, if you like) that you can treat like a piece of data and pass to other functions, etc. (the "closed" bit is that wherever you eventually call it, it behaves just as it would if you called it where it was originally defined). A monad is (roughly) more like a context in which functions can be chained together sequentially, and controls how data is passed from one function to the next.
They're quite different, although monads will often use closures to capture logic.
Personally I would try to get solid on the definition of closures (essentially a piece of logic which also captures its environment, i.e. local variables etc) before worrying about monads. They can come later :)
There are various questions about closures on Stack Overflow - the best one to help you will depend on what platform you're working on. For instance, there's:
What are closures in .NET?
Function pointers, closures and lambda
Personally I'm only just beginning to "grok" monads (thanks to the book I'm helping out on). One day I'll get round to writing an article about them, when I feel I understand them well enough :)
A "closure" is an object comprising 1) a function, and 2) the values of its free variables where it's constructed.
A "monad" is a class of functions that can be composed in a certain way, i.e. by using associated bind and return higher-order function operators, to produce other functions.
I think monads are a little more complicated than closures because closures are just blocks of code that remember something from the point of their definitions and monads are a construct for "twisting" the usual function composition operation.

Resources