Avoiding all DI antipatterns for types requiring asynchronous initialization - dependency-injection

I have a type Connections that requires asynchronous initialization. An instance of this type is consumed by several other types (e.g., Storage), each of which also require asynchronous initialization (static, not per-instance, and these initializations also depend on Connections). Finally, my logic types (e.g., Logic) consumes these storage instances. Currently using Simple Injector.
I've tried several different solutions, but there's always an antipattern present.
Explicit Initialization (Temporal Coupling)
The solution I'm currently using has the Temporal Coupling antipattern:
public sealed class Connections
{
Task InitializeAsync();
}
public sealed class Storage : IStorage
{
public Storage(Connections connections);
public static Task InitializeAsync(Connections connections);
}
public sealed class Logic
{
public Logic(IStorage storage);
}
public static class GlobalConfig
{
public static async Task EnsureInitialized()
{
var connections = Container.GetInstance<Connections>();
await connections.InitializeAsync();
await Storage.InitializeAsync(connections);
}
}
I've encapsulated the Temporal Coupling into a method, so it's not as bad as it could be. But still, it's an antipattern and not as maintainable as I'd like.
Abstract Factory (Sync-Over-Async)
A common proposed solution is an Abstract Factory pattern. However, in this case we're dealing with asynchronous initialization. So, I could use Abstract Factory by forcing the initialization to run synchronously, but this then adopts the sync-over-async antipattern. I really dislike the sync-over-async approach because I have several storages and in my current code they're all initialized concurrently; since this is a cloud application, changing this to be serially synchronous would increase startup time, and parallel synchronous is also not ideal due to resource consumption.
Asynchronous Abstract Factory (Improper Abstract Factory Usage)
I can also use Abstract Factory with asynchronous factory methods. However, there's one major problem with this approach. As Mark Seeman comments here, "Any DI Container worth its salt will be able to auto-wire an [factory] instance for you if you register it correctly." Unfortunately, this is completely untrue for asynchronous factories: AFAIK there is no DI container that supports this.
So, the Abstract Asynchronous Factory solution would require me to use explicit factories, at the very least Func<Task<T>>, and this ends up being everywhere ("We personally think that allowing to register Func delegates by default is a design smell... If you have many constructors in your system that depend on a Func, please take a good look at your dependency strategy."):
public sealed class Connections
{
private Connections();
public static Task<Connections> CreateAsync();
}
public sealed class Storage : IStorage
{
// Use static Lazy internally for my own static initialization
public static Task<Storage> CreateAsync(Func<Task<Connections>> connections);
}
public sealed class Logic
{
public Logic(Func<Task<IStorage>> storage);
}
This causes several problems of its own:
All my factory registrations have to pull dependencies out of the container explicitly and pass them to CreateAsync. So the DI container is no longer doing, you know, dependency injection.
The results of these factory calls have lifetimes that are no longer managed by the DI container. Each factory is now responsible for lifetime management instead of the DI container. (With the synchronous Abstract Factory, this is not an issue if the factory is registered appropriately).
Any method actually using these dependencies would need to be asynchronous - since even the logic methods must await for the storage/connections initialization to complete. This is not a big deal for me on this app since my storage methods are all asynchronous anyway, but it can be a problem in the general case.
Self Initialization (Temporal Coupling)
Another, less common, solution is to have each member of a type await its own initialization:
public sealed class Connections
{
private Task InitializeAsync(); // Use Lazy internally
// Used to be a property BobConnection
public X GetBobConnectionAsync()
{
await InitializeAsync();
return BobConnection;
}
}
public sealed class Storage : IStorage
{
public Storage(Connections connections);
private static Task InitializeAsync(Connections connections); // Use Lazy internally
public async Task<Y> IStorage.GetAsync()
{
await InitializeAsync(_connections);
var connection = await _connections.GetBobConnectionAsync();
return await connection.GetYAsync();
}
}
public sealed class Logic
{
public Logic(IStorage storage);
public async Task<Y> GetAsync()
{
return await _storage.GetAsync();
}
}
The problem here is that we're back to Temporal Coupling, this time spread out throughout the system. Also, this approach requires all public members to be asynchronous methods.
So, there's really two DI design perspectives that are at odds here:
Consumers want to be able to inject instances that are ready to use.
DI containers push hard for simple constructors.
The problem is - particularly with asynchronous initialization - that if DI containers take a hard line on the "simple constructors" approach, then they are just forcing the users to do their own initialization elsewhere, which brings its own antipatterns. E.g., why Simple Injector won't consider asynchronous functions: "No, such feature does not make sense for Simple Injector or any other DI container, because it violates a few important ground rules when it comes to dependency injection." However, playing strictly "by the ground rules" apparently forces other antipatterns that seem much worse.
The question: is there a solution for asynchronous initialization that avoids all antipatterns?
Update: Complete signature for AzureConnections (referred to above as Connections):
public sealed class AzureConnections
{
public AzureConnections();
public CloudStorageAccount CloudStorageAccount { get; }
public CloudBlobClient CloudBlobClient { get; }
public CloudTableClient CloudTableClient { get; }
public async Task InitializeAsync();
}

This is a long answer. There's a summary at the end. Scroll down to the summary if you're in a hurry.
The problem you have, and the application you're building, is a-typical. It’s a-typical for two reasons:
you need (or rather want) asynchronous start-up initialization, and
Your application framework (azure functions) supports asynchronous start-up initialization (or rather, there seems to be little framework surrounding it).
This makes your situation a bit different from a typical scenario, which might make it a bit harder to discuss common patterns.
However, even in your case the solution is rather simple and elegant:
Extract initialization out of the classes that hold it, and move it into the Composition Root. At that point you can create and initialize those classes before registering them in the container and feed those initialized classes into the container as part of registrations.
This works well in your particular case, because you want to do some (one-time) start-up initialization. Start-up initialization is typically done before you configure the container (or sometimes after if it requires a fully composed object graph). In most cases I’ve seen, initialization can be done before, as can be done effectively in your case.
As I said, your case is a bit peculiar, compared to the norm. The norm is:
Start-up initialization is synchronous. Frameworks (like ASP.NET Core¹) typically do not support asynchronous initialization in the start-up phase.
Initialization often needs to be done per-request and just-in-time rather than per-application and ahead-of-time. Often components that need initialization have a short lifetime, which means we typically initialize such instance on first use (in other words: just-in-time).
There is usually no real benefit of doing start-up initialization asynchronously. There is no practical performance benefit because, at start-up time, there will only be a single thread running anyway (although we might parallelize this, that obviously doesn’t require async). Also note that although some application types might deadlock on doing synch-over-async, in the Composition Root we know exactly which application type we are using and whether or not this will be a problem or not. A Composition Root is always application-specific. In other words, when we have initialization in the Composition Root of a non-deadlocking application (e.g. ASP.NET Core, Azure Functions, etc), there is typically no benefit of doing start-up initialization asynchronously, except perhaps for the sake of sticking to the advised patterns & practices.
Because you know whether or not sync-over-async is a problem or not in your Composition Root, you could even decide to do the initialization on first use and synchronously. Because the amount of initialization is finite (compared to per-request initialization) there is no practical performance impact on doing it on a background thread with synchronous blocking if you wish. All you have to do is define a Proxy class in your Composition Root that makes sure that initialization is done on first use. This is pretty much the idea that Mark Seemann proposed as answer.
I was not familiar at all with Azure Functions, so this is actually the first application type (except Console apps of course) that I know of that actually supports async initialization. In most framework types, there is no way for users to do this start-up initialization asynchronously at all. Code running inside an Application_Start event in an ASP.NET application or in the Startup class of an ASP.NET Core application, for instance, there is no async. Everything has to be synchronous.
On top of that, application frameworks don’t allow you to build their framework root components asynchronously. So even if DI Containers would support the concept of doing asynchronous resolves, this wouldn’t work because of the ‘lack’ of support of application frameworks. Take ASP.NET Core’s IControllerActivator for instance. Its Create(ControllerContext) method allows you to compose a Controller instance, but the return type of the Create method is object, not Task<object>. In other words, even if DI Containers would provide us with a ResolveAsync method, it would still cause blocking because ResolveAsync calls would be wrapped behind synchronous framework abstractions.
In the majority of cases, you’ll see that initialization is done per-instance or at runtime. A SqlConnection, for instance, is typically opened per request, so each request needs to open its own connection. When you want to open the connection ‘just in time’, this inevitably results in application interfaces that are asynchronous. But be careful here:
If you create an implementation that is synchronous, you should only make its abstraction synchronous in case you are sure that there will never be another implementation (or proxy, decorator, interceptor, etc.) that is asynchronous. If you invalidly make the abstraction synchronous (i.e. have methods and properties that do not expose Task<T>), you might very well have a Leaky Abstraction at hand. This might force you to make sweeping changes throughout the application when you get an asynchronous implementation later on.
In other words, with the introduction of async you have to take even more care of the design of your application abstractions. This holds for your specific case as well. Even though you might only require start-up initialization now, are you sure that for the abstractions you defined (and AzureConnections as well) will never need just-in-time synchronous initialization? In case the synchronous behavior of AzureConnections is an implementation detail, you will have to make it async right away.
Another example of this is your INugetRepository. Its members are synchronous, but that is clearly a Leaky Abstraction, because the reason it is synchronous is because its implementation is synchronous. Its implementation, however, is synchronous because it makes use of a legacy NuGet package that only has a synchronous API. It’s pretty clear that INugetRepository should be completely async, even though its implementation is synchronous, because implementations are expected to communicate over the network, which is where asynchronicity makes sense.
In an application that applies async, most application abstractions will have mostly async members. When this is the case, it would be a no-brainer to make this kind of just-in-time initialization logic async as well; everything is already async.
Summary
In case you need start-up initialization: do it before or after configuring the container. This makes composing object graphs itself fast, reliable, and verifiable.
Doing initialization before configuring the container prevents Temporal Coupling, but might mean you will have to move initialization out of the classes that require it (which is actually a good thing).
Async start-up initialization is impossible in most application types. In the other application types it is typically unnecessary.
In case you require per-request or just-in-time initialization, there is no way around having asynchronous interfaces.
Be careful with synchronous interfaces if you’re building an asynchronous application, you might be leaking implementation details.
Footnotes
ASP.NET Core actually does allow async start-up initialization, but not from within the Startup class. There are several ways to achieve this: either you implement and register hosted services that contain (or delegate to) the initialization, or trigger the async initialization from within the async Main method of the program class.

While I'm fairly sure the following isn't what you're looking for, can you explain why it doesn't address your question?
public sealed class AzureConnections
{
private readonly Task<CloudStorageAccount> storage;
public AzureConnections()
{
this.storage = Task.Factory.StartNew(InitializeStorageAccount);
// Repeat for other cloud
}
private static CloudStorageAccount InitializeStorageAccount()
{
// Do any required initialization here...
return new CloudStorageAccount( /* Constructor arguments... */ );
}
public CloudStorageAccount CloudStorageAccount
{
get { return this.storage.Result; }
}
}
In order to keep the design clear, I only implemented one of the cloud properties, but the two others could be done in a similar fashion.
The AzureConnections constructor will not block, even if it takes significant time to initialise the various cloud objects.
It will, on the other hand, start the work, and since .NET tasks behave like promises, the first time you try to access the value (using Result) it's going to return the value produced by InitializeStorageAccount.
I get the strong impression that this isn't what you want, but since I don't understand what problem you're trying to solve, I thought I'd leave this answer so at least we'd have something to discuss.

It looks like you are trying to do what I am doing with my proxy singleton class.
services.AddSingleton<IWebProxy>((sp) =>
{
//Notice the GetService outside the Task. It was locking when it was inside
var data = sp.GetService<IData>();
return Task.Run(async () =>
{
try
{
var credentials = await data.GetProxyCredentialsAsync();
if (credentials != null)
{
return new WebHookProxy(credentials);
}
else
{
return (IWebProxy)null;
}
}
catch(Exception ex)
{
throw;
}
}).Result; //Back to sync
});

Related

PerRequestLifetimeManager and Task.Factory.StartNew - Dependency Injection with Unity

How to manage new tasks with PerRequestLifeTimeManager?
Should I create another container inside a new task?(I wouldn't like to change PerRequestLifeTimeManager to PerResolveLifetimeManager/HierarchicalLifetimeManager)
[HttpPost]
public ActionResult UploadFile(FileUploadViewModel viewModel)
{
var cts = new CancellationTokenSource();
CancellationToken cancellationToken = cts.Token;
Task.Factory.StartNew(() =>
{
// _fileService = DependencyResolver.Current.GetService<IFileService>();
_fileService.ProcessFile(viewModel.FileContent);
}, cancellationToken);
}
You should read this article about DI in multi-threaded applications. Although it is written for a different DI library, you'll find most of the information applicable to the concept of DI in general. To quote a few important parts:
Dependency injection forces you to wire all dependencies together in a
single place in the application: the Composition Root. This means that
there is a single place in the application that knows about how
services behave, whether they are thread-safe, and how they should be
wired. Without this centralization, this knowledge would be scattered
throughout the code base, making it very hard to change the behavior
of a service.
In a multi-threaded application, each thread should get its own object
graph. This means that you should typically call
[Resolve<T>()] once at the beginning of the thread’s
execution to get the root object for processing that thread (or
request). The container will build an object graph with all root
object’s dependencies. Some of those dependencies will be singletons;
shared between all threads. Other dependencies might be transient; a
new instance is created per dependency. Other dependencies might be
thread-specific, request-specific, or with some other lifestyle. The
application code itself is unaware of the way the dependencies are
registered and that’s the way it is supposed to be.
The advice of building a new object graph at the beginning of a
thread, also holds when manually starting a new (background) thread.
Although you can pass on data to other threads, you should not pass on
container-controlled dependencies to other threads. On each new
thread, you should ask the container again for the dependencies. When
you start passing dependencies from one thread to the other, those
parts of the code have to know whether it is safe to pass those
dependencies on. For instance, are those dependencies thread-safe?
This might be trivial to analyze in some situations, but prevents you
to change those dependencies with other implementations, since now you
have to remember that there is a place in your code where this is
happening and you need to know which dependencies are passed on. You
are decentralizing this knowledge again, making it harder to reason
about the correctness of your DI configuration and making it easier to
misconfigure the container in a way that causes concurrency problems.
So you should not spin of new threads from within your application code itself. And you should definitely not create a new container instance, since this can cause all sorts of performance problems; you should typically have just one container instance per application.
Instead, you should pull this infrastructure logic into your Composition Root, which allows your controller's code to be simplified. Your controller code should not be more than this:
[HttpPost]
public ActionResult UploadFile(FileUploadViewModel viewModel)
{
_fileService.ProcessFile(viewModel.FileContent);
}
On the other hand, you don't want to change the IFileService implementation, because it shouldn't its concern to do multi-threading. Instead we need some infrastructural logic that we can place in between the controller and the file service, without them having to know about this. They way to do this is by implementing a proxy class for the file service and place it in your Composition Root:
private sealed class AsyncFileServiceProxy : IFileService {
private readonly ILogger logger;
private readonly Func<IFileService> fileServiceFactory;
public AsyncFileServiceProxy(ILogger logger, Func<IFileService> fileServiceFactory)
{
this.logger = logger;
this.fileServiceFactory = fileServiceFactory;
}
void IFileService.ProcessFile(FileContent content) {
// Run on a new thread
Task.Factory.StartNew(() => {
this.BackgroundThreadProcessFile(content);
});
}
private void BackgroundThreadProcessFile(FileContent content) {
// Here we run on a different thread and the
// services should be requested on this thread.
var fileService = this.fileServiceFactory.Invoke();
try {
fileService.ProcessFile(content);
}
catch (Exception ex) {
// logging is important, since we run on a
// different thread.
this.logger.Log(ex);
}
}
}
This class is a small peace of infrastructural logic that allows processing files on a background thread. The only thing left is to configure the container to inject our AsyncFileServiceProxy instead of the real file service implementation. There are multiple ways to do this. Here's an example:
container.RegisterType<ILogger, YourLogger>();
container.RegisterType<RealFileService>();
container.RegisterType<Func<IFileService>>(() => container.Resolve<RealFileService>(),
new ContainerControlledLifetimeManager());
container.RegisterType<IFileService, AsyncFileServiceProxy>();
One part however is missing here from the equation, and this is how to deal with scoped lifestyles, such as the per-request lifestyle. Since you are running stuff on a background thread, there is no HTTPContext and this basically means that you need to start some 'scope' to simulate a request (since your background thread is basically its own new request). This is however where my knowledge about Unity stops. I'm very familiar with Simple Injector and with Simple Injector you would solve this using a hybrid lifestyle (that mixes a per-request lifestyle with a lifetime-scope lifestyle) and you explicitly wrap the call to BackgroundThreadProcessFile in such scope. I imagine the solution in Unity to be very close to this, but unfortunately I don't have enough knowledge of Unity to show you how. Hopefully somebody else can comment on this, or add an extra answer to explain how to do this in Unity.

Ninject interception in multithreaded environment

I'm trying to create an interceptor using Ninject.Extensions.Interception.DynamixProxy to log method completion times.
In a single threaded environment something like this works:
public class TimingInterceptor : SimpleInterceptor
{
readonly Stopwatch _stopwatch = new Stopwatch();
private bool _isStarted;
protected override void BeforeInvoke(IInvocation invocation)
{
_stopwatch.Restart();
if (_isStarted) throw new Exception("resetting stopwatch for another invocation => false results");
_isStarted = true;
invocation.Proceed();
}
protected override void AfterInvoke(IInvocation invocation)
{
Debug.WriteLine(_stopwatch.Elapsed);
_isStarted = false;
}
}
In multithreaded scenarios this would however not work because the StopWatch is shared between invocations. How to pass an instance of StopWatch from BeforeInvoke to AfterInvoke so it would not be shared between invocations?
This should work just fine in a multi-threaded application, because each thread should get its own object graph. So when you start processing some task, you start with resolving a new graph and graphs should not be passed from thread to thread. This allows keeping the knowledge of what is thread-safe (and what not) centralized to the one single place in the application that wires everything up: the composition root.
When you work like this, it means that when you use this interceptor to monitor classes that are singletons (and used across threads), each thread will still get its own interceptor (when its registered as transient), because every time you resolve you get a new interceptor (even though you reuse the same 'intercepted' instance).
This however does mean that you have to be very careful where you inject this intercepted component into, because if you inject this intercepted object into another singleton, you will be in trouble again. This particular sort of 'trouble' is called captive dependency a.k.a lifestyle mismatch. It's really easy to accidentally misconfigure your container to get yourself into trouble by this, and unfortunately Ninject lacks the possibility to warn you about this.
Do note though, that your problems will disappear in case you start using decorators, instead of interceptors, because with a decorator you can keep everything in a single method. This means that even the decorator can be a singleton, without causing any threading issues. Example:
// Timing cross-cutting concern for command handlers
public class TimingCommandHandlerDecorator<TCommand> : ICommandHandler<TCommand>
{
private readonly ICommandHandler<TCommand> decoratee;
public TimingCommandHandlerDecorator(ICommandHandler<TCommand> decoratee)
{
this.decoratee = decoratee;
}
public void Handle(TCommand command)
{
var stopwatch = Stopwatch.StartNew();
this.decoratee.Handle(command);
Debug.WriteLine(stopwatch.Elapsed);
}
}
Of course, the use of decorators is often only possible when you correctly applied the SOLID principles to your design, because you often need to have some clear generic abstractions to be able to apply decorators to a large range of classes in your system. I can be daunting to use decorators efficiently in a legacy code base.

How to make the Controller a single instance per application in ASP.NET MVC?

Over time controllers develop a lot of dependencies, and creating an instance of controller becomes too expensive for each request (especially with DI). Is there any solution to make controllers singletons?
Creating instances of controllers is pretty fast and simple operation. What becomes too expensive is creating dependencies for each request. So, what you really need is many controllers which share same instances of dependencies.
E.g. you have following controller
public class SalesController : Controller
{
private IProductRepository productRepository;
private IOrderRepository orderRepository;
public SalesController(IProductRepository productRepository,
IOrderRepository orderRepository)
{
this.productRepository = productRepository;
this.orderRepository = orderRepository;
}
// ...
}
You should configure your dependency injection framework to use same instances of repositories for all application (keep in mind, you can have synchronization problems). Now creating dependencies is not expensive any more. All dependencies are instantiated only once, and reused for all requests.
If you have many dependencies and you are worrying about costs of getting reference to instance of each dependency and providing these references to controller instance (which I don't think will be very expensive), then you can group your dependencies (something like Introduce Parameter Object refactoring):
public class SalesController : Controller
{
private ISalesService salesService;
public SalesController(ISalesService salesService)
{
this.salesService = salesService;
}
// ...
}
public class SalesService : ISalesService
{
private IProductRepository productRepository;
private IOrderRepository orderRepository;
public SalesService(IProductRepository productRepository,
IOrderRepository orderRepository)
{
this.productRepository = productRepository;
this.orderRepository = orderRepository;
}
// ...
}
Now you have single dependency, which will be injected very quickly. If you will configure your dependency injection framework to use singleton SalesService, then all SalesControllers will reuse same instance of service. Creation of controllers and providing dependencies will be very fast.
So first an answer to the original question:
public void ConfigureServices(IServiceCollection services) {
// put other services bindings here
// bind all Controller classes as singletons
services.AddSingleton<HomeController, HomeController>();
// tell framework to obtain Controller instances from ServiceProvider.
services.AddMvc().AddControllersAsServices();
}
As stated in the original question, if controllers have big dependency trees consisting mainly of request Scoped or Transient dependencies then creating them separately for each request may have some footprint on scalability of your application (in Java for example Servlet instances are singletons by default exactly for this reason). While usually CPU and real time needed to create even a big dependency tree is negligible (unless you have some heavy computations or network communication in constructors of your components, which almost never is a good idea for transient or request scoped components), the memory usage footprint is something to reckon with. In case of common DB-Web apps memory is the main factor limiting number of concurrent requests that a single machine-node can handle. If every request has a separate copy of a big dependency tree, together they may consume a significant amount of memory (the other thing to watch for is initial stack size for a new thread, by the way).
The accepted answer 1220560 solves the problem as well, but I would consider it an ugly hack and it has some drawbacks: you need to create this artificial singleton service that will be used by your Controllers either as a service locator or a proxy for other services. If you have just one such singleton object for all your controllers then you are effectively hiding real dependencies of your Controller: for example if someone wants to write a unit-test for your Controller he needs to analyse carefully its implementation to see which dependencies it actually uses, so that he knows what mocks/fakes he needs to provide in his test setup. If later you change your Controller and as a result of your change the subset of services your controller uses changes as well, it is very easy to forget to update the test setup also. This may sometimes lead to bugs that are hard to track. Contrary to this, if your dependencies are declared explicitly as constructor params, you will get a compiler error in the test setup right away. Another thing you can do is to have a separate such a singleton proxy/service locator for each controller, but then it's a lot of hassle basically.
Regardless whether you use the solution proposed by me or the one from answer #1220560 you must be careful when injecting request Scoped dependencies into singleton objects as described in https://learn.microsoft.com/en-us/aspnet/core/fundamentals/dependency-injection#registering-your-own-services right at the end of the "registering-your-own-services" section. You can find possible solutions to this problem here: how to use scoped dependency in a singleton in C# / ASP
Another thing to watch for is concurrency issue: singleton objects may be accessed concurrently by several threads handling different concurrent requests, so make sure to add proper synchronization to any non-thread-safe resources your singleton uses.
edit:
I've just realized the original question was about ASP.NET and this answer is for ASP.NET Core, so it probably won't work for "non-Core".

Why does one use dependency injection?

I'm trying to understand dependency injections (DI), and once again I failed. It just seems silly. My code is never a mess; I hardly write virtual functions and interfaces (although I do once in a blue moon) and all my configuration is magically serialized into a class using json.net (sometimes using an XML serializer).
I don't quite understand what problem it solves. It looks like a way to say: "hi. When you run into this function, return an object that is of this type and uses these parameters/data."
But... why would I ever use that? Note I have never needed to use object as well, but I understand what that is for.
What are some real situations in either building a website or desktop application where one would use DI? I can come up with cases easily for why someone may want to use interfaces/virtual functions in a game, but it's extremely rare (rare enough that I can't remember a single instance) to use that in non-game code.
First, I want to explain an assumption that I make for this answer. It is not always true, but quite often:
Interfaces are adjectives; classes are nouns.
(Actually, there are interfaces that are nouns as well, but I want to generalize here.)
So, e.g. an interface may be something such as IDisposable, IEnumerable or IPrintable. A class is an actual implementation of one or more of these interfaces: List or Map may both be implementations of IEnumerable.
To get the point: Often your classes depend on each other. E.g. you could have a Database class which accesses your database (hah, surprise! ;-)), but you also want this class to do logging about accessing the database. Suppose you have another class Logger, then Database has a dependency to Logger.
So far, so good.
You can model this dependency inside your Database class with the following line:
var logger = new Logger();
and everything is fine. It is fine up to the day when you realize that you need a bunch of loggers: Sometimes you want to log to the console, sometimes to the file system, sometimes using TCP/IP and a remote logging server, and so on ...
And of course you do NOT want to change all your code (meanwhile you have gazillions of it) and replace all lines
var logger = new Logger();
by:
var logger = new TcpLogger();
First, this is no fun. Second, this is error-prone. Third, this is stupid, repetitive work for a trained monkey. So what do you do?
Obviously it's a quite good idea to introduce an interface ICanLog (or similar) that is implemented by all the various loggers. So step 1 in your code is that you do:
ICanLog logger = new Logger();
Now the type inference doesn't change type any more, you always have one single interface to develop against. The next step is that you do not want to have new Logger() over and over again. So you put the reliability to create new instances to a single, central factory class, and you get code such as:
ICanLog logger = LoggerFactory.Create();
The factory itself decides what kind of logger to create. Your code doesn't care any longer, and if you want to change the type of logger being used, you change it once: Inside the factory.
Now, of course, you can generalize this factory, and make it work for any type:
ICanLog logger = TypeFactory.Create<ICanLog>();
Somewhere this TypeFactory needs configuration data which actual class to instantiate when a specific interface type is requested, so you need a mapping. Of course you can do this mapping inside your code, but then a type change means recompiling. But you could also put this mapping inside an XML file, e.g.. This allows you to change the actually used class even after compile time (!), that means dynamically, without recompiling!
To give you a useful example for this: Think of a software that does not log normally, but when your customer calls and asks for help because he has a problem, all you send to him is an updated XML config file, and now he has logging enabled, and your support can use the log files to help your customer.
And now, when you replace names a little bit, you end up with a simple implementation of a Service Locator, which is one of two patterns for Inversion of Control (since you invert control over who decides what exact class to instantiate).
All in all this reduces dependencies in your code, but now all your code has a dependency to the central, single service locator.
Dependency injection is now the next step in this line: Just get rid of this single dependency to the service locator: Instead of various classes asking the service locator for an implementation for a specific interface, you - once again - revert control over who instantiates what.
With dependency injection, your Database class now has a constructor that requires a parameter of type ICanLog:
public Database(ICanLog logger) { ... }
Now your database always has a logger to use, but it does not know any more where this logger comes from.
And this is where a DI framework comes into play: You configure your mappings once again, and then ask your DI framework to instantiate your application for you. As the Application class requires an ICanPersistData implementation, an instance of Database is injected - but for that it must first create an instance of the kind of logger which is configured for ICanLog. And so on ...
So, to cut a long story short: Dependency injection is one of two ways of how to remove dependencies in your code. It is very useful for configuration changes after compile-time, and it is a great thing for unit testing (as it makes it very easy to inject stubs and / or mocks).
In practice, there are things you can not do without a service locator (e.g., if you do not know in advance how many instances you do need of a specific interface: A DI framework always injects only one instance per parameter, but you can call a service locator inside a loop, of course), hence most often each DI framework also provides a service locator.
But basically, that's it.
P.S.: What I described here is a technique called constructor injection, there is also property injection where not constructor parameters, but properties are being used for defining and resolving dependencies. Think of property injection as an optional dependency, and of constructor injection as mandatory dependencies. But discussion on this is beyond the scope of this question.
I think a lot of times people get confused about the difference between dependency injection and a dependency injection framework (or a container as it is often called).
Dependency injection is a very simple concept. Instead of this code:
public class A {
private B b;
public A() {
this.b = new B(); // A *depends on* B
}
public void DoSomeStuff() {
// Do something with B here
}
}
public static void Main(string[] args) {
A a = new A();
a.DoSomeStuff();
}
you write code like this:
public class A {
private B b;
public A(B b) { // A now takes its dependencies as arguments
this.b = b; // look ma, no "new"!
}
public void DoSomeStuff() {
// Do something with B here
}
}
public static void Main(string[] args) {
B b = new B(); // B is constructed here instead
A a = new A(b);
a.DoSomeStuff();
}
And that's it. Seriously. This gives you a ton of advantages. Two important ones are the ability to control functionality from a central place (the Main() function) instead of spreading it throughout your program, and the ability to more easily test each class in isolation (because you can pass mocks or other faked objects into its constructor instead of a real value).
The drawback, of course, is that you now have one mega-function that knows about all the classes used by your program. That's what DI frameworks can help with. But if you're having trouble understanding why this approach is valuable, I'd recommend starting with manual dependency injection first, so you can better appreciate what the various frameworks out there can do for you.
As the other answers stated, dependency injection is a way to create your dependencies outside of the class that uses it. You inject them from the outside, and take control about their creation away from the inside of your class. This is also why dependency injection is a realization of the Inversion of control (IoC) principle.
IoC is the principle, where DI is the pattern. The reason that you might "need more than one logger" is never actually met, as far as my experience goes, but the actually reason is, that you really need it, whenever you test something. An example:
My Feature:
When I look at an offer, I want to mark that I looked at it automatically, so that I don't forget to do so.
You might test this like this:
[Test]
public void ShouldUpdateTimeStamp
{
// Arrange
var formdata = { . . . }
// System under Test
var weasel = new OfferWeasel();
// Act
var offer = weasel.Create(formdata)
// Assert
offer.LastUpdated.Should().Be(new DateTime(2013,01,13,13,01,0,0));
}
So somewhere in the OfferWeasel, it builds you an offer Object like this:
public class OfferWeasel
{
public Offer Create(Formdata formdata)
{
var offer = new Offer();
offer.LastUpdated = DateTime.Now;
return offer;
}
}
The problem here is, that this test will most likely always fail, since the date that is being set will differ from the date being asserted, even if you just put DateTime.Now in the test code it might be off by a couple of milliseconds and will therefore always fail. A better solution now would be to create an interface for this, that allows you to control what time will be set:
public interface IGotTheTime
{
DateTime Now {get;}
}
public class CannedTime : IGotTheTime
{
public DateTime Now {get; set;}
}
public class ActualTime : IGotTheTime
{
public DateTime Now {get { return DateTime.Now; }}
}
public class OfferWeasel
{
private readonly IGotTheTime _time;
public OfferWeasel(IGotTheTime time)
{
_time = time;
}
public Offer Create(Formdata formdata)
{
var offer = new Offer();
offer.LastUpdated = _time.Now;
return offer;
}
}
The Interface is the abstraction. One is the REAL thing, and the other one allows you to fake some time where it is needed. The test can then be changed like this:
[Test]
public void ShouldUpdateTimeStamp
{
// Arrange
var date = new DateTime(2013, 01, 13, 13, 01, 0, 0);
var formdata = { . . . }
var time = new CannedTime { Now = date };
// System under test
var weasel= new OfferWeasel(time);
// Act
var offer = weasel.Create(formdata)
// Assert
offer.LastUpdated.Should().Be(date);
}
Like this, you applied the "inversion of control" principle, by injecting a dependency (getting the current time). The main reason to do this is for easier isolated unit testing, there are other ways of doing it. For example, an interface and a class here is unnecessary since in C# functions can be passed around as variables, so instead of an interface you could use a Func<DateTime> to achieve the same. Or, if you take a dynamic approach, you just pass any object that has the equivalent method (duck typing), and you don't need an interface at all.
You will hardly ever need more than one logger. Nonetheless, dependency injection is essential for statically typed code such as Java or C#.
And...
It should also be noted that an object can only properly fulfill its purpose at runtime, if all its dependencies are available, so there is not much use in setting up property injection. In my opinion, all dependencies should be satisfied when the constructor is being called, so constructor-injection is the thing to go with.
I think the classic answer is to create a more decoupled application, which has no knowledge of which implementation will be used during runtime.
For example, we're a central payment provider, working with many payment providers around the world. However, when a request is made, I have no idea which payment processor I'm going to call. I could program one class with a ton of switch cases, such as:
class PaymentProcessor{
private String type;
public PaymentProcessor(String type){
this.type = type;
}
public void authorize(){
if (type.equals(Consts.PAYPAL)){
// Do this;
}
else if(type.equals(Consts.OTHER_PROCESSOR)){
// Do that;
}
}
}
Now imagine that now you'll need to maintain all this code in a single class because it's not decoupled properly, you can imagine that for every new processor you'll support, you'll need to create a new if // switch case for every method, this only gets more complicated, however, by using Dependency Injection (or Inversion of Control - as it's sometimes called, meaning that whoever controls the running of the program is known only at runtime, and not complication), you could achieve something very neat and maintainable.
class PaypalProcessor implements PaymentProcessor{
public void authorize(){
// Do PayPal authorization
}
}
class OtherProcessor implements PaymentProcessor{
public void authorize(){
// Do other processor authorization
}
}
class PaymentFactory{
public static PaymentProcessor create(String type){
switch(type){
case Consts.PAYPAL;
return new PaypalProcessor();
case Consts.OTHER_PROCESSOR;
return new OtherProcessor();
}
}
}
interface PaymentProcessor{
void authorize();
}
** The code won't compile, I know :)
The main reason to use DI is that you want to put the responsibility of the knowledge of the implementation where the knowledge is there. The idea of DI is very much inline with encapsulation and design by interface.
If the front end asks from the back end for some data, then is it unimportant for the front end how the back end resolves that question. That is up to the requesthandler.
That is already common in OOP for a long time. Many times creating code pieces like:
I_Dosomething x = new Impl_Dosomething();
The drawback is that the implementation class is still hardcoded, hence has the front end the knowledge which implementation is used. DI takes the design by interface one step further, that the only thing the front end needs to know is the knowledge of the interface.
In between the DYI and DI is the pattern of a service locator, because the front end has to provide a key (present in the registry of the service locator) to lets its request become resolved.
Service locator example:
I_Dosomething x = ServiceLocator.returnDoing(String pKey);
DI example:
I_Dosomething x = DIContainer.returnThat();
One of the requirements of DI is that the container must be able to find out which class is the implementation of which interface. Hence does a DI container require strongly typed design and only one implementation for each interface at the same time. If you need more implementations of an interface at the same time (like a calculator), you need the service locator or factory design pattern.
D(b)I: Dependency Injection and Design by Interface.
This restriction is not a very big practical problem though. The benefit of using D(b)I is that it serves communication between the client and the provider. An interface is a perspective on an object or a set of behaviours. The latter is crucial here.
I prefer the administration of service contracts together with D(b)I in coding. They should go together. The use of D(b)I as a technical solution without organizational administration of service contracts is not very beneficial in my point of view, because DI is then just an extra layer of encapsulation. But when you can use it together with organizational administration you can really make use of the organizing principle D(b)I offers.
It can help you in the long run to structure communication with the client and other technical departments in topics as testing, versioning and the development of alternatives. When you have an implicit interface as in a hardcoded class, then is it much less communicable over time then when you make it explicit using D(b)I. It all boils down to maintenance, which is over time and not at a time. :-)
Quite frankly, I believe people use these Dependency Injection libraries/frameworks because they just know how to do things in runtime, as opposed to load time. All this crazy machinery can be substituted by setting your CLASSPATH environment variable (or other language equivalent, like PYTHONPATH, LD_LIBRARY_PATH) to point to your alternative implementations (all with the same name) of a particular class. So in the accepted answer you'd just leave your code like
var logger = new Logger() //sane, simple code
And the appropriate logger will be instantiated because the JVM (or whatever other runtime or .so loader you have) would fetch it from the class configured via the environment variable mentioned above.
No need to make everything an interface, no need to have the insanity of spawning broken objects to have stuff injected into them, no need to have insane constructors with every piece of internal machinery exposed to the world. Just use the native functionality of whatever language you're using instead of coming up with dialects that won't work in any other project.
P.S.: This is also true for testing/mocking. You can very well just set your environment to load the appropriate mock class, in load time, and skip the mocking framework madness.

DI Container and custom-scoped state in legacy system

I believe I understand the basic concepts of DI / IoC containers having written a couple of applications using them and reading a lot of stack overflow answers as well as Mark Seeman's book. There are still some cases that I have trouble with, especially when it comes to integrating DI container to a large existing architecture where DI principle hasn't been really used (think big ball of mud).
I know the ideal scenario is to have a single composition root / object graph per operation but in a legacy system this might not be possible without major refactoring (only the new and some select refactored old parts of the code could have dependencies injected through constructor and the rest of the system using the container as a service locator to interact with the new parts). This effectively means that a stack trace deep within an operation might include several object graphs with calls being made back and forth between new subsystems (single object graph until exiting into an old segment) and traditional subsystems (service locator call at some point to code under DI container).
With the (potentially faulty, I might be overthinking this or be completely wrong in assuming this kind of hybrid architecture is a good idea) assumptions out of the way, here's the actual problem:
Let's say we have a thread pool executing scheduled jobs of various types defined in database (or any external place). Each separate type of scheduled job is implemented as a class inheriting a common base class. When the job is started, it gets fed the information about which targets it should write its log messages to and the configuration it should use. The configuration could probably be handled by just passing the values as method parameters to whatever class needs them but if the job implementation gets larger than say 10-20 classes, it doesn't seem very handy.
Logging is the larger problem. Subsystems the job calls probably also need to write things to the log and usually in examples this is done by just requesting instance of ILog in the constructor. But how does that work in this case when we don't know the details / implementation until runtime? Since:
Due to (non DI container controlled) legacy system segments in the call chain (-> there potentially being multiple separate object graphs), child container cannot be used to inject the custom logger for specific sub-scope
Manual property injection would basically require the complete call chain (including all legacy subsystems) to be updated
A simplified example to help better perceive the problem:
Class JobXImplementation : JobBase {
// through constructor injection
ILoggerFactory _loggerFactory;
JobXExtraLogic _jobXExtras;
public void Run(JobConfig configurationFromDatabase)
{
ILog log = _loggerFactory.Create(configurationFromDatabase.targets);
// if there were no legacy parts in the call chain, I would register log as instance to a child container and Resolve next part of the call chain and everyone requesting ILog would get the correct logging targets
// do stuff
_jobXExtras.DoStuff(configurationFromDatabase, log);
}
}
Class JobXExtraLogic {
public void DoStuff(JobConfig configurationFromDatabase, ILog log) {
// call to legacy sub-system
var old = new OldClass(log, configurationFromDatabase.SomeRandomSetting);
old.DoOldStuff();
}
}
Class OldClass {
public void DoOldStuff() {
// moar stuff
var old = new AnotherOldClass();
old.DoMoreOldStuff();
}
}
Class AnotherOldClass {
public void DoMoreOldStuff() {
// call to a new subsystem
var newSystemEntryPoint = DIContainerAsServiceLocator.Resolve<INewSubsystemEntryPoint>();
newSystemEntryPoint.DoNewStuff();
}
}
Class NewSubsystemEntryPoint : INewSubsystemEntryPoint {
public void DoNewStuff() {
// want to log something...
}
}
I'm sure you get the picture by this point.
Instantiating old classes through DI is a non-starter since many of them use (often multiple) constructors to inject values instead of dependencies and would have to be refactored one by one. The caller basically implicitly controls the lifetime of the object and this is assumed in the implementations (the way they handle internal object state).
What are my options? What other kinds of problems could you possibly see in a situation like this? Is trying to only use constructor injection in this kind of environment even feasible?
Great question. In general, I would say that an IoC container loses a lot of its effectiveness when only a portion of the code is DI-friendly.
Books like Working Effectively with Legacy Code and Dependency Injection in .NET both talk about ways to tease apart objects and classes to make DI viable in code bases like the one you described.
Getting the system under test would be my first priority. I'd pick a functional area to start with, one with few dependencies on other functional areas.
I don't see a problem with moving beyond constructor injection to setter injection where it makes sense, and it might offer you a stepping stone to constructor injection. Adding a property is usually less invasive than changing an object's constructor.

Resources