Factory classes vs closures in Zend Framework 2 - zend-framework2

Is it better to use factory classes or closures in Zend Framework 2, and why?
I know that closures cannot be serialized, but if you return them from Module#getServiceConfig(), this will not affect the caching of the rest of your configuration data, and the closures would be cached in your opcode cache anyway.
How does performance differ in constructing a factory class vs executing a closure? Does PHP wrap and instantiate closures only when you execute them, or would it do this for every closure defined in your configuration file on every request?
Has anyone compared the execution times of each method?
See also:
Dependency management in Zend Framework 2 MVC applications
Passing forms vs raw input to service layer

PHP will convert the anonymous functions in your configuration to instances of the closure class at compile time so it would do this on every request. This is unlike create_function which will create the function at run time. However, since closures do this at compile time it should be in your opcache cache and therefore should not matter.
In terms of the performance impact of constructing a service using a factory vs closure firstly you have to remember that the service will only ever be constructed once per request regardless of how many times you ask for the service. I ran a quick benchmark of fetching a service using a closure and a factory and here is what I got (I ran a few times and all of the results were about the same value):
Closure: 0.026999999999999ns
Factory: 0.30200000000002ns
Those are nanoseconds, i.e 10-9 seconds. Basically the performance difference is so small that there is no effective difference.
Also ZF2 can't cache my whole module's configuration with closures. If I use purely factories then my whole configuration can be merged, cached and a simple file can be read on each request rather than having to worry about loading and merging configuration files every time. I've not measured the performance impact of this, but I'd guess it is marginal in any case.
However, I prefer factories for mainly readability and maintainability. With factories you don't end up with some massive configuration file with tons of closures all over the place.
Sure, closures are great for rapid development but if you want your code to be readable and maintainable then I'd say stick with factories.

Related

Simple Injector, dependency injection of services and multiple instances

I am using Simple Injector for my DI library. I have controllers in my asp.net MVC site that take services in their constructors via this library. When I look at the Diagnostic Tools in Visual Studio and see my Managed Memory I see multiple instances of the same service.
var container = new Container();
container.Options.DefaultScopedLifestyle = new WebRequestLifestyle();
container.RegisterWebApiControllers(GlobalConfiguration.Configuration);
container.RegisterMvcControllers(Assembly.GetExecutingAssembly());
container.RegisterMvcIntegratedFilterProvider();
RegisterComponents(container);
container.Verify();
DependencyResolver.SetResolver(new SimpleInjectorDependencyResolver(container));
My question is, is this by design. I figured one IPaymentsService would be used throughout all the controllers but I have a count of 187? I would think it should be 1.
I am thinking about adding the below line. Seems to be working fine, and now I see 700,000 KB less of memory used and a 10+ second faster load time on the site. Is there any downside with this?
container.Options.DefaultLifestyle = Lifestyle.Scoped;
Let me start with the basics, most of which you will likely be familiar with, but let’s do it for completeness and correctness.
With Simple Injector, Transient means short-lived and not cached. If service X is registered as Transient, and multiple components depend on X, each get their own new instance. Even if X implements IDisposable, it is not tracked nor disposed of. After creation, Simple Injector forgets about Transients immediately.
Within a specifically defined scope (e.g. a web request), Simple Injector will only create a single instance of a service registered as Scoped. If service X is Scoped, and multiple components depend on X, all components that are created within that same scope get the same instance of X. Scoped instances are tracked and if they implement IDisposable (or IAsyncDisposable), Simple Injector calls their Dispose method when the scope is disposed of. In the context of a web request disposal of scopes (and therefore Scoped components) this managed for you by the Simple Injector integration packages.
With Singleton, Simple injector will ensure at most one instance of that service within a single Container instances. If you have multiple Container instances (which you typically wouldn’t in production, but more likely during testing) each Container instance gets its own cache with Singletons.
The described behavior is specific to Simple Injector. Other DI Containers might have different behavior and definitions when it comes to these lifestyles:
Simple Injector considers a Transient component short lived, possibly including state, while the ASP.NET Core DI (MS.DI) considers Transient components to be *stateless, living for any possible duration. * Because of this different view, with MS.DI, Transients can be injected into Singletons, while with Simple Injector you can’t. Simple Injector will give an diagnostic error when you call Verify().
Transients are not disposed of by Simple Injector. That’s why you get a diagnostic error when you register a Transient that implements IDisposable. Again, this is very different with some other DI Containers. With MS.DI, transients are tracked and disposed of when their scope, from which they are resolved, is disposed of. There are pros and cons to this. Important con is that this will lead to accidental memory leaks when resolving disposable transients from the root container, because MS.DI will store those Transients forever.
About choosing the right lifestyle for a component:
Choosing the correct lifestyle (Transient/Scoped/Singleton) for a component is a delegate matter.
If a component contains state that should be reused throughout the whole application, you might want to register that component as Singleton -or- move the state out of the component and hide it behind a service which component can be registered as Singleton.
The expected lifetime of a component should never exceed that of its consumers. Transient is the shortest lifestyle, while Singleton is the longest. This means that a Singleton component should only depend on other Singletons, while Transient components can depend on both Transient, Scoped, and Singleton components. Scoped components should typically only have Scoped and Singleton dependencies.
The previous rule, however, is quite strict, which is why with Simple Injector v5 we decided to allow Scoped components take a dependency on Transient components as well. This will typically be fine in case scopes live for a short period of time (as is with web requests in web applications). If, however, you have long-running operations that operate within a single scope, but do regularly call back into the container to resolve new instances, having Scoped instances depend on (stateful) Transients could certainly cause trouble; that’s, however, not a very common use case.
Failing to adhere to the rule of “components should only depend on equal or longer-lived components,” results in Captive Dependencies. Simple Injector calls them “Lifestyle Mismatches,” but it’s the same thing.
A stateless component with no dependencies can be registered as Singleton.
The lifestyle of a stateless component with dependencies, depends on the lifestyle of its dependencies. If it has components that are either Scoped or Transient, it should itself be either Scoped or Transient. If it were registered as Singleton, that would lead to its dependencies becoming Captive Dependencies.
If a stateless component only has Singleton dependencies, it can be registered as Singleton as well.
When it comes to selecting the correct lifestyles for your components in the application, there are two basic Composition Models to choose from, namely the Ambient Composition Model and the Closure Composition Model. I wrote a series of five blog posts on this starting here. With the Ambient Composition Model you’ll make all components in your application stateless and store state outside of the object graphs. This allows you to register almost all your components as Singleton, but it does lead to complications and likely a somewhat different design of your application.
It's, therefore, more likely that you are applying the second Composition Model: The Closure Composition Model. This is the most common model, and used by most developers and pushed by most application frameworks (e.g. ASP.NET Core DI). With the Closure Composition Model, you would typically register most of your components as Transient. Only the few components in your application that do contain state would be registered as either Scoped or Singleton. Although you could certainly “tune” composition by looking at the consumers and dependencies of components and decide to increase the lifestyle (to Scoped or even Singleton) to prevent unneeded creation of instances, downside of this is that is more fragile.
For instance, if you have a stateless component X that depends on a Singleton component Y, you can make component X a Singleton as well. But once Y requires a Scoped or Transient dependency of its own, you will not only have to adjust the lifestyle of Y, but of X as well. This could cascade up the dependency chain. So instead, with the Closure Composition Model, it would typically be normal to keep things “transient unless.”
About performance:
Simple Injector is highly optimized, and it would typically not make much difference if a few extra components are created. Especially if they are stateless. When running a 32 bits process, such class would consume “8 + (4 * number-of-dependencies)” bytes of memory. In other words, a stateless component with 1 dependency consumes 12 bytes of memory, while a component with 5 dependencies consumes 28 bytes (assuming a 32 bits processes; multiply this by 2 under 64 bits).
On the other hand, managing and composing Scoped instances comes with its own overhead. Although Simple Injector is highly tuned in this regard, Scoped instances need to be cached and resolved from the scope’s internal dictionary. This comes at a cost. This means that creating a component with no dependencies a few times as Transient in a graph is likely faster than having it resolved as Scoped.
Under normal conditions, you wouldn’t have to worry about the amount of extra memory and the amount of extra CPU it takes to produce those extra Transient instances. But perhaps you are not under normal conditions. The following abnormal conditions could cause trouble:
If you violate the simple-injection-constructors rule: When a component’s constructor does more than simply storing its supplied dependencies (for instance calling them, doing I/O or something CPU intensive or memory intensive) creating extra transient instances can hurt a lot. You should absolutely try to stay away from this situation whenever possible.
**Your application creates massive object graphs: ** If object graphs are really big, you’ll likely see certain components being depended upon multiple (or even many) times in the graph. If the graph is massive (thousands of individual instances), this could lead to the creation of hundreds or even thousands of extra objects, especially when those components have Transient dependencies of their own. This situation often happens when components have many dependencies. If for instance your application’s components regularly have more than 5 dependencies, you’ll quickly see the size of the object graph explode. Important to note here is that this is typically caused by a violation of the Single Responsibility Principle. Components get many dependencies when they are doing too much, taking too many responsibilities. This easily causes them to have many dependencies, and when their dependencies have many dependencies, things can explode quite easily. The real solution in that case is to make your components smaller. For instance, if you have classes like “OrderService” and “CustomerService”, they will likely have a hodgepodge of functionality and a big list of dependencies. This causes a myriad of problems; big object graphs being one of them. Fixing this in an existing application, however, is typically not easy; it requires a different design and a different mindset.
In these kinds of scenarios changing a component’s lifestyle can be beneficial for the performance of the application. You already seem to have established this in your application. In general, changing the lifestyle from Transient to Scoped is a pretty safe change. This is why Simple Injector v5 doesn’t complain anymore when you inject Transient dependencies into Scoped consumers.
This will not be the case, however, when you have a stateful Transient component, while each consumer does expect to get its own state; in that case, changing it to Scope would in fact break your application. However, this is not a design that I typically endorse. Although I’ve seen this type of composition a few times in the past, I never do this in my applications; IMO it leads to unneeded complexity.
TLDR;
Long story short, there are a lot of factors to consider, and perhaps a lot of places in the application where the design could be approved, but in general (especially in the context of web requests) changing the lifestyle of stateless components from Transient to Scoped is usually pretty safe. If this results in a big performance win in your application, you can certainly consider making Scoped the default lifestyle.

How to pass thread local variable in Project Reactor

I started using project reactor. Does anyone know how can I pass thread local variables from one thread to another? I saw some methods on Hooks.java but could not figure out what is the recommended way of doing this. Can someone point me to some documentation or with a code snippet on how to do it. Thanks.
I have a working example in this github repository based on the spring-cloud-sleuth's implementation: https://github.com/gumartinm/JavaForFun/tree/master/SpringJava/WebReactive/spring-webreactive-reactor-context-enrich
The key classes are: ContextCoreSubscriber.java, SubscriberContext.java, ThreadContextEnrichmentAutoConfiguration.java and UsernameFilter.java
ContextCoreSubscriber.java:
Enables you to fill the Mapped Diagnostic Context: MDC
SubscriberContext.java:
Helper class for inserting data in the Reactor's Context.
ThreadContextEnrichmentAutoConfiguration.java:
In charge of configuring the Reactor's Hooks: Hooks.onEachOperator
UsernameFilter.java:
Example where we want to register the username information based on some HTTP header.
Reactor doesn't guarantee that the processing done by a Flux or Mono chain of operators will stick executing on a single thread. On the contrary, it performs work-stealing and lets the user switch execution context.
As such, using ThreadLocal is not very adapted to Reactor.
There is currently some work done in 3.1.0 towards providing an equivalent, at least for library authors that use Reactor, but nothing definite in place yet.
Keep your eyes peeled for 3.1.0, that should be the main theme of that release (and will probably be the focus of the second upcoming milestone, M2).

Dagger #Reusable scope vs #Singleton

From the User's Guide:
Sometimes you want to limit the number of times an #Inject-constructed
class is instantiated or a #Provides method is called, but you don’t
need to guarantee that the exact same instance is used during the
lifetime of any particular component or subcomponent.
Why would I use that instead of #Singleton?
Use #Singleton if you rely on singleton behavior and guarantees. Use #Reusable if an object would only be a #Singleton for performance reasons.
#Reusable bindings have much more in common with unscoped bindings than #Singleton bindings: You're telling Dagger that you'd be fine creating a brand-new object, but if there's a convenient object already created then Dagger may use that one. In contrast, #Singleton objects guarantee that you will always receive the same instance, which can be much more expensive to enforce.
In general, Dagger and DI prefer unscoped objects: Creating a new object is a great way to keep state tightly-contained, and allows for objects to be garbage-collected as soon as the dependent object can. Dagger shows some of this preference built-in: In Dagger unscoped objects can be mixed in to any component or module, regardless of whether the component is scope-annotated. This type of unscoped binding is also useful for stateless objects like injectable (mockable) utility classes and implementations of strategy, command, and other polymorphic behavioral design patterns: The objects should be bound globally and injected for testing/overrides, but instances don't keep any state and short-lived or disposable.
However, in Android and other performance- and memory-constrained environments, it goes against performance recommendations to create a lot of temporary objects (as described on android.com but removed since January 2022), because instance creation and garbage collection are both more-expensive processes than on desktop VMs. This leads to the pragmatic solution of marking an object #Singleton, not because it's important to always get the same instance, but just to conserve instances. This works, but is semantically-weak, and also has memory and speed implications: Your short-lived util or strategy pattern object now has to exist as long as your application exists, and must be accessed through double-checked locking, or else you risk violating the "one instance only" #Singleton guarantee that is unnecessary here. This can be a source of increased memory usage and synchronization overhead.
The compromise is in #Reusable bindings, which have instance-conserving properties like #Singleton but are excepted from the scope-matching #Component rule just like unscoped bindings—which gives you more flexibility about where you install them. (See tests.) They have a lifespan only as long as the outermost component that uses them directly, and will opportunistically use an instance from an ancestor to conserve further, but without double-checked locking to save on creation costs. (Consequently, several instances of a #Reusable object may exist simultaneously in your object graph, particularly if they were requested on multiple threads at the same time.) Finally, and most importantly, they're a signal to you and future developers about the way the class is intended to be used.
Though the cost is lower, it's not zero: As Ron Shapiro notes on Medium, "Reusable has many of the same costs as Singleton. It saves synchronization, but it still forces extra classes to be loaded at app startup. The real suggestion here is to never scope unless you’ve profiled and you’ve seen a performance improvement by scoping." You'll have to evaluate the speed and memory effects yourself: #Reusable is another useful tool in the toolbox, but that doesn't mean it's always or obviously a good choice.
In short, #Singleton would work, but #Reusable has some distinct performance advantages if the whole point is performance instead of object lifecycle. Don't forget to measure performance before and after you mark an instance #Reusable, to make sure #Reusable is really a benefit for your use case.
Follow-up question from saiedmomen: "Just to be 100% clear things like okhttpclient, retrofit and gson should be declared #Reusable. right??"
Yes, in general I think it'd be good to declare stateless utilities and libraries as #Reusable. However, if they secretly keep some state—like handling connection limits or batching across all consumers—you might want to make them #Singleton, and if they are used very infrequently from a long-lived component it might still make sense to make them scopeless so they can be garbage-collected. It's really hard to make a general statement here that works for all cases and libraries: You'll have to decide based on library functionality, memory weight, instantiation cost, and expected lifespan of the objects involved.
OkHttpClient in particular does manage its own connection and thread pools per instance, as Wyko points out in the comments, and Albert Vila Calvo likewise notes Retrofit's intended-singleton behavior. That would make those good candidates for #Singleton over #Reusable. Thanks Wyko and Albert!

Recommendations to test API request layer in iOS apps using NSOperations and Coredata

I develop an iOS app that uses a REST API. The iOS app requests data in worker threads and stores the parsed results in core data. All views use core data to visualize the information. The REST API changes rapidly and I have no real control over the interface.
I am looking for advice how perform integration tests for the app as easy as possible. Should I test against the API or against Mock data? But how to mock GET requests properly if you can create resources with POST or modify them with PUT?
What frameworks do you use for these kind of problems? I played with Frank, which looks nice but is complicated due to rapid UI changes in the iOS app. How would you test the "API request layer" in the app? Worker threads are NSOperations in a queue - everything is build asynchronously. Any recommendations?
I would strongly advise you to mock the server. Servers go down, the behavior changes, and if a test failure implies "maybe my code still works", you have a problem on your hands, because your test doesn't tell you whether or not the code is broken, which is the whole point.
As for how to mock the server, for a unit test that does this:
first_results = list_things()
delete_first_thing()
results_after_delete = list_thing()
I have a mock data structure that looks like this:
{ list_things_request : [first_results, results_after_delete],
delete_thing_request: [delete_thing_response] }
It's keyed on your request, and the value is an array of responses for that request in the order that they were seen. Thus you can support repeatedly running the same request (like listing the things) and getting a different result. I use this format because in my situation it is possible for my API calls to run in a slightly different order than it did last time. If your tests are simpler, you might be able to get away with a simple list of request/response pairs.
I have a flag in my unit tests that indicate if I am in "record" mode (that is, talking to a real server and recording this data-structure to disk) or if I am in "playback" mode (talking to the datastructure). When I need to work with a test, I "record" the interactions with the server and then play them back.
I use the little-known SenTestCaseDidStartNotification to track which unit test is running and isolate my data files appropriately.
The other thing to keep in mind is that instability is the root of all evil. If you have code that does things with sets, or gets the current date, and such, this tends to change the requests and responses, which do not work in an offline scenario. So be careful with those.
(Since nobody stepped in yet and gave you a complete walkthrough) My humble advice: Step back a bit, take out the magic of async, regard everything as sync (api calls, parsing, persistence), and isolate each step as a consumer/producer. After all you don't wan't to unit-test NSURLConnection, or JSONKit or whatever (they should have been tested if you use them), you want to test YOUR code. Your code takes some input and produces output, non-aware of the fact that the input was in fact the output genereated in a background thread somewhere. You can do the isolated test all sync.
Can we agree on the fact that your Views don't care about how their model data was provided? If yes, well, test your View with mock objects.
Can we agree on the fact that your parser doesn't care about how the data was provided? If yes, well, test your parser with mock data.
Network layer: same applies as described above, in the end you'll get an NSDictionary of headers, and some NSData or NSString of content. I don't think you want to unit-test NSURLConnection or any 3'rd party networking api you trust (asihttp, afnetworking,...?), so in the end, what's to be tested?
You can mock up URLs, request headers and POST data for each use-case you have, and setup test cases for expected responses.
In the end, IMHO, it's all about "normalizing" out asyc.
Take a look at Nocilla
For more info, check this other answer to a similar question

MVC componentization vs parallel data retrieval

This question describes two approaches of solving the sophisticated architectural problem related to ASP.NET MVC. Unfortunately our team is quite new to this technology and we haven’t found any solid sources of information on this particular topic (except overviews where it’s said that MVC is more about separation than componentization). So as for now we are hesitating: whether our solution is appropriate or there is a different obvious way to solve this problem.
We have a requirement to make ASP.NET MVC-based design with componentization in mind. View engine Razor is also a requirement for us. The key feature here is that any level of controller’s nesting is expected (obviously thru Html.Action directive within .cshtml). Any controller could potentially obtain the data thru a webservice call (the final design can break this limitation, as it’s described below).
The issue is that the data must be obtained in async and maximum parallel fashion. E.g. if two backend calls within the controllers are independent they must be performed in parallel.
At first glance the usage of async MVC controllers could solve all the problems. But there is a hidden caveat: nested controller must be specified within cshtml only (within a view). And a .cshtml view is being parsed after the original controller finished its own async execution. So all the async operations within the nested controller will be performed in a separate async slot and therefore not in parallel with the first parent controller. This is a limitation of synchronous nature of .cshtml processing.
After a deep investigation we revealed that two options are available.
1) Have only one parent async controller which will retrieve all the data and put this data into container (dictionary or whatever). The nested controllers aren’t allowed to perform any backend calls. Instead of this they will have a reference to the initialized container with the results of all the backend calls. Bu this way the consumer of the framework must differentiate between parent and child controller which is not a brilliant solution.
2) Retrieve all the data from backends within a special async HttpModule. This module will initialize the same container which will reside, for instance within HttpContext. Obviously all the controllers in such a case will not be allowed to perform any backend calls, but they will have a unified internal structure (in comparison with #1 option).
As for now we think that the option #2 is more desirable, but we are more interested in the solid community-adopted way to solve this problem in a real enterprise-level MVC projects.
Literally any links/comments are welcomed.
[UPD] A requirement of any level of nesting of controllers came from our customer which wants a system where fully reusable MVC components will be presented. And they could be combined in any sequence with any level of nesting - as it is already done in the existing webforms-based implementation. This is a business rule for existing app that the components could be combined anyhow so we're not targeted to break this rule. As for now we think that such a component is a combination of "controller+view+metadata" where "metadata" part describes the backend calls to be performed in the scenario 1 or 2.
Why are you considering async calls here? Keep in mind if your async calls are so the asp.net threads don't get all used up since the db is taking a while to return, as soon as new requests come in they too will go to the db, thus increasing the workload and in turn gaining nothing.
To be honest though, Im having a hard time following exactly what you have in mind here. Nested controllers for...?
"The key feature here is that any level of controller’s nesting is expected"
I think I (we?) need a bit more information on that part here.
However, the warning on async still stands :)
E.g. if two backend calls within the controllers are
independent they must be performed in parallel.
If they are truly independent you might be able to use asynch JavaScript calls from the client and achieve some degree of parallelism that way.

Resources