sharing DAO across queues. separate reads/writes. masstransit - asp.net-mvc

I saw an example of how to create asp.net mvc application using masstransit messaging.
What I do undersatnd:
asp.net mvc controllers are meant to pusblish events;
whereas responses come from other (rabbitmq) queue hosted in a separate assembly via SignalR/or any other transport;
each queue (jobs, db writes, responses) is hosted in a separate console project/windows
service.
What I don't undersatnd:
whether separation of reads and writes is necessary?
the example uses mongodb which I still can use. So it doesn't get problematic to even separate reads and writes. But what about SQL Server/EF? How would you build/share db access across multiple assemblies? Should each queue assembly host its own DbContext?

If you were to use SQL Server and EF then yes, each process would have it's own DbContext instance to access the the database. It would be the same shared type across all processes. That's not a problem at all. I would define the EF domain in a shared assembly though. Breaking apart reads and writes isn't an issue at all.
Breaking apart reads and writes isn't really necessary, esp. at first. However, it can become needed as you scale up and have other concerns mixed in or additional scale. Breaking apart reads and writes is just a useful tool in modeling the software you need to write. It's not a requirement at all.

Related

Firedac multiple pools of connections

I'm in process of upgrading our client-server ERP app to multi-tier. We want to offer our customers the possibility having their databases in cloud(hosted in our server).So, clients are written in Delphi, server is a http IOCP server written also in Delphi (from mORMot framework), for dbs we use Firebird embedded.
Our customers(let's say 200), can have 25-30 Firebird databases (total 5000-6000 databases), accessed by 4-5 user per each customer. This is not happening all at once. One user can work in one db, other 2 users can work in another db, but all the dbs should be available and online. So, I can have 800-1000 users working at 700-900 dbs. Databases are not big, typically 20-30 MB each but can go to 200 MB.
This is not data sharding so please don't suggest to merge all databases together, I really need them individually with possibility to backup/restore/replace each one of them.
So, I need multiple pools of connections - for every database I need a pool of let's say 2 connections. I read about Firedac connection pooling. It seems that TFDManager should be perfect for me. I define multiple "ConnectionDef"s with "Pooled=true" and it can maintain multiple pools of connections (each connection lasting until some minutes of inactivity).
Questions:
I have to create all "ConnectionDef"s before server starts serving requests?
Can TFDManager "handle" requests (and time-out connections on inactivity), while in other thread I need to create a new database , so automatically I need to create a new pool of connections and start serving requests from newly created database. Practically can I call FDManager.AddConnectionDef(..) while other pools are in use?
AFAIK Firebird embedded does not have any "connection". As its name states, it is embedded within the same process, so there is no connection pool needed. Connection pools are needed when several clients connect/disconnect to the same DB over the network, whereas here all is embedded, and you would have direct access to the Firebird engine.
As such, you may:
Define one "connnection" per Firebird embedded database;
Protect your SOA code via a mutex (aka critical section) per DB. In fact, mORMot's HTTP IOCP server would run the incoming requests from a thread pool, so ensure that all DB access is safely made.
Ensure you use at least Firebird 2.5 since the embedded version is told to be threadsafe only since revision 2.5 (see the release notes).
Instead of FireDAC, consider using ZDBC/Zeos (in latest 7.2/7.3 branch), which has nice features, together with the native mORMot SynDB libraries.
Looking at Firedac sources, seems that all about adding connection definitions and acquiring connections in pooled mode is thread-safe.
Adding a connection definition or matching one is guarded by a TMultiReadExclusiveWriteSynchronizer and acquiring a connection from the pool is guarded by a TCriticalSection.
So, answers:
I don't have to create all "ConnectionDef"s before server starts serving requests.
Yes, I can call safely FDManager.AddConnectionDef(..) while other pools are in use.
Using Firedac, acquiring a connection for any of those databases will be guarded by one TCriticalSection. The solution proposed by #Arnaud Bouchez presents a more grained access by creating one TCriticalSection per database and I think will scale better, but you should be aware of a bug when using multiple TCriticalSection, especially that all will be initiated at once:
https://www.delphitools.info/2011/11/30/fixing-tcriticalsection/
In that article present a very simple fix for this bug.

Orleans Architecture Design - Silo projects combined or separate?

There is an IoT data collection project, and an IoT data processing project. They are separately developed and maintained. However, it would be beneficial to share common grains between them in an Orleans silo (or silo cluster). How would the architecture look in a self-hosted scenario - a monolithic silo with references to both projects for communication within the silo or two separate silos communicating externally? If in a single silo, can a silo dynamically discover grain .dll's?
There will probably be better answers, but until then:
There are some trade-offs. Performance-wise, it's better to spread all your grains (of all services) across the cluster. This way every grain communicates with other grains via Orleans infrastructure (I guess that's binary serialized messages through tcp), without any additional overhead. But when every service (or project) has it's own silo, you will need a gateway - HTTP listener maybe - in addition to Orleans. However, in the first example, your services become coupled. You cannot deploy a new version of a service, as long as there is a silo running an older version of it (otherwise, there might be 2 grains of the same entity). But if you shut down that silo, you are shutting down the rest of the services. This is a very non trivial issue.
If in a single silo, can a silo dynamically discover grain .dll's
Not sure what you mean. When a silo boots up, it recursively searches for dlls inside it's folder, and if it finds grains, loads them.

How do I call Web API from MVC without latency?

I'm thinking about moving my DAL which uses DocumentDb and Azure Table Storage to a separate Web API and host it as a cloud service on Azure.
The primary purpose of doing this is to make sure that I keep a high performance DAL that can scale up easily and independently of my front-end application -- currently ASP.NET MVC 5 running as a cloud service on Azure but I'll definitely add mobile apps as well. With DocumentDb and Azure Table Storage, I'm finding myself doing a lot of data handling in my C# code, therefore, I think it would be a good idea to keep that separate from my front-end application.
However, I'm very concerned about latency issues introduced by HTTP calls from one cloud service to another which would defeat the purpose of separating DAL into its own application/cloud service.
What is the best way to separate my DAL from my front-end application without introducing any latency issues?
I think the trade off between scaling-out/partitioning resources and network latency is unavoidable. That being said, you may find the trade-off well worth it for many reasons (i.e. enabling parallel execution of application tasks, increased reliability, etc.) when working w/ large-scale systems.
Here are some general tips to help you minimize the hit on network latency:
Use caching to avoid cross-service calls whenever possible.
Batch cross-service calls and re-use connections whenever possible to minimize the cost associated w/ traversing the NAT out of one cloud service and through the load balancer into another. Note - your application must also be able to handle dropped connections (inevitable in cloud architecture).
Monitor performance metrics as much as possible to take measurements and identify bottlenecks.
Co-locate your applications layers within the same datacenter to keep cross-service latency to a minimum.
You may also find the following literature useful: http://azure.microsoft.com/en-us/documentation/articles/best-practices-performance/
I recently split out my DAL to a WebAPI that serves data from DocumentDB for both the MVC website and mobile applications for the same reasons stated by the questioner.
The statements from aliuy are valid performance considerations generally accepted as good practice.
But more specifically - in order to call Web API from MVC without latency using Azure cloud services, one should specify same affinity group for each resource (websites, cloud services, etc).
Affinity groups are a way you can group your cloud services by
proximity to each other in the Azure datacenter in order to achieve
optimal performance. When you create an affinity group, it lets Azure
know to keep all of the services that belong to your affinity group as
physically close to each other as possible.
https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-migrate-to-regional-vnet/

MassTransit in ASP.NET MVC site?

I'd like to decouple a number of business objects that my website is using to support actions of the users.
My website is a SaaS/B2B site and I do not anticiapte to have a need for "mega scale". My primary issue is a need to decouple business objects from each other, and perform occasional longer-running operations asynchronously - outside of execution of threads that handle user traffic.
Having said that, I really do not want to have a separate set of servers that process my messages, and would prefer for web servers to just host MassTransit or other Bus software) internaly in memory. Assured message delivery (at this point) is also not yet my most important concenrn. I plan to "outsorce" a number of supporting business actions to the bus so that they do not pollute my main business services/objects.
Is this possible? Do I need Loopback for now as a transport or do I need full RabbitMq? Will RabbitMQ require me to install yet another set of servers to host it?
TIA
Loopback is just for testing. Installing RMQ is the right path. You don't NEED different servers for it, but would suggest it. If you off load work to a bus, you don't really want that contending with resources for the website. Given that, you can run RMQ locally without any issue. It message volume is low, so is resource usage in RMQ. When you reacher higher volumes, IO can be a problem with RabbitMQ (or any MQ).

FoundationDB, the layer: Is it hosted on client application or server nodes?

Recently I was reading about concept of layers in FoundationDB. I like their idea, the decomposition of storage from one side and access to it from other.
There are some unclear points regarding implementation of the layers. Especially how they communicate with the storage engine. There are two possible answers: they are parts of server nodes and communicate with the storage by fast native API calls (e.g. as linked modules hosted in the server process) -OR- hosted inside client application and communicate through network protocol. For example, the SQL layer of many RDBMS is hosted on the server. And how are things with FoundationDB?
PS: These two cases are different from the performance view, especially when the clinent-server communication is high-latency.
To expand on what Eonil said: the answer rests on the distinction between two different sense of "client" and "server".
Layers are not run within the database server processes. They use the FDB client API to make requests of the database, and do not (with one exception*) get to pierce the transactional key-value abstraction.
However, there is nothing stopping your from running the layers on the same physical (or virtual) server machines as the database server processes. And, as that post from the community site mentions, there are use cases where you might very much wish to do this in order to minimize latencies.
*The exception is the Locality API, which is mostly useful in exactly those cases where you want to co-locate client-side layers with the data on which they operate.
Layers are on top of client-side library feature.
Cited from http://community.foundationdb.com/questions/153/what-layers-do-you-want-to-see-first
That's a good question. One reason that it doesn't always make sense
to run layers on the server is that in a distributed database, that
data is scattered--the servers themselves are a network hop away from
a random piece of data, just like the client.
Of course, for something like an analytics layer which is aware of
what data each server contains, it makes sense to run a distributed
version co-located with each of the machines in the FDB cluster.

Resources