Why are read-only nodes called read-only in the case of data store replication? - serverless

I was going through the article, https://learn.microsoft.com/en-us/azure/architecture/patterns/cqrs which says, "If separate read and write databases are used, they must be kept in sync". One obvious benefit I can understand from having separate read replicas is that they can be scaled horizontally. However, I have some doubts:
It says, "Updating the database and publishing the event must occur in a single transaction". My understanding is that there is no guarantee that the updated data will be available immediately on the read-only nodes because it depends on when the event will be consumed by the read-only nodes. Did I get it correctly?
Data must be first written to read-only nodes before it can be read i.e. write operations are also performed on the read-only nodes. Why are they called read-only nodes? Is it because the write operations are performed on these nodes not directly by the data producer application; but rather by some serverless function (e.g. AWS Lambda or Azure Function) that picks up the event from the topic (e.g. Kafka topic) to which the write-only node has sent the event?
Is the data sharded across the read-only nodes or does every read-only node have the complete set of data?

All of these have "it depends"-like answers...
Yes, usually, although some implementations might choose to (try to) update read models transactionally with the update. With multiple nodes you're quickly forced to learn the CAP theorem, though, and so in many CQRS contexts, eventual consistency is just accepted as a feature, as the gains from tolerating it usually significantly outweigh the losses.
I suspect the bit you quoted anyway refers to transactionally updating the write store with publishing the event. Even this can be difficult to achieve, and is one of the problems event sourcing seeks to solve.
Yes. It's trivially obvious - in this context - that data must be written before it can be read, but your apps as consumers of the data see them as read-only.
Both are valid outcomes. Usually this part is less an application concern and is more delegated to the capabilities of your chosen read-model infrastructure (Mongo, Cosmos, Dynamo, etc).

Related

How to do Neo4j Cache-based Sharding?

I've been reading Neo4j's Operational Manual on Cache Sharding, and posts all over the web, however I can hardly find any detailed example on how to configure HAProxy for cache sharding(yes the one on Operation Manual is rather brief) on a real-world graph, which may contain multiple node labels.
Has anyone ever done this before? Would be lovely if you could share your experience.
Moreover, I'm a bit confused on the mechanism of the way to shard the graph using HAProxy. How do sub-graphs get cached on certain slaves, merely by providing rules in HAProxy? It surprised me to learn that cache sharding isn't handled by Neo4j.
The goal is to send queries hitting the same region of your graph always to the same instance. This of course means that the request data indicates the region. What to use as "region indicator" is heavily depending on the structure and shape of your graph.
In a lot of cases of customer facing applications people successfully used the current user id and set it as additional http header which is then evaluated by haproxy.

Should instances of a horizontally scaled microservice share DB?

Given a microservice that owns a relational database and needs to scale horizontally, I see two approaches to provisioning of the database server:
provide each instance of the service with it's own DB server instance with a coupled process lifecycle
OR
have the instances connect to a shared (by identical instances of the same service) independent db server or cluster
With an event driven architecture and the former approach, each instance of the microservice would need to process each event and take the appropriate action to mutate its own isolated state. This seems inefficient.
Taking the latter approach, only one instance has to process the event to achieve the same effect but as a mutation of the shared state. One must ensure each event is processed by only one instance of the given microservice (is this trivial?) to avoid conflict.
Is there consensus on preferred approach here? What lessons has your experience taught you on this?
I would go with the first approach, a service local DB. Each instance has its own DB instance. This enables to change the persistence layer between versions of the service.
Changing the ER model otherwise would lead to conflicts. You would also be able to change to a NoSQL solution with this approach easily.
With the event driven design, I can recommend this book: Designing Event Driven Systems
As I see it, a service receives an request that leads to an Event. This Event is consumed by the other instances of the service, therefore the request doesn't need to be processed again, but the result has to be copied to the instances state.

Apache Samza local storage - OrientDB / Neo4J graph instead of KV store

Apache Samza uses RocksDB as the storage engine for local storage. This allows for stateful stream processing and here's a very good overview.
My use case:
I have multiple streams of events that I wish to process taken from a system such as Apache Kafka.
These events create state - the state I wish to track is based on previous messages received.
I wish to generate new stream events based on the calculated state.
The input stream events are highly connected and a graph such as OrientDB / Neo4J is the ideal medium for querying the data to create the new stream events.
My question:
Is it possible to use a non-KV store as the local storage for Samza? Has anyone ever done this with OrientDB / Neo4J and is anyone aware of an example?
I've been evaluating Samza and I'm by no means an expert, but I'd recommend you to read the official documentation, and even read through the source code—other than the fact that it's in Scala, it's remarkably approachable.
In this particular case, toward the bottom of the documentation's page on State Management you have this:
Other storage engines
Samza’s fault-tolerance mechanism (sending a local store’s writes to a replicated changelog) is completely decoupled from the storage engine’s data structures and query APIs. While a key-value storage engine is good for general-purpose processing, you can easily add your own storage engines for other types of queries by implementing the StorageEngine interface. Samza’s model is especially amenable to embedded storage engines, which run as a library in the same process as the stream task.
Some ideas for other storage engines that could be useful: a persistent heap (for running top-N queries), approximate algorithms such as bloom filters and hyperloglog, or full-text indexes such as Lucene. (Patches accepted!)
I actually read through the code for the default StorageEngine implementation about two weeks ago to gain a better sense of how it works. I definitely don't know enough to say much intelligently about it, but I can point you at it:
https://github.com/apache/samza/tree/master/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv
https://github.com/apache/samza/tree/master/samza-kv/src/main/scala/org/apache/samza/storage/kv
The major implementation concerns seem to be:
Logging all changes to a topic so that the store's state can be restored if a task fails.
Restoring the store's state in a performant manner
Batching writes and caching frequent reads in order to save on trips to the raw store.
Reporting metrics about the use of the store.
Do the input stream events define one global graph, or multiple graphs for each matching Kafka/Samza partition? That is important as Samza state is local not global.
If it's one global graph, you can update/query a separate graph system from the Samza task process method. Titan on Cassandra would one such graph system.
If it's multiple separate graphs, you can use the current RocksDB KV store to mimic graph database operations. Titan on Cassandra does just that - uses Cassandra KV store to store and query the graph. Graphs are stored either via matrix (set [i,j] to 1 if connected) or edge list. For each node, use it as the key and store its set of neighbors as the value.

Is Erlang bad language for this app?

I am building framework for realtime web applications. I started to do it in Elixir, because
it is modern way how to develop application for Erlang VM. Erlang should be good if you need concurrency, fault tolerant, scalable apps (something like web server etc.). That is exactly what i need.
Question: Realtime framework always need for instance keep information about who is interested in what. This will be accomplished by using publish/subscribe pattern. So i will have 1000 clients subscribing to topic "newest-message". I need to save those clients (pid of process representing each client) somewhere to later access them if content for topic "newest-message" appears.
This is where i am confused if Erlang is really good for my framework.
ETS is probably the only option where to store shared data, but ETS is always copying everything if you save/access records. So that means copy 1000 pids always when i need to access them (instead of just iterating over some list, if i will do it for instance in c/java/python).
This will be probably great bottleneck if still copying many and many records from ETS (many clients, many subscriptions etc), i am right?
Sharing the state may be a sign of bad design. You can for example have process for each queue/topic and it will store its own list of subscribers. You send a message to that topic process and it in turn sends the message to clients. This way, you don't copy entire subscriber list.
If you need to process them in parallel, you can split the subscriber list between more processes.
The fault tolerance of Erlang is achieved, because it doesn't let you share state and you have to put more thought to the design, that will not involve state sharing, but will be efficient. This will pay off in the long run, so Erlang/Elixir is definitely good language for this kind of apps. Just look at RabbitMQ.
In my opnion, if you plan to save states like "who is interested in what" Erlang alone may not be a good idea. Of course, sometimes it is very convenient to pass everything in signals (like you'd do in Erlang), but when there is much content to store - lack of state in Erlang starts to hinder you rather than help.
On the other hand, you can keep a broad piece of convenience of Erlang and use it with a Java application, for example. Erlangs interface for Java enables you to connect both technologies quite easily, and at the same time you can use a Java app to store information for you (and save them somewhere, when necessary) and Erlang for the whole concurrent signaling real time part. Even better than that: you can still implement OTP with architecture like that, so you can create quite a lightweight application (because real-time logic is done by Erlang for you) being able to access stored data easily (because Java helps you here).

What is Mnesia replication strategy?

What strategy does Mnesia use to define which nodes will store replicas of particular table?
Can I force Mnesia to use specific number of replicas for each table? Can this number be changed dynamically?
Are there any sources (besides the source code) with detailed (not just overview) description of Mnesia internal algorithms?
Manual. You're responsible for specifying what is replicated where.
Yes, as above, manually. This can be changed dynamically.
I'm afraid (though may be wrong) that none besides the source code.
In terms of documenation the whole Erlang distribution is hardly the leader
in the software world.
Mnesia does not automatically manage the number of replicas of a given table.
You are responsible for specifying each node that will store a table replica (hence their number). A replica may be then:
stored in memory,
stored on disk,
stored both in memory and on disk,
not stored on that node - in this case the table will be accessible but data will be fetched on demand from some other node(s).
It's possible to reconfigure the replication strategy when the system is running, though to do it dynamically (based on a node-down event for example) you would have to come up with the solution yourself.
The Mnesia system events could be used to discover a situation when a node goes down; given you know what tables were stored on that node you could check the number of their online replicas based on the nodes which were still online and then perform a replication if needed.
I'm not aware of any application/library which already manages this kind of stuff and it seems like a quite an advanced (from my point of view, at least) endeavor to make one.
However, Riak is a database which manages data distribution among it's nodes transparently from the user and is configurable with respect to the options you mentioned. That may be the way to go for you.

Resources