Calling specific instances of a docker service - asp.net-mvc

Not exactly sure how to ask this question or if this is a valid approach. So I am learning all about docker, containers, etc. From what I have read it is great for creating individual different microservices that perform various tasks such as BasketService, CartService, etc, which can each be contained in their own docker container on a vm which I think the URL calls from my UI (If hosted on a linux vm) would be something along the lines of https://MyLinuxVM/BasketService/{controller}.
My Question:
Now lets say I have only 1 service. We will call it MyService, that needs to have multiple instances. So I could have 4 instances i.e: MyService1, MyService2, MyService3, MyService4. All exactly the same. From my client, would the following assumption be correct?
I can call https://MyLinuxVM/MyService1/{controller} or https://MyLinuxVM/MyService2/{controller} to send to a specific container instance?
Why:
I feel this may help explain why I am doing this and possibly help everyone understand my problem in the first place. I have 4 physical devices I need to communicate with. We will call them Device1, Device2, Device3, Device4. Each device has its own IP Address, and its own set of "Tools" connected to it on various ports of the device (10-20 ports per device).
From our UI, the users can click a button that sets some torque values for the tool in their hand by sending the data to the MVC backend which gets sent to the "Correct" background worker/container which will then transform the data into byte[] and pass it along to its dedicated device. I am not sure if I need multiple background workers in a single container, or just a single configurable container with a single background worker that gets deployed multiple times dependent on number of devices we have running in the shop.
I have read a lot of things on creating different worker services that do different tasks, but I need multiple instances of a worker service that can be configured (preferably from db tables) to send to a specific device.
Picture for additional details / visual:

Related

Using a load balancer to dispatch messages from Redis pub sub

I have several Python application that all connect to a Redis server and consume messages using the pubsub mechanism. I have containerized the applications with Docker and I would like to scale each application by replicating the number of container instances. The challenge is that I don’t want each container to act as an independent subscriber to Redis, meaning I would essentially like to load balance the network traffic so that, when a message is published, only one container receives it per service.
Let’s take the simple example of two services, Service A and Service B. Both services need to be subscribed to the same topic so that each is notified upon a message published to that topic. Each service will process the message differently; in other words the same message will trigger two different outcomes, one executed by Service A and one by Service B. Now, I am trying to imagine an architecture in which these services consist of replicated containers, let’s call them workers. Say Service A consists of two workers A1 and A2, and Service B consists of three workers B1, B2, and B3 (maybe it requires more processing power per message than Service A, so it requires more workers for the same message load). So my use case requires that both Service A and Service B need to subscribe to the same topic so that they both receive updates as they come in, but I only want one worker to handle the message per service. Imagine that a message comes in and worker A1 handles it for Service A while B3 handles it for Service B.
Overall this feels like it should be pretty straightforward, I essentially have multiple applications, each of which needs to scale horizontally and should handle network traffic as if they were sitting behind a load balancer.
I am intending to deploy these applications with something like Amazon ECS, where each application is essentially a service with task replication and all services connect to a centralized Redis cache acting as a message broker. In a situation like this, from the limited research I’ve done, it would be nice to just put a network load balancer up in front of each service so that published messages would be directed to what looks like a single subscriber, but behind the scenes is a collection of workers acting like they’re pulling off a task queue.
I haven’t had much luck finding examples of this kind of architecture, or for that matter any examples of tasks that use something like Redis in the way I’m imagining. This is an architecture I’ve more or less dreamed up, so I could just be thinking about this all wrong, but at the same time it doesn’t seem like a crazy use case to me. I’m looking for any advice about how this could be accomplished and/or if what I’m talking about just sounds insane and there’s a better way.

Is there a way to add a separate graph for each host on a Datadog dashboard, when the hosts frequently change?

I'm trying to make a dashboard to monitor a process which runs on 5 remote machines simultaneously. I want the dashboard to display the metrics for each machine separately - basically, I want to create five separate graphs, one for each machine that runs the process. My problem is that the remote machines are reassigned periodically, so I have no way of knowing the name of the host at any given time.
I've tried creating five separate graphs, with each one filtered by a different host name tag, but the graphs do not seem to pick up the new host when the lease for the process is changed. I also know you can split out one graph for each host using metrics explorer, but I haven't found any way to automatically do that on a dashboard. Does anyone know if this is possible? Leases for the process are assigned through AWS, if that is helpful.
Thanks in advance for any suggestions.

Share storage/volume between worker nodes in Kubernetes?

Is it possible to have a centralized storage/volume that can be shared between two pods/instances of an application that exist in different worker nodes in Kubernetes?
So to explain my case:
I have a Kubernetes cluster with 2 worker nodes. In each one of these I have 1 instance of app X running. This means I have 2 instances of app X running totally at the same time.
Both instances subscribe on the topic topicX, that has 2 partitions, and are part of a consumer group in Apache Kafka called groupX.
As I understand it the message load will be split among the partitions, but also among the consumers in the consumer group. So far so good, right?
So to my problem:
In my whole solution I have a hierarchy division with the unique constraint by country and ID. Each combination of country and ID has a pickle model (python Machine Learning Model), which is stored in a directory accessed by the application. For each combination of a country and ID I receive one message per minute.
At the moment I have 2 countries, so to be able to scale properly I wanted to split the load between two instances of app X, each one handling its own country.
The problem is that with Kafka the messages can be balanced between the different instances, and to access the pickle-files in each instance without know what country the message belongs to, I have to store the pickle-files in both instances.
Is there a way to solve this? I would rather keep the setup as simple as possible so it is easy to scale and add a third, fourth and fifth country later.
Keep in mind that this is an overly simplified way of explaining the problem. The number of instances is much higher in reality etc.
Yes. It's possible if you look at this table any PV (Physical Volume) that supports ReadWriteMany will help you accomplish having the same data store for your Kafka workers. So in summary these:
AzureFile
CephFS
Glusterfs
Quobyte
NFS
VsphereVolume - (works when pods are collocated)
PortworxVolume
In my opinion, NFS is the easiest to implement. Note that Azurefile, Quobyte, and Portworx are paid solutions.

Tell difference of a scaled app on Mesos

I am running server apps inside Docker on Mesos, some apps are scaled to multiple instances. When I am collecting data inside the app, I want the app to be able to store some type of identifiers so later on when I read data, I know which app it is collected from. For example, I scaled an app to 3 instances on mesos, and from the data, I want to read that the data is from either app_1, app_2 or app_3. I thought of using host IP, but those scaled apps are sometimes spawned inside the same node. I cannot use something like pid because it will change when the app restarts.
Thus I tried to read if there are environment variables that can help me distinguish between them, but they are all the same across the platform except HOSTNAME, so I wonder if anyone has other ideas. Thank you very much.
You can use the MESOS_TASK_ID task-level environment variable for this, see the Marathon docs for more details. Also, note that a Marathon application instance corresponds to a Mesos task.

Is this the right way of building an Erlang network server for multi-client apps?

I'm building a small network server for a multi-player board game using Erlang.
This network server uses a local instance of Mnesia DB to store a session for each connected client app. Inside each client's record (session) stored in this local Mnesia, I store the client's PID and NODE (the node where a client is logged in).
I plan to deploy this network server on at least 2 connected servers (Node A & B).
So in order to allow a Client A who is logged in on Node A to search (query to Mnesia) for a Client B who is logged in on Node B, I replicate the Mnesia session table from Node A to Node B or vise-versa.
After Client A queries the PID and NODE of the Client B, then Client A and B can communicate with each other directly.
Is this the right way of establishing connection between two client apps that are logged-in on two different Erlang nodes?
Creating a system where two or more nodes are perfectly in sync is by definition impossible. In practice however, you might get close enough that it works for your particular problem.
You don't say the exact reason behind running on two nodes, so I'm going to assume it is for scalability. With many nodes, your system will also be more available and fault-tolerant if you get it right. However, the problem could be simplified if you know you only ever will run in a single node, and need the other node as a hot-slave to take over if the master is unavailable.
To establish a connection between two processes on two different nodes, you need some global addressing(user id 123 is pid<123,456,0>). If you also care about only one process running for User A running at a time, you also need a lock or allow only unique registrations of the addressing. If you also want to grow, you need a way to add more nodes, either while your system is running or when it is stopped.
Now, there are already some solutions out there that helps solving your problem, with different trade-offs:
gproc in global mode, allows registering a process under a given key(which gives you addressing and locking). This is distributed to the entire cluster, with no single point of failure, however the leader election (at least when I last looked at it) works only for nodes that was available when the system started. Adding new nodes requires an experimental version of gen_leader or stopping the system. Within your own code, if you know two players are only going to ever talk to each other, you could start them on the same node.
riak_core, allows you to build on top of the well-tested and proved architecture used in riak KV and riak search. It maps the keys into buckets in a fashion that allows you to add new nodes and have the keys redistributed. You can plug into this mechanism and move your processes. This approach does not let you decide where to start your processes, so if you have much communication between them, this will go across the network.
Using mnesia with distributed transactions, allows you to guarantee that every node has the data before the transaction is commited, this would give you distribution of the addressing and locking, but you would have to do everything else on top of this(like releasing the lock). Note: I have never used distributed transactions in production, so I cannot tell you how reliable they are. Also, due to being distributed, expect latency. Note2: You should check exactly how you would add more nodes and have the tables replicated, for example if it is possible without stopping mnesia.
Zookeper/doozer/roll your own, provides a centralized highly-available database which you may use to store the addressing. In this case you would need to handle unregistering yourself. Adding nodes while the system is running is easy from the addressing point of view, but you need some way to have your application learn about the new nodes and start spawning processes there.
Also, it is not necessary to store the node, as the pid contains enough information to send the messages directly to the correct node.
As a cool trick which you may already be aware of, pids may be serialized (as may all data within the VM) to a binary. Use term_to_binary/1 and binary_to_term/1 to convert between the actual pid inside the VM and a binary which you may store in whatever accepts binary data without mangling it in some stupid way.

Resources