I have a question on inter-zone egress charges on Google Cloud Run (managed). As I understand there is no control over which zones Cloud Run chooses. So potentially when deploying several microservices talking to each other, there could be significant charges.
In kubernetes this can be alleviated via service topology (preferring same zone or even same host if available). Is there anyway to achieve this with Cloud Run?
https://kubernetes.io/docs/concepts/services-networking/service-topology/
According to Cloud Run pricing and internet egress pricing cost stays the same
independent if apps are within the same zone or not.
Now if you plan to have heavy traffic between your apps you should consider using different setup. Either GKE or Cloud Run for Anthos will allow you to setup communication between your apps through internal IP addresses which is free of charge assuming they are in the same zone. Refer to this table.
Related
We are currently operating a backend stack in central europe, Japan and Taiwan and are perparing our stack to transition to docker swarm.
We are working with real time data streams from sensor networks to do fast desaster warnings which means that latency is critical for some services. Therefore, we currently have brokers (rabbitmq) running on dedicated servers in each region as well as a backend instance digesting the data that is sent accross these brokers.
I'm uncertain how to best achieve a comparable topology using docker swarm. Is it possible to group nodes, let's say by nationality and then deploy a latency critical service stacks to each of these groups? Should I create separate swarms for each region (feels conceptually contradictory to docker swarm)?
The swarm managers should be in a low latency zone. Swarm workers can be anywhere. You can use a node label to indicate the location of the node, and restrict your workloads to a particular label as needed.
Latency critical considerations on the container-to-container network across large regional boundaries may be relevant depending on your required data path. If the only latency-critical data path is to the rabbitmq service that is external to the swarm, then you won't need to worry about the container-to-container latency.
It is also a valid pattern to have one swarm per region. If you need to be able to lose any region without impacting services on another region, then you'd want to split it up. If you have multiple low latency regions, then you can spread the master nodes across those.
I have a fully managed Google Cloud Run service running in Frankfurt. I was not able to choose a zone but only a region, so I took "europe-west3". For my Google Cloud SQL server I can and have to choose a zone. I want to select the same data center for my SQL server and my service to keep the distance short and connections fast but I don't know which zone I should use (a, b, c). Do you know a way to determine which zone fits best to a fully managed Cloud Run Service?
Unfortunetly you cannot choose a Zone to deploy your Cloud Run service, the control goes only until Region. However, this is not something that you should be worried about, as you can see in this documentation:
A zone is a deployment area for Google Cloud resources within a region
That means that even thought the resources might not be in the same Cluster or VM, they are still very close geographically and very likely to be in the same Data Center, and as mentioned in the same documentation:
Locations within regions (Zones) tend to have round-trip network latencies of under <1ms on the 95th percentile.
So you are looking at a very low latency between your resources anyway, to the point that might not even noticible.
Google App Engine flexible allows you to deploy docker containers... how does scaling manifest itself?
Will a new VM be spun up each time the application needs to scale or can it spin up new container instances on an existing VM?
Can individual containers scale independent of each other? e.g. product container is under load but customer is not so only a new product container is spun up?
I realize GKE would be a better option for scaling containers, but I need to understand how this works on GAE for a multitude of reasons.
App Engine flex will only run one of your app container per VM instance. If it needs to scale up, it'll always create a new VM to run the new container.
As per your example, if you want to scale "product" and "customer" containers separately, you'll need to define them as separate App Engine services. Each service will have its own scaling set up and act independently.
If you have containers, you can have a look to Cloud Run, which scale to 0 and can scale up very quickly (there is no new VM to proviion, that can take several seconds on AppEngine Flex).
However, long run aren't supported (limited to 15 minutes). All depends you requirement in term of feature, portability, scalability.
Provide more details if you want more advices.
Google App Engine is a fully managed serverless platform, where you basically submit a code and GAE will manage the underlying infrastructure and the runtime environment (for example the version of a python interpreter). You can also customize the runtime environment with Dockerfiles.
In contrast, GKE provides more fine-grained control on your cluster infrastructure. You can configure your computer resources, network, security, how the services are exposed, custom scaling policies, etc. GKE can be considered a managed container orchestration plaform.
An alternative to GKE that can provide even more control is creating the resources you need in GCE and configuring Kubernetes by yourself.
Both GKE and GAE are based and priced on compute engine instances. Google Cloud Functions, however, is a more recent event-driven serverless service. GCF is great if you want to execute code on an event-driven basis (for example, sending a confirmation email after a user registers).
In terms of complexity and control over your code's environment I would order the different Google services as:
GCE(Compute Engine) > GKE(Kubernetes Engine) > GAE(App Engine) > GCF(Cloud Functions)
One point to consider is that the more low-level you go the easier it is to migrate your service to another platform.
Given that you seem to be deploying only containerized applications, I would recommend giving GKE a try, specially if you want to have a cluster of multiple services that interact with each other.
In terms of scaling, GAE will scale only VM instances and you have only one app per VM instance.
In GKE you have two types of scaling: container scaling and VM instance scaling. You can have multiple containers in one instance and those containers can be different apps. Based on limits you define (such as the CPU used in an app) GKE will try to efficiently allocate the containers across the instances of your cluster.
I'm considering setting up Prometheus monitoring stack using federation to handle issues with poor connectivity. My use case is the following:
I have N separate small setups on-prem, each consisting of several machines and docker containers running on them
Each of those setups is connected to the cloud, but connection is poor and can be lost sometimes
I need to have a single prometheus instance in the cloud which would aggregate data from all those "small setup" on-prem
My idea was that local prom servers will scrape my jobs/machines etc and then using federation those metrics will land in the "central" prom instance.
Now, I'm not sure what will happen if I loose connectivity between cloud and on-prem. Am I going to download all samples from local prometheus servers, once connectivity is back? Or the central server will never learn about what happened during such network outage?
My question is related to microservices & service discovery of a service which is spread between several hosts.
The setup is as follows:
2 docker hosts (host A & host B)
a Consul server (service discovery)
Let’s say that I have 2 services:
service A
service B
Service B is deployed 10 times (with random ports): 5 times on host A and 5 times on host B.
When service A communicates with service B, for example, it sends a request to serviceB.example.com (hard coded).
In order to get an IP and a port, service A should query the Consul server for an SRV record.
It will get 10 ip:port pairs, for which the client should apply some load-balancing logic.
Is there a simpler way to handle this without me developing a client resolver (+LB) library for that matter ?
Is there anything like that already implemented somewhere ?
Am I doing it all wrong ?
There are a few options:
Load balance on client as you suggest for which you'll either need to find a ready-build service discovery library that works with SRV records and handles load balancing and circuit breaking. Another answer suggested Netflix' ribbon which I have not used and will only be interesting if you are on JVM. Note that if you are building your own, you might find it simpler to just use Consul's HTTP API for discovering services than DNS SRV records. That way you can "watch" for changes too rather than caching the list and letting it get stale.
If you don't want to reinvent that particular wheel, another popular and simple option is to use a HAProxy instance as the load balancer. You can integrate it with consul via consul-template which will automatically watch for new/failed instances of your services and update LB config. HAProxy then provides robust load balancing and health checking with a lot of options (http/tcp, different balancing algorithms, etc). One possible setup is to have a local HAProxy instance on each docker host and a fixed port assigned statically to each logical service (can store it in Consul KV) so you connect to localhost:1234 for service A for example and localhost:2345 for service B. Local instance means you don't pay for extra round trip to loadbalancer instance then to the actual service instance but this might not be an issue for you.
I suggest you to check out Kontena. It will solve this kind of problem out of the box. Every service will have an internal DNS that you can use in communication between services. Kontena has also built-in load balancer that is very easy to use making it very easy to create and scale micro services.
There are also lot's of built-in features that will help developing containerized applications, like private image registry, VPN access to running services, secrets management, stateful services etc.
Kontena is open source project and the code is visible on Github
If you look for a minimal setup, you can wrap the values you receive from Consul via ribbon, Netflix' client based load balancer.
You will find it as a module for Spring Cloud.
I didn't find an up-to-date standalone example, only this link to chrisgray's dropwizard-consul implementation that is using it in a Dropwizard context. But it might serve as a starting point for you.