Exporting cypress memory and cpu uses to external API - memory

Use Case
I am using cypress on multiple servers to run tests. While the cypress tests are running, i would like to record the CPU/Memory used by each instance of cypress.
Cypress provides a flag to log the cpu/memory uses on console -> https://docs.cypress.io/guides/references/troubleshooting#Log-memory-and-CPU-usage
DEBUG=cypress:server:util:process_profiler cypress run
Is there any way to send this information to any external API, as I want to record the uses of CPU and Memory of each server?

Related

ML serving service architecture with Docker

I am in the early stage of developing an image segmentation service. Currently, I have a simple Flask server that is responsible for receiving data and running a docker container with an AI model in the local GPU server. But I also think about something asynchronous like FastAPI or Nodejs to implement some scheduler for prediction tasks. What is better: a) when the server calls the docker container by ssh and the docker container run only when it is called, predicted images, saved results, and stopped, or b) running an API server inside the AI container? Each container is around 5-10GB. Running all containers looks more expensive, but I am not sure what practice is better.
I tried to call the container each time and stop it after work was done.
You should avoid approaches based on dynamically starting containers and approaches based on ssh. I'd recommend a long-running process that accepts some network input, like your existing Flask server, and either always has the ML model running or launches it as a subprocess.
If you can use a subprocess that could be a good match here. When the subprocess exits, all of its memory resources will be automatically cleaned up, so you won't have the cost of the subprocess when it's not being used. If the container happens to exit, the subprocess will get cleaned up with it. Subprocesses are also basic Unix functionality, so you can locally develop your service without needing any particular complex setup.
Dynamically launching containers comes with many challenges. It ties your application to the Docker API, which will make it harder to run, even in local development. Using that API grants unrestricted root-level access to the host system (you can very easily run a container that compromises the host). You need to remember to clean up after your own containers. The setup may not work in other container systems like Kubernetes that don't make a Docker socket available.
An ssh-based system presents different complexities. You need to distribute credentials to various places. If you're trying to run an ssh daemon inside a Docker container, that is difficult to configure securely (what creates the host keys? how do you provision users and private keys?). You also need to think about various failure cases around the ssh transport that might not be present in a purely-local system.

How to configure Cypress in Docker-Compose project

Context
I am trying to configure Cypress as an E2E test runner in my current employer's codebase, and will utilize this for Snapshot tests at a later point (TBD based on our experience w/ Cypress). We currently utilize docker-compose to manage our Frontend (FE), Backend (BE) and Database (DB) services (images).
FE tech-stack
NextJS and React, Recoil and yarn pkg manager
Problem
I am having a difficult time configuring Cypress, here is a list of things that are hindering this effort:
Am I supposed to run my E2E tests in its own Docker Service/Image separate from the FE Image?
I am having a tough time getting Cypress to run in my Docker container on my M1 mac due to CPU architecture issues (the docker image makes use of the linux x64 architecture which croaks out when I try to get cypress to run on my Mac, but works fine when I run it in the cloud in a Debian box). This is a known issue within Cypress.
There is a workaround to this as Cypress works when installed globally/on the local machine itself, outside the container. So instead of calling on the tests inside the container (ideal), I'm having to call the test on my local in the FE directory root.
If I need to run snapshot tests with Cypress, do I need to configure that separately from my E2E tests and place that suite of tests within my FE image? Since I will be needing the components in FE to be mounted to be tested by Cypress.
Goal
The goal here is to configure Cypress in a way that works INSIDE the docker container on both in the cloud (CI/CD and Production/Staging) and on local M1 mac machines. Furthermore, (this is good-to-have, not necessary), have Cypress live in a place where it can be used for both Snapshot and E2E tests, within Docker-Compose.
Any help, advice or links are appreciated, I'm a bit out of my depth here. Thanks!

How to do UAT on our containerized Spring-batch application

We have a Spring-batch (3.0.8) application using db2 for data persistence.
We have built Docker image of the application and are trying to figure out how to test it using Jenkins pipeline. We launch the application using CommandLineJobRunner in a format similar to this:
bluecost-docker]$ docker run -v /home/bluecost/config:/home/bluecost/config -v /home/bluecost/data:/home/bluecost/data -v /home/bluecost/logs:/home/bluecost/logs bluecost com.mycomp.cloud.cost.LoadBMSData CommandLineJobRunner load-bms-job-xml LoadBMSJob ../data/input/CSVMapping-Mar2018.csv
The results of the job are recorded in the DB2 database table. I'm having trouble figuring out how to test this containerized application that doesn't expose a RESTful interface to the outside world.
The goals of the testing is User Acceptance. The test scenarios are done using Cucumber (Features > Scenarios > Tests). The testing must check the results of the job run (db table) against the expected outcomes.
Question: Do we have to write an integration layer around the jobs so we can launch them using REST and retrieve the results using REST or is there some other way?!

Running Scrapy in a docker container

I am setting up a new application which I would like to package using docker-compose. Currently in one container I have a Flask-Admin application which also exposes a API for interacting with the database. I then will have lots of scrapers that need to run once a day. These scrapers should scrape the data, reformat the data and then send it to the API. I expect I should have another docker container running for the scrapers.
Currently, on my local machine I run Scrapy run-spider myspider.py to run each spider.
What would be the best way to have multiple scrapers in one container and have them scheduled to run at various points during the day?
You could configure your docker container that has the scrapers to use "cron" to fire off the spiders at appropriate times. Here's an example:"Run a cron job with Docker"

How Mesos Marathon handle application data persistence?

I have been exploring Mesos, Marathon framework to deploy applications. I have a doubt that how Marathon handle application files when an application is killed .
For example we are using Jenkins which is run through Marathon and if Jenkins server fails and it will be restarted again by Marathon but this time old jobs defined will be lost .
Now my question is how can I ensure that if a application restarts, those old application jobs should be available ?
Thanks.
As of right now mesos/marathon is great at supporting stateless applications, but the support for stateful applications is increasing.
By default the task data is written into sandbox and hence will be lost when a task is failed/restarted. Note that usually only a small percentage of tasks fails (e.g. only the tasks being on the failed node).
Now let us have a look at different failure scenarios.
Recovering from slave process failures:
When only the Mesos slave process fails (or is upgraded) the framework can use slave checkpointing for reconnecting to the running executors.
Executor failures (e.g. Jenkins process failures):
In this case the framework could persist it own metadata on some persistent media and use it to restart. Note, that this is highly application specific and hence mesos/marathon can not offer a generic way to do this (and I am actually not sure how that could look like in case of jenkins). Persistent data could either be written to HDFS, Cassandra or you could have a look at the concept of dynamic reservations.

Resources