We have a service to pick up custom tests in XML and convert those to CodedUI tests. We then start a process for MSTest to load the tests into the Test Controller which then distributes the tests across various Agents. We run regression tests at night, so no one is around to fix a system if something goes wrong. When certain exceptions occur in the test program, it pops open an error window and no more test can run on the system. Subsequent tests are loaded into the agent and fail immediately because they can not perform their assigned tasks. Thousands of tests that should take all night on multiple systems now fail in minutes.
We can detect that an error occurred by how quickly a test is returned, but we don't know how to disable the agent so as not to allow it to pick up any more tests.
addendum:
If the test has failed so miserably that no more tests can attempt a successful run (as noted, we may not have an action to handle some, likely new, popup), then we want to disable that agent as no more tests need to run on it: they will all fail. As we have many agents running concurrently, if one fails (and gets disabled), the load can still be distributed without a long string of failures. These other regression tests can still have a chance to succeed (everything works) or fail (did we miss another popup, or is this an actual regression failure).
2000 failures in 20 seconds doesn't say anything except 1 system had an problem that no one realized it would have and now we wasted a whole night of tests. 2 failures (1 natural, 1 caused by issue from previous failure) and 1 system down means the total nights run might be extended by an hour or two and we have useful data on how to start the day: fix 1 test and rerun both failures.
One would need to abort the testrun in that case. If you are running mstest yourself, you would need to inject a ^c into the command line process. But: if no-one is around to fix it, why does it matter that the consequenting test fail ? if its just to see which test was the cause of the error quickly, why not generate a code ui check to see if the message box is there and mark the test inconclusive with Assert.inconclusive. The causing test would stand out like a flag.
If you can detect the point at which you want to disable the agent then you can disable the agent by running the "TestAgentConfig.exe delete" which will reset the agent to unconfigured state.
Related
Context
I have a kubeflow pipeline running multiple stages with python scripts. In one of the inner stages, I use kfp.dsl.ParallelFor to run 5-6 deep learning models, and in the next stage, I choose the best one with respect to a validation metric.
Problem
The issue is if one of the models fail, the whole pipeline fails. It'll complain that the dependencies of the next stage is not satisfied. However, if model A fails and model B is still running at that time, the pipeline state will continue to be running till the time model B is running, and it'll change only at end of all models in that stage.
Question
How can I allow partial failures in a stage? As long as at least one of the model is working, the next stage can work. How do I make it happen in kubeflow? For example, I have setup CI in Gitlab, which supports this.
If it is not possible to have this, I want the pipeline to fail immediately as soon as one model fails, and not wait for others only to fail later, which possibly can be way later based on configurations.
Obviously, a way to avoid failure will be to include a top level try - except in the python script, and it'll always return exit code as 0. However, in this way there shall be no visual indication that one (or more) models failed. It can be recovered from the logs, but it's rarely monitored in a scheduled pipeline when the entire run status is successful.
I'm trying to improve some of the testing procedures at work and since I'm not an expert on Jenkins was hoping you guys could maybe point me in the right direction?.
Our current situation is like this. We have a huge suite of E2E tests that run regularly. These tests rely on a pool of limited resources (AWS VMs that are use to run each tests). We have 2 test suites. A full blown regression that consumes, at its peak, a total of ~80% of those resources and a much more light weight smoke run that just uses 15% or so.
Right now I'm using the lockable resources plugin. When the Test Run step comes it checks whether you are running a regression or not and if you are then it will request the single lock. If it is available then all good and if not it will wait until it becomes available before continuing. This allows me to make sure that at no point there will be more than 1 regression running at the same point but it has a lot of gaps. Like a regression could be running and several smoke runs might be triggered which will exhaust the resource pool.
What I would like to accomplish on a best-case-scenario would be some sort of conditional rules that would decide whether the test execution step can go forward or not based on something like this:
Only 1 regression can be running at the same time.
If a regression is running allow only 1 smoke run to be run in
parallel.
If no regression is running then allow up to 5 or 6 smoke tests.
If 2 or more smoke tests are running do not allow a regression to
launch.
Would something like that be possible from a Jenkins pipeline? In this case I'm using the declarative pipeline with a bunch of helper groovy code I've put together over time. My first idea is to see if there's a way to check if a lockable resource is available or not (but without actually requesting it yet) and then go through a bunch of if/then/else to set up the logic. But again I'm not sure if there's a way to check a lockable resource state or how many of a kind have already been requested.
Honestly, something this complex might probably be outside of what Jenkins is supposed to handle but I'm not sure and figured asking here would be a good start.
Thanks!.
Create a declarative pipeline with steps that build individual jobs. Don't allow people to run the jobs ad-hoc, or when changes are pushed to the repository, and force a build schedule.
How can this solve your issue:
Only 1 regression can be running at the same time.
Put all these jobs in sequence in a declarative pipeline.
If a regression is running allow only 1 smoke run to be run in parallel.
Put smoke tests that are related to the regression test in sequence, just after the regression build, but run the smoke tests in parallel, prior to the next regression build.
If no regression is running then allow up to 5 or 6 smoke tests.
See previous
If 2 or more smoke tests are running do not allow a regression to launch.
It will never happen if you run things in sequence.
Here is an ugly picture explaining what I am talking about.
You can manually create the pipeline, or use the coolness of blue ocean to give you a graphical interface to put the steps in sequence or in parallel:
https://jenkins.io/doc/tutorials/create-a-pipeline-in-blue-ocean/
The downside is that if one of those jobs fails, it will stop the build, but that is not necessarily a bad thing if the jobs are highly correlated.
Completely forgot to update this but after reading and experimenting a bit more with the lockable resources plugin I found out you could have several resources under the same label and request a set quantity whenever a specific job starts.
I defined 5 resources and set the Jenkinsfile to check whether you are running the test suite with the parameter regression or not. If you are running a full regression it will try to request 4 locks while a smoke test will only try to request 1. This way when there aren't enough locks available the job will wait until either the enough amount becomes available or the timeout expires.
Here's a snippet from my Jenkinsfile:
stage('Test') {
steps {
lock(resource: null, label: 'server-farm-tokens', quantity: getQuantityBySuiteType()) {
<<whatever I do to run my tests here>>
}
}
resource has to be null due to a bug in Jenkin's declarative pipeline. If you're using the scripted one you can ignore that parameter.
We have a max execution time set for tests, but frankly this option is about as much use as a chocolate teapot.
When the execution time exceeds this limit, the whole build fails and all subsequent steps are aborted, so the "Publish Test Results" step never executes, so you get absolutely no information whatsoever to help you work out WHY it exceeded the timeout period.
Can anyone suggest an alternative?
I was thinking of maybe trying to implement the timeout as part of the test code itself - does anyone know if this is possible? If I launch a thread that monitors for the timeout, and if it is hit, then...?
Could I just have the test terminate it's own process?
Build job timeout in minutes
Specify the maximum time a build job is allowed to execute on an agent
before being canceled by the server. Leave it empty or at zero if you want the job to never be canceled by the server.
This timeout are including all tasks in your build definition, if the test step out of time. Then the whole build definition will be canceled by the server. Certainly, the whole build fails and all subsequent steps are aborted.
According to your requirement, suggest you leaving this value 0 and setting "continue on error" for test step just as comment suggested. With this, if the step fails, all following steps are executed and the step/ overall build partially succeeds. Then you could got related info to trouble shoot the failed task.
If your test will not automatically judge whether the execution is timeout or not, another way is creating a custom build task to collect the execution time of your test task(through read build log), set a judgment to pass of fail the customize step using Logging Commands.
I'm writing a few tests in ExUnit to illustrate how different Supervisor strategies work. I had planned to test results by intentionally causing spawned processes to fail, and then testing the restarted processes' output.
Thus far I have been unsuccessful in creating passing tests, as the initial process failure causes the test to fail. I have tried capturing the errors (try/catch) in both the Supervisor/GenServer implementation and the test implementation, but I have not been able to capture any and avoid the test failure.
Is there any way to capture these errors so that they do not trigger a test failure?
Is there a better/different means of testing different supervisor strategies?
Thanks!
You need to careful with your links. When you start a supervisor, it is linked to the current process, so if you crash the supervisor (or any other linked process), it will cause the test to crash too.
You can change this behaviour by setting Process.flag(:trap_exit, true) now links won't trigger crashes and instead you will be able to find messages of format {:EXIT, pid, reason} in your mailbox.
This is a fine approach for testing but for production or in general you likely want to setup some sort of monitor.
Because I was intentionally causing a process to fail and wanted to ignore this failure within the ExUnit test, I ended up using catch_exit/1 to prevent the test process from failing.
We were used to running our grails integration test against in memory HSQLDB database, but at the failure point it was difficult to investigate as the data was lost. We migrated to running the test against the physical database(postgres) and all is well when the tests passes. At any point if the tests fail we want the data to be committed in the database for postmortem analysis as to why the test failed.
To summarize, we want the tests to run in rollback mode as long as the test passes so that one test doesn't affect the other test and on the first failure of a test, commit the data at that point and stop.
We spend considerable amount of time investigating the integration test failure and would like to know if we have any option in grails to stop at first integration test failure with data preserved in the database for investigation. I searched little and didn't find any suitable pointers. If you follow any other practice to troubleshoot integration test and if it is worth sharing please let us know.
Simple hack you could try:
set a global flag on failure, test for the flag in each test. if flag is set exit the test
Recently I came across with Grails Guard Plugin and I think it can be useful in this case, because besides running integration tests faster, it preserves the data saved after tests are run.