Running Camunda in clustered environment locks further deployment - docker

I have a problem with my clustered Camunda environment. What I am trying is to run multiple Camunda instances on my Openshift Cluster. All of them are connected to a single oracle db instance.
My problem is, that the deployment of the first instance is working as expected. However as soon as I'm trying to scale the pods to e.x. 3 instances, at least one of them failes and remains stuck on the following output:
{"timestamp":"2020-07-15 14:04:39.503","level":"DEBUG","thread":"main","logger":"org.camunda.bpm.engine.cmd","message":"ENGINE-13009 opening new command context","context":"default"}
14:01:00.741","level":"DEBUG","thread":"main","logger":"org.camunda.bpm.engine.impl.persistence.entity.PropertyEntity.lockDeploymentLockProperty","message":"==> Preparing: SELECT VALUE_ FROM ACT_GE_PROPERTY WHERE NAME_ = 'deployment.lock' for update ","context":"default"}
{"timestamp":"2020-07-15 14:01:00.748","level":"DEBUG","thread":"main","logger":"org.camunda.bpm.engine.impl.persistence.entity.PropertyEntity.lockDeploymentLockProperty","message":"==> Parameters: ","context":"default"}
As the logs tell, it hast something to do with locking of process deployment. After further investigation I came across this article on the offical Camunda page:
https://docs.camunda.org/manual/7.13/user-guide/process-engine/deployments/
And have seen these entries in the database:
Problem: I do understand why the deployments are locked but the main problem is that the lock remains there forever and never gets released. I would appreciate any help!

Are you using autodeployment?! The mentioned article describes a weired situation where multiple nodes try to deploy the same resources. In my opinion this only should happen, when each node trys to autodeploy resources.
Using an explicit deployment (after nodes are started) should be executed on a single node.
KR, Joachim

Related

Configuration Item already exists [XL Deploy]

I'm trying to deploy a DAR on a server using XL Deploy, I get the following error message:
A Configuration Item with ID [CONFIGURATION_ITEM_ID] already exists.
How can I fix it?
You may have to be a little bit more specific... you get the error before the deployment even starts?
If so, it may be because your application tries to deploy an artefact with the same name and on the same container as another, so as the same children. You'll have to have distinct name for your deployables across different applications, or deploy to different containers on you infrastructure.

CF - Working with Apps and changing env variables

I'm new on this technology (studying for now) and I have a doubt on how to properly work with the Apps.
For example, I have an app-gateway with 2 instances (Traefik), and in there I'm using env variables like RESTRICTED_NETWORK_01 and RESTRICTED_NETWORK_02.
I want to replace de IP value of RESTRICTED_NETWORK_02 and apply the change without any impact to the gateway/redirect users service.
Should I just use the command:
cf set-env app-gateway RESTRICTED_NETWORK_02 [ipvalue]/24
then
cf restart-app-instance app-gateway 0
wait until the instance restarts and apply
cf restart-app-instance app-gateway 1
Is it the right steps I should follow for this situation?
Any help?
cf restart-app-instance app-gateway 0
This will probably not do what you want, at least not reliably. This command works by terminating your application instance, which then allows it to be restarted. It's typically useful if you have an app instance that's in a bad state but hasn't crashed or failed health checks (or for demos where you need to make an app crash).
In terms of env variables, it doesn't work so well. This is because the env variable changes you've made reside in Cloud Controller, but your application runs on the scheduler (Diego). The scheduler has a copy of data from the Cloud Controller that tells it how to run the app, including environment variables that should be set. The command you're referencing doesn't result in Cloud Controller updating the application definition that it has sent to the scheduler (Diego). It simply results in the process being terminated and restarted by the scheduler using the same definition.
Your mileage may vary on this one, as I've heard some reports of people saying that cf restart-app-instance works for updating env variables (I don't really see how it would, but I haven't investigated their claims either). You could try it and see, but the stories I've seen have typically ended with problems. tl-dr it doesn't consistently work.
To reliably update env variables, you need to use cf restart. This will restart all application instances.
What you should do to restart app instances without downtime is to use the cf restart --strategy=rolling option (available in version 7 of the cf cli). This will roll out the changes and so long as you have multiple application instances, should not result in downtime.

Unable to acquire Singleton lock in Azure WebJob after removing SingletonAttribute

I have got a WebJob containing functions having a TimeTriggerAttribute. They also have a SingletonAttribute on them, so they don't get executed in parallel.
Deployed in the App Service everything works as expected.
When running it locally it did work as expected for a while and then the output of the Job host reports:
Development settings applied
Found the following functions:
MyNamespace.WebJobs.Functions.MyFunctionAsync
Unable to acquire Singleton lock (bd99766f202e4ac4ba230557b7180bd2/MyNamespace.WebJobs.Functions.MyFuncitonAsync.Listener).
I removed the singleton attribute, but the message keeps showing. I also re-built the project and restarted the machine. Nothing helped.
I renamed the function and it was again scheduled as expected. Now I could also re-add the SingletonAttribute and everything worked as expected. When I rename it back to the original, the error comes back.
What can I do in order to be able to acquire the lock again from the Job host?
By reading this article I figured out that on top of the SingletonAttribute I did set, the TimeTriggerAttribute uses another one behind the scenes. Thank you Janley Zhang.
The ID of the host is, by default, constant across deployments. So when multiple developers that use the same storage account execute the code locally, they would try to acquire the same blob lease and only the first would be successful.
I ended up using a distinct host ID for each developer, like
if (jobHostConfiguration.IsDevelopment)
{
jobHostConfiguration.UseDevelopmentSettings();
jobHostConfiguration.HostId = Environment.UserName.ToLowerInvariant();
}

Gitlab Pages delivers random content

I am experiencing weird behavior with the Pages feature of GitLab Omnibus package running on an Ubuntu 16.04 virtual machine. Some projects use Pages with Jekyll built by GitLab CI, which has been working as expected since it was first published with Gitlab CE.
For a couple of days now, visiting any of the homepages of those sites shows the content of just one of the projects. Each of them should of course show different content, but they all show the same. Even stranger: the content shown on each of the sites changes over time to one of the other projects, and I can not see whether this is deterministic.
Restarting the build processes of each of the projects did not fix this, neither did gitlab-ctl reconfigure, stop and start, nor rebooting the entire VM.
To investigate on that issue, I edited (which I assume is) the resulting file of the build process at /var/opt/gitlab/gitlab-rails/shared/pages/www/www.domain.org/public/index.html. Not in the first place, but later on during the already stated "rotating" content, the edits showed up on the webpage.
So what is going on there? Is this some caching issue? Is it malconfiguration? Is it a bug? Please help me find and fix the problem, as those are production websites.
Looks like this is actually an issue

WSS caches old Workflow version

I'm currently developing three workflows that are supposed to handle the status of items in different lists.
Each Workflow is attached to a separate list.
When I'm deploying and debugging in my development Environment, everything works fine.
Except for the case, when an item is created via an incoming mail.
I already figured out, that I have to restart some services and then it'll work, but I'm still not sure wich of the services is caching the workflow.
Afterwards I build a .wsp file which I deploy on a server.
Each time I deploy the solution, I do a retract and delete solution first.
After deployment I'll recreate the workflows on the lists
It seems to me that this has no effect. An older version of the workflow is still triggered, if I create a new instance in the list.
I already restarted the whole server and still no result.
Has anyone an idea what else I could try in order to get this working?
Thanks in advance.
If Timer Service is the one that calls your code, then restart Windows SharePoint Services Timer (OWSTIMER.EXE).
When workflow waits on something, it gets serialized (hydrated). When event happens, OWSTIMER.EXE deserializes (dehydrates) and continues workflow execution.
So timer is the one that wakes workflow up.
So this problem kind of resolved itself.
I was reading an article on Kirk Evanns Blog on an issue with the development of workflows in VS2008 for WSS.
I had not realized that I still had an illeagle reference in my Project properties.
I removed the reference. The second thing I tried was deploying with -upgradesolution rather than doing a retract-delete-add-deploy...
I don't know which of both did the trick, but I can finally see the new workflows kicking in.
Thanks for your help.

Resources