juju destroy-service do not remove failed services - juju

I have deployed a service through JUJU in my local environment but it got failed due to a bug in my charm. Now I wanted to remove this failed service from the environment. I executed three commands in following hierarchy:
juju destroy-service servicename
juju destroy-unit servicename/0
juju destroy-machine machinename
The first 2 commands executed without any error but third command gave me an error like (the error is not exact as i lost that console):
Failed to remove machine "machinename". It is associated with unit "servicename/0".
Now I can not remove this machine through JUJU. Also the related service is still being shown on JUJU UI and status command output.
Is the above hierarchy good?
Is there any work around to remove this service + associated machine through JUJU. If I remove this machine through lxc-destroy command, it leaves the juju in an inconsistent state.

You won't be able to destroy the machine gracefully until the services and units associated with that machine have been removed. Running juju destroy-service serviceName will destroy all of the associated units eventually but will leave the machines around (by default). You can then follow-up with juju destroy-machine machineName which will destroy the machine.
If you are not able to destroy the machine using destroy-machine you can append --force to the end to force it to destroy. This basically simply sends the terminate command to the provider so only use as a last resort: juju destroy-machine machineName --force.
To find out of the unit is still associated to the machine you can run juju status in the console and it should only list empty machines.
Note: You will probably have better luck getting juju related questions answered if you ask on https://askubuntu.com/

Related

Rsyslog can't start inside of a docker container

I've got a docker container running a service, and I need that service to send logs to rsyslog. It's an ubuntu image running a set of services in the container. However, the rsyslog service cannot start inside this container. I cannot determine why.
Running service rsyslog start (this image uses upstart, not systemd) returns only the output start: Job failed to start. There is no further information provided, even when I use --verbose.
Furthermore, there are no error logs from this failed startup process. Because rsyslog is the service that can't start, it's obviously not running, so nothing is getting logged. I'm not finding anything relevant in Upstart's logs either: /var/log/upstart/ only contains the logs of a few things that successfully started, as well as dmesg.log which simply contains dmesg: klogctl failed: Operation not permitted. which from what I can tell is because of a docker limitation that cannot really be fixed. And it's unknown if this is even related to the issue.
Here's the interesting bit: I have the exact same container running on a different host, and it's not suffering from this issue. Rsyslog is able to start and run in the container just fine on that host. So obviously the cause is some difference between the hosts. But I don't know where to begin with that: There are LOTS of differences between the hosts (the working one is my local windows system, the failing one is a virtual machine running in a cloud environment), so I wouldn't know where to even begin about which differences could cause this issue and which ones couldn't.
I've exhausted everything that I know to check. My only option left is to come to stackoverflow and ask for any ideas.
Two questions here, really:
Is there any way to get more information out of the failure to start? start itself is a binary file, not a script, so I can't open it up and edit it. I'm reliant solely on the output of that command, and it's not logging anything anywhere useful.
What could possibly be different between these two hosts that could cause this issue? Are there any smoking guns or obvious candidates to check?
Regarding the container itself, unfortunately it's a container provided by a third party that I'm simply modifying. I can't really change anything fundamental about the container, such as the fact that it's entrypoint is /sbin/init (which is a very bad practice for docker containers, and is the root cause of all of my troubles). This is also causing some issues with the docker logging driver, which is why I'm stuck using syslog as the logging solution instead.

gitlab on kubernetes/docker: pipeline failing: Error cleaning up configmap: resource name may not be empty

We run gitlab-ee-12.10.12.0 under docker and use kubernetes to manage the gitlab-runner
All of a sudden a couple of days ago, all my pipelines, in all my projects, stopped working. NOTHING CHANGED except I pushed some code. Yet ALL projects (even those with no repo changes) are failing. I've looked at every certificate I can find anywhere in the system and they're all good so it wasn't a cert expiry. Disk space is at 45% so it's not that. Nobody logged into the server. Nobody touched any admin screens. One code push triggered the pipeline successfully, next one didn't. I've looked at everything. I've updated the docker images for gitlab and gitlab-runner. I've deleted every kubernetes pod I can find in the namespace and let them get relaunched (my go-to for solving k8s problems :-) ).
Every pipeline run in every project now says this:
Running with gitlab-runner 14.3.2 (e0218c92)
on Kubernetes Runner vXpkH225
Preparing the "kubernetes" executor
00:00
Using Kubernetes namespace: gitlab
Using Kubernetes executor with image lxnsok01.wg.dir.telstra.com:9000/broadworks-build:latest ...
Using attach strategy to execute scripts...
Preparing environment
00:00
ERROR: Error cleaning up configmap: resource name may not be empty
ERROR: Job failed (system failure): prepare environment: setting up build pod: error setting ownerReferences: configmaps "runner-vxpkh225-project-47-concurrent-0-scripts9ds4c" is forbidden: User "system:serviceaccount:gitlab:gitlab" cannot update resource "configmaps" in API group "" in the namespace "gitlab". Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
That URL talks about bash logout scripts containing bad things. But nothing changed. At least we didn't change anything.
I believe the second error implying that the user doesn't have permissions is not correct. It seems to just be saying that the user couldn't do it. The primary error being the previous one about the configmaps clean up. Again, no serviceaccounts, roles, rolebindings, etc have changed in any way.
So I'm trying to work out what may CAUSE that error. What does it MEAN? What resource name is empty? Where can I find out?
I've checked the output from "docker container logs " and it says exactly what's in the error above. No more, no less.
The only thing I can think of is perhaps 14.3.2 of gitlab-runner doesn't like my k8s or the config. Going back and checking, it seems this has changed. Previous working pipelines ran in 14.1.
So two questions then: 1) Any ideas how to fix the problem (eg update some config, clear some crud, whatever) and 2) How to I get gitlab to use a runner other than :latest?
Turns out something DID change. gitlab-runner changed and kubernetes pulled gitlab/gitlab-runner:latest between runs. Seems gitlab-runner 14.3 has a problem with my kubernetes. I went back through my pipelines and the last successful one was using 14.1
So, after a day of working through it, I edited the relevant k8s deployment to redefine the image tag used for gitlab-runner to :v14.1.0 which is the last one that worked for me.
Maybe I'll wait a few weeks and try a later one (now that I know how to easily change that tag) and see if the issue gets fixed. And perhaps go raise an issue on gitlab-runner

How do I get docker service errors

I did a docker swarm update to add a mount with "--mount-add". I got an error that was cut off on the terminal. While I've solved this particular issue myself, how do I continue to get error messages in the future that appear here?
Is there a way to get that error either from docker service update command (that's not cut off) or from the local logs of the manager node (if the error likely occurred on worker nodes)

Run e2e test with simulation of k8s

we want to create e2e test (integration test ) for our applications on k8s and we want to use
minikube but it seems that there is no proper (maintained or official ) docker file for minikube. at least
I didn’t find any…In addition I see k3s and not sure which is better to run e2e test on k8s ?
I found this docker file but when I build it it fails with errors
https://aspenmesh.io/2018/01/building-istio-with-minikube-in-a-container-and-jenkins/
e - –no-install-recommends error
any idea ?
Currently there's no official way to run minikube from within a container. Here's a two months old quote from one of minikube's contributors:
It is on the roadmap. For now, it is VM based.
If you decide to go with using a VM image containing minikube, there are some guides how to do it out there. Here's one called "Using Minikube as part of your CI/CD flow
".
Alternatively, there's a project called MicroK8S backed by Canonical. In a Kubernetes Podcast ep. 39 from February, Dan Lorenc mentions this:
MicroK8s is really exciting. That's based on some new features of recent Ubuntu distributions to let you run a Kubernetes environment in an isolated fashion without using a virtual machine. So if you happen to be on one of those Ubuntu distributions and can take advantage of those features, then I would definitely recommend MicroK8s.
I don't think he's referring to running minikube in a container though, but I am not fully sure: I'd enter a Ubuntu container, try to install microk8s as a package, then see what happens.
That said, unless there's a compelling reason you want to run kubernetes from within a container and you are ready to spend the time going the possible rabbit hole – I think these days running minikube, k3s or microk8s from within a VM should be the safest bet if you want to get up and running with a CI/CD pipeline relatively quickly.
As to the problem you encountered when building image from this particular Dockerfile...
I found this docker file but when I build it it fails with errors
https://aspenmesh.io/2018/01/building-istio-with-minikube-in-a-container-and-jenkins/
e - –no-install-recommends error
any idea ?
notice that:
--no-install-recommends install
and
–no-install-recommends install
are two completely different strings. So that the error you get:
E: Invalid operation –no-install-recommends
is the result you've copied content of your Dockerfile from here and you should have rather copied it from github (you can even click raw button there to be 100% sure you copy totally plain text without any additional formatting, changed encoding etc.)

How do you install something that needs restart in a Dockerfile?

Suppose I have installation instructions as follows:
Do something.
Reboot your machine.
Do something else.
How do I express that in a Dockerfile?
This entirely depends on why they require a reboot. For Linux, rebooting a machine would typically indicate a kernel modification, though it's possible it's for something more simple like a change in user permissions (which would be handled by logging out and back in again). If the install is trying to make an OS level change to the kernel, it should fail if done inside of a container. By default, containers isolate and restrict what the application can do to the running host OS which would impact the host or other running containers.
If, the reboot is to force the application service to restart, you should realize that this design doesn't map well to a container since each RUN command runs just that command in an isolated environment. And by running only that command, this also indicates that any OS services that would normally be started on OS bootup (cron, sendmail, or your application) will not be started in the container. Therefore, you'll need to find a way to run the installation command in addition to restarting any dependent services.
The last scenario I can think of they want different user permissions to take effect to the logged in user. In that case, the next RUN command will run the requested command with any changed access from prior RUN commands. So there's no need to take any specific action of your own to do a reboot, simply perform the install steps as if there's a complete restart between each step.

Resources