Kubernetes CronJob Pod status remains Running - docker

I want to run a Kubernetes CronJob for a PHP script. Job executes properly but status of the POD remains running and after few minutes it becomes Error. It should be Completed status. Tried with different options but couldn't be able to resolve the issue.
Here is my CronJob Yaml file
Here is the output of kubectl get pods
Here is the log output inside the container.
Ignore the PHP exception. Issue is there regardless of the exception.

The state of the pod sets to completed when the running process / the application or the container returns exit code 0.
If in case it's returning a non-zero exit code it usually sets it to state Error.
If you want the pod set to completed status, just make sure the application at the end returning an exit code which is 0.
OPINION: It's something which is usual cases should be/are handled by the application itself.
I'm attaching docs for the k8s jobs.

Related

For Kubernetes pods how to find cause of exit code 2

I have pods that are of kind Cronjob running in parallel. They complete task and run again after fixed interval of 20 minutes as per cron expression. I noticed that some pods are restarting 2-3 times before completing task.
I checked details in kubectl describe pod command and found that pod exit code 2 when it restart due to some error:
Last State: Terminated
Reason: Error
Exit Code: 2
I searched about exit code 2 and found that it is misuse of a shell builtin commands. How I can find which shell builtin is misused. How to debug cause of exit code 2.
Thanks in advance.
An exit code of 2 indicates either that the application chose to return that error code, or (by convention) there was a misuse of a shell built-in. Check your pod’s command specification to ensure that the command is correct. If you think it is correct, try running the image locally with a shell and run the command directly.
Refer to this link for more information.
You can get logs with
kubectl logs my-pod
Post output here if you can't fix it.

Docker Wildfly 26 constantly restarts itself

I'm having a problem with building a custom Wildfly26 (26.1.2) image.
The server starts normally but as soon as its fully started, it redeploys all the ears and so restarts itself.
I've broken down the problem to the deployment-scanner: I am mounting my ears from my windows host via the -v flag into the container. Everything works as expected and the server starts with the message WildFly Full 26.1.2.Final started in 106408ms. After that it immediately destroys everything and tries to redeploy the ears due to an update action. I can't figure out where this update action is coming from.
Debugging the server does show a message that it redeploys the ears but not why:
DEBUG \[org.jboss.as.server.deployment.scanner\] (DeploymentScanner-threads - 2) Deployment scan of \[/opt/jboss/wildfly/deploy/.\] found update action \[{
"operation" =\> "redeploy",
"address" =\> \[("deployment" =\> "something.ear")\],
"owner" =\> \[
("subsystem" =\> "deployment-scanner"),
("scanner" =\> "dev.profile.scanner")
\]
}\]
My Deployment-scanner config:
/subsystem=deployment-scanner/scanner=dev.profile.scanner:add(path=./,relative-to="deploy.dir",scan-interval=5000,auto-deploy-zipped="true",auto-deploy-exploded="true", deployment-timeout="18000")
If I copy the ears to another directory within the container on entrypoint, everything works as it should and it does not behave like that.
I am guessing its related to the volume mount from my host to the container.
Any help is appreciated.
If I copy the ears to another directory within the container on entrypoint, everything works as it should do. The problem here is that the deployment scanner checks the sources in the copy-destination which aren't changing because the copy command is only run once at the beginning (entrypoint). In this case the deployment scanner would be useless.

kubectl set image throws error: the server doesn't have a resource type deployment"

Environment: Win 10 home, gcloud sdk v240.0 kubectl added as a gcloud sdk component, Jenkins 2.169
I am running a Jenkins pipeline in which I call a windows batch file as a post-build action.
In that batch file, I am running:
kubectl set image deployment/py-gmicro py-gmicro=%IMAGE_NAME%
I get this
error: the server doesn't have a resource type deployment
However, if I run the batch file directly from the command prompt, it works fine. Looks like it has an issue only if I run it from Jenkins.
Looked at a similar thread on stackoverflow, however that user was using bitbucket (instead of Jenkins).
Also, there was no certified answer on that thread. I cannot continue on that thread since I am not allowed to comment (50 reputation required)
Just was answered on this thread
I've had this error fixed by explicitly setting the namespace as an argument, e.g.:
kubectl set image -n foonamespace deployment/ms-userservice.....
Refrence:
https://www.mankier.com/1/kubectl-set-image#--namespace

Continuous deployment using LFTP gets "stuck" temporarily after about 10 files

I am using GitLab Community Edition and GitLab runner CI setup to deploy (synchronize) a bunch of JSON files on a server using LFTP. This job however, seems to "freeze" for a few minutes every 10 files roughly. Having to synchronize roughly 400 files sometimes, this job simply crashes because it sometimes takes more than an hour to complete. The JSON files are all 1KB. Neither the source and target servers should have any firewalls rate limiting the FTP. Both are hosted at OVH.
The following LFTP command is executed in orer to synchronize everything:
lftp -v -c "set sftp:auto-confirm true; open sftp://$DEVELOPMENT_DEPLOY_USER:$DEVELOPMENT_DEPLOY_PASSWORD#$DEVELOPMENT_DEPLOY_HOST:$DEVELOPMENT_DEPLOY_PORT; mirror -Rev ./configuration_files configuration/configuration_files --exclude .* --exclude .*/ --include ./*.json"
Job is ran in Docker, using this container to deploy everything. What could cause this?
For those of you coming from google we had the exact same setup. The way to get LFTP to stop hanging when running in a docker or some other CI you can use this command:
lftp -c "set net:timeout 5; set net:max-retries 2; set net:reconnect-interval-base 5; set ftp:ssl-force yes; set ftp:ssl-protect-data true; open -u $USERNAME,$PASSWORD $HOST; mirror dist / -Renv --parallel=10"
This does several things:
It makes it so it won't wait forever or get into a continuous loop
when it can't do a command. This should speed things along.
Makes sure we are using SSL/TLS. If you don't need this remove those
options.
Synchronizes one folder to the new location. The options -Renv can
be explained here: https://lftp.yar.ru/lftp-man.html
Lastly in the gitlab CI I set the job to retry if it fails. This will spin up a new docker instance that gets around any open file or connection limitations. The above LFTP command will run again but since we are using the -n flag it will only move over the files that were missed on the first job if it doesn't succeed. This gets everything moved over without hassle. You can read more about CI job retrys here: https://docs.gitlab.com/ee/ci/yaml/#retry
Have you looked at using rsync instead? I'm fairly sure you can benefit from the incremental copying of files as opposed to copying the entire set over each time.

Openshift : pods not being deleted

I have am using OpenShift 3, and have been trying to get Fabric8 setup.
Things havent been going to well, so I decided to remove all services and pods.
When I run
oc delete all -l provider=fabric8
The cli output claims to have deleted a lot of pods, however, they are still showing in the web console, and I can run the same command in the CLI again and get the exact same list of pods that OpenShift cli claims it deleted.
How do I actually delete these pods?
Why is this not working as designed?
Thanks
Deletion is graceful by default, meaning the pods are given an opportunity to terminate themselves. You can force a graceless delete with oc delete all --grace-period=0 ...
You can also delete the pod forcefully like below,it works fine
#oc delete all -l provider=fabric8 --grace-period=0 --force
Sadly Jordan's response did not work for me on openshift 3.6.
Instead I used the option --now equivalent to --grace-period=1

Resources