Have Jenkins build that use Ant to do heavy lifting.
First it fetch code, tar it, scp, sshexec to extract it, sshexec it again to install it.
There are 2 production servers right now. So I used for from ant-contrib to run scp/sshexec in parallel. For param is used to set property which is then used in scp/sshexec - to avoid issues with # vs $ notations.
However that's not working as expected.
I either get:
connection reset
ssh-agaent not present (from production server sshd logs)
Windows sockets not found
scp doulbe write server to which it's connecting (but that transfer succeds)
Build always fail at second scp/sshexec, which is strange since second connection should happen to different server.
Questions:
What am I doing wrong?
Or alternatively how to write that ant script differently, while still achieving parallelism?
This is root cause:
For param is used to set property which is then used in scp/sshexec - to avoid issues with # vs $ notations.
Ant properties are IMMUTABLE, so if set at first itteration to X, it would stay X for all the iterations of that loop!
So I either had to stick to serial execution and unsetting each parameter at the end of sequential or use # syntax and parallel loop if possible. sshexec did accept #syntax.
Related
I am wondering how to
1) how to run model directly in Eclipse without GUI - just run the model like run other java codes in Eclipse and print out something i am interested.
2) how to run it in headless mode without even Eclipse - I plan to deploy my model in a remote server, which the server or my own PC could run the model automatically at a specific time of the day.
3) Every time when I change the code, I have to launch a new GUI in order to reflect the code changes. It takes at least 5 seconds to open the GUI. This is very inefficient way of model development and debugging. What is the better strategy available?
For headless, or batch, running of models, take a look at the Repast Batch Getting Started Guide. This can either allow you to run multiple runs without a GUI, as in (1), or if you look at section 9.2, it will allow you to run from the command line without invoking Eclipse, as in your case (2). If you want more control, I'd suggest looking at the InstanceRunner class and utilize the complete_model.jar payload that is generated by the Batch GUI or batch_runner.jar.
Unarchive the complete_model.jar
Then use the InstanceRunner class from the command line, like so from within the complete_model directory
java -Xmx512m -cp "../lib/*" repast.simphony.batch.InstanceRunner \
-pxml ../scenario.rs/batch_params.xml \
-scenario ../scenario.rs \
-id $instance \
-pinput localParamFile.txt
where the localParamFile.txt is an unrolled parameter file specifying the combination(s) of parameters to run (see the unrolledParamFile.txt within the payload for an example) and if you're running just one instance this would just be one line.
I am using GitLab Community Edition and GitLab runner CI setup to deploy (synchronize) a bunch of JSON files on a server using LFTP. This job however, seems to "freeze" for a few minutes every 10 files roughly. Having to synchronize roughly 400 files sometimes, this job simply crashes because it sometimes takes more than an hour to complete. The JSON files are all 1KB. Neither the source and target servers should have any firewalls rate limiting the FTP. Both are hosted at OVH.
The following LFTP command is executed in orer to synchronize everything:
lftp -v -c "set sftp:auto-confirm true; open sftp://$DEVELOPMENT_DEPLOY_USER:$DEVELOPMENT_DEPLOY_PASSWORD#$DEVELOPMENT_DEPLOY_HOST:$DEVELOPMENT_DEPLOY_PORT; mirror -Rev ./configuration_files configuration/configuration_files --exclude .* --exclude .*/ --include ./*.json"
Job is ran in Docker, using this container to deploy everything. What could cause this?
For those of you coming from google we had the exact same setup. The way to get LFTP to stop hanging when running in a docker or some other CI you can use this command:
lftp -c "set net:timeout 5; set net:max-retries 2; set net:reconnect-interval-base 5; set ftp:ssl-force yes; set ftp:ssl-protect-data true; open -u $USERNAME,$PASSWORD $HOST; mirror dist / -Renv --parallel=10"
This does several things:
It makes it so it won't wait forever or get into a continuous loop
when it can't do a command. This should speed things along.
Makes sure we are using SSL/TLS. If you don't need this remove those
options.
Synchronizes one folder to the new location. The options -Renv can
be explained here: https://lftp.yar.ru/lftp-man.html
Lastly in the gitlab CI I set the job to retry if it fails. This will spin up a new docker instance that gets around any open file or connection limitations. The above LFTP command will run again but since we are using the -n flag it will only move over the files that were missed on the first job if it doesn't succeed. This gets everything moved over without hassle. You can read more about CI job retrys here: https://docs.gitlab.com/ee/ci/yaml/#retry
Have you looked at using rsync instead? I'm fairly sure you can benefit from the incremental copying of files as opposed to copying the entire set over each time.
I'm executing the following command which executes a group of scripts with each script being a curl download.
parallel --resume-failed --joblog logshd.log {1} ::: SH/*.sh
The set of files downloaded is quite large. I've noticed some files don't download.
I hoped that the resume-failed parameter would ensure that all the downloads that fail resume and complete.
I'm not clear on if that means I need to run the process again a second time or if that should occur when I run the one time.
From the gnu documentation
Where --resume-failed reads the commands from the command line (and
ignores the commands in the joblog), --retry-failed ignores the
command line and reruns the commands mentioned in the joblog.
I'm not clear on what ignoring the command line or ignores the commands in the job log means. Could that be clarified.
Can --resume-failed and --retry-failed be declared within the same command and if so what is the effect of that?
Regards
Conteh
If we assume the download fails intermittently then your answer is --retries 10. It will run the command 10 times before giving up.
--resume-failed and --retry-failed are both used when GNU Parallel has finished, and you then figure out that you want to retry some of the jobs again.
The difference between the two is in how to retry the command.
--retry-failed will run exactly the same command as failed before. It does that by looking in the joblog for the command. This is typically what you want.
--resume-failed is used if you figure out that the failing command actually needed some other parameter: i.e. GNU Parallel should not run exactly the same command, but it should run a (typically slightly changed) command with the same parameters instead.
I have set up a cron job to run once an hour a script cron/cron.php
This script simply reads a table to check which scripts should run at a given time.
So far no problem.
I just noticed that $_SERVER['DOCUMENT_ROOT'] and $_SERVER['SERVER_NAME'] is empty. Same to $_ENV['HOSTNAME']
What can be the reason? I would prefer to have my cron.php portable so I am searching for a solution which should work on every server.
Thanks in advance for any tips!
When the cron script is run, it's most likely executed by the php-cli binary and not the webserver.
$_SERVER entries are set by the webserver, here is the quote from $_SERVER page in the PHP manual:
$_SERVER is an array containing information such as headers, paths, and script locations. The entries in this array are created by the web server.
As there is no webserver involved with your cron script, these are not set. You can try this your own by executing php on the command-line:
php -r 'var_dump($_SERVER);'
it will output all settings in $_SERVER in your command-line environment, "DOCUMENT_ROOT" most likely will be an empty string and "SERVER_NAME" is not set at all.
The $_ENV superglobal contains the environment variables of the system specifically, it's just that "HOSTNAME" is not set as environment variable by the cron binary.
Further Considerations
I normally suggest to not only create the PHP cron script (as you did with cron/cron.php) but also to create a shell-script that invokes the php script. Then use the shell-script in the crontab. This allows you to modify the environment easily without re-configuring the crontab or the cron.php too often. You can then set environment variables within that shell script as well as changing the working directory etc.
If you want to make your cron.php script more portable, figure out what the injected environment dependencies are (e.g. the document root your have) and make those variable, e.g. with variables or a parameter object. Then create a section in your script where those variables are populated and the rest of your script can run based on them in an injected manner. This reduces configuration changes only to a very limited part of your script and will allow you to create more re-useable code.
I have the following problem: I have an ANT-task in Jenkins-CI that (apparently) needs access to OSX' window server (it needs to show a window). After doing some research, it appears that only the currently logged in user and the 'root' user (or SUDO) can access OSX' window server.
The ANT task (Adobe ADL) is one that actually 'runs' a build, so it has to popup a screen.
I'm on a macBook running OSX 10.7.something (Lion), Jenkins 1.487, Ant 1.8.4.
What i have tried so far:
to start with, tried the 'barebone' < exec > task to invoke ADL. Works, but getting error that means that Jenkins running as daemon (with homedir /Users/shared/Jenkins/Home) cannot access OSX' Window Server.
Run Jenkins as myself, by changing USER_NAME, GROUP_NAME, JENKINS_HOME in the jenkins launchd.conf file: https://wiki.jenkins-ci.org/display/JENKINS/Thanks+for+using+OSX+Installer
this gives a lot of errors/trouble, which i tried to solve in communication with the creator of the Jenkins CI but, unfortunately to no avail.
Try to have Ant run an < exec > task (running a shell script) in which i try to sudo with a password using this sneaky way of passing a password to the stdinput: echo < password > | sudo -S < command > which is really bad, but as i'm running Jenkins locally (not reachable from the outside of my LAN) it's np.
Tried to have Ant run an < exec > task, using a 'redirector' with as inputstring my password. also superbad, but yea, i just want it to work. which it did not.
Tried a Jenkins SSH plugin: didn't work. I could, however, SSH to my own localhost using terminal, thing is, i don't know what the Jenkins SSH was trying to do (how can i figure that out anyway?) so i don't know why it wouldn't work.
Tried to have Ant run an SSHEXEC task (which, after some hours, finally worked. Ant for mac is borked, something with optional .jar tasks not being re-named correctly or something) but i'm getting a "com.jcraft.jsch.JSchException: Auth fail" which i googled for, and can't seem to resolve. only applicable solution is to have sshd accept password auths, did that, still got the same error.
I think what i want to accomplish was NOT worth the 2 days that i spent so far on this problem, although i learned a lot. However, i just want this to work and will not accept defeat, yet :)
My question: have you had to solve a similar problem, how did you go about it? are there any other methods i can try to solve this problem? Is there a method mentioned that should JUST _WORK_ and i did something wrong?
[edit] I have decided to go with the Jenkins standalone app, as i think (for me) this is a nicer solution in total, as my laptop is not a build server. Also, the Jenkins app can start at startup so it actually acts as a local server.
Just a quick guess: if you don't want the interactivity of the script, and the script can do without it, you can try to set the headless mode on the java command-line:
-Djava.awt.headless=true
I have decided to go with the Jenkins standalone app, as i think (for me) this is a nicer solution anyway, as my laptop is not a (headless) build server. Also, the Jenkins app can start at startup so it acts as a server too.