auto restart a Tika server - apache-tika

I am building a web service where users submit pdf files and from these files the content in text is extracted using Tika. I am using Tika in server mode on the same machine that I host my Django website.
My question is, is there a way to automate the restart of the Tika server when it shuts down for any reason? How can I build a script and run this so whenever the Tika server goes down this gets traced and the server restarts again? My ultimate goal for this is not to check every day from the console if Tika is down, neither to realize that the service is down when a user complains that her pdf does get extracted.

Since you're using a recent copy of Ubuntu, your easiest option is probably to create a custom Upstart job for it. On other unixes, you'd want something similar for their init system, and on Windows I think something with Apache Commons Daemon to wrap it as a Windows service is likely the best bet.
As covered in this post over on Ask Ubuntu, the key thing you'll want is the respawn option, to tell upstart to re-launch the Tika server if it happens to fail, and a limit in case it gets really broken for some reason.
You'll want to create a file /etc/init/tika-server.conf, with contents along the lines of:
description "Apache Tika Server"
start on filesystem or runlevel [2345]
stop on shutdown
respawn
respawn limit 3 12
exec java -jar /path/to/tika/tika-server-1.10-SNAPSHOT.jar
Tweak the path to your Tika Server jar, and add any options / parameters you want to the end.
With that done, to init-checkconf /etc/init/tika-server.conf to check it's valid, then service tika-server start to start it.
At that point, you can head to http://localhost:9998/ and see it running! If it dies, upstart will restart it for you.

Related

How to interact with already running instance via terminal in Mongooseim?

I am using Mongooseim 3.2.0 from the source code on the ubuntu server. Below are concern:
What is the best way to run mongooseim as a service so that it automatically restarts if mongooseim crashes or system restarts?
How to interact via terminal with already running mongooseim instance on the ubuntu server like "mongooseimctl live". My guess is running "mongooseimctl live" will try to create another instance. I just want to see the live logs and interaction and don't want to keep scrolling the long log files for this purpose.
I apologize if the answer to above is obvious but just want to follow the best guidance.
mongooseimctl live or mongooseimctl foreground is mostly useful for development or smoke testing a deployment (unless you're running inside a container). For real world use cases you should start the server in the background with mongooseimctl start.
Back to the container - the best approach for containerised applications is to run them in the foreground, therefore in a container startup script use mongooseimctl foreground.
Once the server is running (no matter how it was started) attaching a shell to troubleshoot issues can be done with mongooseimctl debug. This is the command to use when you get the Protocol 'inet_tcp': the name mongooseim#localhost seems to be in use by another Erlang node error. Be careful if it's a production environment - you can easily take the server down with access to this shell.
If you're just interested in watching logs, with no interactive access to the server internals that the shell offers, a simple tail -f /your-configured-mongooseim-log-dir/* should be enough.
Ubuntu nowadays uses systemd for managing its services' lifetimes. A systemd .service file can be found at https://github.com/esl/MongooseIM/blob/master/tools/pkg/platforms/debian_stretch/files/build/mongooseim.service - we use it for packaging into Debian/Ubuntu .deb packages.

Restart a process inside a Docker container whenever the config file changes

I have a DockerFile that starts 2 processes in a single docker container using a jar file and a config file as an argument
java -jar process1.jar process1.cfg &
java -jar process2.jar process2.cfg
process1.cfg and process2.cfg are residing in mounted directories. Now whenever there is a change in any of the cfg files, I would need to restart the corresponding process for the new change to take effect. All these to be done programmatically using Java in a REST microservice that updates the config file and restarts the process. Any idea on how to go about it ?
The problem can be generically solved by your Java app starting a config change monitoring service/thread, which manages the actual business service/thread(s) by starting it in the beginning and restarting on any change (if the change actually needs a restart). File change monitoring is standard Java functionality. The solution does not need any REST, it is not bound to microservice architecture (although it is more sensible within it) and it is not limited by or to docker containers.
If you do not want any file-based configs, do the same, but the monitoring bit can be e.g. a vert.x-based web server listening for external REST requests supplying configs, on start or for any update. The rest remains the same.
In my current workplace we actually have a module that functions in exactly this way, it is deployed to a docker and uses both file system monitoring and vert.x web server for config changes.
You can even go further and make the monitoring bit start multiple instances internally if multiple configs need to be supported.

Sandbox command execution with docker via Ajax

I'm looking For help in this matter, what options do I have if I want to sandbox the execution of commands that are typed in a website? I would like to create an online interpreter for a programming language.
I've been looking at docker, how would I use it? Is this the best option?
codecube.io does this. It's open source: https://github.com/hmarr/codecube
The author wrote up his rationale and process. Here's how the system works:
A user types some code in to a box on the website, and specifies the language the code is written in
They click “Run”, the code is POSTed to the server
The server writes the code to a temporary directory, and boots a docker container with the temporary directory mounted
The container runs the code in the mounted directory (how it does this varies according to the code’s language)
The server tails the logs of the running container, and pushes them down to the browser via server-sent events
The code finishes running (or is killed if it runs for too long), and the server destroys the container
The Docker container's entrypoint is entrypoint.sh, which inside a container runs:
prog=$1
<...create user and set permissions...>
sudo -u codecube /bin/bash /run-code.sh $prog
Then run-code.sh checks the extension and runs the relevant compiler or interpreter:
extension="${prog##*.}"
case "$extension" in
"c")
gcc $prog && ./a.out
;;
"go")
go run $prog
;;
<...cut...>
The server that accepts the code examples from the web, and orchestrates the Docker containers was written in Go. Go turned out to be a pretty good choice for this, as much of the server relied on concurrency (tailing logs to the browser, waiting for containers to die so cleanup could happen), which Go makes joyfully simple.
The author also details how he implemented resource limiting, isolation and thoughts of security.

Change Default 'home' Path in Erlang to Resolve RabbitMQ Start Up Error

I'm new to rabbitmq and by association new to erlang. I'm running into a problem where I cannot start rabbitmq as the 'home' location for the .erlang.cookie has been changed. I've run the command
init:get_argument(home).
which returns
{ok,[["H:\\"]]}
this is an issue, as this is a network drive I do not always have access to. I need to be able to change the 'home' directory to something local.
when I run
rabbitmqctl status
it gives me the following error:
{error_logger,{{2013,7,5},{14,47,10}},"Failed to create cookie file 'h:/.erlang.cookie': enoent",[]}
which again leads me to believe that there is an issue with the home argument. I need to be able to change this location to something local.
Versions:
Erlang R16B01 32 bit
RabbitMQ 3.1.3
Running on Win7
I have uninstalled and reinstalled multiple times hoping to resolve this. I am looking for a way to change the 'home' location in erlang so rabbitmq can properly start.
The solution I came up with was to not bother with the installed service. I used the rabbitmq-server.bat to start the service, SET HOMEDRIVE=C: at the start of the file. I'm planing to run this from a parent service so that I can install this on servers.
Final note to earlang and rabbitMQ developers; using pre-existing environment variables for you own purposes is just wrong. You should create your own, or better yet put this stuff in a configuration file. Telling people to talk to their system administrators to change the HOMEDRIVE and APPDATA variables is arrogant to say the least.
You need to set the correct values ​​for variables $HOMEDRIVE and $HOMEPATH. These links should help:
Permanently Change Environment Variables in Windows
Overriding HOMEDRIVE and HOMEPATH as a Windows 7 user

Stop daemon with server in Ruby on Rails

I have a daemon that I'm starting along with the server using an initializer file.
I want to stop this daemon once the server stops, but I'm not sure where to put a script that would run when the server stops.
Initializers get automatically loaded when the server starts. Is there a similar "destroyers" folder? Where would I put code that I want to run when the server stops?
Thanks!
Here's a link that might be of interest, http://github.com/costan/daemonz

Resources