Using docker, scrapy splash on Heroku - docker

I have a scrapy spider that uses splash which runs on Docker localhost:8050 to render javascript before scraping. I am trying to run this on heroku but have no idea how to configure heroku to start docker to run splash before running my web: scrapy crawl abc dyno. Any guides is greatly appreciated!

From what I gather you're expecting:
Splash instance running on Heroku via Docker container
Your web application (Scrapy spider) running in a Heroku dyno
Splash instance
Ensure you can have docker CLI and heroku CLI installed
As seen in Heroku's Container Registry - Pushing existing image(s):
Ensure docker CLI and heroku CLI are installed
heroku container:login
docker tag scrapinghub/splash registry.heroku.com/<app-name>/web
docker push registry.heroku.com/<app-name>/web
To test the application: heroku open -a <app-name>. This should allow you to see the Splash UI at port 8050 on the Heroku host for this app name.
You may need to ensure $PORT is set appropriately as the EXPOSE docker configuration is not respected (https://devcenter.heroku.com/articles/container-registry-and-runtime#dockerfile-commands-and-runtime)
Running Dyno Scrapy Web App
Configure your application to point to <app-host-name>:8050. And the Scrapy spider should now be able to request to the Splash instance previously run.

Run at the same problem. Finally, I succesfully deployed splash docker image on Heroku.
This is my solution:
I cloned the splash proyect from github and changed the Dockerfile.
Removed command EXPOSE because it's not supported by Heroku
Replaced ENTRYPOINT by CMD command.
CMD python3 /app/bin/splash --proxy-profiles-path
/etc/splash/proxy-profiles --js-profiles-path /etc/splash/js-profiles
--filters-path /etc/splash/filters --lua-package-path /etc/splash/lua_modules/?.lua --port $PORT
Notice that I added the option --port=$PORT. This is just to listen at the port specified by Heroku instead of the default (8050)
A fork to the proyect with this change its avaliable here
You just need to build the docker image and push it to the heroku's registry, like you did before.
You can test it locally first but you must pass the environment variable "PORT" when running the docker
sudo docker run -p 80:80 -e PORT=80 mynewsplashimage

Related

Can I shell into a worker dyno on Heroku?

I can shell into a Heroku app using the CLI command:
heroku run -a app-name bash
This works beautifully, however, I cannot seem to be able to specify which dyno I want to shell into. I have one web and one worker dyno, each with their own Docker image, and the run command always goes into the web.
Is there a solution to shell into a worker dyno?
I found the answer myself. Based on Docker's documentation:
If your app is composed of multiple Docker images, you can target the process type when creating a one-off dyno:
$ heroku run bash --type=worker
This works exactly as expected.

Bitnami-docker-keycloak on Heroku: Web process failed to bind to $PORT(Error R10)

I want to put a docker image of keycloak on heroku and I followed the next instructions:
heroku auth:token
docker login --username=_ --password=${YOUR_TOKEN} registry.heroku.com
docker pull bitnami/keycloak:latest
docker images (to get image_id)
docker tag {image_id} registry.heroku.com/{heroku_app_name}/web
docker push registry.heroku.com/{heroku_app_name}/web
heroku container:release web -a {heroku_app_name}
After that I added postgresql on heroku followed by configuring vars. Everything worked fine until I got this error. It doesn't work if I add another PORT vars in heroku
You cannot hardcode the port (8081) on Heroku, but you must use the $PORT environment variable provided by you (this is the dynamic port for your Web Dyno).
The web process must listen for HTTP traffic on $PORT
On Heroku you cannot run docker run -e KEYCLOAK_HTTP_PORT=$PORT bitnami/keycloak:latest but you can provide a Dockerfile to start the application with the configuration/variable you need (using CMD)

How to run grails-vue profile within a docker container?

I have created a docker image for grails on dockerhub:
https://hub.docker.com/r/dhobdensa/docker-alpine-grails/
It is based on the official openjdk:alpine image
First I run the container:
docker container run -it --rm -p 8080:8080 -p 3000:3000 dhobdensa/docker-alpine-grails
Then I create a new grails app with the vue profile
grails create-app --inplace --profile vue
And then I run the app:
./gradlew bootRun -parallel
Which starts a grails REST API server, and a vue client app using vue-cli and webpack
The server says your app is running on localhost:8080. This can be accessed and returns the expected result.
The client says your app is running on localhost:3000. But when attempting to access this, the browser just shows the default ERR_EMPTY_RESPONSE page.
I have tried different browsers and clearing caches.
Any ideas why accessing port 3000 is not working, but 8080 is?
Additional info
It seems that gradle is essentially running this command:
webpack-dev-server --inline --progress --config build/webpack.dev.conf.js
And this is the file:
https://gist.github.com/dhobdensa/4e22a188cc2b26cf5b0dd4028755d39b
Perhaps this is linked to webpack dev server?
So I found my answer.
I suspected that webpack dev server was the place to be looking.
Then I found this issue on github:
Cant run webpack-dev-server inside of a docker container?
https://github.com/webpack/webpack-dev-server/issues/547
Long story short, I had to add --host 0.0.0.0 to the "dev" task in package.json
"dev": "webpack-dev-server --inline --progress --config build/webpack.dev.conf.js --host 0.0.0.0"

how to deploy specific docker container just by docker run?

https://github.com/getsentry/onpremise
mkdir -p data/{sentry,postgres} - Make our local database and sentry config directories.
This directory is bind-mounted with postgres so you don't lose state!
docker-compose run --rm web config generate-secret-key - Generate a secret key.
Add it to docker-compose.yml in base as SENTRY_SECRET_KEY.
docker-compose run --rm web upgrade - Build the database.
Use the interactive prompts to create a user account.
docker-compose up -d - Lift all services (detached/background mode).
Access your instance at localhost:9000!
I'm new to docker.
I tried to run sentry container locally, succeeded.
But when I was trying to deploy it on a cloud container service platform,I met some problems.
The platform just provide one way to run docker: docker run xxx , unlike aws which can use cli.
So how could I deploy on that platform? Thanks.
Additionally,I must use that platform cause it's my company's product lol.

My app can't create log files when it starts up inside Docker

I spent the weekend pouring over the Docker docs and playing around with the toy applications and example projects. I'm now trying to write a super-simple web service of my own and run it from inside a container. In the container, I want my app (a Spring Boot app under the hood) -- called bootup -- to have the following directory structure:
/opt/
bootup/
bin/
bootup.jar ==> the app
logs/
bootup.log ==> log file; GETS CREATED BY THE APP # STARTUP
config/
application.yml ==> app config file
logback.groovy ==> log config file
It's very important to note that when I run my app locally on my host machine - outside of Docker - everything works perfectly fine, including the creation of log files to my host's /opt/bootup/logs directory. The app endpoints serve up the correct content, etc. All is well and dandy.
So I created the following Dockerfile:
FROM openjdk:8
RUN mkdir /opt/bootup
RUN mkdir /opt/bootup/logs
RUN mkdir /opt/bootup/config
RUN mkdir /opt/bootup/bin
ADD build/libs/bootup.jar /opt/bootup/bin
ADD application.yml /opt/bootup/config
ADD logback.groovy /opt/bootup/config
WORKDIR /opt/bootup/bin
EXPOSE 9200
ENTRYPOINT java -Dspring.config=/opt/bootup/config -jar bootup.jar
I then build my image via:
docker build -t bootup .
I then run my container:
docker run -it -p 9200:9200 -d --name bootup bootup
I run docker ps:
CONTAINER ID IMAGE COMMAND ...
3f1492790397 bootup "/bin/sh -c 'java ..."
So far, so good!
My app should then be serving a simple web page at localhost:9200, so I open my browser to http://localhost:9200 and I get nothing.
When I use docker exec -it 3f1492790397 bash to "ssh" into my container, I see everything looks fine, except the /opt/bootup/logs directory, which should have a bootup.log file in it -- created at startup -- is instead empty.
I tried using docker attach 3f1492790397 and then hitting http://localhost:9200 in my browser, to see if that would generated some standard output (my app logs both to /opt/bootup/logs/bootup.log as well as the console) but that doesn't yield any output.
So I think what's happening is that my app (for some reason) doesn't have permission to create its own log file when the container starts up, and puts the app in a weird state, or even prevents it from starting up altogether.
So I ask:
Is there a way to see what user my app is starting up as?; or
Is there a way to tail standard output while the container is starting? Attaching after startup doesn't help me because I think by the time I run the docker attach command the app has already choked
Thanks in advance!
I don't know why your app isn't working, but can answer your questions-
Is there a way to see what user my app is starting up as?; or
A: Docker containers run as root unless otherwise specified.
Is there a way to tail standard output while the container is starting? Attaching after startup doesn't help me because I think by the time I run the docker attach command the app has already choked
A: Docker containers dump stdout/stderr to the Docker logs by default. There are two ways to see these- 1 is to run the container with the flag -it instead of -d to get an interactive session that will list the stdout from your container. The other is to use the docker logs *container_name* command on a running or stopped container.
docker attach 3f1492790397
This doesn't do what you are hoping for. What you want is docker exec (probably docker exec -it bootup bash), which will give you a shell in the scope of the container which will let you check for your log files or try and hit the app using curl from inside the container.
Why do I get no output?
Hard to say without the info from the earlier commands. Is your app listening on 0.0.0.0 or on localhost (your laptop browser will look like an external machine to the container)? Does your app require a supervisor process that isn't running? Does it require some other JAR files that are on the CLASSPATH on your laptop but not in the container? Are you running docker using Docker-Machine (in which case localhost is probably not the name of the container)?

Resources