Docker image statistics from hub.docker.com - docker

I have a docker image on hub.docker.com. Is there a way to find out who is using my docker image or who is pulling it? Any statistics that hub.docker.com can provide.

You can get the total pull count and star count from the API:
https://hub.docker.com/v2/repositories/$1/$2
For example:
curl -s https://hub.docker.com/v2/repositories/library/ubuntu/ | jq -r ".pull_count"

Only the statistics about number of pulls can be retrieved, at the moment. Then you can use Google Apps Script to record the number of pulls periodically and store it in a google sheet. You can find more about that here https://www.gasimof.com/blog/track_docker_image_pulls

Since the Docker Hub does not have an out-of-the-box way to see the pull trend, I end up implementing a Prometheus exporter for myself and add a dashboard in my Grafana.
Below is the graph from PromQL: docker_hub_pulls{repo="$repo"}.
Here is the Github link to my project: predatorray/docker-hub-prometheus-exporter.

It's an old thread. Just stubling on this.
I also used an exporter (different one as the one from predatorray, which uses a Promotheus-database. Worked fine, only the lastupdated date was in unixformat and I couldn't get it working to show the regular date in a more readable format. It was written in golang.
After having modified a script to store pihole-stats in an InfluxDB I thought to modify that script to get the information from hub.docker.com in my InfluxDB. After a bit of testing I managed. Running it in a docker-container now.https://github.com/pluim003/dockerhub_influx

Related

Universal way to check if container image changed without pulling that works for all container registries

I'm writing a tool to sync container image from any container registry. In order to sync images, I need a way to check if local image:tag is different from remote image:tag, possibly through comparing image sha ID (as image sha digest is registry-based). Due to the nature of my tool, pulling image first then compare using docker inspect will not be suitable.
I was able to find some post like this or this. They either tell me to use docker v2 API to fetch remote metadata (which contains image ID) and then compare with local image ID or use container-diff (which seems like it was made for a more complicated problem, comparing packages in package management systems inside images). The docker v2 API method is not universal because each registry (docker.io, grc.io, ecr) requires different headers, authentications, etc. Therefore, container-diff seems to be the most suitable choice for me, but I have yet to figure out a way to simply output true/false if local and remote image are different. Also, it seems like this tool does pull images before diffing them.
Is there anyway to do this universally for all registries? I see that there are tools that already implemented this feature like fluxcd for kubernetes which can sync remote image to local pod image but is yet to know their technical details.
On a high level your approach is correct to compare the SHA values, however you need to dive deeper into the container spec, as there is more to it. (layer and blobs)
There are already tools out there that can copy images from one registry to another. The tools by default don't copy the data if the image already exist in the target. Skopeo is a good starting point.
If you plan to copy images from different registries, you need to cope with each registry individually. I would also recommend you to take a look at Harbor. The Harbor Container Registry has the capability to copy images from and to various registries built in. You can use Harbor as your solution or starting point for your endeavor.

Monitor result of a bash command or shell script using Prometheus

My requirement is to monitor the helpdesk system of the company which is running inside the Kubernetes cluster, for example, URL https://xyz.zendesk.com
They provide their API set to monitor this efficiently.
We can easily check the status using curl
$ curl -s "https://status.zendesk.com/api/components/support?domain=xyz.zendesk.com" | jq '.active_incidents'
[]
The above output means success status according to zendesk documentation.
Now the main part is, the company uses Prometheus to monitor everything.
How to have Prometheus check the success status from the output of this curl command?.
I did some research already and found somewhat related threads here and using pushgateway
Are they applicable to my requirement or going in the wrong route?
What you probably (!?) want is something that:
Provides an HTTP(s) (e.g. /metrics) endpoint
Producing metrics in Prometheus' exposition format
From Zendesk's API
NOTE curl only gives you #3
There are some examples of solutions that appear to meet the requirements but none is from Zendesk:
https://www.google.com/search?q=%22zendesk%22+prometheus+exporter
There are >2 other lists of Prometheus exporters (neither contains Zendesk):
https://prometheus.io/docs/instrumenting/exporters/
https://github.com/prometheus/prometheus/wiki/Default-port-allocations
I recommend you contact Zendesk and ask whether there's a Prometheus Exporter already. It's surprising to not find one.
It is straightforward to write a Prometheus Exporter. Prometheus Client libraries and Zendesk API client are available and preferred. While it's possible, bash is probably sub-optimal.
If all you want to do is GET that static endpoint, get a 200 response code and confirm that the body is [], you may be able to use Prometheus Blackbox exporter
NOTE Logging and monitoring tools often provide a higher-level tool that provides something analogous to a "universal translator", facilitating translation from 3rd-party systems' native logging|monitoring formats into some canonical form using config rather than code. Although in the logging space, fluentd is an example. To my knowledge, there is no such tool for Prometheus but I sense that there's an opportunity for someone to create one.

Building platform-as-a-service

I have been assigned a problem statement which goes as follows:
I am building platform-as-a-service from scratch, which has pipelined execution. Here pipelined execution means that output of a service can be input into another service. The platform can offer number of services, which can be pipelined together. Output of a service can be input to multiple services.
I am new to this field so how to go about this task is not very intuitive to me.
After researching a bit, I found out that I can use Docker to deploy services in containers. So I installed Docker on Ubuntu and installed few images and run them as service (for example, MongoDB). What I am thinking of is that I need to run the services in containers, and define a way of taking input and output to these services. But how exactly do I do this using Docker containers. As an example, I want to send a query as an input to MongoDB (running as a service) and want an output, which I want to feed into another service.
Am I thinking in the right direction? If not in what direction should I be thinking of going about implementing this task?
Is there a standard way of exchanging data between services? (For example output of on service as input to another)
Is there something that Docker offers that I can leverage?
NOTE: I cannot use any high level API which does this for me. I need to implement it myself.

Is it possible to run a new docker container of same image when the old one reaches a specified memory limit

I am wondering if is it possible to run a new docker container by some automated means such that whenever the old container reaches a specific memory/CPU usage limit ,the old container doesn't get killed and new one balances the load.
You mean a sort of autoscaling, at the moment I don't have a built-in solution ready to be used but I can share with you my idea:
You can use a collector for metrics like cAdvisor https://github.com/google/cadvisor you can grab information about your container (you can also use docker stats to do that)
You can store this data inside a time series database like InfluxDB or prometheus.
Create a continuous query to trigger an event "create new container" when some metrics go our of your limit.
I know that you are looking for something of ready but at the moment I don't see any tools that resolve this problem.
It sounds like you need a container orchestrator for possibly other use cases. You can drive scale choices with metrics via almost any of them. Mesos, Kubernetes, or Swarm. Swarm is evolving a lot with Docker investing heavily. Swarm mode is a new feature coming in 1.12 which will put a lot of this orchestration in the core product, and would probably give you a great use case.

Pulling docker images

Is there a way where I can manually download a docker image?
I have pretty slow Internet connection and for me is better to get a link of the image and download it elsewhere with better Internet speed,
How can I get the direct URL of the image managed by docker pull?
It's possible to obtain that, but let me suggest two other ways!
If you can connect to a remote server with a fast connection, and that server can run Docker, you could docker pull on that server, then you can docker save to export an image (and all its layers and metadata) as tarball, and transfer that tarball any way you like.
If you want to transfer multiple images sharing a common base, the previous method won't be great, because you will end up transferring multiple tarballs sharing a lot of data. So another possibility is to run a private registry e.g. on a "movable" computer (laptop), connect it to the fast network, pull images, push images to the private registry; then move the laptop to the "slow" network, and pull images from it.
If none of those solutions is acceptable for you, don't hesitate to give more details, we'll be happy to help!
You could pull down the individual layers with this:
https://github.com/samalba/docker-registry-debug
Use the curlme option.
Reassembling the layers into an image is left as an exercise for the reader.

Resources