As per title: I've an issue with Loki (running on Docker) storing its chunks & C. on a bucket of AWS S3.
Loki is running fine, simply it stores its logs in the filesystem rather than in the bucket, and infact the bucket is empty.
What is wrong in my configuration?
In AWS, IAM, I've created a user with programmatic access, and I've give it the following policy ...
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LokiStorage",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::__myBucket__",
"arn:aws:s3:::__myBucket__/*"
]
}
]
}
The policy seems sufficient, since I've use it to push in the bucket some files present in the filesystem.
The relevant part of the docker compose...
version: "3.8"
volumes:
loki_data: {}
services:
loki:
image: grafana/loki:2.1.0
networks:
- my-overlay
ports:
- 3100:3100
volumes:
- ./loki/loki-config.yml:/etc/loki/local-config.yml
- loki_data:/loki
command: -config.file=/etc/loki/local-config.yaml
... Seems fine too: Loki - as container and service - runs smoothly.
The Loki's configuration file, "loki-config.yaml", seems also fine...
---
auth_enabled: false
ingester:
chunk_idle_period: 3m
chunk_block_size: 262144
chunk_retain_period: 1m
max_transfer_retries: 0
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
compactor:
working_directory: /loki/boltdb-shipper-compactor
shared_store: aws
schema_config:
configs:
- from: 2020-07-01
store: boltdb-shipper
object_store: aws
schema: v11
index:
prefix: loki_index_
period: 24h
server:
http_listen_port: 3100
storage_config:
aws:
s3: s3://__myAccessKey__:__mySecretAccessKey__#eu-west-1/__myBucket__
boltdb_shipper:
active_index_directory: /loki/index
shared_store: s3
cache_location: /loki/boltdb-cache
chunk_store_config:
max_look_back_period: 0s
table_manager:
retention_deletes_enabled: false
retention_period: 0s
... But, infact, there is something wrong and/or missing somewhere, somehow.
Suggestions?
I've found the culprit: there was an error in the docker-compose:
volumes:
- ./loki/loki-config.yml:/etc/loki/local-config.yml
### Here: "yml", and not "yaml" ──────────────────┘
- loki_data:/loki
command: -config.file=/etc/loki/local-config.yaml
### Here: "yaml", and not "yml" ─────────────┘
Now the compose is correct, and the Loki's configuration file is the following:
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
final_sleep: 0s
chunk_idle_period: 1h
max_chunk_age: 1h
chunk_target_size: 1048576
chunk_retain_period: 30s
max_transfer_retries: 0
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: s3
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /loki/index
cache_location: /loki/index_cache
cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space
shared_store: s3
aws:
s3forcepathstyle: true
bucketnames: __myBucket__
region: eu-west-1
access_key_id: __myAccessKey__
secret_access_key: __mySecretAccessKey__
compactor:
working_directory: /loki/compactor
shared_store: s3
compaction_interval: 5m
limits_config:
reject_old_samples: true
reject_old_samples_max_age: 168h
ingestion_rate_mb: 16
ingestion_burst_size_mb: 32
chunk_store_config:
max_look_back_period: 0s
table_manager:
retention_deletes_enabled: false
retention_period: 0s
ruler:
storage:
type: s3
s3:
s3forcepathstyle: true
bucketnames: __myBucket__
region: eu-west-1
access_key_id: __myAccessKey__
secret_access_key: __mySecretAccessKey__
rule_path: /loki/rules-temp
alertmanager_url: http://localhost:9093
ring:
kvstore:
store: inmemory
enable_api: true
The configuration seems fine, since - at last! - the bucket of S3 is populated and Loki works fine.
Maybe.
Oddly enough: the bucket (as said) is populated, but I see that Loki also stores its chunks etc. in the local filesystem:
root#ip-aaa-bbb-ccc-ddd:/var/lib/docker/volumes/monit_loki_data# tree
.
└── _data
├── compactor
│ ├── index_19375
│ └── index_19376
├── index
│ └── uploader
│ └── name
├── index_cache
│ └── index_19376
│ ├── 5a22562e87b2-1674059149901907563-1674116100.gz
│ └── compactor-1674116450.gz
├── rules
└── tmprules
10 directories, 3 files
Hence the question: Loki really works as intended? That is: storing both in the bucket and in the filesystem it is its normal behaviour?
Related
I am trying to create a Private Container Registry on DigitalOcean Kubernetes. And I want all data to be saved in the DigitalOcean Spaces. I am using this tutorial:
https://www.digitalocean.com/community/tutorials/how-to-set-up-a-private-docker-registry-on-top-of-digitalocean-spaces-and-use-it-with-digitalocean-kubernetes
Things and pod are running well, I am able to push or pull images and I would like to configure basic auth (htpasswd) on top on it, but when I add htpasswd attribute to my chart values file, I am getting error:
{"level":"fatal","msg":"configuring application: unable to configure authorization (htpasswd): no access controller registered with name: htpasswd","time":"2022-12-14T13:02:23.608Z"}
My chart_values.yaml file:
ingress:
enabled: true
hosts:
- cr.somedomain.com
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/proxy-body-size: "30720m"
args:
- --set controller.extraArgs.ingress-class=nginx
tls:
- secretName: somedomain-cr-prod
hosts:
- cr.somedomain.com
storage: s3
secrets:
htpasswd: |-
username:someBcryptPassword
s3:
accessKey: "someaccesskey"
secretKey: "someaccesssecret"
s3:
region: region
regionEndpoint: region.digitaloceanspaces.com
secure: true
bucket: somebucketname
image:
repository: somerepo
tag: latest
Maybe someone can answer where did I go wrong?
I have tried different formats to enter htpasswd, but it did produce the same error.
I’ve been working with a docker deployment and I’m seeing an irksome behavior: The full project is here (I'm using the v1.0.0-CODI tag): https://github.com/NACHC-CAD/anonlink-entity-service
Sometimes (often) I get the following error:
running:
docker-compose -p anonlink -f tools/docker-compose.yml up --remove-orphans
I get:
Pulling db_init (data61/anonlink-app:v1.15.0-22-gba57975)...
ERROR: manifest for data61/anonlink-app:v1.15.0-22-gba57975 not found: manifest unknown: manifest unknown
anonlink-app is specified in the .yml file as:
data61/anonlink-app:${TAG:-latest}
How is it that docker looking for a non-existent tag?
The full .yml file is shown below.
version: '3.4'
services:
db:
image: postgres:11.13
environment:
- POSTGRES_PASSWORD=rX%QpV7Xgyrz
volumes:
- psql:/var/lib/postgresql/data
#ports:
#- 5432:5432
healthcheck:
test: pg_isready -q -h db -p 5432 -U postgres
interval: 5s
timeout: 30s
retries: 5
minio:
image: minio/minio:RELEASE.2021-02-14T04-01-33Z
command: server /export
env_file: .env
volumes:
- minio:/export
ports:
- 9000:9000
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
interval: 30s
timeout: 20s
retries: 3
redis:
image: redis:5.0
# The flask application server
backend:
image: data61/anonlink-app:${TAG:-latest}
env_file: .env
environment:
- FLASK_DB_MIN_CONNECTIONS=1
- FLASK_DB_MAX_CONNECTIONS=10
depends_on:
- db
- db_init
- redis
- minio
- objectstore_init
# The application server can also setup the database
db_init:
image:
data61/anonlink-app:${TAG:-latest}
env_file:
- .env
environment:
- DEBUG=true
- DATABASE_PASSWORD=rX%QpV7Xgyrz
- FLASK_APP=entityservice
entrypoint: /bin/sh -c "dockerize -wait tcp://db:5432 alembic upgrade head"
depends_on:
- db
# Set up the object store to have another more restricted user
objectstore_init:
image: minio/mc:RELEASE.2021-02-14T04-28-06Z
environment:
- OBJECT_STORE_SECURE=false
env_file:
- .env
entrypoint: |
/bin/sh /opt/init-object-store.sh
volumes:
- ./init-object-store.sh:/opt/init-object-store.sh:ro
depends_on:
- minio
# A celery worker
worker:
image: data61/anonlink-app:${TAG:-latest}
depends_on:
- redis
- db
command: celery -A entityservice.async_worker worker --loglevel=info -O fair -Q celery,compute,highmemory
env_file:
- .env
environment:
- CELERY_ACKS_LATE=true
- REDIS_USE_SENTINEL=false
- CELERYD_MAX_TASKS_PER_CHILD=2048
#- CHUNK_SIZE_AIM=300_000_000
- CELERY_DB_MIN_CONNECTIONS=1
- CELERY_DB_MAX_CONNECTIONS=3
nginx:
image: data61/anonlink-nginx:${TAG:-latest}
ports:
- 8851:8851
depends_on:
- backend
environment:
TARGET_SERVICE: backend
PUBLIC_PORT: 8851
# A celery monitor. Useful for debugging.
# celery_monitor:
# image: data61/anonlink-app:${TAG:-latest}
# depends_on:
# - redis
# - worker
# command: celery flower -A entityservice.async_worker
# ports:
# - 8888:8888
# Jaeger UI is available at http://localhost:16686
jaeger:
image: jaegertracing/all-in-one:latest
environment:
COLLECTOR_ZIPKIN_HTTP_PORT: 9411
# ports:
# - 5775:5775/udp
# - 6831:6831/udp
# - 6832:6832/udp
# - 5778:5778
# - 16686:16686
# - 14268:14268
# - 9411:9411
volumes:
psql:
minio:
This values.yaml file also exists in the project (note is uses v1.15.1, not v1.15.0-22-gba57975)
rbac:
## TODO still needs work to fully lock down scope etc
## See issue #88
create: false
anonlink:
## Set arbitrary environment variables for the API and Workers.
config: {
## e.g.: to control which task is added to which celery worker queue.
## CELERY_ROUTES: "{
## 'entityservice.tasks.comparing.create_comparison_jobs': { 'queue': 'highmemory' }, ...
## }"
}
objectstore:
## Settings for the Object Store that Anonlink Entity Service uses internally
## Connect to the object store using https
secure: false
## Settings for uploads via Object Store
## Toggle the feature providing client's with restricted upload access to the object store.
## By default we don't expose the Minio object store, which is required for clients to upload
## via the object store. See section `minio.ingress` to create an ingress for minio.
uploadEnabled: true
## Server used as the external object store URL - provided to clients so should be externally
## accessible. If not provided, the minio.ingress is used (if enabled).
#uploadServer: "s3.amazonaws.com"
## Tell clients to make secure connections to the upload object store.
uploadSecure: true
## Object store credentials used to grant temporary upload access to clients
## Will be created with an "upload only" policy for a upload bucket if using the default
## MINIO provisioning.
uploadAccessKey: "EXAMPLE_UPLOAD_KEY"
uploadSecretKey: "EXAMPLE_UPLOAD_SECRET"
## The bucket for client uploads.
uploadBucket:
name: "uploads"
## Settings for downloads via Object Store
## Toggle the feature providing client's with restricted download access to the object store.
## By default we don't expose the Minio object store, which is required for clients to download
## via the object store.
downloadEnabled: true
## Tell clients to make secure connections to the download object store.
downloadSecure: true
## Server used as the external object store URL for downloads - provided to clients so
## should be externally accessible. If not provided, the minio.ingress is used (if enabled).
#downloadServer: "s3.amazonaws.com"
## Object store credentials used to grant temporary download access to clients
## Will be created with an "get only" policy if using the default MINIO provisioning.
downloadAccessKey: "EXAMPLE_DOWNLOAD_KEY"
downloadSecretKey: "EXAMPLE_DOWNLOAD_SECRET"
api:
## Deployment component name
name: api
## Defines the serviceAccountName to use when `rbac.create=false`
serviceAccountName: default
replicaCount: 1
## api Deployment Strategy type
strategy:
type: RollingUpdate
# type: Recreate
## Annotations to be added to api pods
##
podAnnotations: {}
# iam.amazonaws.com/role: linkage
## Annotations added to the api Deployment
deploymentAnnotations: # {}
# This annotation enables jaeger injection for open tracing
"sidecar.jaegertracing.io/inject": "true"
## Settings for the nginx proxy
www:
image:
repository: data61/anonlink-nginx
tag: "v1.4.9"
# pullPolicy: Always
pullPolicy: IfNotPresent
## Nginx proxy server resource requests and limits
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 200m
memory: 256Mi
app:
image:
repository: data61/anonlink-app
tag: "v1.15.1"
pullPolicy: IfNotPresent
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
limits:
cpu: 1
memory: 8Gi
requests:
cpu: 500m
memory: 512Mi
dbinit:
enabled: "true"
## Database init runs migrations after install and upgrade
image:
repository: data61/anonlink-app
tag: "v1.15.1"
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
## Annotations added to the database init job's pod.
# podAnnotations: {}
# sidecar.istio.io/inject: "false"
## A job that creates an upload only object store user.
objectstoreinit:
enabled: true
image:
repository: minio/mc
tag: RELEASE.2020-01-13T22-49-03Z
## Annotations added to the object store init job's pod.
# podAnnotations: {}
# sidecar.istio.io/inject: "false"
ingress:
## By default, we do not want the service to be accessible outside of the cluster.
enabled: false
## Ingress annotations
annotations: {}
## Suggested annotations
## To handle large uploads we increase the proxy buffer size
#ingress.kubernetes.io/proxy-body-size: 4096m
## Redirect to ssl
#ingress.kubernetes.io/force-ssl-redirect: "true"
## Deprecated but common
## https://kubernetes.io/docs/concepts/services-networking/ingress/#deprecated-annotation
# kubernetes.io/ingress.class: ""
path: /
pathType: Prefix
## Entity Service API Ingress hostnames
## Must be provided if Ingress is enabled
hosts: []
## E.g:
#- beta.anonlink.data61.xyz
## Ingress TLS configuration
## This example setup is for nginx-ingress. We use certificate manager.
## to create the TLS secret in the namespace with the name
## below.
tls: []
## secretName is the kubernetes secret which will contain the TLS secret and certificates
## for the provided host url. It is automatically generated from the deployed cert-manager.
#- secretName: beta-anonlink-data61-tls
# hosts:
# - beta.anonlink.data61.xyz
service:
annotations: []
labels:
tier: frontend
clusterIp: ""
## Expose the service to be accessed from outside the cluster (LoadBalancer service).
## or access it from within the cluster (ClusterIP service).
## Set the service type and the port to serve it.
## Ref: http://kubernetes.io/docs/user-guide/services/
## Most likely ingress is enabled so this should be ClusterIP,
## Otherwise "LoadBalancer".
type: ClusterIP
servicePort: 80
## If using a load balancer on AWS you can optionally lock down access
## to a given IP range. Provide a list of IPs that are allowed via a
## security group.
loadBalancerSourceRanges: []
workers:
name: "matcher"
image:
repository: "data61/anonlink-app"
tag: "v1.15.1"
pullPolicy: Always
## The initial number of workers for this deployment
## Note the workers.highmemory.replicaCount are in addition
replicaCount: 1
## Enable a horizontal pod autoscaler
## Note: The cluster must have metrics-server installed.
## https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/
autoscaler:
enabled: false
minReplicas: 1
maxReplicas: 20
podAnnotations: {}
deploymentAnnotations: # {}
# This annotation enables jaeger injection for open tracing
"sidecar.jaegertracing.io/inject": "true"
#strategy: ""
## Additional Entity Service Worker container arguments
##
extraArgs: {}
## Worker configuration
## These settings populate the deployment's configmap.
## Desired task size in "number of comparisons"
## Note there is some overhead creating a task and a single dedicated cpu core can do between 50M and 100M
## comparisons per second, so much lower that 100M isn't generally worth splitting across celery workers.
CHUNK_SIZE_AIM: "300_000_000"
## More than this many entities and we skip caching in redis
MAX_CACHE_SIZE: "1_000_000"
## How many seconds do we keep cache ephemeral data such as run progress
## Default is 30 days:
CACHE_EXPIRY_SECONDS: "2592000"
## Specific configuration for celery
## Note that these configurations are the same for a "normal" worker, and a "highmemory" one,
## except for the requested resources and replicaCount which can differ.
celery:
## Number of fork worker celery node will have. It is recommended to use the same concurrency
## as workers.resources.limits.cpu
CONCURRENCY: "2"
## How many messages to prefetch at a time multiplied by the number of concurrent processes. Set to 1 because
## our tasks are usually quite "long".
PREFETCH_MULTIPLIER: "1"
## Maximum number of tasks a pool worker process can execute before it’s replaced with a new one
MAX_TASKS_PER_CHILD: "2048"
## Late ack means the task messages will be acknowledged after the task has been executed, not just before.
ACKS_LATE: "true"
## Currently, enable only the monitoring of celery.
monitor:
enabled: false
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
memory: 500Mi
cpu: 500m
## It is recommended to set limits. celery does not like to share resources.
limits:
memory: 1Gi
cpu: 2
## At least one "high memory" worker is also required.
highmemory:
replicaCount: 1
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
memory: 2Gi
cpu: 1
## It is recommended to set limits. celery does not like to share resources.
limits:
memory: 2Gi
cpu: 2
postgresql:
## See available settings and defaults at:
## https://github.com/kubernetes/charts/tree/master/stable/postgresql
nameOverride: "db"
persistence:
enabled: false
size: 8Gi
metrics:
enabled: true
#serviceMonitor:
#enabled: true
#namespace:
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
#limits:
# memory: 8Gi
requests:
#memory: 1Gi
cpu: 200m
global:
postgresql:
postgresqlDatabase: postgres
postgresqlUsername: postgres
postgresqlPassword: "examplePostgresPassword"
## In this section, we are not installing Redis. The main goal is to define configuration values for
## other services that need to access Redis.
redis:
## Note the `server` options are ignored if provisioning redis
## using this chart.
## External redis server url/ip
server: ""
## Does the redis server support the sentinel protocol
useSentinel: true
sentinelName: "mymaster"
## Note if deploying redis-ha you MUST have the same password below!
password: "exampleRedisPassword"
redis-ha:
## Settings for configuration of a provisioned redis ha cluster.
## https://github.com/DandyDeveloper/charts/tree/master/charts/redis-ha#configuration
## Provisioning is controlled in the `provision` section
auth: true
redisPassword: "exampleRedisPassword"
#replicas: 3
redis:
resources:
requests:
memory: 512Mi
cpu: 100m
limits:
memory: 10Gi
sentinel:
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
memory: 256Mi
persistentVolume:
enabled: false
size: 10Gi
nameOverride: "memstore"
# Enable transparent hugepages
# https://github.com/helm/charts/tree/master/stable/redis-ha#host-kernel-settings
sysctlImage:
enabled: true
mountHostSys: true
command:
- /bin/sh
- -xc
- |-
sysctl -w net.core.somaxconn=10000
echo never > /host-sys/kernel/mm/transparent_hugepage/enabled
# Enable prometheus exporter sidecar
exporter:
enabled: true
minio:
## Configure the object storage
## https://github.com/helm/charts/blob/master/stable/minio/values.yaml
## Root access credentials for the object store
## Note no defaults are provided to help prevent data breaches where
## the object store is exposed to the internet
#accessKey: "exampleMinioAccessKey"
#secretKey: "exampleMinioSecretKet"
defaultBucket:
enabled: true
name: "anonlink"
## Settings for deploying standalone object store
## Can distribute the object store across multiple nodes.
mode: "standalone"
service.type: "ClusterIP"
persistence:
enabled: false
size: 50Gi
storageClass: "default"
metrics:
serviceMonitor:
enabled: false
#additionalLabels: {}
#namespace: nil
# If you'd like to expose the MinIO object store
ingress:
enabled: false
#labels: {}
#annotations: {}
#hosts: []
#tls: []
nameOverride: "minio"
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
memory: 256Mi
cpu: 100m
limits:
memory: 5Gi
provision:
# enable to deploy a standalone version of each service as part of the helm deployment
minio: true
postgresql: true
redis: true
## Tracing config used by jaeger-client-python
## https://github.com/jaegertracing/jaeger-client-python/blob/master/jaeger_client/config.py
tracingConfig: |-
logging: true
metrics: true
sampler:
type: const
param: 1
## Custom logging file used to override the default settings. Will be used by the workers and the api container.
## Example of logging configuration:
loggingCfg: |-
version: 1
disable_existing_loggers: False
formatters:
simple:
format: "%(message)s"
file:
format: "%(asctime)-15s %(name)-12s %(levelname)-8s: %(message)s"
filters:
stderr_filter:
(): entityservice.logger_setup.StdErrFilter
stdout_filter:
(): entityservice.logger_setup.StdOutFilter
handlers:
stdout:
class: logging.StreamHandler
level: DEBUG
formatter: simple
filters: [stdout_filter]
stream: ext://sys.stdout
stderr:
class: logging.StreamHandler
level: ERROR
formatter: simple
filters: [stderr_filter]
stream: ext://sys.stderr
info_file_handler:
class: logging.handlers.RotatingFileHandler
level: INFO
formatter: file
filename: info.log
maxBytes: 10485760 # 10MB
backupCount: 20
encoding: utf8
error_file_handler:
class: logging.handlers.RotatingFileHandler
level: ERROR
formatter: file
filename: errors.log
maxBytes: 10485760 # 10MB
backupCount: 20
encoding: utf8
loggers:
entityservice:
level: INFO
entityservice.database.util:
level: WARNING
entityservice.cache:
level: WARNING
entityservice.utils:
level: INFO
celery:
level: INFO
jaeger_tracing:
level: WARNING
propagate: no
werkzeug:
level: WARNING
propagate: no
root:
level: INFO
handlers: [stdout, stderr, info_file_handler, error_file_handler]
This is also interesting...
According to the web site https://docs.docker.com/language/golang/develop/ the tag v1.15.0 seems to exists
The .env file looks like this:
SERVER=http://nginx:8851
DATABASE_PASSWORD=myPassword
# Object Store Configuration
# Provide root credentials to MINIO to set up more restricted service accounts
# MC_HOST_alias is equivalent to manually configuring a minio host
# mc config host add minio http://minio:9000 <MINIO_ACCESS_KEY> <MINIO_SECRET_KEY>
#- MC_HOST_minio=http://AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY#minio:9000
MINIO_SERVER=minio:9000
MINIO_ACCESS_KEY=AKIAIOSFODNN7EXAMPLE
MINIO_SECRET_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
MINIO_SECURE=false
# Object store account which will have upload only object store access.
#UPLOAD_OBJECT_STORE_SERVER=
UPLOAD_OBJECT_STORE_BUCKET=uploads
UPLOAD_OBJECT_STORE_SECURE=false
UPLOAD_OBJECT_STORE_ACCESS_KEY=EXAMPLE_UPLOAD_ACCESS_KEY
UPLOAD_OBJECT_STORE_SECRET_KEY=EXAMPLE_UPLOAD_SECRET_ACCESS_KEY
# Object store account which will have "read only" object store access.
#DOWNLOAD_OBJECT_STORE_SERVER=
DOWNLOAD_OBJECT_STORE_ACCESS_KEY=EXAMPLE_DOWNLOAD_ACCESS_KEY
DOWNLOAD_OBJECT_STORE_SECRET_KEY=EXAMPLE_DOWNLOAD_SECRET_ACCESS_KEY
DOWNLOAD_OBJECT_STORE_SECURE=false
# Logging, monitoring and metrics
LOG_CFG=entityservice/verbose_logging.yaml
JAEGER_AGENT_HOST=jaeger
SOLVER_MAX_CANDIDATE_PAIRS=500000000
SIMILARITY_SCORES_MAX_CANDIDATE_PAIRS=999000000
The spring boot application is deployed on openshift 4. This application needs to create a file on the nfs-share.
The openshift container has configured a volume mount on the type NFS.
The container on openshift creates a pod with random userid as
sh-4.2$ id
uid=1031290500(1031290500) gid=0(root) groups=0(root),1031290500
The mount point is /nfs/abc
sh-4.2$ ls -la /nfs/
ls: cannot access /nfs/abc: Permission denied
total 0
drwxr-xr-x. 1 root root 29 Nov 25 09:34 .
drwxr-xr-x. 1 root root 50 Nov 25 10:09 ..
d?????????? ? ? ? ? ? abc
on the docker image I created a user "technical" with uid= gid=48760 as shown below.
FROM quay.repository
MAINTAINER developer
LABEL description="abc image" \
name="abc" \
version="1.0"
ARG APP_HOME=/opt/app
ARG PORT=8080
ENV JAR=app.jar \
SPRING_PROFILES_ACTIVE=default \
JAVA_OPTS=""
RUN mkdir $APP_HOME
ADD $JAR $APP_HOME/
WORKDIR $APP_HOME
EXPOSE $PORT
ENTRYPOINT java $JAVA_OPTS -Dspring.profiles.active=$SPRING_PROFILES_ACTIVE -jar $JAR
my deployment config file is as shown below
spec:
volumes:
- name: bad-import-file
persistentVolumeClaim:
claimName: nfs-test-pvc
containers:
- resources:
limits:
cpu: '1'
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
terminationMessagePath: /dev/termination-log
name: abc
env:
- name: SPRING_PROFILES_ACTIVE
valueFrom:
configMapKeyRef:
name: abc-configmap
key: spring.profiles.active
- name: DB_URL
valueFrom:
configMapKeyRef:
name: abc-configmap
key: db.url
- name: DB_USERNAME
valueFrom:
configMapKeyRef:
name: abc-configmap
key: db.username
- name: BAD_IMPORT_PATH
valueFrom:
configMapKeyRef:
name: abc-configmap
key: bad.import.path
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: abc-secret
key: db.password
ports:
- containerPort: 8080
protocol: TCP
imagePullPolicy: IfNotPresent
volumeMounts:
- name: bad-import-file
mountPath: /nfs/abc
dnsPolicy: ClusterFirst
securityContext:
runAsGroup: 44337
runAsNonRoot: true
supplementalGroups:
- 44337
the PV request is as follows
apiVersion: v1
kind: PersistentVolume
metadata:
name: abc-tuc-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: classic-nfs
mountOptions:
- hard
- nfsvers=3
nfs:
path: /tm03v06_vol3014
server: tm03v06cl02.jit.abc.com
readOnly: false
Now the openshift user has id
sh-4.2$ id
uid=1031290500(1031290500) gid=44337(technical) groups=44337(technical),1031290500
RECENT UPDATE
Just to be clear with the problem, Below I have two commands from the same pod terminal,
sh-4.2$ cd /nfs/
sh-4.2$ ls -la (The first command I tried immediately after pod creation.)
total 8
drwxr-xr-x. 1 root root 29 Nov 29 08:20 .
drwxr-xr-x. 1 root root 50 Nov 30 08:19 ..
drwxrwx---. 14 technical technical 8192 Nov 28 19:06 abc
sh-4.2$ ls -la(few seconds later on the same pod terminal)
ls: cannot access abc: Permission denied
total 0
drwxr-xr-x. 1 root root 29 Nov 29 08:20 .
drwxr-xr-x. 1 root root 50 Nov 30 08:19 ..
d?????????? ? ? ? ? ? abc
So the problem is that I see these question marks(???) on the mount point.
The mounting is working correctly but I cannot access this /nfs/abc directory and I see this ????? for some reason
UPDATE
sh-4.2$ ls -la /nfs/abc/
ls: cannot open directory /nfs/abc/: Stale file handle
sh-4.2$ ls -la /nfs/abc/ (after few seconds on the same pod terminal)
ls: cannot access /nfs/abc/: Permission denied
Could this STALE FILE HANDLE be the reason for this issue?
TL;DR
You can use the anyuid security context to run the pod to avoid having OpenShift assign an arbitrary UID, and set the permissions on the volume to the known UID of the user.
OpenShift will override the user ID the image itself may specify that it should run as:
The user ID isn't actually entirely random, but is an assigned user ID which is unique to your project. In fact, your project is assigned a range of user IDs that applications can be run as. The set of user IDs will not overlap with other projects. You can see what range is assigned to a project by running oc describe on the project.
The purpose of assigning each project a distinct range of user IDs is so that in a multitenant environment, applications from different projects never run as the same user ID. When using persistent storage, any files created by applications will also have different ownership in the file system.
... this is a blessing and a curse, when using shared persistent volume claims for example (e.g. PVC's mounted in ReadWriteMany with multiple pods that read / write data - files created by one pod won't be accessible by the other pod because of the incorrect file ownership and permissions).
One way to get around this issue is using the anyuid security context which "provides all features of the restricted SCC, but allows users to run with any UID and any GID".
When using the anyuid security context, we know the user and group ID's the pod(s) are going to run as, and we can set the permissions on the shared volume in advance. For example, where all pods run with the restricted security context by default:
When running the pod with the anyuid security context, OpenShift doesn't assign an arbitrary UID from the range of UID's allocated for the namespace:
This is just for example, but an image that is built with a non-root user with a fixed UID and GID (e.g. 1000:1000) would run in OpenShift as that user, files would be created with the ownership of that user (e.g. 1000:1000), permissions can be set on the PVC to the known UID and GID of the user set to run the service. For example, we can create a new PVC:
cat <<EOF |kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data
namespace: k8s
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 8Gi
storageClassName: portworx-shared-sc
EOF
... then mount it in a pod:
kubectl run -i --rm --tty ansible --image=lazybit/ansible:v4.0.0 --restart=Never -n k8s --overrides='
{
"apiVersion": "v1",
"kind": "Pod",
"spec": {
"serviceAccountName": "default",
"containers": [
{
"name": "nginx",
"imagePullPolicy": "Always",
"image": "lazybit/ansible:v4.0.0",
"command": ["ash"],
"stdin": true,
"stdinOnce": true,
"tty": true,
"env": [
{
"name": "POD_NAME",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.name"
}
}
}
],
"volumeMounts": [
{
"mountPath": "/data",
"name": "data"
}
]
}
],
"volumes": [
{
"name": "data",
"persistentVolumeClaim": {
"claimName": "data"
}
}
]
}
}'
... and create files in the PVC as the USER set in the Dockerfile.
Using official / stable jenkins helm release to install the chart on kubernetes.
Using a GCS bucket as destination in the corresponding section of the values.yaml file
backup:
enabled: true
# Used for label app.kubernetes.io/component
componentName: "jenkins-backup"
schedule: "0 2 * * *"
labels: {}
annotations: {}
image:
repository: "maorfr/kube-tasks"
tag: "0.2.0"
extraArgs: []
# Add existingSecret for AWS credentials
existingSecret: {}
env: []
resources:
requests:
memory: 1Gi
cpu: 1
limits:
memory: 1Gi
cpu: 1
# Destination to store the backup artifacts
# Supported cloud storage services: AWS S3, Minio S3, Azure Blob Storage, Google Cloud Storage
# Additional support can added. Visit this repository for details
# Ref: https://github.com/maorfr/skbn
destination: "gs://jenkins-backup-240392409"
However, when the backup job starts, I get the following in its logs:
gs not implemented
edit: To address the issue raised by #Maxim in a comment below, the pod's description indicates that the quotes do not end up in the backup command
Pod Template:
Labels: <none>
Service Account: my-service-account
Containers:
jenkins-backup:
Image: maorfr/kube-tasks:0.2.0
Port: <none>
Host Port: <none>
Command:
kube-tasks
Args:
simple-backup
-n
jenkins
-l
app.kubernetes.io/instance=my-jenkins
--container
jenkins
--path
/var/jenkins_home
--dst
gs://my-destination-backup-bucket-6266
you should change the "gs" in destination to "gcs":
destination: "gcs://jenkins-backup-240392409"
however you can use ThinBackup plugin in jenkins and the backup is straighforward. check this guide for full instructions and walkthrough.
I am trying to set up a docker private registry on kubernetes cluster with helm. But I am getting an error for pvc. The error is:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 22m default-scheduler Successfully assigned docker-reg/docker-private-registry-docker-registry-6454b85dbb-zpdjc to 192.168.1.19
Warning FailedMount 2m10s (x9 over 20m) kubelet, 192.168.1.19 Unable to mount volumes for pod "docker-private-registry-docker-registry-6454b85dbb-zpdjc_docker-reg(82c8be80-eb43-11e8-85c9-b06ebfd124ff)": timeout expired waiting for volumes to attach or mount for pod "docker-reg"/"docker-private-registry-docker-registry-6454b85dbb-zpdjc". list of unmounted volumes=[data]. list of unattached volumes=[auth data docker-private-registry-docker-registry-config default-token-xc4p7]
What might be the reason for this error? I've also tried to create a pvc first and then use the existing pvc with docker registry's helm but it gives the same error.
Steps:
Create a htpasswd file
Edit values.yml and add contents of htpasswd file to htpasswd key.
Modify values.yml to enable persistence
Run helm install stable/docker-registry --namespace docker-reg --name docker-private-registry --values helm-docker-reg/values.yml
values.yml file:
# Default values for docker-registry.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
replicaCount: 1
updateStrategy:
# type: RollingUpdate
# rollingUpdate:
# maxSurge: 1
# maxUnavailable: 0
podAnnotations: {}
image:
repository: registry
tag: 2.6.2
pullPolicy: IfNotPresent
# imagePullSecrets:
# - name: docker
service:
name: registry
type: ClusterIP
# clusterIP:
port: 5000
# nodePort:
annotations: {}
# foo.io/bar: "true"
ingress:
enabled: false
path: /
# Used to create an Ingress record.
hosts:
- chart-example.local
annotations:
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
tls:
# Secrets must be manually created in the namespace.
# - secretName: chart-example-tls
# hosts:
# - chart-example.local
resources: {}
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
persistence:
accessMode: 'ReadWriteOnce'
enabled: true
size: 10Gi
storageClass: 'rook-ceph-block'
# set the type of filesystem to use: filesystem, s3
storage: filesystem
# Set this to name of secret for tls certs
# tlsSecretName: registry.docker.example.com
secrets:
haSharedSecret: ""
htpasswd: "dasdma:$2y$05$bnLaYEdTLawodHz2ULzx2Ob.OUI6wY6bXr9WUuasdwuGZ7TIsTK2W"
# Secrets for Azure
# azure:
# accountName: ""
# accountKey: ""
# container: ""
# Secrets for S3 access and secret keys
# s3:
# accessKey: ""
# secretKey: ""
# Secrets for Swift username and password
# swift:
# username: ""
# password: ""
# Options for s3 storage type:
# s3:
# region: us-east-1
# bucket: my-bucket
# encrypt: false
# secure: true
# Options for swift storage type:
# swift:
# authurl: http://swift.example.com/
# container: my-container
configData:
version: 0.1
log:
fields:
service: registry
storage:
cache:
blobdescriptor: inmemory
http:
addr: :5000
headers:
X-Content-Type-Options: [nosniff]
health:
storagedriver:
enabled: true
interval: 10s
threshold: 3
securityContext:
enabled: true
runAsUser: 1000
fsGroup: 1000
priorityClassName: ""
nodeSelector: {}
tolerations: []
It's working now. The issue was with the openebs storage which was documented here - https://docs.openebs.io/docs/next/tsgiscsi.html