We are trying to get EKS logs to Graylog.
Deployed, Graylog using Helm Charts.
We used MongoDB, Elasticsearch, and Graylog to deploy Graylog. Graylog works fine.
After Graylog was created.
To get EKS logs, we deployed Fluent-bit.
To send logs to Graylog in Fluent-bit configuration
inputs: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
DB /var/log/flb_graylog.db
Parser docker
Docker_Mode On
Mem_Buf_Limit 50MB
Skip_Long_Lines On
Refresh_Interval 10
Key log
[INPUT]
Name systemd
Tag host.*
Systemd_Filter _SYSTEMD_UNIT=kubelet.service
Read_From_Tail On
## https://docs.fluentbit.io/manual/pipeline/filters
filters: |
[FILTER]
Name kubernetes
Match kube.*
Merge_Log On
Merge_Log_Key log_processed
Keep_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Annotations Off
Labels On
[FILTER]
Name nest
Match *
Operation lift
Nested_under kubernetes
## https://docs.fluentbit.io/manual/pipeline/outputs
outputs: |
[OUTPUT]
Name es
Match kube.*
Host elasticsearch-master
Port 9200
Logstash_Format On
Retry_Limit Off
Replace_Dots On
[OUTPUT]
Name es
Match host.*
Host elasticsearch-master
Port 9200
Logstash_Format On
Logstash_Prefix node
Retry_Limit Off
Replace_Dots On
[OUTPUT]
Name gelf
Match *
Host graylog.example.com
Port 12201
Mode tcp
Gelf_Short_Message_Key short_message
[OUTPUT]
Name syslog
Match *
Host graylog.example.com
Port 541
Mode udp
Syslog_Format rfc5424
Syslog_Maxsize 2048
Syslog_Severity_Key severity
Syslog_Facility_Key facility
Syslog_Sd_Key sd
Syslog_Message_key message
## https://docs.fluentbit.io/manual/pipeline/parsers
customParsers: |
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# Command | Decoder | Field | Optional Action
# =============|==================|=================
Decode_Field_As escaped log
[PARSER]
Name syslog
Format regex
Regex ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S
In fluent-bit logs, I am getting the error
[2022/08/02 06:50:18] [error] [upstream] connection #141 to graylog.example.com:12201 timed out after 10 seconds
[2022/08/02 06:50:18] [error] [upstream] connection #126 to graylog.example.com:12201 timed out after 10 seconds
[2022/08/02 06:50:18] [error] [upstream] connection #143 to graylog.example.com:12201 timed out after 10 seconds
[2022/08/02 06:50:18] [error] [upstream] connection #125 to graylog.example.com:12201 timed out after 10 seconds
[2022/08/02 06:50:18] [error] [upstream] connection #144 to graylog.example.com:12201 timed out after 10 seconds
[2022/08/02 06:50:18] [error] [upstream] connection #139 to graylog.example.com:12201 timed out after 10 seconds
[2022/08/02 06:50:18] [error] [output:gelf:gelf.2] no upstream connections available
[2022/08/02 06:50:18] [error] [output:gelf:gelf.2] no upstream connections available
[2022/08/02 06:50:18] [error] [output:gelf:gelf.2] no upstream connections available
[2022/08/02 06:50:18] [error] [output:gelf:gelf.2] no upstream connections available
[2022/08/02 06:50:18] [error] [output:gelf:gelf.2] no upstream connections available
[2022/08/02 06:50:18] [error] [output:gelf:gelf.2] no upstream connections available
[2022/08/02 06:50:18] [ warn] [engine] chunk '1-1659422985.239025920.flb' cannot be retried: task_id=108, input=tail.0 > output=gelf.2
[2022/08/02 06:50:18] [ warn] [engine] chunk '1-1659422988.238308295.flb' cannot be retried: task_id=15, input=tail.0 > output=gelf.2
[2022/08/02 06:50:18] [ warn] [engine] chunk '1-1659422988.801295849.flb' cannot be retried: task_id=116, input=tail.0 > output=gelf.2
[2022/08/02 06:50:18] [ warn] [engine] failed to flush chunk '1-1659423006.238302940.flb', retry in 11 seconds: task_id=46, input=tail.0 > output=gelf.2 (out_id=2)
[2022/08/02 06:50:18] [ warn] [engine] chunk '1-1659422989.738179384.flb' cannot be retried: task_id=105, input=tail.0 > output=gelf.2
[2022/08/02 06:50:18] [ warn] [engine] failed to flush chunk '1-1659423007.739931411.flb', retry in 10 seconds: task_id=56, input=tail.0 > output=gelf.2 (out_id=2)
I am trying to set up Xdebug 3 in a docker container and capture a break point in Visual Studio Code.
My xdebug log:
[7] Log opened at 2021-02-16 16:12:12.027862
[7] [Step Debug] INFO: Checking remote connect back address.
[7] [Step Debug] INFO: Checking header 'HTTP_X_FORWARDED_FOR'.
[7] [Step Debug] INFO: Checking header 'REMOTE_ADDR'.
[7] [Step Debug] INFO: Client host discovered through HTTP header, connecting to 172.26.0.1:9099.
[7] [Step Debug] INFO: Connected to debugging client: 172.26.0.1:9099 (from REMOTE_ADDR HTTP header). :-)
[7] [Step Debug] -> <init xmlns="urn:debugger_protocol_v1" xmlns:xdebug="https://xdebug.org/dbgp/xdebug" fileuri="file:///var/www/public/index.php" language="PHP" xdebug:language_version="7.3.27" protocol_version="1.0" appid="7" idekey="VSCODE"><engine version="3.0.2"><![CDATA[Xdebug]]></engine><author><![CDATA[Derick Rethans]]></author><url><![CDATA[https://xdebug.org]]></url><copyright><![CDATA[Copyright (c) 2002-2021 by Derick Rethans]]></copyright></init>
[7] [Step Debug] -> <response xmlns="urn:debugger_protocol_v1" xmlns:xdebug="https://xdebug.org/dbgp/xdebug" status="stopping" reason="ok"></response>
[7] [Step Debug] WARN: 2021-02-16 16:12:12.039287: There was a problem sending 179 bytes on socket 6: Broken pipe.
xdebug.ini configuration:
[xdebug]
zend_extension=/usr/local/lib/php/extensions/no-debug-non-zts-20180731/xdebug.so
xdebug.client_port = 9099
xdebug.client_host = host.docker.internal
xdebug.idekey = VSCODE
xdebug.mode = debug
xdebug.start_with_request = yes
xdebug.discover_client_host = true
xdebug.log = /var/tmp/xdebug.log
Visual Studio Code configuration (json.config)
{
"version": "0.2.0",
"configurations": [
{
"name": "Listen for XDebug",
"type": "php",
"request": "launch",
"port": 9099,
"log": true,
"externalConsole": false,
"pathMappings": {
"/var/www/public": "${workspaceFolder}",
},
"ignore": [
"**/vendor/**/*.php"
]
},
]
}
I will be grateful for your help, I have run out of ideas why it doesn't work.
host.docker.internal - mac only
172.17.0.1/16 brd 172.17.255.255 scope global docker0
probably
zend_extension=/usr/local/lib/php/extensions/no-debug-non-zts-20180731/xdebug.so
xdebug.mode=debug
xdebug.client_port = 9003
xdebug.client_host=172.17.0.1
xdebug.start_with_request=yes
xdebug.extended_info=1
xdebug.remote_handler="dbgp"
xdebug.remote_connect_back=0
xdebug.idekey = "VSCODE"
I have a job that starts several docker containers periodically and for each container I also start a filebeat docker container to gather the logs and save them in elastic search.
Filebeat version 7.9 has been used.
Docker containers are started from java application using spotify docker client and terminated when job finishes.
The filebeat configuration is the following and it monitors only a specific docker container:
filebeat.inputs:
- paths: ${logs_paths}
include_lines: ['^{']
json.message_key: log
json.keys_under_root: true
json.overwrite_keys: true
json.add_error_key: true
type: log
scan_frequency: 10s
ignore_older: 15m
- paths: ${logs_paths}
exclude_lines: ['^{']
json.message_key: log
type: log
json.keys_under_root: true
json.overwrite_keys: true
json.add_error_key: true
scan_frequency: 10s
ignore_older: 15m
max_bytes: 20000000
processors:
- decode_json_fields:
fields: ["log"]
target: ""
output.elasticsearch:
hosts: ${elastic_host}
username: "something"
password: "else"
logs_paths:
- /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log
From time to time we observe that one filebeat container is crashing immediately after starting with the following error. Although the job runs the same docker images each time, the filebeat error might appear to any of them:
2020-12-09T16:00:15.784Z INFO instance/beat.go:640 Home path: [/usr/share/filebeat] Config path: [/usr/share/filebeat] Data path: [/usr/share/filebeat/data] Logs path: [/usr/share/filebeat/logs]
2020-12-09T16:00:15.864Z INFO instance/beat.go:648 Beat ID: 03ef7f54-2768-4d93-b7ca-c449e94b239c
2020-12-09T16:00:15.868Z INFO [seccomp] seccomp/seccomp.go:124 Syscall filter successfully installed
2020-12-09T16:00:15.868Z INFO [beat] instance/beat.go:976 Beat info {"system_info": {"beat": {"path": {"config": "/usr/share/filebeat", "data": "/usr/share/filebeat/data", "home": "/usr/share/filebeat", "logs": "/usr/share/filebeat/logs"}, "type": "filebeat", "uuid": "03ef7f54-2768-4d93-b7ca-c449e94b239c"}}}
2020-12-09T16:00:15.869Z INFO [beat] instance/beat.go:985 Build info {"system_info": {"build": {"commit": "b2ee705fc4a59c023136c046803b56bc82a16c8d", "libbeat": "7.9.0", "time": "2020-08-11T20:11:11.000Z", "version": "7.9.0"}}}
2020-12-09T16:00:15.869Z INFO [beat] instance/beat.go:988 Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":4,"version":"go1.14.4"}}}
2020-12-09T16:00:15.871Z INFO [beat] instance/beat.go:992 Host info {"system_info": {"host": {"architecture":"x86_64","boot_time":"2020-10-28T10:03:29Z","containerized":true,"name":"638de114b513","ip":["someIP"],"kernel_version":"4.4.0-190-generic","mac":["someMAC"],"os":{"family":"redhat","platform":"centos","name":"CentOS Linux","version":"7 (Core)","major":7,"minor":8,"patch":2003,"codename":"Core"},"timezone":"UTC","timezone_offset_sec":0}}}
2020-12-09T16:00:15.876Z INFO [beat] instance/beat.go:1021 Process info {"system_info": {"process": {"capabilities": {"inheritable":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"permitted":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"effective":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"bounding":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"ambient":null}, "cwd": "/usr/share/filebeat", "exe": "/usr/share/filebeat/filebeat", "name": "filebeat", "pid": 1, "ppid": 0, "seccomp": {"mode":"filter"}, "start_time": "2020-12-09T16:00:14.670Z"}}}
2020-12-09T16:00:15.876Z INFO instance/beat.go:299 Setup Beat: filebeat; Version: 7.9.0
2020-12-09T16:00:15.876Z INFO [index-management] idxmgmt/std.go:184 Set output.elasticsearch.index to 'someIndex' as ILM is enabled.
2020-12-09T16:00:15.877Z INFO eslegclient/connection.go:99 elasticsearch url: someURL
2020-12-09T16:00:15.878Z INFO [publisher] pipeline/module.go:113 Beat name: 638de114b513
2020-12-09T16:00:15.885Z INFO [monitoring] log/log.go:118 Starting metrics logging every 30s
2020-12-09T16:00:15.886Z INFO instance/beat.go:450 filebeat start running.
2020-12-09T16:00:15.893Z INFO memlog/store.go:119 Loading data file of '/usr/share/filebeat/data/registry/filebeat' succeeded. Active transaction id=0
2020-12-09T16:00:15.893Z INFO memlog/store.go:124 Finished loading transaction log file for '/usr/share/filebeat/data/registry/filebeat'. Active transaction id=0
2020-12-09T16:00:15.893Z INFO [registrar] registrar/registrar.go:108 States Loaded from registrar: 0
2020-12-09T16:00:15.893Z INFO [crawler] beater/crawler.go:71 Loading Inputs: 2
2020-12-09T16:00:15.894Z INFO log/input.go:157 Configured paths: [/var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log]
2020-12-09T16:00:15.895Z INFO [crawler] beater/crawler.go:141 Starting input (ID: 3906827571448963007)
2020-12-09T16:00:15.895Z INFO log/harvester.go:297 Harvester started for file: /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log
2020-12-09T16:00:15.902Z INFO beater/crawler.go:148 Stopping Crawler
2020-12-09T16:00:15.902Z INFO beater/crawler.go:158 Stopping 1 inputs
2020-12-09T16:00:15.902Z INFO [crawler] beater/crawler.go:163 Stopping input: 3906827571448963007
2020-12-09T16:00:15.902Z INFO input/input.go:136 input ticker stopped
2020-12-09T16:00:15.902Z INFO log/harvester.go:320 Reader was closed: /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log. Closing.
2020-12-09T16:00:15.902Z INFO beater/crawler.go:178 Crawler stopped
2020-12-09T16:00:15.902Z INFO [registrar] registrar/registrar.go:131 Stopping Registrar
2020-12-09T16:00:15.902Z INFO [registrar] registrar/registrar.go:165 Ending Registrar
2020-12-09T16:00:15.903Z INFO [registrar] registrar/registrar.go:136 Registrar stopped
2020-12-09T16:00:15.912Z INFO [monitoring] log/log.go:153 Total non-zero metrics {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":80,"time":{"ms":80}},"total":{"ticks":230,"time":{"ms":232},"value":0},"user":{"ticks":150,"time":{"ms":152}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":8},"info":{"ephemeral_id":"cae44857-494c-40e7-bf6a-e06e2cf40759","uptime":{"ms":290}},"memstats":{"gc_next":16703568,"memory_alloc":8518080,"memory_total":40448184,"rss":73908224},"runtime":{"goroutines":11}},"filebeat":{"events":{"added":2,"done":2},"harvester":{"closed":1,"open_files":0,"running":0,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"elasticsearch"},"pipeline":{"clients":0,"events":{"active":0,"filtered":2,"total":2}}},"registrar":{"states":{"current":1,"update":2},"writes":{"success":2,"total":2}},"system":{"cpu":{"cores":4},"load":{"1":1.79,"15":1.21,"5":1.54,"norm":{"1":0.4475,"15":0.3025,"5":0.385}}}}}}
2020-12-09T16:00:15.912Z INFO [monitoring] log/log.go:154 Uptime: 292.790204ms
2020-12-09T16:00:15.912Z INFO [monitoring] log/log.go:131 Stopping metrics logging.
2020-12-09T16:00:15.913Z INFO instance/beat.go:456 filebeat stopped.
2020-12-09T16:00:15.913Z ERROR instance/beat.go:951 Exiting: Failed to start crawler: starting input failed: Error while initializing input: Can only start an input when all related states are finished: {Id: native::4096794-64769, Finished: false, Fileinfo: &{40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log 0 416 {874391692 63743126415 0x608b880} {64769 4096794 1 33184 0 0 0 0 0 4096 0 {1607529615 874391692} {1607529615 874391692} {1607529615 874391692} [0 0 0]}}, Source: /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log, Offset: 0, Timestamp: 2020-12-09 16:00:15.896210395 +0000 UTC m=+0.302799924, TTL: -1ns, Type: log, Meta: map[], FileStateOS: 4096794-64769}
Exiting: Failed to start crawler: starting input failed: Error while initializing input: Can only start an input when all related states are finished: {Id: native::4096794-64769, Finished: false, Fileinfo: &{40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log 0 416 {874391692 63743126415 0x608b880} {64769 4096794 1 33184 0 0 0 0 0 4096 0 {1607529615 874391692} {1607529615 874391692} {1607529615 874391692} [0 0 0]}}, Source: /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log, Offset: 0, Timestamp: 2020-12-09 16:00:15.896210395 +0000 UTC m=+0.302799924, TTL: -1ns, Type: log, Meta: map[], FileStateOS: 4096794-64769}
Does anyone have an idea what might cause this?
I have setup of my application on DigitaOcean using dockers. It was working fine but few days back it stopped working. Whenever I want to build application and deploy it doesn't shows any progress.
By using following commands
docker-compose build && docker-compose stop && docker-compose up -d
systems stucks on the following output
db uses an image, skipping
elasticsearch uses an image, skipping
redis uses an image, skipping
Building app
It doesn't shows any furthur progress.
Following are the logs of docker-compose
db_1 | LOG: received smart shutdown request
db_1 | LOG: autovacuum launcher shutting down
db_1 | LOG: shutting down
db_1 | LOG: database system is shut down
db_1 | LOG: database system was shut down at 2018-01-10
02:25:36 UTC
db_1 | LOG: MultiXact member wraparound protections are now enabled
db_1 | LOG: database system is ready to accept connections
db_1 | LOG: autovacuum launcher started
redis_1 | 11264:C 26 Mar 15:20:17.028 # Failed opening the RDB
file root (in server root dir /run) for saving: Permission denied
redis_1 | 1:M 26 Mar 15:20:17.127 # Background saving error
redis_1 | 1:M 26 Mar 15:20:23.038 * 1 changes in 3600 seconds.
Saving...
redis_1 | 1:M 26 Mar 15:20:23.038 * Background saving started by pid 11265
elasticsearch | [2018-03-06T01:18:25,729][WARN ][o.e.b.BootstrapChecks ] [_IRIbyW] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
elasticsearch | [2018-03-06T01:18:28,794][INFO ][o.e.c.s.ClusterService ] [_IRIbyW] new_master {_IRIbyW}{_IRIbyWCSoaUaKOLN93Fzg}{TFK38PIgRT6Kl62mTGBORg}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
elasticsearch | [2018-03-06T01:18:28,835][INFO ][o.e.h.n.Netty4HttpServerTransport] [_IRIbyW] publish_address {172.17.0.4:9200}, bound_addresses {0.0.0.0:9200}
elasticsearch | [2018-03-06T01:18:28,838][INFO ][o.e.n.Node ] [_IRIbyW] started
elasticsearch | [2018-03-06T01:18:29,104][INFO ][o.e.g.GatewayService ] [_IRIbyW] recovered [4] indices into cluster_state
elasticsearch | [2018-03-06T01:18:29,799][INFO ][o.e.c.r.a.AllocationService] [_IRIbyW] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[product_records][2]] ...]).
elasticsearch | [2018-03-07T16:11:18,449][INFO ][o.e.n.Node ] [_IRIbyW] stopping ...
elasticsearch | [2018-03-07T16:11:18,575][INFO ][o.e.n.Node ] [_IRIbyW] stopped
elasticsearch | [2018-03-07T16:11:18,575][INFO ][o.e.n.Node ] [_IRIbyW] closing ...
elasticsearch | [2018-03-07T16:11:18,601][INFO ][o.e.n.Node ] [_IRIbyW] closed
elasticsearch | [2018-03-07T16:11:37,993][INFO ][o.e.n.Node ] [] initializing ...
WARNING: Connection pool is full, discarding connection: 'Ipaddress'
I am using postgres , redis, elasticsearch and sidekiq images in my rails application
But i have no clue where the things are going wrong.
I followed the ECS Getting Started tutorial but the ECS Agent isn't getting the container started. When I start the image manually on the same instance it starts fine.
The image is a Spring Boot web application with a single endpoint on / that returns the string "Hello world!!". The container runs fine locally, and also runs fine on a CentOS EC2 instance I've created. The endpoint is available publicly when I run the docker image on the CentOS EC2 instance.
The ECS Instance has security groups created by the wizard and has port 80 open. I added port 22 for SSH access.
When I SSH into the ECS instance to see the Docker logs for my container, and it looks like it's hanging during the entrypoint execution.
Here are the Docker logs for the hanging instance:
[ec2-user#ip-10-0-0-156 ~]$ docker logs --tail 100 107d4cf04dd8
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v1.4.2.RELEASE)
2016-11-25 17:36:22.505 INFO 1 --- [ main] ecstest.Application : Starting Application on 107d4cf04dd8 with PID 1 (/ecstest-1.0-SNAPSHOT.jar started by root in /)
2016-11-25 17:36:22.546 INFO 1 --- [ main] ecstest.Application : No active profile set, falling back to default profiles: default
2016-11-25 17:36:23.059 INFO 1 --- [ main] ationConfigEmbeddedWebApplicationContext : Refreshing org.springframework.boot.context.embedded.AnnotationConfigEmbeddedWebApplicationContext#6d21714c: startup date [Fri Nov 25 17:36:23 UTC 2016]; root of context hierarchy
2016-11-25 17:36:30.972 INFO 1 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat initialized with port(s): 8080 (http)
2016-11-25 17:36:31.014 INFO 1 --- [ main] o.apache.catalina.core.StandardService : Starting service Tomcat
2016-11-25 17:36:31.016 INFO 1 --- [ main] org.apache.catalina.core.StandardEngine : Starting Servlet Engine: Apache Tomcat/8.5.6
2016-11-25 17:36:31.464 INFO 1 --- [ost-startStop-1] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext
2016-11-25 17:36:31.464 INFO 1 --- [ost-startStop-1] o.s.web.context.ContextLoader : Root WebApplicationContext: initialization completed in 8458 ms
At first it seems like an application error in my container image, but when I stop the docker process and run the same image manually, the output is as expected and I can reach my endpoint from outside the instance as expected.
[ec2-user#ip-10-0-1-124 ~]$ docker stop -t 1 4d2401d7db93 && docker run -p 80:8080 -d ############.dkr.ecr.us-west-2.amazonaws.com/ecstest
4d2401d7db93
db8cffa89995401d9314d7d70e954f09c7fde972a5e6a423615827d8c47b9d10
[ec2-user#ip-10-0-1-124 ~]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
db8cffa89995 ############.dkr.ecr.us-west-2.amazonaws.com/ecstest "java -jar ecstest-1." 10 seconds ago Up 9 seconds 0.0.0.0:80->8080/tcp small_gates
85bd18480c99 amazon/amazon-ecs-agent:latest "/agent" 11 minutes ago Up 11 minutes ecs-agent
[ec2-user#ip-10-0-1-124 ~]$ docker logs --tail 1000 db8cffa89995
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v1.4.2.RELEASE)
2016-11-25 18:06:57.960 INFO 1 --- [ main] ecstest.Application : Starting Application on db8cffa89995 with PID 1 (/ecstest-1.0-SNAPSHOT.jar started by root in /)
2016-11-25 18:06:58.004 INFO 1 --- [ main] ecstest.Application : No active profile set, falling back to default profiles: default
2016-11-25 18:06:58.578 INFO 1 --- [ main] ationConfigEmbeddedWebApplicationContext : Refreshing org.springframework.boot.context.embedded.AnnotationConfigEmbeddedWebApplicationContext#6d21714c: startup date [Fri Nov 25 18:06:58 UTC 2016]; root of context hierarchy
2016-11-25 18:07:05.784 INFO 1 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat initialized with port(s): 8080 (http)
2016-11-25 18:07:05.866 INFO 1 --- [ main] o.apache.catalina.core.StandardService : Starting service Tomcat
2016-11-25 18:07:05.876 INFO 1 --- [ main] org.apache.catalina.core.StandardEngine : Starting Servlet Engine: Apache Tomcat/8.5.6
2016-11-25 18:07:06.283 INFO 1 --- [ost-startStop-1] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext
2016-11-25 18:07:06.283 INFO 1 --- [ost-startStop-1] o.s.web.context.ContextLoader : Root WebApplicationContext: initialization completed in 7753 ms
2016-11-25 18:07:07.026 INFO 1 --- [ost-startStop-1] o.s.b.w.servlet.ServletRegistrationBean : Mapping servlet: 'dispatcherServlet' to [/]
2016-11-25 18:07:07.031 INFO 1 --- [ost-startStop-1] o.s.b.w.servlet.FilterRegistrationBean : Mapping filter: 'characterEncodingFilter' to: [/*]
2016-11-25 18:07:07.031 INFO 1 --- [ost-startStop-1] o.s.b.w.servlet.FilterRegistrationBean : Mapping filter: 'hiddenHttpMethodFilter' to: [/*]
2016-11-25 18:07:07.032 INFO 1 --- [ost-startStop-1] o.s.b.w.servlet.FilterRegistrationBean : Mapping filter: 'httpPutFormContentFilter' to: [/*]
2016-11-25 18:07:07.033 INFO 1 --- [ost-startStop-1] o.s.b.w.servlet.FilterRegistrationBean : Mapping filter: 'requestContextFilter' to: [/*]
2016-11-25 18:07:08.432 INFO 1 --- [ main] s.w.s.m.m.a.RequestMappingHandlerAdapter : Looking for #ControllerAdvice: org.springframework.boot.context.embedded.AnnotationConfigEmbeddedWebApplicationContext#6d21714c: startup date [Fri Nov 25 18:06:58 UTC 2016]; root of context hierarchy
2016-11-25 18:07:08.786 INFO 1 --- [ main] s.w.s.m.m.a.RequestMappingHandlerMapping : Mapped "{[],methods=[GET]}" onto public java.lang.String ecstest.Application.get()
2016-11-25 18:07:08.800 INFO 1 --- [ main] s.w.s.m.m.a.RequestMappingHandlerMapping : Mapped "{[/error],produces=[text/html]}" onto public org.springframework.web.servlet.ModelAndView org.springframework.boot.autoconfigure.web.BasicErrorController.errorHtml(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse)
2016-11-25 18:07:08.801 INFO 1 --- [ main] s.w.s.m.m.a.RequestMappingHandlerMapping : Mapped "{[/error]}" onto public org.springframework.http.ResponseEntity<java.util.Map<java.lang.String, java.lang.Object>> org.springframework.boot.autoconfigure.web.BasicErrorController.error(javax.servlet.http.HttpServletRequest)
2016-11-25 18:07:09.036 INFO 1 --- [ main] o.s.w.s.handler.SimpleUrlHandlerMapping : Mapped URL path [/webjars/**] onto handler of type [class org.springframework.web.servlet.resource.ResourceHttpRequestHandler]
2016-11-25 18:07:09.036 INFO 1 --- [ main] o.s.w.s.handler.SimpleUrlHandlerMapping : Mapped URL path [/**] onto handler of type [class org.springframework.web.servlet.resource.ResourceHttpRequestHandler]
2016-11-25 18:07:09.204 INFO 1 --- [ main] o.s.w.s.handler.SimpleUrlHandlerMapping : Mapped URL path [/**/favicon.ico] onto handler of type [class org.springframework.web.servlet.resource.ResourceHttpRequestHandler]
2016-11-25 18:07:09.893 INFO 1 --- [ main] o.s.j.e.a.AnnotationMBeanExporter : Registering beans for JMX exposure on startup
2016-11-25 18:07:10.201 INFO 1 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 8080 (http)
2016-11-25 18:07:10.216 INFO 1 --- [ main] ecstest.Application : Started Application in 14.385 seconds (JVM running for 16.522)
Any ideas why the ECS Agent isn't getting my application started?
Task Definition JSON
{
"attributes": null,
"requiresAttributes": [
{
"value": null,
"name": "com.amazonaws.ecs.capability.ecr-auth",
"targetId": null,
"targetType": null
}
],
"taskDefinitionArn": "arn:aws:ecs:us-west-2:############:task-definition/DcTaskDefinition:4",
"networkMode": "bridge",
"status": "ACTIVE",
"revision": 4,
"taskRoleArn": null,
"containerDefinitions": [
{
"volumesFrom": [],
"memory": 128,
"extraHosts": null,
"dnsServers": null,
"disableNetworking": null,
"dnsSearchDomains": null,
"portMappings": [
{
"hostPort": 80,
"containerPort": 8080,
"protocol": "tcp"
}
],
"hostname": null,
"essential": true,
"entryPoint": null,
"mountPoints": [],
"name": "DcContainer",
"ulimits": null,
"dockerSecurityOptions": null,
"environment": [],
"links": null,
"workingDirectory": null,
"readonlyRootFilesystem": null,
"image": "############.dkr.ecr.us-west-2.amazonaws.com/ecstest:latest",
"command": null,
"user": null,
"dockerLabels": null,
"logConfiguration": null,
"cpu": 0,
"privileged": null,
"memoryReservation": null
}
],
"placementConstraints": [],
"volumes": [],
"family": "DcTaskDefinition"
}
The memory key in the Task Definition JSON imposes a hard memory limit on the container. When a container tries to exceed that limit, the Docker daemon is supposed to kill it.
I'm not sure whether this can cause your container to get "stuck", however that's the only important difference I see between how your container runs when ECS runs it and how it runs from the command line.
So, I would try to set the memory value to at least 300, or use the memoryReservation key instead which imposes a soft memory limit.
More information on the difference between hard and soft memory limits can be found in the official ECS documentation.