How to solve "mod_fastcgi.c.2566 unexpected end-of-file (perhaps the fastcgi process died)" when calling .php that takes long time to execute? - timeout

In my php application I restore db2 database. It works fine but here is one huge one 2.9GB that finishes with 500 - Internal Server Error.
I use exec() to run unix shell commands from .php - cp, db2 etc. The same error happens when running from firefox or ruby script.
I have to copy the backup image file first which takes few minutes. Then I call db2 to restored the image. For this particular database the php process finishes with above error. Then I can find this in the error log file
2012-08-02 10:25:18: (mod_fastcgi.c.2566) unexpected end-of-file (perhaps the fastcgi process died): pid: 0 socket: tcp:127.0.0.1:9090
2012-08-02 10:25:18: (mod_fastcgi.c.3352) response not received, request sent: 2758 on socket: tcp:127.0.0.1:9090 for /wrational/tools/rationalTest.php?mode=restore&database=RATIONAL&from_database=dbb&dbbackuptype=weekly, closing connection
I set both default_socket_timeout and max_execution_time to 5660 in php.ini and confirmed that it is set by phpinfo() but it doesn't look like it helped.
Any idea how I can make this one work?
update
It looks like it dies after 40mins.
The corresponding line in access.log file look like
"GET /rational/tools/rationalTest.php?mode=restore&from_database=dbb&dbbackuptype=weekly HTTP/1.1" 500 369 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:11.0) Gecko/20100101 Firefox/11.0"

Looks like request_terminate_timeout option in php-fpm.ini was causing the troubles. It was set to 30 mins. I changed it to 0 and it looks good so far. Need to do more testing though.
; The timeout for serving a single request after which the worker process will
; be killed. This option should be used when the 'max_execution_time' ini option
; does not stop script execution for some reason. A value of '0' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0

Related

nagios - nsclient - nsca - host_check message

I’ve configured some clients to use the nsca to send data to Nagios since I’m not able to connect from server to the client. I was able to achieve like everything I wanted, but one last thing is not exactly how I would like.
The host check / status information is not appending a message I want. This is my nsclient.ini conf:
[/settings/scheduler/schedules]
host_check = Check_OK
;host_check = Check_OK “OK”
And in the nagios server I see these messages:
Jun 30 06:52:02 localhost nagios: EXTERNAL COMMAND:
PROCESS_HOST_CHECK_RESULT;client;0;No message Jun 30 06:57:16
localhost nagios: EXTERNAL COMMAND:
PROCESS_HOST_CHECK_RESULT;client;0;No message Jun 30 07:02:31
localhost nagios: EXTERNAL COMMAND:
PROCESS_HOST_CHECK_RESULT;client;0;No message Jun 30 07:11:17
localhost nagios: EXTERNAL COMMAND:
PROCESS_HOST_CHECK_RESULT;client;3;Invalid command line: unrecognised
option 'OK' Jun 30 07:11:17 localhost nagios: Error: External command
failed -> PROCESS_HOST_CHECK_RESULT;client;3;Invalid command line:
unrecognised option 'OK' Jun 30 07:11:17 localhost nagios: External
command [1656573077] PROCESS_HOST_CHECK_RESULT;client;3;Invalid
command line: unrecognised option 'OK' returned error Command failed
So, when I use the “OK” I get that 3 invalid command line: unrecognised option ‘OK’ and when I don’t use any message I got that 0 No message.
Any thought what I’m doing wrong here?
nsclient version = NSCP-0.5.2.35-x64
Thanks!
It gave me some hard time to figure out but after deciding to check the documentation I tried couple other stuff and ended up with this way to get that message passed along:
host_check = check_ok message=OK
This way Nagios UI I can see the Status Information for the host as "OK" and how I was expecting.

wine fixme wait_for_withdrawn_state window

I am running a Windows execuable (.exe) using wine, in a docker container and I am dumping the graphical interface using xvfb-run. The setup is working but with some fixmes and some errors which I am trying to understand what do they mean.
The most common and fixme is:
002c:fixme:event:wait_for_withdrawn_state window 0x1004a/e00001 wait timed out
0024:fixme:event:wait_for_withdrawn_state window 0x10086/a00003 wait timed out
I found here that it means that:
Your application X Windows are possibly staying in a Withdrawn (aka limbo) state - because Wine isn't drawing to the correct X (Xvfb) Display.
However I do not understand what is this withdrawn state and why wine isn't drawing to the corrcect X. Furthermore the same fixme is present when running wine with xvfb locally without docker.
Thank you!
The whole log from the docker-xvfb-wine setup:
$ xvfb-run --server-num=99 wine my.exe (runs using a run.sh specified by the CMD clause in the dockerfile)
0050:err:ole:StdMarshalImpl_MarshalInterface Failed to create ifstub, hr 0x80004002
0050:err:ole:CoMarshalInterface Failed to marshal the interface {6d5140c1-7436-11ce-8034-00aa006009fa}, hr 0x80004002
0050:err:ole:apartment_get_local_server_stream Failed: 0x80004002
0048:err:ole:StdMarshalImpl_MarshalInterface Failed to create ifstub, hr 0x80004002
0048:err:ole:CoMarshalInterface Failed to marshal the interface {6d5140c1-7436-11ce-8034-00aa006009fa}, hr 0x80004002
0048:err:ole:apartment_get_local_server_stream Failed: 0x80004002
0048:err:ole:start_rpcss Failed to open RpcSs service
002c:fixme:event:wait_for_withdrawn_state window 0x1004a/e00001 wait timed out
wine: configuration in L"/root/.wine" has been updated.
0024:fixme:event:wait_for_withdrawn_state window 0x10086/a00003 wait timed out
0024:fixme:event:wait_for_withdrawn_state window 0x1007c/a00001 wait timed out
X connection to :99 broken (explicit kill or server shutdown).
The log from the xvfb-wine setup (without docker):
$ xvfb-run --server-num=99 wine my.exe
0034:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
0034:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
0048:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
0048:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
0050:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
0050:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
002c:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
002c:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
0024:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
0024:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
00cc:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
00cc:fixme:font:get_name_record_codepage encoding 20 not handled, platform 1.
0024:fixme:font:freetype_set_outline_text_metrics failed to read full_nameW for font L"Ani"!
0024:fixme:event:wait_for_withdrawn_state window 0x20044/c00003 wait timed out
0024:fixme:event:wait_for_withdrawn_state window 0x1007a/c00001 wait timed out
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":99"
after 19 requests (19 known processed) with 0 events remaining.
You need to start a window manager, like bspwm.
Xvfb :99 &
export DISPLAY=:99
bspwm &
wine some.exe

mod_fcgid: read data timeout in 45 seconds and Premature end of script headers: index.php

One of my website clients had issues while placing orders.
When I checked my error log I could see this :
[warn] mod_fcgid: read data timeout in 45 seconds, referer: https://myDomain/cart
[error] Premature end of script headers: index.php, referer: https://myDomain/cart
What does this error mean? What should I do to eliminate this error? Are there any settings to be changed in Plesk Control panel? will it be solved if I change 'max_execution_time' in 'Php settings' to 3600?
I am using Plesk 12.0.18, CentOS 5.11
The error means that website code in index.php file fails to be executed in the time limit, which set for Apache FastCGI module and/or PHP.
Most likely, there is an error in the index.php, which makes it inoperable at all. In this case, you should increase the PHP error reporting level in Plesk > Domains > example.com > PHP Settings and review the script itself.
Less likely that script is meant to take a long time to execute. In this case, you may simply increase timeout via Plesk. To set 120 seconds instead of default 45, do the following:
1. Set max_execution_time to 120 in Plesk > Domains > example.com > PHP settings.
2. Increase FastCGI timeout by adding the following Apache dirctives in Plesk > Domains > example.com > Apache & Nginx settings > Additional Apache directives:
<IfModule mod_fcgid.c>
FcgidIOTimeout 120
</IfModule>

How to debug/fix random occurring Redis::TimeoutError?

I have a rails app running which is using redis quite a lot - however - I'm seeing quite a few Redis::TimeoutError occurring here and there, from time to time. There is no pattern in the circumstances. It occurs both in the web app and in the background jobs (which is being processed using sidekiq) - not often but from time to time.
Now I have no idea how to track down the root cause of this and hence no idea how to fix it.
Here is a little background on my setup:
The redis instance is running on a separate physical server which is connected to both my web server and background server in a private local 1Gbit network. All servers are running ubuntu 12.04. The redis version is 2.6.10. I'm connecting from my rails app (which is 3.2) using an initializer like so:
require 'redis'
require 'redis/objects'
REDIS = Redis.new(:url => APP_CONFIG['REDIS_URL'])
Redis.current = REDIS
This is the output of redis-cli INFO:
# Server
redis_version:2.6.10
redis_git_sha1:00000000
redis_git_dirty:0
redis_mode:standalone
os:Linux 3.2.0-38-generic x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.6.3
process_id:28475
run_id:d89bbb1b81d3169c4228cf23c0988ae437d496a1
tcp_port:6379
uptime_in_seconds:14913365
uptime_in_days:172
lru_clock:1507056
# Clients
connected_clients:233
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:19
# Memory
used_memory:801637360
used_memory_human:764.50M
used_memory_rss:594706432
used_memory_peak:4295394784
used_memory_peak_human:4.00G
used_memory_lua:31744
mem_fragmentation_ratio:0.74
mem_allocator:jemalloc-3.3.0
# Persistence
loading:0
rdb_changes_since_last_save:23166
rdb_bgsave_in_progress:0
rdb_last_save_time:1378219310
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:4
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
# Stats
total_connections_received:932395
total_commands_processed:3088408103
instantaneous_ops_per_sec:837
rejected_connections:0
expired_keys:31428
evicted_keys:3007
keyspace_hits:124093049
keyspace_misses:53060192
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:17651
# Replication
role:master
connected_slaves:1
slave0:192.168.0.2,6379,online
# CPU
used_cpu_sys:54000.21
used_cpu_user:73692.52
used_cpu_sys_children:36229.79
used_cpu_user_children:420655.84
# Keyspace
db0:keys=1498962,expires=1310
In my redis config I have the following set:
\fidaemonize yes
pidfile /var/run/redis/redis-server.pid
timeout 0
loglevel notice
databases 1
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
slave-serve-stale-data yes
slave-read-only yes
slave-priority 100
maxclients 1000
maxmemory 4GB
maxmemory-policy volatile-lru
appendonly no
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
That could come from many issues :
because you use the SAVE command (it is setup in your conf) generating a lot of I/O and hammering the server, especially if you use EBS volumes on Amazon.
because you have a Redis slave (same as before, doing SAVE before mirroring).
because you use a KEY * which is very slow on a lot of indexes.
Try "slowlog" command on the redis server to see if there are some "slow query".
Write some logs when "TimeoutError" happens, to see if the "error redis command" in the "slow log".
adjust your timeout setting on the client side。
It might be a problem on the client side if server performs normally. Each redis client instance, not the server, also has a timeout setting, and the default setting is very short - something like a few milliseconds. So if the server does not respond within that time, a Redis::TimeoutError will be raised by the client.
First thing you can try is to set a longer timeout value, and see if things get better.
redis_url = 'redis://user:password#host:port/'
redis = Redis.connect(:url => redis_url, :timeout => 0.7)
Even with longer timeout setting, there is no guarantee that timeout would not happen, but then it'd be a problem of the design of your system.
Are you rolling your own code to connect to redis or just letting sidekiq handle it? I think you should really just design your connection code to reconnect if the connection has been lost. You can rescue Redis::BaseConnectionError and reconnect.

Phantomjs loads pages slowly

I'm new into phantomjs, trying it on a standard centOS server (with httpd etc installed, but no modified settings apart from nameservers set to 8.8.8.8 and 8.8.4.4).
I'm using the default loadspeed.js file (be it renamed). However, page speeds appear to be extremely slow. Here's an example:
$ phantomjs phantomjs.js http://www.google.com/
starting
Loading time 90928 msec
$ phantomjs phantomjs.js http://173.194.67.138/ #(one of google's public ips)
starting
Loading time 30204 msec
When I load any url on the server (such as http://something.be ), loadtime is 141msec:
$ phantomjs phantomjs.js http://something.be
starting
Loading time 141 msec
Does anyone have a clue what causes my connection to be this slow? The connection itself is fine, wget takes seconds to download a file of several MB.
Also, when I run the exact same script on OSX locally for Google, this is the output:
phantomjs phantomjs.js http://google.com/
starting
Loading time 430 msec
Found it - seems like ipv6 was the culprit.
I disabled it temporarily by running the following:
echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6
Testing confirms:
$ phantomjs phantomjs.js http://google.com
starting
Loading time 230 msec
Well, in my case, the page was waiting for some GET requests and was not able to reach the requests' server and it kept waiting for long. I could only figure it out when i used the remote debugger option.
phantomjs --remote-debugger-port=9000 loadspeed.js <some_url>
and inside the loadspeed.js file
page.onResourceRequested = function (req) {
console.log('requested: ' + JSON.stringify(req, undefined, 4));
};
page.onResourceReceived = function (res) {
console.log('received: ' + JSON.stringify(res, undefined, 4));
};
and then loading localhost:9000 in any webkit browser (safari/chrome) and seeing the console logs where i could figure out it was waiting for some failed requests for a long time.
TO BYPASS THIS - REDUCE THE TIMEOUT:
page.settings.resourceTimeout = 3000; //in secs
and things were very quick after that. Hope this helps

Resources