chrony with gpsd and pps - gpsd
On my PC I want to feed chrony from GPS. For this, I installed gpsd and pps-tools. I have a GPS connected to the serial port /dev/ttyS0 and the PPS is connected to the DCD input. Apparently, the PPS pulses are received correctly:
$ sudo ppscheck /dev/ttyS0
# Seconds nanoSecs Signals
1646915383.000347816 TIOCM_CD
1646915383.100323649
1646915384.000213974 TIOCM_CD
1646915384.100172453
...
so far so good. Also, the data from the GPS module seems to be received correctly. I verify this using sudo gpsmon /dev/ttyS0. It appears as the $GPRMC sentence is successfully received and that the data was valid. A time could be extracted and the PPS was detected also:
┌──────────────────────────────────────────────────────────────────────────────┐
│Time: 2022-03-10T12:31:10.000Z Lat: 46 xx.xxxxxx' N Lon: 7 xx.xxxxxx' E │
└───────────────────────────────── Cooked TPV ─────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────────────┐
│ GPRMC │
└───────────────────────────────── Sentences ──────────────────────────────────┘
┌───────────────────────┌─────────────────────────┌────────────────────────────┐
│ SVID PRN Az El SN HU│Time: 123110 │Time: │
│ │Latitude: 46xx.xxxxx N │Latitude: │
│ │Longitude: 7xx.xxxxx E │Longitude: │
│ │Speed: 000.0 │Altitude: │
│ │Course: 000.0 │Quality: Sats: │
│ │Status: A FAA: │HDOP: │
│ │MagVar: 000.0E │Geoid: │
│ └───────── RMC ───────────└─────────── GGA ────────────┘
│ ┌─────────────────────────┌────────────────────────────┐
│ │Mode: Sats: │UTC: RMS: │
│ │DOP H= V= P= │MAJ: MIN: │
│ │TOFF: 0.132531938 │ORI: LAT: │
│ │PPS: 0.000246686 │LON: ALT: │
└──────── GSV ──────────└────── GSA + PPS ────────└─────────── GST ────────────┘
so far, so good. I would like to run gpsd now such that chrony can get the time from it. Here the trouble begins. I test gpsd as follows `
gpsd:INFO: launching (Version 3.23.1, revision 3.23.1)
gpsd:IO: opening IPv4 socket
gpsd:SPIN: passivesock_af() -> 3
gpsd:IO: opening IPv6 socket
gpsd:SPIN: passivesock_af() -> 4
gpsd:INFO: listening on port gpsd
gpsd:PROG: NTP: shmat(0,0,0) succeeded, segment 0
gpsd:PROG: NTP: shmat(1,0,0) succeeded, segment 1
gpsd:PROG: NTP: shmat(2,0,0) succeeded, segment 2
gpsd:PROG: NTP: shmat(3,0,0) succeeded, segment 3
gpsd:PROG: NTP: shmat(4,0,0) succeeded, segment 4
gpsd:PROG: NTP: shmat(5,0,0) succeeded, segment 5
gpsd:PROG: NTP: shmat(6,0,0) succeeded, segment 6
gpsd:PROG: NTP: shmat(7,0,0) succeeded, segment 7
gpsd:PROG: successfully connected to the DBUS system bus
gpsd:PROG: shmget(0x47505344, 26712, 0666) for SHM export succeeded
gpsd:PROG: shmat() for SHM export succeeded, segment 8
gpsd:INFO: stashing device /dev/ttyS0 at slot 0
gpsd:PROG: no /etc/gpsd/device-hook present, skipped running ACTIVATE hook. No such file or directory
gpsd:INFO: SER: opening read-only GPS data source type 2 at '/dev/ttyS0'
gpsd:IO: SER: fusercount: path /dev/ttyS0 fullpath /dev/ttyS0 cnt 1
gpsd:IO: SER: fd 6 set speed 115200(4098)
gpsd:INFO: SER: fd 6 current speed 115200, 8N1
gpsd:IO: SER: open(/dev/ttyS0) -> 6 in gpsd_serial_open()
gpsd:PROG: Probing "Garmin USB binary" driver...
gpsd:PROG: Probe not found "Garmin USB binary" driver...
gpsd:PROG: Probing "GeoStar" driver...
gpsd:PROG: Sent GeoStar packet id 0xc1
gpsd:PROG: Probe not found "GeoStar" driver...
gpsd:PROG: Probing "Trimble TSIP" driver...
gpsd:IO: SER: fd 6 set speed 9600(13)
gpsd:INFO: SER: fd 6 current speed 9600, 8O1
gpsd:IO: SER: fd 6 set speed 115200(4098)
gpsd:INFO: SER: fd 6 current speed 115200, 8N1
gpsd:PROG: Probe not found "Trimble TSIP" driver...
gpsd:PROG: Probing "iSync" driver...
gpsd:IO: SER: fd 6 set speed 9600(13)
gpsd:INFO: SER: fd 6 current speed 9600, 8N1
gpsd:IO: SER: fd 6 set speed 115200(4098)
gpsd:INFO: SER: fd 6 current speed 115200, 8N1
gpsd:PROG: Probe not found "iSync" driver...
gpsd:PROG: no probe matched...
gpsd:INFO: gpsd_activate(2): activated GPS (fd 6)
gpsd:PROG: NTP:PPS: using SHM(0)
gpsd:PROG: NTP:PPS: using SHM(1)
gpsd:PROG: PPS:/dev/ttyS0 connect chrony socket failed: /run/chrony.ttyS0.sock, error: -2, errno: 111/Connection refused
gpsd:PROG: KPPS:/dev/ttyS0 checking /sys/devices/virtual/pps/pps1/path, /dev/ttyS0
gpsd:INFO: KPPS:/dev/ttyS0 RFC2783 path:/dev/pps1, fd is 7
gpsd:INFO: KPPS:/dev/ttyS0 pps_caps 0x1133
gpsd:INFO: KPPS:/dev/ttyS0 have PPS_CANWAIT
gpsd:INFO: KPPS:/dev/ttyS0 kernel PPS will be used
gpsd:PROG: PPS:/dev/ttyS0 thread launched
gpsd:INFO: PPS: activated /dev/ttyS0 ntpshm_link_activate(): Clock
gpsd:INFO: stashing device /dev/pps0 at slot 1
gpsd:PROG: no /etc/gpsd/device-hook present, skipped running ACTIVATE hook. No such file or directory
gpsd:ERROR: SER: stat(/dev/pps0) failed: No such file or directory(2)
gpsd:ERROR: initial GPS device /dev/pps0 open failed
gpsd:INFO: KPPS:/dev/ttyS0 kernel PPS timeout 4:unknown error
gpsd:INFO: KPPS:/dev/ttyS0 kernel PPS timeout 4:unknown error
gpsd:INFO: KPPS:/dev/ttyS0 kernel PPS timeout 4:unknown error
gpsd:INFO: running with effective group ID 18
gpsd:INFO: running with effective user ID 65534
gpsd:INFO: startup at 2022-03-10T12:35:52.000Z (1646915752)
gpsd:PROG: KPPS:/dev/ttyS0 assert 1646915753.000208807, sequence: 1, clear 0.000000000, sequence: 0 - using: assert
gpsd:PROG: KPPS:/dev/ttyS0 Assert cycle: 1646915753000208, duration: 0 # 1646915753.000208807
gpsd:RAW: PPS:/dev/ttyS0 Assert pps-detect changed to 1
gpsd:PROG: PPS:/dev/ttyS0 Assert cycle: 1646915753000208, duration: 0 # 1646915753.000208807
gpsd:PROG: PPS:/dev/ttyS0 Assert ignored missing last_fixtime
gpsd:PROG: KPPS:/dev/ttyS0 assert 1646915753.000208807, sequence: 1, clear 1646915753.100149920, sequence: 1 - using: clear
gpsd:PROG: KPPS:/dev/ttyS0 Clear cycle: 99941, duration: 99941 # 1646915753.100149920
gpsd:RAW: PPS:/dev/ttyS0 Clear pps-detect changed to 0
gpsd:PROG: PPS:/dev/ttyS0 Clear cycle: 99941, duration: 99941 # 1646915753.100149920
gpsd:PROG: PPS:/dev/ttyS0 Clear ignored missing last_fixtime
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915753.130428717 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: packet sniff on /dev/ttyS0 finds type -1
gpsd:INFO: reconnection attempt on device 1
gpsd:PROG: no /etc/gpsd/device-hook present, skipped running ACTIVATE hook. No such file or directory
gpsd:ERROR: SER: stat(/dev/pps0) failed: No such file or directory(2)
gpsd:ERROR: /dev/pps0: device activation failed, freeing device.
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915753.131202895 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: packet sniff on /dev/ttyS0 finds type -1
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915753.132038783 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: packet sniff on /dev/ttyS0 finds type -1
gpsd:IO: SER: gpsd_next_hunt_setting(6) retries 0 diff 0
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915753.132776939 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: packet sniff on /dev/ttyS0 finds type -1
gpsd:IO: SER: gpsd_next_hunt_setting(6) retries 1 diff 0
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915753.133566515 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: packet sniff on /dev/ttyS0 finds type -1
gpsd:IO: SER: gpsd_next_hunt_setting(6) retries 2 diff 0
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915753.134296808 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: packet sniff on /dev/ttyS0 finds type -1
gpsd:IO: SER: gpsd_next_hunt_setting(6) retries 3 diff 0
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915753.135119824 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: packet sniff on /dev/ttyS0 finds type -1
gpsd:IO: SER: gpsd_next_hunt_setting(6) retries 4 diff 0
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915753.135905236 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: packet sniff on /dev/ttyS0 finds type 1
gpsd:PROG: switching to match packet type 1: $GPRMC,123553,A,46xx.xxxxx,N,7xx.xxxxx,E,000.0,000.0,100322,000.0,E*70\x0d\x0a
gpsd:PROG: switch_driver(NMEA0183) called...
gpsd:PROG: selecting NMEA0183 driver...
gpsd:INFO: /dev/ttyS0 identified as type NMEA0183, 1 sec # 115200bps
gpsd:RAW: raw packet of type 1, 72:$GPRMC,123553,A,46xx.xxxxx,N,7xx.xxxxx,E,000.0,000.0,100322,000.0,E*70\x0d\x0a
gpsd:IO: <= GPS: $GPRMC,123553,A,46xx.xxxxx,N,7xx.xxxxx,E,000.0,000.0,100322,000.0,E*70
gpsd:DATA: NMEA0183: merge_ddmmyy(100322) sets year 2022
gpsd:RAW: NMEA0183: merge_ddmmyy(100322) 2 10 122
gpsd:DATA: NMEA0183: GPRMC: registers fractional time 123553.000000000
gpsd:DATA: NMEA0183: RMC: ddmmyy=100322 hhmmss=123553 lat=46.95 lon=7.44 speed=0.00 track=0.00 mode=2 var=nan status=0
gpsd:DATA: NMEA0183: GPRMC newtime is 1646915753.000000000 = 2022-03-10T12:35:53.000Z
gpsd:DATA: NMEA0183: GPRMC time 123553.000000000 last 0.000000000 latch 1 cont 0
gpsd:PROG: NMEA0183: GPRMC starts a reporting cycle. lasttag 0
gpsd:SPIN: parse_packet() = {ONLINE|TIME|LATLON|SPEED|TRACK|STATUS|MODE|PACKET|DRIVER|CLEAR|NTPTIME}
gpsd:DATA: packet type 1 from /dev/ttyS0 with {ONLINE|TIME|LATLON|SPEED|TRACK|STATUS|MODE|PACKET|DRIVER|CLEAR|NTPTIME}
gpsd:DATA: all_reports(): changed {ONLINE|TIME|LATLON|SPEED|TRACK|STATUS|MODE|PACKET|DRIVER|CLEAR|NTPTIME}
gpsd:SPIN: packet_get() fd 6 -> 0 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
gpsd:PROG: KPPS:/dev/ttyS0 assert 1646915754.000142962, sequence: 2, clear 1646915753.100149920, sequence: 1 - using: assert
gpsd:PROG: KPPS:/dev/ttyS0 Assert cycle: 999934, duration: 899993 # 1646915754.000142962
gpsd:RAW: PPS:/dev/ttyS0 Assert pps-detect changed to 1
gpsd:PROG: PPS:/dev/ttyS0 Assert cycle: 999934, duration: 899993 # 1646915754.000142962
gpsd:PROG: PPS:/dev/ttyS0 Assert ignored missing last_fixtime
gpsd:PROG: KPPS:/dev/ttyS0 assert 1646915754.000142962, sequence: 2, clear 1646915754.100056948, sequence: 2 - using: clear
gpsd:PROG: KPPS:/dev/ttyS0 Clear cycle: 999907, duration: 99913 # 1646915754.100056948
gpsd:RAW: PPS:/dev/ttyS0 Clear pps-detect changed to 0
gpsd:PROG: PPS:/dev/ttyS0 Clear cycle: 999907, duration: 99913 # 1646915754.100056948
gpsd:PROG: PPS:/dev/ttyS0 Clear ignored missing last_fixtime
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915754.139262196 (Success)
gpsd:SPIN: packet_get() fd 6 -> 12 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
gpsd:RAW: packet sniff on /dev/ttyS0 finds type 1
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915754.140017104 (Success)
gpsd:PROG: transmission pause. gap 1.004081 quiet_time 0.250000
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
gpsd:RAW: packet sniff on /dev/ttyS0 finds type 1
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915754.140906731 (Success)
gpsd:SPIN: packet_get() fd 6 -> 10 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
gpsd:RAW: packet sniff on /dev/ttyS0 finds type 1
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915754.141784777 (Success)
gpsd:SPIN: packet_get() fd 6 -> 10 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
gpsd:RAW: packet sniff on /dev/ttyS0 finds type 1
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915754.142634744 (Success)
gpsd:SPIN: packet_get() fd 6 -> 10 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
gpsd:RAW: packet sniff on /dev/ttyS0 finds type 1
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915754.143412923 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
gpsd:RAW: packet sniff on /dev/ttyS0 finds type 1
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915754.144186955 (Success)
gpsd:SPIN: packet_get() fd 6 -> 9 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
gpsd:RAW: packet sniff on /dev/ttyS0 finds type 1
gpsd:SPIN: pselect() {3 4 6} -> { 6 } at 1646915754.144838131 (Success)
gpsd:SPIN: packet_get() fd 6 -> 3 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
gpsd:RAW: packet sniff on /dev/ttyS0 finds type 1
gpsd:RAW: raw packet of type 1, 72:$GPRMC,123554,A,46xx.xxxxx,N,7xx.xxxxx,E,000.0,000.0,100322,000.0,E*7B\x0d\x0a
gpsd:IO: <= GPS: $GPRMC,123554,A,46xx.xxxxx,N,7xx.xxxxx,E,000.0,000.0,100322,000.0,E*7B
gpsd:DATA: NMEA0183: merge_ddmmyy(100322) sets year 2022
gpsd:RAW: NMEA0183: merge_ddmmyy(100322) 2 10 122
gpsd:DATA: NMEA0183: GPRMC: registers fractional time 123554.000000000
gpsd:DATA: NMEA0183: RMC: ddmmyy=100322 hhmmss=123554 lat=46.95 lon=7.44 speed=0.00 track=0.00 mode=2 var=nan status=0
gpsd:DATA: NMEA0183: GPRMC newtime is 1646915754.000000000 = 2022-03-10T12:35:54.000Z
gpsd:DATA: NMEA0183: GPRMC time 123554.000000000 last 123553.000000000 latch 1 cont 0
gpsd:PROG: NMEA0183: GPRMC starts a reporting cycle. lasttag 61
gpsd:PROG: NMEA0183: tagged RMC as a cycle ender. 61
gpsd:PROG: NMEA0183: GPRMC ends a reporting cycle.
gpsd:SPIN: parse_packet() = {ONLINE|TIME|LATLON|SPEED|TRACK|STATUS|MODE|PACKET|CLEAR|REPORT|NTPTIME}
gpsd:DATA: packet type 1 from /dev/ttyS0 with {ONLINE|TIME|LATLON|SPEED|TRACK|STATUS|MODE|PACKET|CLEAR|REPORT|NTPTIME}
gpsd:DATA: all_reports(): changed {ONLINE|TIME|LATLON|SPEED|TRACK|STATUS|MODE|PACKET|CLEAR|REPORT|NTPTIME}
gpsd:SPIN: packet_get() fd 6 -> 0 (0)
gpsd:RAW: /dev/ttyS0 is known to be NMEA0183
...
Apparently, it successfully sets up the shared memory. It can also read the NMEA data, extracts date and time, and the PPS pulses are also detected! while this runs, it creates a device /dev/pps0 and with sudo ppstest /dev/pps0 I can confirm that the PPS pulses are still there. Now, in my chrony.conf, I have these two lines
refclock SHM 0 offset 0.5 delay 0.2 refid NMEA noselect
refclock SHM 1 offset 0.0 delay 0.1 refid PPS
so chrony should be able to read the time. However, even after a couple minutes, there is nothing:
$ chronyc sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
#? NMEA 0 4 0 - +0ns[ +0ns] +/- 0ns
#? PPS 0 4 0 - +0ns[ +0ns] +/- 0ns
However I can confirm the shared memory segments are there:
$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x4e545030 0 root 600 96 2
0x4e545031 1 root 600 96 2
0x4e545032 2 root 666 96 1
0x4e545033 3 root 666 96 1
0x4e545034 4 root 666 96 1
...
so what is going on here? I notice that gpsd changes its own user to nobody
$ ps aux | grep gpsd
root 12667 0.0 0.0 235424 8736 pts/0 S+ 13:41 0:00 sudo gpsd -D 8 -N -b -n /dev/ttyS0 /dev/pps0
nobody 12668 0.3 0.0 17172 4816 pts/0 S<l+ 13:41 0:01 gpsd -D 8 -N -b -n /dev/ttyS0 /dev/pps0
while chronyd runs as user chrony:
$ ps aux | grep chrony
chrony 12917 0.0 0.0 10600 2972 ? S 13:45 0:00 /usr/sbin/chronyd -F 2
I believe this could be the culprit, since the shared memories belong to root. But I am not sure. Why can't get chrony the time from gpsd? I also tried the other variants, i.e. the sockets, but with the same result.
As /dev/pps0 gives plausible output, these refclock lines in chrony.conf might be what you need:
refclock PPS /dev/pps0 lock NMEA refid GPS
refclock SHM 0 offset 0.5 delay 0.2 refid NMEA noselect
My system (Raspberry pi3B+ with Adafruit Ultimate GPS HAT version 2) has these non-comment lines in chrony.conf in addition to the default contents:
pool 0.europe.pool.ntp.org iburst
refclock PPS /dev/pps0 lock NMEA refid GPS
refclock SHM 0 offset 0.5 delay 0.2 refid NMEA noselect
As I understand it, the PPS system knows very accurately where the second boundaries are but it doesn't know which second the boundaries belong to. The NMEA entry is accurate enough to know which second is being timed and passes this to PPS. The 'magic' is realising that the 'noselect' entry can be assist the line that references it without itself being used directly.
Chronyc sources shows
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
#* GPS 0 4 17 11 +18ns[ +127ns] +/- 113ns
#? NMEA 0 4 17 13 -120ms[ -120ms] +/- 106ms
^- 169.80-203-110.customer.> 2 6 17 59 +428us[ +575us] +/- 24ms
^- 84.2.46.19 2 6 17 59 -901us[-1185us] +/- 23ms
^- time.erickochen.nl 2 6 17 59 -141us[ -425us] +/- 12ms
^- phouchg.0x2a.io 2 6 17 59 -962us[-1245us] +/- 58ms
^- mail.rettensteiner.com 2 6 17 59 +509us[ +226us] +/- 42ms
^- ip98.mikrocom.sk 2 6 17 58 -1150us[-1433us] +/- 52ms
^- pauseq4vntp2.datamossa.io 2 6 17 58 +1777us[+1494us] +/- 177ms
^- time2.isu.net.sa 2 6 17 60 -23ms[ -23ms] +/- 413ms
Related
Informix - Locked DB due to lock created by cancelled session?
SI attempted to run a script to generate a table in my Informix database, but the script was missing a newline at EOF, so I think Informix had problems to read it and hence the script got blocked doing nothing. I had to kill the script and add the new line to the file so now the script works fine, except it does not create the table due to a lockecreated when I killed the script abruptly. I am new to this, so sorry for the dumb question. IBM page does not have a clear and simple explanation of how to clean this now. So, my question is: How do I unlock the locks so I can continue working in my script? admin_proyecto#li1106-217 # onstat -k IBM Informix Dynamic Server Version 12.10.FC9DE -- On-Line (CKPT REQ) -- Up 9 ds Blocked:CKPT Locks address wtlist owner lklist type tbz 44199028 0 44ca6830 0 HDR+S 44199138 0 44cac0a0 0 HDR+S 441991c0 0 44cac0a0 4419b6f0 HDR+IX 44199358 0 44ca44d0 0 S 441993e0 0 44ca44d0 44199358 HDR+S 4419ac50 0 44cac0a0 441991c0 HDR+X 4419aef8 0 44ca44d0 441993e0 HDR+IX 4419b2b0 0 44ca79e0 0 S 4419b3c0 0 44ca82b8 0 S 4419b6f0 0 44cac0a0 44199138 HDR+X 4419b998 0 44ca8b90 0 S 4419bdd8 0 44ca44d0 4419aef8 HDR+X 12 active, 20000 total, 16384 hash buckets, 0 lock table overflows
On my "toy" systems i usually point LTAPEDEV to a directory: LTAPEDEV /usr/informix/dumps/motor_003/backups Then, when Informix blocks due to having all of it's logical logs full, i manually do an ontape -a to backup to files the used logical logs and free them to be reused. For example, here I have an Informix instance blocked due to no more logical logs available: $ onstat -l IBM Informix Dynamic Server Version 12.10.FC8DE -- On-Line (CKPT REQ) -- Up 00:18:58 -- 213588 Kbytes Blocked:CKPT Physical Logging Buffer bufused bufsize numpages numwrits pages/io P-1 0 64 1043 21 49.67 phybegin physize phypos phyused %used 2:53 51147 28085 240 0.47 Logical Logging Buffer bufused bufsize numrecs numpages numwrits recs/pages pages/io L-1 13 64 191473 12472 6933 15.4 1.8 Subsystem numrecs Log Space used OLDRSAM 191470 15247376 HA 3 132 Buffer Waiting Buffer ioproc flags L-1 0 0x21 0 address number flags uniqid begin size used %used 44d75f88 1 U------ 47 3:15053 5000 5 0.10 44b6df68 2 U---C-L 48 3:20053 5000 4986 99.72 44c28f38 3 U------ 41 3:25053 5000 5000 100.00 44c28fa0 4 U------ 42 3:53 5000 2843 56.86 44d59850 5 U------ 43 3:5053 5000 5 0.10 44d598b8 6 U------ 44 3:10053 5000 5 0.10 44d59920 7 U------ 45 3:30053 5000 5 0.10 44d59988 8 U------ 46 3:35053 5000 5 0.10 8 active, 8 total On the online log I have: $ onstat -m 04/23/18 18:20:42 Logical Log Files are Full -- Backup is Needed So I manually issue the command: $ ontape -a Performing automatic backup of logical logs. File created: /usr/informix/dumps/motor_003/backups/informix003.ifx.marqueslocal_3_Log0000000041 File created: /usr/informix/dumps/motor_003/backups/informix003.ifx.marqueslocal_3_Log0000000042 File created: /usr/informix/dumps/motor_003/backups/informix003.ifx.marqueslocal_3_Log0000000043 File created: /usr/informix/dumps/motor_003/backups/informix003.ifx.marqueslocal_3_Log0000000044 File created: /usr/informix/dumps/motor_003/backups/informix003.ifx.marqueslocal_3_Log0000000045 File created: /usr/informix/dumps/motor_003/backups/informix003.ifx.marqueslocal_3_Log0000000046 File created: /usr/informix/dumps/motor_003/backups/informix003.ifx.marqueslocal_3_Log0000000047 File created: /usr/informix/dumps/motor_003/backups/informix003.ifx.marqueslocal_3_Log0000000048 Do you want to back up the current logical log? (y/n) n Program over. If I check again the status of the logical logs: $ onstat -l IBM Informix Dynamic Server Version 12.10.FC8DE -- On-Line -- Up 00:23:42 -- 213588 Kbytes Physical Logging Buffer bufused bufsize numpages numwrits pages/io P-2 33 64 1090 24 45.42 phybegin physize phypos phyused %used 2:53 51147 28091 36 0.07 Logical Logging Buffer bufused bufsize numrecs numpages numwrits recs/pages pages/io L-1 0 64 291335 15878 7023 18.3 2.3 Subsystem numrecs Log Space used OLDRSAM 291331 22046456 HA 4 176 address number flags uniqid begin size used %used 44d75f88 1 U-B---- 47 3:15053 5000 5 0.10 44b6df68 2 U-B---- 48 3:20053 5000 5000 100.00 44c28f38 3 U---C-L 49 3:25053 5000 3392 67.84 44c28fa0 4 U-B---- 42 3:53 5000 2843 56.86 44d59850 5 U-B---- 43 3:5053 5000 5 0.10 44d598b8 6 U-B---- 44 3:10053 5000 5 0.10 44d59920 7 U-B---- 45 3:30053 5000 5 0.10 44d59988 8 U-B---- 46 3:35053 5000 5 0.10 8 active, 8 total The logical logs are now marked as "Backed Up" and can be reused and the Informix instance is no longer blocked on Blocked:CKPT .
Finding the memory consumption of each redis DB
The problem One of my Python Redis clients fails with the following exception: redis.exceptions.ResponseError: MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error. I have checked the redis machine, and it seems to be out of memory: free total used free shared buffers cached Mem: 3952 3656 295 0 1 9 -/+ buffers/cache: 3645 306 Swap: 0 0 0 top top - 15:35:03 up 14:09, 1 user, load average: 0.06, 0.17, 0.16 Tasks: 114 total, 2 running, 112 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.2 us, 0.3 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.2 st KiB Mem: 4046852 total, 3746772 used, 300080 free, 1668 buffers KiB Swap: 0 total, 0 used, 0 free. 11364 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1102 root 20 0 3678836 3.485g 736 S 1.3 90.3 10:12.53 redis-server 1332 ubuntu 20 0 41196 3096 972 S 0.0 0.1 0:00.12 zsh 676 root 20 0 10216 2292 0 S 0.0 0.1 0:00.03 dhclient 850 syslog 20 0 255836 2288 124 S 0.0 0.1 0:00.39 rsyslogd I am using a few dozens Redis DBs in a single Redis instance. Each DB is denoted by numeric ids given to redis-cli, e.g.: $ redis-cli -n 80 127.0.0.1:6379[80]> How do I know how much memory does each DB consume, and what are the largest keys in each DB?
How do I know how much memory does each DB consume, and what are the largest keys in each DB? You CANNOT get the used memory for each DB. With INFO command, you can only get the totally used memory for Redis instance. Redis records the newly allocated memory size, each time it dynamically allocates some memory. However, it doesn't do such record for each DB. Also, it doesn't have any record for the largest keys. Normally, you should config your Redis instance with the maxmemory and maxmemory-policy (i.e. eviction policy when the maxmemory is reached).
You can write some sh-script like to this (show element count in each DB): #!/bin/bash max_db=501 i=0 while [ $i -lt $max_db ] do echo "db_nubner: $i" redis-cli -n $i dbsize i=$((i+1)) done Example output: db_nubner: 0 (integer) 71 db_nubner: 1 (integer) 0 db_nubner: 2 (integer) 1 db_nubner: 3 (integer) 1 db_nubner: 4 (integer) 0 db_nubner: 5 (integer) 1 db_nubner: 6 (integer) 28 db_nubner: 7 (integer) 1 I know that we can have a one database with large key, but anyway, in some cases this script can help.
GFS2 flags 0x00000005 blocked,join
I have cluster RHEL6, cman, corosync, pacemaker. After adding new memebers I got error in GFS mounting. GFS never mounts on servers. group_tool fence domain member count 4 victim count 0 victim now 0 master nodeid 1 wait state none members 1 2 3 4 dlm lockspaces name clvmd id 0x4104eefa flags 0x00000000 change member 4 joined 1 remove 0 failed 0 seq 1,1 members 1 2 3 4 gfs mountgroups name lv_gfs_01 id 0xd5eacc83 flags 0x00000005 blocked,join change member 3 joined 1 remove 0 failed 0 seq 1,1 members 1 2 3 In processes: root 2695 2690 0 08:03 pts/1 00:00:00 /bin/bash /etc/init.d/gfs2 start root 2702 2695 0 08:03 pts/1 00:00:00 /bin/bash /etc/init.d/gfs2 start root 2704 2703 0 08:03 pts/1 00:00:00 /sbin/mount.gfs2 /dev/mapper/vg_shared-lv_gfs_01 /mnt/share -o rw,_netdev,noatime,nodiratime fsck.gfs2 -yf /dev/vg_shared/lv_gfs_01 Initializing fsck jid=1: Replayed 0 of 0 journaled data blocks jid=1: Replayed 20 of 21 metadata blocks Recovering journals (this may take a while) Journal recovery complete. Validating Resource Group index. Level 1 rgrp check: Checking if all rgrp and rindex values are good. (level 1 passed) RGs: Consistent: 183 Cleaned: 1 Inconsistent: 0 Fixed: 0 Total: 184 2 blocks may need to be freed in pass 5 due to the cleaned resource groups. Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Pass2 complete Starting pass3 Pass3 complete Starting pass4 Pass4 complete Starting pass5 Block 11337799 (0xad0047) bitmap says 1 (Data) but FSCK saw 0 (Free) Fixed. Block 11337801 (0xad0049) bitmap says 1 (Data) but FSCK saw 0 (Free) Fixed. RG #11337739 (0xad000b) free count inconsistent: is 65500 should be 65502 RG #11337739 (0xad000b) Inode count inconsistent: is 15 should be 13 Resource group counts updated Pass5 complete The statfs file is wrong: Current statfs values: blocks: 12057320 (0xb7fae8) free: 9999428 (0x989444) dinodes: 15670 (0x3d36) Calculated statfs values: blocks: 12057320 (0xb7fae8) free: 9999432 (0x989448) dinodes: 15668 (0x3d34) The statfs file was fixed. Writing changes to disk gfs2_fsck complete gfs2_edit -p 0xad0047 field di_size /dev/vg_shared/lv_gfs_01 10 (Block 11337799 is type 10: Ext. attrib which is not implemented) Howto drop flag blocked,join from GFS?
I solved it by reboot all servers which have GFS, it is one of the unpleasant behavior of GFS. GFS lock based on kernel and in the few cases it solved only with reboot. there is very usefull manual - https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/Global_File_System_2/index.html
Memory used but i can't see process that used it (Debian)
Here is my problem: top - 11:32:47 up 22:20, 2 users, load average: 0.03, 0.72, 1.27 Tasks: 112 total, 1 running, 110 sleeping, 1 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8193844k total, 7508292k used, 685552k free, 80636k buffers Swap: 2102456k total, 15472k used, 2086984k free, 7070220k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28555 root 20 0 57424 38m 1492 S 0 0.5 0:06.38 bash 28900 root 20 0 39488 7732 3176 T 0 0.1 0:03.12 python 28553 root 20 0 72132 5052 2600 S 0 0.1 0:00.22 sshd 28859 root 20 0 70588 3424 2584 S 0 0.0 0:00.06 sshd 29404 root 20 0 70448 3320 2600 S 0 0.0 0:00.06 sshd 28863 root 20 0 42624 2188 1472 S 0 0.0 0:00.02 sftp-server 29406 root 20 0 19176 1984 1424 S 0 0.0 0:00.00 bash 2854 root 20 0 115m 1760 488 S 0 0.0 5:37.02 rsyslogd 29410 root 20 0 19064 1400 1016 R 0 0.0 0:05.14 top 3111 ntp 20 0 22484 604 460 S 0 0.0 10:26.79 ntpd 3134 proftpd 20 0 64344 452 280 S 0 0.0 6:29.16 proftpd 2892 root 20 0 49168 356 232 S 0 0.0 0:31.58 sshd 1 root 20 0 27388 284 132 S 0 0.0 0:01.38 init 3121 root 20 0 4308 248 172 S 0 0.0 0:16.48 mdadm As you can see 7.5 GB of memory is used, but there is no process that use it. How it can be, and how to fix this? Thanks for answer.
www.linuxatemyram.com It's too good of a site to ruin by copy/pasting the entire contents here.
in order to see all process you can use that command: ps aux and then try to sort with different filters ps faux Hope that helps. If your system starts using the swap file - then you have high memory load. Depends on the file system, programs that you use - linux system may allocate all of your system memory - but that doesn't mean that they are using it. Lots of ubuntu and debian servers that we use have free memory 32 or 64 mb but don't use swap. I'm not Linux-gure however, so please someone to correct me if I'm wrong :)
I don't have a Linux box handy to experiment, but it looks like you can sort top's output with interactive commands, so you could bring the biggest memory users to the top. Check the man page and experiment. Update: In the version of top I have (procps 3.2.7), you can hit "<" and ">" to change the field it's sorting by. Doesn't actually say what field it is, you have to look at how the display is changing. It's not hard once you experiment a little. However, Arrowmaster's point (that it's probably being used for cache) is a better answer. Use "free" to see how much is being used.
I had a similar problem. I was running Raspbian on a Pi B+ with a TP-Link USB Wireless LAN stick connected. The stick caused a problem which resulted in nearly all memory being consumed on system start (around 430 of 445 MB). Just like in your case, the running processes did not consume that much memory. When I removed the stick and rebooted everything was fine, just 50 MB memory consumption.
Why are ruby processes at 100% CPU on passenger
I have a rails app (2.3.5) running on a VPS with 4 cores # 2 GHz and 4GB memory. I am running nginx (0.7.61) and phusion passenger(2.2.14) on Ruby Enterprise (1.8.7-2010.01) with the max pool size set at 30. My problem is that it seems as if every ruby process that is executing a rails request runs at near 100% cpu. If I run TOP they drop off every time the display refreshes so they are not getting hung, but they are still running at 100%. Is there any way I can bring this down? Or at least figure out what portion of code is spiking the CPU? Is this a normal behavior? Here is the TOP output: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2427 psadmin 25 0 91904 76m 2696 R 100 1.9 739:05.96 Rails: /var/www/apps/main_rails_app/current 3457 psadmin 25 0 98180 82m 2532 R 100 2.0 711:21.91 Rails: /var/www/apps/main_rails_app/current 2415 psadmin 25 0 93952 77m 2708 R 99 1.9 727:49.31 Rails: /var/www/apps/main_rails_app/current 3455 psadmin 25 0 99204 83m 2528 R 69 2.0 726:04.70 Rails: /var/www/apps/main_rails_app/current 2791 psadmin 16 0 98044 81m 2492 S 31 2.0 0:10.16 Rails: /var/www/apps/main_rails_app/current 8034 psadmin 15 0 8160 3656 1772 S 1 0.1 0:35.39 nginx: worker process 8035 psadmin 15 0 8324 3696 1732 S 0 0.1 0:31.34 nginx: worker process 2588 psadmin 15 0 197m 183m 2712 S 0 4.5 1:02.16 Rails: /var/www/apps/main_rails_app/current Thanks! Edit: Tried strace with follow forks as mentioned below. This is the output that is dumped over and over: sudo strace -f -p 3455 clock_gettime(CLOCK_MONOTONIC, {394577, 508326476}) = 0 select(0, [], [], [], {0, 0}) = 0 (Timeout) --- SIGVTALRM (Virtual timer expired) # 0 (0) --- sigreturn()
check your logs for suspicious behavior. In general rails does suck a bunch of cpu though...you could also try pointing strace at the offending pids.