Configure scollector to log if a process is running - monitoring

I'm trying to use Bosun to determine if certain processes are running and then eventually alert on if they are up or down. I'm probably misinterpreting the docs but I can't figure this out.
Bosun is running just fine. I have the scollector running on Ubuntu 14 LTS and using my config file correctly.
Here is what I have in my scollector.toml:
host="blah:8070"
hostname="cass01"
[[Process]]
command = "^.*(java).*(CassandraDaemon)$"
name = "Cassandra"
I would then expect to see in bosun under my host cass01 a metric title "cassandra" somewhere but it's nowhere to be seen. Other metrics are there.

Right now Command is a partial match on the process path to the binary, up to the first space delimiter. The Args parameter is a regex to differentiate between multiple instances of the process. So for a java process you would use something like:
[[Process]]
Command = "java"
Name = "Cassandra"
Args = "CassandraDaemon$"
This would match a command line like:
/usr/bin/java /usr/bin/CassandraDaemon
This assumes the /proc/<pid>/cmdline for that process ends in CassandraDaemon. If it doesn't end in that string you would need to change the Args to just "CassandraDaemon" which would match any java process that contains that string.
Also some processes change the cmdline to something other than a nul delimited string. In those cases the Command argument needs to be used to match as Args is expecting nul delimiters. Example:
cat /proc/80156/cmdline | hexdump -C
00000000 2f 75 73 72 2f 62 69 6e 2f 72 65 64 69 73 2d 73 |/usr/bin/redis-s|
00000010 65 72 76 65 72 20 2a 3a 36 33 37 39 00 00 00 00 |erver *:6379....|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 00 |.|
00000031
#Example for cmdline without NUL (00) delimiters between args
[[Process]]
Command = "redis-server *:6379"
Name = "redis-core"
Once these are in place with the correct matching values you should see metrics show up under linux.proc.* where the the name tag will match the name used in the TOML file.

Related

How to validate CRC-32 calculation of Zip file

I want to validate that my ZIP file has a correct CRC-32 checksum.
I read that in a ZIP file the CRC-32 data is in bytes 14 to 17:
Offset Bytes Description[30]
0 4 Local file header signature = 0x04034b50 (read as a little-endian number)
4 2 Version needed to extract (minimum)
6 2 General purpose bit flag
8 2 Compression method
10 2 File last modification time
12 2 File last modification date
14 4 CRC-32 of uncompressed data
18 4 Compressed size
22 4 Uncompressed size
26 2 File name length (n)
28 2 Extra field length (m)
30 n File name
30+n m Extra field
I wanted to validate a CRC-32 checksum of a simple ZIP file I created:
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
-----------------------------------------------
50 4B 03 04 14 00 00 00 00 00 38 81 1C 51 4C 18 | PK........8..QL.
C7 8C 02 00 00 00 02 00 00 00 07 00 00 00 31 32 | nj............12
33 2E 64 61 74 73 73 50 4B 01 02 14 00 14 00 00 | 3.datssPK.......
00 00 00 38 81 1C 51 4C 18 C7 8C 02 00 00 00 02 | ...8..QL.nj.....
00 00 00 07 00 00 00 00 00 00 00 01 00 20 00 00 | ............. ..
00 00 00 00 00 31 32 33 2E 64 61 74 50 4B 05 06 | .....123.datPK..
00 00 00 00 01 00 01 00 35 00 00 00 27 00 00 00 | ........5...'...
00 00 | ..
The CRC-32 is: 0x4C18C78C
I went to this CRC-32 online calculator and added the following un-compressed row from the file:
50 4B 03 04 14 00 00 00 00 00 38 81 1C 51
This is the result:
Algorithm Result Check Poly Init RefIn RefOut XorOut
CRC-32 0x6A858174 0xCBF43926 0x04C11DB7 0xFFFFFFFF true true 0xFFFFFFFF
CRC-32/BZIP2 0xE3FA1205 0xFC891918 0x04C11DB7 0xFFFFFFFF false false 0xFFFFFFFF
CRC-32C 0xB578110E 0xE3069283 0x1EDC6F41 0xFFFFFFFF true true 0xFFFFFFFF
CRC-32D 0xAFE2EEA4 0x87315576 0xA833982B 0xFFFFFFFF true true 0xFFFFFFFF
CRC-32/MPEG-2 0x1C05EDFA 0x0376E6E7 0x04C11DB7 0xFFFFFFFF false false 0x00000000
CRC-32/POSIX 0xFF9B3071 0x765E7680 0x04C11DB7 0x00000000 false false 0xFFFFFFFF
CRC-32Q 0x79334F11 0x3010BF7F 0x814141AB 0x00000000 false false 0x00000000
CRC-32/JAMCRC 0x957A7E8B 0x340BC6D9 0x04C11DB7 0xFFFFFFFF true true 0x00000000
CRC-32/XFER 0xA7F36A3F 0xBD0BE338 0x000000AF 0x00000000 false false 0x00000000
But none of them equal to: 0x4C18C78C.
What am I doing wrong? The CRC-32 of the ZIP is the calculation of all the bytes (0-13) before, no?
The byte sequence you are running against the online CRC calculator are not uncompressed bytes.
50 4B 03 04 14 00 00 00 00 00 38 81 1C 51
Those bytes are the first few bytes of the zip file. The CRC32 value in a zip is calculated by running the CRC32 algorithm against the complete uncompressed payload. In your case the payload is the two byte sequence "ss".
To work that out, I converted your hex dump back into a zip file, tmp.zip. It contains a single member 123.dat
$ unzip -lv tmp.zip
Archive: tmp.zip
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
2 Stored 2 0% 2020-08-28 16:09 8cc7184c 123.dat
-------- ------- --- -------
2 2 0% 1 file
When I extract that member to stdout & pipe though hexdump, we find it contains the two bytes string "ss" (hex 73 73)
$ unzip -p tmp.zip | hexdump -C
00000000 73 73 |ss|
Finally, as already mentioned in another comment, you can check that the CRC value is correct by running unzip -t
$ unzip -t tmp.zip
Archive: tmp.zip
testing: 123.dat OK
No errors detected in compressed data of tmp.zip.
I was able to create a zip file that matches the one in the question. The header shows that the compression type == 0, which means no compression, the uncompressed size == 2, the data == {73 73}. CRC32 uses reflected input and output, and the CRC is stored in little endian format, so the CRC == 0x8CC7184C.
I get a match using CRC32 on data of {73 73} using this online CRC calculator:
http://www.sunshine2k.de/coding/javascript/crc/crc_js.html

What are these weird ha:// URLs jenkins fills our logs with?

We noticed our Jenkins build logs were being filled with 10 times more content than we expected. This greatly increases the amount of logs that slaves have to send back to the master, which in turn makes all builds take longer, which in turn makes builds fail with spurious timeouts.
On investigation, we find the lines all have a huge URL prepended.
ha:////{320 bytes of base64 junk} Log message
ha:////{320 bytes of base64 junk} [blank line]
ha:////{320 bytes of base64 junk} Next log message
I tried decoding the base64, but it doesn't produce any structure which I'm familiar with.
I didn't want to post ours because someone who knows how to decode it might find private info in there, but I tried searching for some of the content we were seeing, and noticed that someone else had posted the same sort of thing to pastebin:
https://pastebin.com/LM7mht8W
Taking one of those URLs:
ha:////4HTWhKVov8LrT80csqfIVuXrtfeJTJod3fz9PpkDu0UAAAAAzh+LCAAAAAAAAP9b85aBtbiIQSOjNKU4P0+vIKc0PTOvWK8kMze1uCQxtyC1SC8ExvbLL0llgABGJgYmLwaB3MycnMzi4My85FTXgvzkDB8G3tScxILi1BRfsEwJg4BPVmJZon5OYl66vk9+Xrp1RRGDFNSy5Py84vycVD1nCI1qPENFAZCOr07/fwfoPj6QKXogU/TApnQ/mXCmX/k+EwOjFwNrWWJOaSrQXAGEIr/S3KTUorY1U2W5pzzohprGwMDU+O4jAJgnACXyAAAA
And decoding it (including the ////):
00000000 ff ff ff e0 74 d6 84 a5 68 bf c2 eb 4f cd 1c b2 |....t...h...O...|
00000010 a7 c8 56 e5 eb b5 f7 89 4c 9a 1d dd fc fd 3e 99 |..V.....L.....>.|
00000020 03 bb 45 00 00 00 00 ce 1f 8b 08 00 00 00 00 00 |..E.............|
00000030 00 ff 5b f3 96 81 b5 b8 88 41 23 a3 34 a5 38 3f |..[......A#.4.8?|
00000040 4f af 20 a7 34 3d 33 af 58 af 24 33 37 b5 b8 24 |O. .4=3.X.$37..$|
00000050 31 b7 20 b5 48 2f 04 c6 f6 cb 2f 49 65 80 00 46 |1. .H/..../Ie..F|
00000060 26 06 26 2f 06 81 dc cc 9c 9c cc e2 e0 cc bc e4 |&.&/............|
00000070 54 d7 82 fc e4 0c 1f 06 de d4 9c c4 82 e2 d4 14 |T...............|
00000080 5f b0 4c 09 83 80 4f 56 62 59 a2 7e 4e 62 5e ba |_.L...OVbY.~Nb^.|
00000090 be 4f 7e 5e ba 75 45 11 83 14 d4 b2 e4 fc bc e2 |.O~^.uE.........|
000000a0 fc 9c 54 3d 67 08 8d 6a 3c 43 45 01 90 8e af 4e |..T=g..j<CE....N|
000000b0 ff 7f 07 e8 3e 3e 90 29 7a 20 53 f4 c0 a6 74 3f |....>>.)z S...t?|
000000c0 99 70 a6 5f f9 3e 13 03 a3 17 03 6b 59 62 4e 69 |.p._.>.....kYbNi|
000000d0 2a d0 5c 01 84 22 bf d2 dc a4 d4 a2 b6 35 53 65 |*.\..".......5Se|
000000e0 b9 a7 3c e8 86 9a c6 c0 c0 d4 f8 ee 23 00 98 27 |..<.........#..'|
000000f0 00 25 f2 00 00 00 |.%....|
000000f6
Noticing that 1f 8b 08 looked like a gzip header, I tried cutting the file at that point and decompressed it. This gave:
00000000 ac ed 00 05 73 72 00 28 68 75 64 73 6f 6e 2e 70 |....sr.(hudson.p|
00000010 6c 75 67 69 6e 73 2e 74 69 6d 65 73 74 61 6d 70 |lugins.timestamp|
00000020 65 72 2e 54 69 6d 65 73 74 61 6d 70 4e 6f 74 65 |er.TimestampNote|
00000030 00 00 00 00 00 00 00 01 02 00 02 4a 00 10 6d 69 |...........J..mi|
00000040 6c 6c 69 73 53 69 6e 63 65 45 70 6f 63 68 4c 00 |llisSinceEpochL.|
00000050 0d 65 6c 61 70 73 65 64 4d 69 6c 6c 69 73 74 00 |.elapsedMillist.|
00000060 10 4c 6a 61 76 61 2f 6c 61 6e 67 2f 4c 6f 6e 67 |.Ljava/lang/Long|
00000070 3b 78 72 00 1a 68 75 64 73 6f 6e 2e 63 6f 6e 73 |;xr..hudson.cons|
00000080 6f 6c 65 2e 43 6f 6e 73 6f 6c 65 4e 6f 74 65 00 |ole.ConsoleNote.|
00000090 00 00 00 00 00 00 01 02 00 00 78 70 00 00 01 5f |..........xp..._|
000000a0 7b 67 ff dc 73 72 00 0e 6a 61 76 61 2e 6c 61 6e |{g..sr..java.lan|
000000b0 67 2e 4c 6f 6e 67 3b 8b e4 90 cc 8f 23 df 02 00 |g.Long;.....#...|
000000c0 01 4a 00 05 76 61 6c 75 65 78 72 00 10 6a 61 76 |.J..valuexr..jav|
000000d0 61 2e 6c 61 6e 67 2e 4e 75 6d 62 65 72 86 ac 95 |a.lang.Number...|
000000e0 1d 0b 94 e0 8b 02 00 00 78 70 00 00 00 00 02 81 |........xp......|
000000f0 ee f1 |..|
000000f2
So it kind of seems like the timestamper plugin is somehow implicated in this nonsense, but when I go and read their code, I don't see anything about this stuff.
Which bit of Jenkins is actually doing this, and is there a way to avoid it?
Good detective work, #Trejkaz. Disabling the timestamper plugin did NOT fix things for me (I left the plugin installed; perhaps I should have removed it altogether or restarted Jenkins one more time to be sure).
My best answer (the one I'm using in practice) gets rid of all the escape sequences in the console AND in the context of this question, removes all of the 'ha:////' URLs as well so I get pretty close to unadorned, complete ASCII text in my processed console log. It's worth mentioning that our site's automation culture is to allow Jenkins builds to expire except those marked for keeping, so my workflow is to produce a postprocessed console log artifact to "keep" and not to archive the original log. It's not to create a smaller log in the first place, which I saw as more time- and resource-consuming for no discernible benefit.
Presuming the raw Jenkins console log lives in console-log.txt, it's:
ansi2txt < console-log.txt | col -b | sed 's;ha:////[[:print:]]*AAAA[=]*;;g'
This eliminates escape sequences meant to provide terminal display sugar without requiring build and installation of tool packages not found in any repo (in Ubuntu ansi2txt comes from colorized-logs and col comes from bsdmainutils), removes the mysterious 'ha:////' URLs regardless of their source, and turns a raw console log that looks like:
Started by user ESC[8mha:////4AqgegZw7qQ8DI1+KvWPDM7IJMwAv+ifWfXHqdHJJeCwAAAAlx+
LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihh
k0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUN
odHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=ESC[0mAdmin
Checking out git ssh://git#github.com/SlipChip/PHX-Inst-App-SW.git into /var/tmp
/meta-talis/workspace/Firmware-Inst-App-SW#script to read Jenkinsfile
...
Commit message: "Add Jenkins console log as artifact console-log.txt."
> git rev-list --no-walk b70ac257fc5c87aa4a1fe55661b3523842f43412 # timeout=10
Running in Durability level: MAX_SURVIVABILITY
ESC[8mha:////4Ke8FKbo31T+wvpwDtO0m31cw6Dr9enqafGE6M9os2Y7AAAAoh+LCAAAAAAAAP9tjTEOwjAQBM8BClpKHuFItIiK1krDC0x8GCfWnbEdkooX8TX+gCESFVvtrLSa5wtWKcKBo5UdUu8otU4GP9jS5Mixv3geZcdn2TIl9igbHBs2eJyx4YwwR1SwULBGaj0nRzbDRnX6rmuvydanHMu2V1A5c4MHCFXMWcf8hSnC9jqYxPTz/BXAFEIGsfuclm8zQVqFvQAAAA==ESC[0m[Pipeline] Start of Pipeline
ESC[8mha:////4IgCbJC4forU2exyZEKrDUTKRV7HgFuwndWEBhDMO34wAAAApR+LCAAAAAAAAP9tjTEOwjAUQ3+KOrAycohUghExsUZZOEFIQkgb/d8mKe3EibgadyBQiQlLlmxL1nu+oE4RjhQdby12HpP2vA+jK4lPFLtroIm3dOGaMFGwXNpJkrGnpUrKFhaxClYC1hZ1oOTRZdiIVt1VExS65pxj2Q4CKm8GeAAThZxVzN8yR9jeRpMIf5y/AJj7DGxXvP/86jduZBmjwAAAAA==ESC[0m[Pipeline] node
...
into the considerably more palatable:
Started by user Admin
Checking out git ssh://git#github.com/SlipChip/PHX-Inst-App-SW.git into /var/tmp/meta-talis/workspace/Firmware-Inst-App-SW#script to read Jenkinsfile
...
Commit message: "Add Jenkins console log as artifact console-log.txt."
> git rev-list --no-walk b70ac257fc5c87aa4a1fe55661b3523842f43412 # timeout=10
Running in Durability level: MAX_SURVIVABILITY
[Pipeline] Start of Pipeline
[Pipeline] node
which is the same as what I see in the Jenkins web interface when browsing the console log.
I hope this answer helps you in a practical sense (i.e. rather than making an O(n) walkthrough of all of your plugins searching for the 'ha:////' culprit).

Connect to Windows SQL Server 2008 R2 from Rails app

My Rails 4.2.1 app has to connect to a Microsoft SQL 2008 R2 database. I am using the tiny_tds gem version 1.0.4. FreeTDS v1.00.15 is installed on the production server running Ubuntu 14.04.
I run queries inside an each loop and I can't get the loop to complete, the process crashes before completion.
I tried playing with tiny_tds options without success.
Here's the code I am using to get the tiny_tds client (check tds_version and timeout options):
client = TinyTds::Client.new(username: db_conf['username'], password: db_conf['password'], host: db_conf['host'], port: db_conf['port'], database: db_conf['database'], tds_version: '7.3', timeout: 15000, appname: 'ERP')
Here's the FreeTDS log after such an error happens.
packet.c:741:Sending packet 0000 12 01 00 ce 00 00 00 00-16 03 01 00
86 10 00 00 |........ ........| 0010 82 00 80 6e d9 e2 dc 97-9d 77 59
9a 5b da e3 e2 |...n.... .wY.[...| 0020 8b aa 66 ed ec 5e e2 02-e5 6c
fd db e1 ef 47 1a |..f..^.. .l....G.| 0030 9d 63 03 ed 6d 3e 28 3b-b9
64 fd 92 71 34 ff ba |.c..m>(; .d..q4..| 0040 7d 3c 8d ee 7b 34 75
e9-d5 b7 c6 83 a9 7d e6 7f |}<..{4u. .....}..| 0050 71 7e 25 11 82 b8
76 b1-c6 ba 86 b4 c3 0a 47 f0 |q~%...v. ......G.| 0060 51 96 c7 e2 5f
ca 07 b2-95 53 b9 9e bb 2c e7 cb |Q..._... .S...,..| 0070 be 0a b5 eb
b0 f3 41 1d-cd 86 fc a6 53 08 5e 56 |......A. ....S.^V| 0080 29 85 79
14 dc 2b 74 7b-b2 43 2c e8 0e 87 60 e4 |).y..+t{ .C,....| 0090 10 ef
f8 14 03 01 00 01-01 16 03 01 00 30 c7 f0 |........ .....0..| 00a0 35
f5 2c 6e 79 8d 85 b9-bd 60 b7 09 8c 7e 29 18 |5.,ny... ....~).| 00b0
4a 56 ea c3 4e 13 bf e3-c5 8d f6 68 31 31 54 ee |JV..N... ...h11T.|
00c0 bf 2f 75 8d e9 9e c0 a9-d0 d2 9e 5b c9 92 |./u..... ...[..|
tls.c:105:in tds_pull_func_login query.c:3796:tds_disconnect()
util.c:165:Changed query state from IDLE to DEAD
util.c:322:tdserror(0x80b75e0, 0xa04ca80, 20017, 0)
dblib.c:7947:dbperror(0xae62780, 20017, 0) dblib.c:8015:dbperror:
Calling dblib_err_handler with msgno = 20017; msg->msgtext =
"Unexpected EOF from the server (192.168.32.105:1433)"
dblib.c:5777:dbgetuserdata(0xae62780) dblib.c:8037:dbperror:
dblib_err_handler for msgno = 20017; msg->msgtext = "Unexpected EOF
from the server (192.168.32.105:1433)" -- returns 2 (INT_CANCEL)
util.c:352:tdserror: client library returned TDS_INT_CANCEL(2)
util.c:375:tdserror: returning TDS_INT_CANCEL(2) util.c:375:tdserror:
returning TDS_INT_CANCEL(2) tls.c:942:handshake failed
login.c:530:login packet rejected util.c:322:tdserror(0x80b75e0,
0xa04ca80, 20002, 0) dblib.c:7947:dbperror(0xae62780, 20002, 0)
dblib.c:8015:dbperror: Calling dblib_err_handler with msgno = 20002;
msg->msgtext = "Adaptive Server connection failed"
And here's the output of tsql -C:
~$ tsql -C
Compile-time settings (established with the "configure" script)
Version: freetds v1.00.15
freetds.conf directory: /usr/local/etc
MS db-lib source compatibility: no
Sybase binary compatibility: no
Thread safety: yes
iconv library: yes
TDS version: auto
iODBC: no
unixodbc: no
SSPI "trusted" logins: no
Kerberos: no
OpenSSL: yes
GnuTLS: no
MARS: no
Any idea what I should do to fix those Unexpected EOF from the server errors?
In your FreeTDS configuration (often in /etc/freetds/freetds.conf as in your configuration), change the value of text size:
text size = 4294967295
That's the maximum value, IIRC. I believe with FreeTDS 0.91 that your default is probably 64512.
Looking at the SQL Profiler, I found out the Rails application was opening way too many connections on the MSSQL server. Upon reaching its max number of open connection, the MSSQL server refused opening any new connection, resulting in the Unexpected EOF from the server error.
To solve the issue, I had to reuse my open connection when sending queries instead of opening a new connection for each query. I guess this is the correct way to use the tiny_tds connector anyway.
Translated to code:
def self.get_pmi_client
if ##pmi_client.nil? or !##pmi_client.active?
db_conf = Rails.configuration.database_configuration["pmi_#{Rails.env}"]
##pmi_client = TinyTds::Client.new(username: db_conf['username'], password: db_conf['password'], host: db_conf['host'], port: db_conf['port'], database: db_conf['database'])
raise MSSQLConnectionError, t('erp.errors.pmi_connection_error') unless ##pmi_client.active?
end
return ##pmi_client
end

How can I execute commands in redis without getting any response at all?

I try to execute commands on redis but don't care for any response and don't even want any to minimize network traffic. One answer on stackoverflow said a Lua scripts that doesn't return anything could help to achieve that, but when I try it on the redis-cli and sniff my packages I still get the same number of packages transfered between client and server whether I have a script returning nothing or one that returns Integer 1.
The example Queries are:
EVAL "" 0
EVAL "return 1" 0
In both cases wireshark shows 4 packages exchanged. One [PSH, ACK] client to server, [ACK] from the server to the client, [PSH, ACK] from the server to the client and [ACK] back from the client to the server.
In the first case the [PSH, ACK] package that I expect to be the response from redis contains the following data:
0000 02 00 00 00 45 00 00 39 bc a8 40 00 40 06 00 00 ....E..9 ..#.#...
0010 7f 00 00 01 7f 00 00 01 18 eb e6 bb 03 4d 7c 9c ........ .....M|.
0020 e2 97 bf 53 80 18 23 df fe 2d 00 00 01 01 08 0a ...S..#. .-......
0030 11 cd c0 31 11 cd c0 31 24 2d 31 0d 0a ...1...1 $-1..
In the second case this package contains:
0000 02 00 00 00 45 00 00 38 fa 9f 40 00 40 06 00 00 ....E..8 ..#.#...
0010 7f 00 00 01 7f 00 00 01 18 eb e6 bb 03 4d 7c a1 ........ .....M|.
0020 e2 97 bf 76 80 18 23 dd fe 2c 00 00 01 01 08 0a ...v..#. .,......
0030 11 ce be 46 11 ce be 46 3a 31 0d 0a ...F...F :1..
For the second case the point is clear. :1 is the integer reply for 1. But for the first case I'm not sure. $ is the indicator for bulk reply and - the indicator for error. Does this mean that $-1 is the data for the (nil) that is shown in the redis-cli?
Or am I completely wrong there? And if I am right, is there a possibility to tell redis that I don't want any response at all (except the ACK for the command)? Or would I have to fork the redis code and implement this myself?
I really appreciate any hints on how to achieve getting no response at all without digging into the redis source code.
EVAL "" 0 returns $-1\r\n
EVAL "return 1" 0 returns :1\r\n
In the first case, $-1 is a specific bulk-reply to be used to represent the nil value (as described in the protocol specification)
AFAIK, there is no possibility to tell Redis it does not have to send a reply (even for an empty answer).
As explained by Marc Gravell, you can use Lua to bundle several operations and reduce the volume of the reply data. However, you will not avoid the minimal reply packet.
For instance you could run 100 operations in one Lua script and have one single minimal packet as a reply. However, this packet cannot be avoided IMO, except by altering Redis source code.

Why "menus" unit is finalized too early?

I tested my application with FastMM and FullDebugMode turned on, since I had some shutdown problems.
After solving bunch of my own problems FastMM started to complain about calling virtual method on a freed object in TPopupList. I tried to move the menus unit as early as possible in uses so that it would be finalized last, but it didn't help. Is this real problem, a bug in vcl or false alarm from FastMM?
Here's the full report from FastMM:
FastMM has detected an attempt to call a virtual method on a freed object. An access violation will now be raised in order to abort the current operation.
Freed object class: TPopupList
Virtual method: Offset +16
Virtual method address: 4714E4
The allocation number was: 220
The object was allocated by thread 0x1CC0, and the stack trace (return addresses) at the time was:
403216 [sys\system.pas][System][System.#GetMem][2654]
404A4F [sys\system.pas][System][System.TObject.NewInstance][8807]
404E16 [sys\system.pas][System][System.#ClassCreate][9472]
404A84 [sys\system.pas][System][System.TObject.Create][8822]
7F2602 [Menus.pas][Menus][Menus.Menus][4223]
40570F [sys\system.pas][System][System.InitUnits][11397]
405777 [sys\system.pas][System][System.#StartExe][11462]
40844F [SysInit.pas][SysInit][SysInit.#InitExe][663]
7F6368 [PCCSServer.dpr][PCCSServer][PCCSServer.PCCSServer][148]
7C90DCBA [ZwSetInformationThread]
7C817077 [Unknown function at RegisterWaitForInputIdle]
The object was subsequently freed by thread 0x1CC0, and the stack trace (return addresses) at the time was:
403232 [sys\system.pas][System][System.#FreeMem][2699]
404A6D [sys\system.pas][System][System.TObject.FreeInstance][8813]
404E61 [sys\system.pas][System][System.#ClassDestroy][9513]
428D15 [common\Classes.pas][Classes][Classes.TList.Destroy][2914]
404AB3 [sys\system.pas][System][System.TObject.Free][8832]
472091 [Menus.pas][Menus][Menus.Finalization][4228]
4056A7 [sys\system.pas][System][System.FinalizeUnits][11256]
4056BF [sys\system.pas][System][System.FinalizeUnits][11261]
7C9032A8 [RtlConvertUlongToLargeInteger]
7C90327A [RtlConvertUlongToLargeInteger]
7C92AA0F [Unknown function at towlower]
The current thread ID is 0x1CC0, and the stack trace (return addresses) leading to this error is:
4714B8 [Menus.pas][Menus][Menus.TPopupList.MainWndProc][3779]
435BB2 [common\Classes.pas][Classes][Classes.StdWndProc][11583]
7E418734 [Unknown function at GetDC]
7E418816 [Unknown function at GetDC]
7E428EA0 [Unknown function at DefWindowProcW]
7E428EEC [Unknown function at DefWindowProcW]
7C90E473 [KiUserCallbackDispatcher]
7E42B1A8 [DestroyWindow]
47CE31 [Controls.pas][Controls][Controls.TWinControl.DestroyWindowHandle][6857]
493BE4 [Forms.pas][Forms][Forms.TCustomForm.DestroyWindowHandle][4564]
4906D9 [Forms.pas][Forms][Forms.TCustomForm.Destroy][2929]
Current memory dump of 256 bytes starting at pointer address 7FF9CFF0:
2C FE 82 00 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 C4 A3 2D 0C 00 00 00 00 B1 D0 F9 7F
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 C0 00 00 00 16 32 40 00 9D 5B 40 00 C8 5B 40 00
CE 82 40 00 3C 40 91 7C B0 B1 94 7C 0A 77 92 7C 84 77 92 7C 7C F0 96 7C 94 B3 94 7C 84 77 92 7C
C0 1C 00 00 32 32 40 00 12 5B 40 00 EF 69 40 00 BA 20 47 00 A7 56 40 00 BF 56 40 00 A8 32 90 7C
7A 32 90 7C 0F AA 92 7C 0A 77 92 7C 84 77 92 7C C0 1C 00 00 0E 00 00 00 00 00 00 00 C7 35 65 59
2C FE 82 00 80 80 80 80 80 80 80 80 80 80 38 CA 9A A6 80 80 80 80 80 80 00 00 00 00 51 D1 F9 7F
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 C1 00 00 00 16 32 40 00 9D 5B 40 00 C8 5B 40 00
CE 82 40 00 3C 40 91 7C B0 B1 94 7C 0A 77 92 7C 84 77 92 7C 7C F0 96 7C 94 B3 94 7C 84 77 92 7C
, þ ‚ . € € € € € € € € € € € € € € € € Ä £ - . . . . . ± Ð ù
. . . . . . . . . . . . . . . . À . . . . 2 # . [ # . È [ # .
Î ‚ # . < # ‘ | ° ± ” | . w ’ | „ w ’ | | ð – | ” ³ ” | „ w ’ |
À . . . 2 2 # . . [ # . ï i # . º G . § V # . ¿ V # . ¨ 2 |
z 2 | . ª ’ | . w ’ | „ w ’ | À . . . . . . . . . . . Ç 5 e Y
, þ ‚ . € € € € € € € € € € 8 Ê š ¦ € € € € € € . . . . Q Ñ ù
. . . . . . . . . . . . . . . . Á . . . . 2 # . [ # . È [ # .
Î ‚ # . < # ‘ | ° ± ” | . w ’ | „ w ’ | | ð – | ” ³ ” | „ w ’ |
I'm using Delphi 2007 and FastMM 4.97.
Edit1: I think the main problem here is why does Classes.StdWndProc call Menus.TPopupList? Digging the call stack inside debugger shows that System.FinalizeUnit is called three times, then it goes to SysUtils.ShowException, which tries to display MessageBox and after bunch of user32.dll calls we end up to classes.StdWndProc.
Edit2: I had problem with interfaces, fixing that made this problem go away. The object with interface was freed, but the reference was released later on. When the interface was released, occured an exception which I initially somehow ignored. Releasing the interface probably corrupted something which caused all other problems.
That situation can happens when a unit finalize after another unit it indirectly depends on.
For exemple, take the following unit:
unit Unit1;
interface
uses
Contnrs;
var
ItemHolder : TObjectList;
implementation
initialization
ItemHolder := TObjectList.Create(True);
finalization
ItemHolder.Free;
end.
That unit only directly depends on Contnrs. For that reason, delphi will ensure that this unit is finalized before Contnrs is. If the ObjectList contains TForms, Delphi won't ensure that Unit1 is finalized before unit Forms. If there are still some forms left while closing the application, TObjectList (Since it owns the object) will free the items it contains(Call TForm.Free). But since Unit1 doesn't depends on TForm, it's possible that the unit Forms is already finalized and that TForm.Destroy isn't in memory anymore.
This is why you need to be very carefull about what you do in finalization sections.
I'm not sure it's the source of your problem, but I would look that way first.
I've seen such problems with Delphi 2007 before. Sometimes the compiler gets confused and generates incorrect initialization or finalization order. Sadly, I was never able to create a reproducible test case to send to the CodeGear/Embarcadero people.
Whenever that happened, a full rebuild helped.
Make sure that FastMM4 is the FIRST line in your project file's uses clause (project|View source). If its not there, then add it.
It looks like one of your forms is getting destroyed after Menus.pas has been finalized. If your form has a menu on it, it would probably have to have Menus in its uses list in the interface section, which should make this impossible.
The only time I've seen issues like this pop up (no pun intended) is when using packages. Are you perhaps using a DPK with a plugin that adds a popup menu or menu items to your program? Package finalization can do some strange things to your program if you're not careful.
Either way, the solution is probably to dispose of the menu yourself before menus.pas finalizes. When it's time for the program to shut down, call Free on your popup menu and see if that solves the problem.
Update: this is only partial workaround
Workaround:
In main form of your application write
finalization
FreeAndNil(PopupList);
end.
this free PopupList and set to nil, so PopupList.Free in menus.pas will be OK.

Resources