Comparing generated executables for equivilance - comparison

I need to compare 2 executables and/or shared objects, compiled using the same compiler/flags and verify that they have not changed. We work in a regulated environment, so it would be really useful for testing purposes to isolate exactly what parts of the executable has changed.
Using MD5Sums/Hashes doesn't work due to the headers containing information about the file.
Does anyone know of a program or way to verify that 2 files are executionally the same even if they were built at a different time?

An interesting question. I have a similar problem on linux. Intrusion detection systems like OSSEC or tripwire may generate false positives if the hashsum of an executable changes all of a sudden. This may be nothing worse than the Linux "prelink" program patching the executable file for faster startups.
In order to compare two binaries (in the ELF format), one can use the "readelf" executable and then "diff" to compare outputs. I'm sure there are refined solutions, but without further ado, a poor man's comparator in Perl:
#!/usr/bin/perl -w
$exe = $ARGV[0];
if (!$exe) {
die "Please give name of executable\n"
}
if (! -f $exe) {
die "Executable $exe not found or not a file\n";
}
if (! (`file '$exe'` =~ /\bELF\b.*?\bexecutable\b/)) {
die "file command says '$exe' is not an ELF executable\n";
}
# Identify sections in ELF
#lines = pipeIt("readelf --wide --section-headers '$exe'");
#sections = ();
for my $line (#lines) {
if ($line =~ /^\s*\[\s*(\d+)\s*\]\s+(\S+)/) {
my $secnum = $1;
my $secnam = $2;
print "Found section $1 named $2\n";
push #sections, $secnam;
}
}
# Dump file header
#lines = pipeIt("readelf --file-header --wide '$exe'");
print #lines;
# Dump all interesting section headers
#lines = pipeIt("readelf --all --wide '$exe'");
print #lines;
# Dump individual sections as hexdump
for my $section (#sections) {
#lines = pipeIt("readelf --hex-dump='$section' --wide '$exe'");
print #lines;
}
sub pipeIt {
my($cmd) = #_;
my $fh;
open ($fh,"$cmd |") or die "Could not open pipe from command '$cmd': $!\n";
my #lines = <$fh>;
close $fh or die "Could not close pipe to command '$cmd': $!\n";
return #lines;
}
Now you can run for example, on machine 1:
./checkexe.pl /usr/bin/curl > curl_machine1
And on machine 2:
./checkexe.pl /usr/bin/curl > curl_machine2
After having copypasted, SFTP-ed or NSF-ed (you don't use FTP, do you?) the files into the same filetree, compare the files:
diff --side-by-side --width=200 curl_machine1 curl_machine2 | less
In my case, differences exist in section ".gnu.conflict", ".gnu.liblist", ".got.plt" and ".dynbss", which might be ok for a "prelink" intervention, but in the code section, ".text", which would be a Bad Sign.

To follow up, here is what I came up with finally:
Instead of comparing the final executables & shared objects, we compared the .o files output before linking. We assumed that the linking process was sufficiently reproducible that this would be fine.
It works in some of our cases, where we have two builds were we've made some small change that shouldn't effect the final code (Code pretty-printer) but doesn't help us if we do not have the build intermediary output.

You can compare the contents of RO and RW initialized sections by generating a binary file from the ELF file.
objcopy <elf_file> -O binary <binary_file>
Use the generated binary files to compare if they are identical, using diff, for example.
In my opinion, this is enough to grantee you are generating the same executable.

A few years back I had to do the same thing. We had to prove that we could rebuild the executable from source when given only a revision number, revision control repository, build tools, and build configuration. Note: If any of these change you may see a difference.
I remember there is some timestamps in the executable. The trick is to realize that the file is not just a bunch of bytes, that can not be interpreted. The file has sections, most will not change, but there will be a section for time of build (or some such thing).
I don't remember all the details, but the commands you will need are { objcopy, objdump, nm }, I think objdump would be the first to try.
Hope this helps.

Related

How to find the file paths of all namespace loaded in an application using tcl/TK under Unix?

For existing flow, there would be a whole bunch of namespaces loaded when running some script job.
However, if I want to check & trace the usage of some command in some namespace, I need to find the script path of the certain namespace.
Is there some way to get that? Particularly, I'm talking about Primetime scripts.
Technically, namespaces don't have script paths. But we can do something close enough:
proc report_current_file {call code result op} {
if {$code == 0} {
# If the [proc] call was successful...
set cmd [lindex $call 1]
set qualified_cmd [uplevel 1 [list namespace which $cmd]]
set file [file normalize [info script]]
puts "Defined $qualified_cmd in $file"
}
}
trace add execution proc leave report_current_file
It's not perfect if you've got procedures creating procedures dynamically — the current file might be wrong — but that's fortunately not what most code does.
Another option that might work for you is to use tcl::unsupported::getbytecode, which produces a lot of information in machine-readable format (a dictionary). One of the pieces of information is the sourcefile key. Here's an example running interactively on my machine:
% parray tcl_platform
tcl_platform(byteOrder) = littleEndian
tcl_platform(engine) = Tcl
tcl_platform(machine) = x86_64
tcl_platform(os) = Darwin
tcl_platform(osVersion) = 20.2.0
tcl_platform(pathSeparator) = :
tcl_platform(platform) = unix
tcl_platform(pointerSize) = 8
tcl_platform(threaded) = 1
tcl_platform(user) = dkf
tcl_platform(wordSize) = 8
% dict get [tcl::unsupported::getbytecode proc parray] sourcefile
/opt/local/lib/tcl8.6/parray.tcl
Note that the procedure has to be already defined for this to work. And if Tcl's become confused about what file the code was in (because of dynamic programming trickery) then that key is absent.

Parameter of the dxgettext command line to "Add likely ignores to ignore po file"

The dxgettext Extract Translations GUI has a switch to Add likely ignores to ignore po file but I don't see the correspondent parameter when calling dxgettext as a command line.
I'm building a batch file doing several tasks when preparing a new release and I would like that the translations extraction step behaves similarly than when called from the UI, moving to a separate file the strings that will clearly not need to be translated.
These are the parameters that I'm using:
dxgettext -b MyProjectPath --delphi --nonascii -r --useignorepo --preserveUserComments
Thank you.
I had the same problem as you do. Tho solve it for my OpenSource image organizer application, I use the following batch file to extract the strings from the sources and remove all strings to be ignored:
c:\Utils\dxgettext -b . --delphi --nonascii --no-wrap -o:msgid -o .
c:\Utils\msgremove default.po -i OvbImgOrganizerLanguageIgnore.po -o OvbImgOrganizerLanguage.pot --no-wrap
del OvbImgOrganizerLanguageDefaultBak.po
ren default.po OvbImgOrganizerLanguageDefaultBak.po
This batch is run with current directory being the source code directory.
That dialog is provided by the GUI ggdxgettext tool.
By the look of it, the dxgettext command line tool does this automatically by default:
item := dom.order.Objects[j] as TPoEntry;
ignoreitem:=ignorelist.Find(item.MsgId);
if ignoreitem=nil then begin
newitem:=TPoEntry.Create;
newitem.Assign(item);
if not IsProbablyTranslatable( newitem,
nil,
nil) then
ignorelist.Add(newitem)
else
FreeAndNil (newitem);
end else begin
ignoreitem.AutoCommentList.Text:=item.AutoCommentList.Text;
end;
But I am not quite sure since I haven't tried to analyze the program flow.
The sources are available on SourceForge, so you can check yourself.

How to issue Message Before Build--or seq problems

I'm trying to add helpful messages for arbitrary builds. If the build fails the user can, for example, install the package with different arguments.
My interface idea is to provide a function, build-with-message, that would be called with something like this:
build-with-message
''Building ${pkg.name}. Alternative invocations are: ..''
pkg
My implementation is based on builtins.seq
build-with-message = msg : pkg :
seq
(self.runCommand "issue-message" {} ''mkdir $out; echo ${msg}'')
pkg;
When I build a package with build-with-message I never see the message. My hunch is that seq evaluates the runCommand far enough to see that a set is returned and moves on to building the package. I tried with deepSeq as well, but a deepSeq build fails on runCommand. I also tried calling out some attributes from the runCommand, e.g.
(self.runCommand "issue-message" {} ''mkdir $out; echo ${msg}'').drvPath
(self.runCommand "issue-message" {} ''mkdir $out; echo ${msg}'').out
My thought being that calling for one of these would prompt the rest of the build. Perhaps I'm not calling the right attribute, but in any case the ones I've tried don't work.
So:
Is there a way to force the runCommand to build in the above scenario?
Is there already some builtin that just lets me issue messages on top of arbitrary builds?
Here's me answering my own question again, consider this a warning.
Solution:
I've in-lined some numbered comments to help with the explanation.
build-with-message = msg : pkg :
let runMsg /*1*/ = self.runCommand "issue-message"
{ version = toString currentTime; /*2*/ } ''
cat <<EOF
${msg}
EOF
echo 0 > $out /*3*/
'';
in seq (import runMsg /*4*/) pkg; /*5*/
Explanation:
runMsg is the derivation that issues the message.
Adding a version based on the current time ensures that the build of runMsg will not be in /nix/store. Otherwise, each unique message will only be issued for the first build.
After the message is printed, a 0 is saved to file as the output of the derivation.
The import loads runMsg--a derivation, and therefore serialized as the path $out. Import expects a nix expression, which in this case is just the number 0 (a valid nix expression).
Now, since the runMsg output will not be available until after it has been built, the seq command will build it (issuing the message) and then build pkg.
Discussion:
I take note of Robert Hensing's comment to my question--this may not be something Nix was not intended for. I'm not arguing against that. Moving on.
Notice that issuing a message like so will add a file to your nix store for every message issued. I don't know if the message build will be garbage collected while pkg is still installed, so there's the possibility of polluting the nix store if such a pattern is overused.
I also think it's really interesting that the result of the runMsg build was to install a nix expression. I suppose this opens the door to doing useful things.

How do I get output files for a given Bazel target?

Ideally, I'd like a list of output files for a target without building. I imagine this should be possible using cquery which runs post-analysis, but can't figure out how.
Here's my output.cquery
def format(target):
outputs = target.files.to_list()
return outputs[0].path if len(outputs) > 0 else "(missing)"
You can run this as follows:
bazel cquery //a/b:bundle --output starlark \
--starlark:file=output.cquery 2>/dev/null
bazel-out/darwin-fastbuild/bin/a/b/something-bundle.zip
For more information on cquery.
What exactly do you mean by "output files" here? Do you mean that you'd like to know the files generated if you build the target on the command line?
At what point would you like to have this information? Do you really want to invoke a bazel query command to acquire this information, or would you like it during analysis? I don't think there's a way, using bazel query, to get the exact expected absolute path of output files (or even the workspace-relative path, for example, bazel-out/foo/bar/baz.txt)
It may be a bit more involved than you want, but Requesting Output Files
has some information about specifying output files in Starlark, with a brief bit about acquiring information about your dependencies' output files (See DefaultInfo
I made a slight improvement to Engene's answer, since a target's output might be multiple:
bazel cquery --output=starlark \
--starlark:expr="'\n'.join([f.path for f in target.files.to_list()])" \
//foo:bar

How to monitor a text file in realtime [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
For debugging purposes in a somewhat closed system, I have to output text to a file.
Does anyone know of a tool that runs on windows (console based or not) that detects changes to a file and outputs them in real-time?
I like tools that will perform more than one task, Notepad++ is a great notepad replacement and has a Document Monitor plugin (installs with standard msi) that works great. It also is portable so you can have it on a thumb drive for use anywhere.
For a command line option, PowerShell (which is really a new command line) has a great feature already mentioned.
Get-Content someFile.txt -wait
But you can also filter at the command line using a regular expression
Get-Content web.log -wait | where { $_ -match "ERROR" }
Tail for Win32
Apache Chainsaw - used this with log4net logs, may require file to be in a certain format
When using Windows PowerShell you can do the following:
Get-Content someFile.txt -wait
I use "tail -f" under cygwin.
I use BareTail for doing this on Windows. It's free and has some nice features, such as tabs for tailing multiple files and configurable highlighting.
Tail is the best answer so far.
If you don't use Windows, you probably already have tail.
If you do use Windows, you can get a whole slew of Unix command line tools from here. Unzip them and put them somewhere in your PATH.
Then just do this at the command prompt from the same folder your log file is in:
tail -n 50 -f whatever.log
This will show you the last 50 lines of the file and will update as the file updates.
You can combine grep with tail with great results - something like this:
tail -n 50 -f whatever.log | grep Error
gives you just lines with "Error" in it.
Good luck!
FileSystemWatcher works a treat, although you do have to be a little careful about duplicate events firing - 1st link from Google - but bearing that in mind can produce great results.
Late answer, though might be helpful for someone -- LOGEXPERT seems to be interesting tail utility for windows.
Try SMSTrace from Microsoft (now called CMTrace, and directly available in the Start Menu on some versions of Windows)
Its a brilliant GUI tool that monitors updates to any text file in real time, even if its locked for writing by another file.
Don't be fooled by the description, its capable of monitoring any file, including .txt, .log or .csv.
Its ability to monitor locked files is extremely useful, and is one of the reasons why this utility shines.
One of the nicest features is line coloring. If it sees the word "ERROR", the line becomes red. If it sees the word "WARN", the line becomes yellow. This makes the logs a lot easier to follow.
I have used FileSystemWatcher for monitoring of text files for a component I recently built. There may be better options (I never found anything in my limited research) but that seemed to do the trick nicely :)
Crap, my bad, you're actually after a tool to do it all for you..
Well if you get unlucky and want to roll your own ;)
Yor can use the FileSystemWatcher in System.Diagnostics.
From MSDN:
public class Watcher
{
public static void Main()
{
Run();
}
[PermissionSet(SecurityAction.Demand, Name="FullTrust")]
public static void Run()
{
string[] args = System.Environment.GetCommandLineArgs();
// If a directory is not specified, exit program.
if(args.Length != 2)
{
// Display the proper way to call the program.
Console.WriteLine("Usage: Watcher.exe (directory)");
return;
}
// Create a new FileSystemWatcher and set its properties.
FileSystemWatcher watcher = new FileSystemWatcher();
watcher.Path = args[1];
/* Watch for changes in LastAccess and LastWrite times, and
the renaming of files or directories. */
watcher.NotifyFilter = NotifyFilters.LastAccess | NotifyFilters.LastWrite
| NotifyFilters.FileName | NotifyFilters.DirectoryName;
// Only watch text files.
watcher.Filter = "*.txt";
// Add event handlers.
watcher.Changed += new FileSystemEventHandler(OnChanged);
watcher.Created += new FileSystemEventHandler(OnChanged);
watcher.Deleted += new FileSystemEventHandler(OnChanged);
watcher.Renamed += new RenamedEventHandler(OnRenamed);
// Begin watching.
watcher.EnableRaisingEvents = true;
// Wait for the user to quit the program.
Console.WriteLine("Press \'q\' to quit the sample.");
while(Console.Read()!='q');
}
// Define the event handlers.
private static void OnChanged(object source, FileSystemEventArgs e)
{
// Specify what is done when a file is changed, created, or deleted.
Console.WriteLine("File: " + e.FullPath + " " + e.ChangeType);
}
private static void OnRenamed(object source, RenamedEventArgs e)
{
// Specify what is done when a file is renamed.
Console.WriteLine("File: {0} renamed to {1}", e.OldFullPath, e.FullPath);
}
}
You can also follow this link Watching Folder Activity in VB.NET
Snake Tail. It is a good option.
http://snakenest.com/snaketail/
Just a shameless plug to tail onto the answer, but I have a free web based app called Hacksaw used for viewing log4net files. I've put in an auto refresh options so you can give yourself near real time updates without having to refresh the browser all the time.
Yeah I've used both Tail for Win32 and tail on Cygwin. I've found both to be excellent, although I prefer Cygwin slightly as I'm able to tail files over the internet efficiently without crashes (Tail for Win32 has crashed on me in some instances).
So basically, I would use tail on Cygwin and redirect the output to a file on my local machine. I would then have this file open in Vim and reload (:e) it when required.
+1 for BareTail. I actually use BareTailPro, which provides real-time filtering on the tail with basic search strings or search strings using regex.
To make the list complete here's a link to the GNU WIN32 ports of many useful tools (amongst them is tail).
GNUWin32 CoreUtils
Surprised no one has mentioned Trace32 (or Trace64). These are great (free) Microsoft utilities that give a nice GUI and highlight any errors, etc. It also has filtering and sounds like exactly what you need.
Here's a utility I wrote to do just that:
It uses a FileSystemWatcher to look for changes in log files within local folders or network shares (don't have to be mounted, just provide the UNC path) and appends the new content to the console.
on github: https://github.com/danbyrne84/multitail
http://www.danielbyrne.net/projects/multitail
Hope this helps
#echo off
set LoggingFile=C:\foo.txt
set lineNr=0
:while1
for /f "usebackq delims=" %%i in (`more +%lineNr% %LoggingFile%`) DO (
echo %%i
set /a lineNr+=1
REM Have an appropriate stop condition here by checking i
)
goto :while1
A command prompt way of doing it.
FileMon is a free stand alone tool that can detect all kinds of file access. You can filter out any unwanted. It does not show you the data that has actually changed though.
I second "tail -f" in cygwin. I assume that Tail for Win32 will accomplish the same thing.
Tail for Win32
I did a tiny viewer by my own:
https://github.com/enexusde/Delphi/wiki/TinyLog

Resources