Malware Real-Time Memory Analysis - memory

I'm attempting to analyze some malware. As you'd expect, the malware can get a bit complicated in what it does. I believe I've discovered the approximate time where it utilizes an AES key prior to deleting the key (from memory). Because this happens so quick, I don't have the time to core dump the process to find the decryption key.
My question is, how should I go about watching all memory reads (or more importantly writes) on the system? Yes, i know that's a lot. However, between knowing a small time window (second or two), and understanding that the key will have to have a given format, i think it's possible to narrow down possible keys to a reasonable key space.
EDIT: Sadly, there has been many concerns raised already about this question regarding the asking for a tool. Let's make this perfectly clear. I AM NOT ASKING FOR A TOOL RECOMMENDATION. :-) Similarly, if there are other censoring concerns please feel free to raise them and I can redact my question further.

Related

Interview question on stack

recently my friend attended intv, he faced this question(intviewer made this up from my fren's answer to another question)
Say, we have option to use either
1) recursion --> uses system stack, i think OS takes care of everything
2) use our own stack for only data part and get things done.
to fix something. Which one do you prefer? and why?
assume stack size wouldn't grow beyond 100.
I would use the system stack. Why re-invent the wheel?
Function calls, while not really slow per se, do take non-zero time. Therefore an iterative solution can be slightly faster.
More often thatn not, simplicity is better than a slight performance gain.
Dont overkill a solution, and loose maitainability/readability for 1ms if you are not going to use that 1ms.
Just remember that whatever clever little hack you put together has to be maintained (and proven to work first for that matter) where as many standard/system solutions are available, that has been proven. (see Reinventing the wheel).
If it is really system crytical that you reduce memory allocation and enhance performance, you have your work cut out for you, and be prepared to spend some time proving that your solution is better/faster and stable.
Interesting to see the general preference for recursion on here, and a few who assume that the recursive implementation will necessarily be clearer or more maintainable... maybe, maybe not :-).
recursion typically avoids an explicit loop
recursion can sometimes simply use local variables inside the function to avoid a container storing results as they're calculated
recursion can make it trivial to reverse the order in which sub-results are gathered
recursion means there's a limit to the depth of information being processed, where-as often a loop implementation easily avoids this, or at least has memory requirements that more accurately reflect the data-processing needs
the more widely applicable you want your software to be, the more important it is to remove arbitrary limits (e.g. UNIX software like modern vim, less, GNU grep etc. make minimal assumptions about file/line/expression length and dynamically attempt whatever they're asked / many here will remember old editors and vendor-specific utilities e.g. one "celestial" company's grep that would never match results at the end of a too-long line, editors that SIGSEGVed, shutdown, corrupted or slowed down into uselessness on long lines or files)
naive recursion can result in spectacularly inefficiently combined sub-results
some people find recursion easier to understand, some find it harder - definitely it suits how we think about some problems better than others
Depends on the algorithm. Small stack usage, system stack. Lot of stack needed, go on the heap. Stack size is limited by OS beyond which OS throws stackoverflow ;-) If algo uses more stack space then I would go with stack data structure and push the data on the heap
Hm, I think it deppends the problem...
The stack size, if I got your point, is not only what limits you from using one or another.
But wanting to use recursion... well, no bads, really, for the length of the stack, but I'd rather make my own solution.
Avoid recursion when you can. :)
Recursion may be the simplest way to solve a particular problem. An iterative solution can required more code and more opportunities for errors. The testing and maintenance cost may be greater than the performance benefit.
I would go with the first, use the system stack. That being said the language FORTH there are two system stacks. One is the return stack and the other is the parameters stack. This offers some nice flexibility.

Background reading for parsing sloppy / quirky / "almost structured" data?

I'm maintaining a program that needs to parse out data that is present in an "almost structured" form in text. i.e. various programs that produce it use slightly different formats, it may have been printed out and OCR'd back in (yeah, I know) with errors, etc. so I need to use heuristics that guess how it was produced and apply different quirks modes, etc. It's frustrating, because I'm somewhat familiar with the theory and practice of parsing if things are well behaved, and there are nice parsing frameworks etc. out there, but the unreliability of the data has led me to write some very sloppy ad-hoc code. It's OK at the moment but I'm worried that as I expand it to process more variations and more complex data, things will get out of hand. So my question is:
Since there are a fair number of existing commercial products that do related things ("quirks modes" in web browsers, error interpretation in compilers, even natural language processing and data mining, etc.) I'm sure some smart people have put thought into this, and tried to develop a theory, so what are the best sources for background reading on parsing unprincipled data in as principled a manner as possible?
I realize this is somewhat open-ended, but my problem is that I think I need more background to even know what the right questions to ask are.
Given the choice between what you've proposed and fighting a hungry crocodile while covered in raw-beef-flavored marmalade and both hands tied behind my back, I'd choose the ...
Well, OK on a more serious note, if you have data that doesn't abide by the any "sane" structure, you have to study the data and find frequencies of quirks in it and correlate the data for the given context (i.e. how it was generated)
Print to OCR to get the data in is almost always going to lead to heart break. The company I work for employs a veritable army of people who manually read such documents and hand "code" (i.e. enter by hand) the data for known problematic OCR scenarios, or documents our customers detect the original OCR failed on.
As for leveraging "Parsing Frameworks" these tend to expect data that will always follow the grammar rules you've laid out. The data you've described has no such guarantees. If you go that route be prepared for unexpected - though not always obvious - failures.
By all means if there is any way possible to get the original data files, do so. Or if you can demand that those providing the data make their data come in a single well defined format, even better. (It might not be "YOUR" format, but at least it's a regular and predictable format you can convert from)

Low level programming: How to find data in a memory of another running process?

I am trying to write a statistics tool for a game by extracting values from game's process memory (as there is no other way). The biggest challenge is to find out required addresses that store data I am interested. What makes it even more harder is dynamic memory allocation - I need to find not only addresses that store data but also pointers to those memory blocks, because addresses are changing every time game restarts.
For now I am just manually searching game memory using memory editor (ArtMoney), and looking for addresses that change their values as data changes (or don't change). After address is found I am looking for a pointer that points to this memory block in a similar way.
I wonder what techniques/tools exist for such tasks? Maybe there are some articles I can read? Is mastering disassembler the only way to go? For example game trainers are solving similar tasks, but they make them in days and I am struggling already for weeks.
Thanks.
PS. It's all under windows.
Is mastering disassembler the only way to go?
Yes; go download WinDbg from http://www.microsoft.com/whdc/devtools/debugging/default.mspx, or if you've got some money to blow, IDA Pro is probably the best tool for doing this
If you know how to code in C, it is easy to search for memory values. If you don't know C, this page might point you to your solution if you can code in C#. It will not be hard to port the C# they have to Java.
You might take a look at DynInst (Dynamic Instrumentation). In particular, look at the Dynamic Probe Class Library (DPCL). These tools will let you attach to running processes via the debugger interface and insert your own instrumentation (via special probe classes) into them while they're running. You could probably use this to instrument the routines that access your data structures and trace when the values you're interested in are created or modified.
You might have an easier time doing it this way than doing everything manually. There are a bunch of papers on those pages you can look at to see how other people built similar tools, too.
I believe the Windows support is maintained, but I have not used it myself.

Dealing with illogical managers [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
At a place I used to work they typical response to any problem was to blame the hardware or the users for not using the system perfectly. I had adopted the philosophy that it's my fault until I can prove otherwise prior to that job (and so far, at least 99 times out of 100 it's correct).
One of the last "unsolvable" problems when I was there was an abundance of database timeouts. After months of research, I still only had theories but couldn't prove any of them. One of my developers adamantly suggested replacing the network (every router, switch, access point) but couldn't provide any evidence that the network was the cause; it was, however, "obviously the cause" according to my manager (no development/IT experience) so he took over the problem. One caveat and Fog Creek plug: He couldn't account for the fact that the error reporting via FogBugz worked perfectly and to the same SQL Server as the rest of the data.
A couple, timeout-free months later, my manager boasted that he had fixed the timeouts ("Look, no timeouts!"). I had to hold back from grabbing a rock and saying "Look, no tigers!" but I did ask how he knew they would have occurred to which I got no response. The timeouts did return (and in greater numbers) a couple months later.
I'm pretty content with how I handled the situation but I'm curious how the SO crowd would have responded to letting a superior/colleague implement a solution you know (or are very sure) is wrong and will likely waste thousands of dollars?
Let them, but at the same time continue searching for the real cause.
A couple thousand dollars is money well spent if it keeps me from going against that kind of thinking (which is futile).
Well, if the problem is upper management, then I would do as you have done - lodge your complaint, then follow instructions. If it turns out they were right (it happens every now and then) then you look like a good employee despite your misgivings. If it turns out you were right, then they might be more willing to listen to you given that you allowed them their turn.
This is, of course, optimistic.
In the case of a colleague, take the problem up a level and consult your superiors for advice on how to approach the subject. Be fair to both your perspective and that of your colleague's, then follow the advice you're given.
Sometimes it's best to leave a manager be. If you think about his pressures and responsibilities he had to be seen to be doing something, rather than "nothing". After enough time "investigating" resolves into nothing to outside parties that need the timeouts to stop.
By taking an action he creates an opportunity to keep researching. The trick is to find a way to put your solutions in his context. Here is something we can do now, and here's what we can continue to do. For example, "We can replace the networking gear as a precautionary step, and then look at the version control logs to rule out that possibility."
This gives him something proactive so he can look productive up his chain while achieving the solution you want which will ultimately be successful.
In the long term you should look to work for someone who trusts your technical decisions implicitly, you can talk candidly with and who well help you help him navigate the politics in a way you both know what's going on. If you manager isn't that person, move.
Is this a big problem? Its not your job to save your company dollars other than that you would like your company to remain solvent so you get paid.
If its just one manager, he will be weeded out sooner or later, if your entire company culture is like this, maybe it would be time to move on.
In the mean time, see if you can see this from your manager's perspective.
I'd consider you're manager's intent to be a good thing. It's the people that don't want to bother that I find more difficult. It's just best to find a way to use that energy to be helpful.
One common problem for lots of people (occasionally myself), is that they flail around when trying to diagnose a problem. If you're wildly guessing at it, then particularity with modern computers, you have only the slimmest possibility of being right. Approaching this type of problem with that type of attitude, generally means that you'll never fix it.
The best way of handing complex debugging, is by divide and conquer. In this case, first think up a test, them implement it. Did that test act as expected? Depending on where you are with your tests, you are getting closer or farther away from the problem. The key, is that ALL of the tests must result in some concrete (objective) behavior. If the results are ambiguous, then the test is useless.
If you're getting a disconnect in part of the system, but some other part is not, then you have a huge amount of valuable information (it also shows it's not the network). What's the difference between the parts? Just start descending until you get somewhere ...
Getting back to your manager. Whenever I encounter that type of personality problem, I try to redirect the energy into something more useful. The desire is there, it just needs some help in getting shaped. If you can convince your manager to make sure the tests are concrete, then if they do enough of them, they'll produce enough information to correctly guess the bug. Sure, a more consistent approach might be faster, but why turn down some free assistance. I generally feel that there is some useful role for anyone on any project, it's all about making it possible to harness their efforts ....
Paul.

What is the most critical piece of code you have written and how did you approach it?

Put it another way: what code have you written that cannot fail. I'm interested in hearing from those who have worked on projects dealing with heart monitors, water testing, economic fundamentals, missile trajectories, or the O2 concentration on the space shuttle.
How did you prepare for writing this sort of code: methodologically, intellectually, and emotionally?
Edit
I've marked this wiki in case the rep issue is keeping people from replying. I thought there would be a good deal more perspective on this issue than there has been.
While I am not personally involved in what is described there, this article will hopefully contribute to the spirit of your question: They Write the Right Stuff.
I wrote a driver for a blood pressure measuring device for hospital use. If it "fails", the patient will not have his blood pressure checked at the scheduled time; if his blood pressure is abnormal, no alarm (in the larger system) will be triggered. Such an event could be clinically significant.
My approach was to thoroughly read the spec/documentation in a non-work environment (to avoid the temptation to start coding right away), then read it again at work. After that, I summarized the possible states and actions on paper and "flowcharted" an algorithm, and annotated all the potential real-world "bad events" (cables getting unplugged, batteries dying, etc). Finally, I wrote and rewrote the driver three times, each with different mechanisms (e.g. FSM), and compared their results. Each iteration helped me identify weaknesses I hadn't yet discovered. The third rewrite was the "official" result. I reviewed each iteration with my co-worker.
Emotional preparation consisted of convincing myself that should the unthinkable happen, at least I wasn't willfully negligent -- just incompetent (the old "I'm only human" excuse). ;-)
I have written computer interface to a MRI machine. It had no chance of hurting the end user as it was just record management, but it could potentially have given an incorrect diagnosis or omit important information.
Tests, lots and lots of tests.
Unit tests, mid and high level tests. Simulate all possible input combinations. Also a great deal of testing with the hardware itself. Testing must be done in a complete and methodical way. It should take a great deal more time to test than to write.
Error Reporting
All errors must be reported and be obvious. If it won't hurt the patent to do so, fail fast.
For something that is actively keeping a person alive things are even worse. It must never stop working. If it fails it needs to restart and keep trying. Redundant internals are also a must in case the hardware fails.
At the wrong company it can really a difficult kind of situation to work in. However, if things are going well, you are well funded and release pressure is not high, it can be a very rewarding space to work in.
Not really an answer, but:
I've got a friend who writes embedded control software for laser eye surgery machines. When he had laser eye surgery himself, he made sure to go to an ophthalmologist who used his company's system. I have great admiration for this guy. I can't think of a piece of software I've ever written whose level of quality was high enough that I'd trust my own eyesight to it.
Right now I'm working on some base code for a system that retrieves medical patient information from clinics and hospitals for a medical billing office. We're starting out with a smaller client and a long break-in period to ensure quality, but eventually this code needs to securely handle a large variety of report formats from a number of clients at different facilities.
It's not quite in the same scale as your examples, but a bad mistake could result in the wrong people being billed or the right person billed to a defunct address (screwing up credit reports) or open people up to identity theft, so it's still pretty critical. Oh yeah, and it could mean doctors don't get paid quite as quick. That's important, too, especially from a business perspective, but not in the same class as data protection and integrity.
I've heard crazy stories of the processes used to write code at NASA for the spaceshuttles. Every line of code has about 10-20 lines of documentation, along with tests, full revision history, etc. Every time a bug is found, not only is the code evaluated and repaired, but the entire procedure of writing code, the entire command chain, etc. is reviewed to answer the question: "What happened wrong in our process that allowed this bug to get included in the first place?"
While nothing quite so important as an MRI machine or a blood pressure monitor, I did get tapped to do a rewrite of Blackjack when I worked for an online gambling provider. Blackjack is by far the most popular online game, and millions of dollars was going to go through this software (and did).
I wrote the game engine separate from the server and the client, and used Test Driven Development to ensure that what I was assuming was coming through in the results. I also had a wrapper "server" that had console output that would allow me to play. This was actually only useful in that it mimicked the real server interface, since playing a text version of blackjack isn't very fun or easy ("You draw a 10. You now have a 10 and a 6, while the dealer has a 6 showing. [bsd] >")
The game is still being run on some sites out there, and to my knowledge, has never had any financial bugs after years of play.
My first "real" software job was writing a GUI app for planning stereotactic brain surgery. Testing, testing, testing... absolutely no formal methods, engineering-style thoughts, just younger programmers cranking it out. When they started talking about using the software to control a robotic arm with a laser, without any serious engineering methods in place, i got a bit worried, left for more officey lands.
I've created information system application for local government cultures and tourism department in Bali island which were installed in several tourism denstinations, providing extensive informations about the culture, maps, accomodations etc.
if it failed then probably tourists couldnt get the right informations they need most, cheat by brookers, or lost somewhere :)

Resources