Linux module: being notified about task creation and destruction - linux-security-module

for Mach kernel API emulation on Linux, I need for my kernel module to get called when a task has been just created or is being terminated.
In my kernel module, this could most nicely be done via Linux Security Modules, but a couple of years ago, they prevented external modules from acting as a LSM by unexporting the needed symbols.
The only other way I could find was to make my module act like a rootkit. Find the syscall table and hook it in there.
Patching the kernel is out of the question. I need my app to be installed easily. Is there any other way?

You can use Kprobes, which enables you to dynamically hook into code in the kernel. You will need to find the right function among the ones involves in creating and destroying processes that give you the information you need. For instance, for tasks created, do_fork() in fork.c would be a good place to start. For tasks destroyed, do_exit. You would want to write a retprobe, which is a kind of kprobe that additionally gives you control at the end of the execution of the function, before it returns. The reason you want control before the function returns is to check if it succeeded in creating the process by checking the return value. If there was an error, then the function will return a negative value or in some cases possibly 0.
You would do this by creating a kretprobe struct:
static struct kretprobe do_fork_probe = {
.entry_handler = (kprobe_opcode_t *) my_do_fork_entry,
.handler = (kprobe_opcode_t *) my_do_fork_ret,
.maxactive = 20,
.data_size = sizeof(struct do_fork_ctx)
};
my_do_fork_entry gets executed when control enters the hooked function, and my_do_fork_ret gets executed just before it returns. You would hook it in as follows:
do_fork_probe.kp.addr =
(kprobe_opcode_t *) kallsyms_lookup_name("do_fork");
if ((ret = register_kretprobe(&do_fork_probe)) <0) {
// handle error
}
In the implementation of your hooks, it's a bit unwieldy to get the arguments and return value. You get these via the saved registers pt_regs data structure. Let's look at the return hook, where on x86 you get the return value via regs->ax.
static int my_do_fork_ret(struct kretprobe_instance *ri, struct pt_regs *regs)
{
struct do_fork_ctx *ctx = (struct do_fork_ctx *) ri->data;
int ret = regs->ax; // This is on x86
if (ret > 0) {
// It's not an error, probably a valid process
}
}
In the entry point, you can get access to the arguments via the registers. e.g. on x86, regs->di is the first argument, regs->si is the second etc. You can google to get the full list. Note that you shouldn't rely on these registers for the arguments in the return hook as the registers may have been overwritten for other computations.
You will surely have to jump many hoops in getting this working, but hopefully this note should set you off in the right direction.

Related

How can I get a custom python type and avoid importing a python module every time a C function is called

I am writing some functions for a C extension module for python and need to import a module I wrote directly in python for access to a custom python type. I use PyImport_ImportModule() in the body of my C function, then PyObject_GetAttrString() on the module to get the custom python type. This executes every time the C function is called and seems like it's not very efficient and may not be best practice. I'm looking for a way to have access to the python custom type as a PyObject* or PyTypeObject* in my source code for efficiency and I may need the type in more than one C function also.
Right now the function looks something like
static PyObject* foo(PyObject* self, PyObject* args)
{
PyObject* myPythonModule = PyImport_ImportModule("my.python.module");
if (!myPythonModule)
return NULL;
PyObject* myPythonType = PyObject_GetAttrString(myPythonModule, "MyPythonType");
if (!myPythonType) {
Py_DECREF(myPythonModule);
return NULL;
}
/* more code to create and return a MyPythonType instance */
}
To avoid retrieving myPythonType every function call I tried adding a global variable to hold the object at the top of my C file
static PyObject* myPythonType;
and initialized it in the module init function similar to the old function body
PyMODINIT_FUNC
PyInit_mymodule(void)
{
/* more initializing here */
PyObject* myPythonModule = PyImport_ImportModule("my.python.module");
if (!myPythonModule) {
/* clean-up code here */
return NULL;
}
// set the static global variable here
myPythonType = PyObject_GetAttrString(myPythonModule, "MyPythonType");
Py_DECREF(myPythonModule);
if (!myPythonType) {
/* clean-up code here */
return NULL;
/* finish initializing module */
}
which worked, however I am unsure how to Py_DECREF the global variable whenever the module is finished being used. Is there a way to do that or even a better way to solve this whole problem I am overlooking?
First, just calling import each time probably isn't as bad as you think - Python does internally keep a list of imported modules, so the second time you call it on the same module the cost is much lower. So this might be an acceptable solution.
Second, the global variable approach should work, but you're right that it doesn't get cleaned up. This is rarely a problem because modules are rarely unloaded (and most extension modules don't really support it), but it isn't great. It also won't work with isolated sub-interpreters (which isn't much of a concern now, but may become more more popular in future).
The most robust way to do it needs multi-phase initialization of your module. To quickly summarise what you should do:
You should define a module state struct containing this type of information,
Your module spec should contain the size of the module state struct,
You need to initialize this struct within the Py_mod_exec slot.
You need to create an m_free function (and ideally the other GC functions) to correctly decref your state during de-initialization.
Within a global module function, self will be your module object, and so you can get the state with PyModule_GetState(self)

Possible to create Graal native function callable from C without isolate?

I'd like to create a library, written in Java, callable from C, with simple method signatures:
int addThree(int in) {
return in + 3;
}
I know it's possible to do this with GraalVM if you do a little dance and create an Isolate in your C program and pass it in as the first parameter in every function call. There is good sample code here.
The problem is that the system I'm writing for, Postgres, can load C libraries and call functions in them, but I would have to create a wrapper function in C that would wrap every function I wanted to expose. This really limits the value of being able to slap something together in Java and use it in Postgres directly. I'd have to do something like this:
int myPublicAddThreeFunction(int in) {
graal_isolatethread_t *thread = NULL;
if (graal_create_isolate(NULL, NULL, &thread) != 0) {
fprintf(stderr, "error on isolate creation or attach\n");
return 1;
}
return SomeClassName_addThree_big_random_string_here(thread, in);
}
Is there a way, in Java alone, to expose a simple C function? I'm thinking I could create the isolate in a static method that gets loaded once on startup, somehow set it as the current isolate, and have the Java method just use it. Haven't been able to figure it out, though.
Also, it would be real nice not to have to append a big random string to every function name.

Testing for GVfs metadata support in C

I am trying to add support for per-directory viewing settings to the Thunar file browser of the Xfce desktop. So for example if a user chooses to view the contents of a directory as a list rather than as a grid of icons, this setting is remembered for that directory and will be used whenever that directory is viewed.
Now Thunar is built on GLib, and the mechanism we have chosen to use to implement this is to store metadata using GFile attributes, using methods like g_file_set_attributes_async to store
keys with names such as "metadata::thunar-view-type". The per-directory feature can be turned on or off by the user via a checkbox in a preferences dialog. My knowledge of GIO and GLib is pretty limited, but I have now managed to get this all working as desired (you can see my merge request here if you are interested).
Now as I understand it, the functionality that I am using here relies on something called "GVfs metadata", and as I understand it this might not be available on all systems. On systems where GVfs metadata is not available, I want to turn this functionality off and in particular make the checkbox in the preferences dialog insensitive (i.e. greyed out). Thus I need to write a function to detect if gvfs metadata support is available, by which I mean whether I can use functions like g_file_set_attributes_async to successfully save metadata so that it will be available in future.
Thunar is written in C, so this function needs to be written in C using the C API for GLib, GIO, etc.
The function I have come up with (from much reading of API documentation, modifying code scraps I have found, and experimentation) is as follows.
gboolean
thunar_g_vfs_metadata_is_supported (void)
{
GDBusMessage *send, *reply;
GDBusConnection *conn;
GVariant *v1, *v2;
GError *error = NULL;
const gchar **service_names;
gboolean metadata_found;
/* connect to the session bus */
conn = g_bus_get_sync (G_BUS_TYPE_SESSION, NULL, &error);
/* check that the connection was opened sucessfully */
if (error != NULL)
{
g_error_free (error);
return FALSE;
}
/* create the message to send to list the available services */
send = g_dbus_message_new_method_call ("org.freedesktop.DBus",
"/org/freedesktop/DBus",
"org.freedesktop.DBus",
"ListNames");
/* send the message and wait for the reply */
reply = g_dbus_connection_send_message_with_reply_sync (conn, send, G_DBUS_SEND_MESSAGE_FLAGS_NONE,
-1, NULL, NULL, &error);
/* release the connection and the sent message */
g_object_unref (send);
g_object_unref (conn);
/* check if we got a sucessful reply */
if (error != NULL)
{
g_error_free (error);
return FALSE;
}
/* extract the GVariant with the array of strings describing the available services */
v1 = g_dbus_message_get_body (reply); /* v1 belongs to reply and must not be freed */
if (v1 == NULL || !g_variant_is_container (v1) || g_variant_n_children (v1) < 1)
{
g_object_unref (reply);
return FALSE;
}
v2 = g_variant_get_child_value (v1, 0);
g_object_unref (reply);
/* check that the GVariant we have been given does contain an array of strings */
if (!g_variant_is_of_type (v2, G_VARIANT_TYPE_STRING_ARRAY))
{
g_variant_unref (v2);
return FALSE;
}
/* search through the list of service names to see if gvfs metadata is present */
metadata_found = FALSE;
service_names = g_variant_get_strv (v2, NULL);
for (int i=0; service_names[i] != NULL; i++)
if (g_strcmp0 (service_names[i], "org.gtk.vfs.Metadata") == 0)
metadata_found = TRUE;
g_free (service_names);
g_variant_unref (v2);
return metadata_found;
}
As you can see, this function uses DBus to query service names to see if the necessary service is available. Now, as far as I have been able to test it, this function works as I want it to. However, during a code review it has been questioned whether this can be done without relying on DBus (which might itself not be available even though GVfs metadata is).
Thus (at last!) my question: what is the best (i.e. most robust and accurate) way to test for GVfs metadata support via the C API for GLib, GIO, etc?. As I said above, by "GVfs metadata support" I mean "can I use functions like g_file_set_attributes_async to successfully save metadata so that it will be available in future?".
One method I have considered is looking at the list of running processes for the name "gvfsd-metadata", but that seems a bit kludgy to me.
Also, as mentioned above I am very much a novice with these technologies, so I is absolutely possible that I have misunderstood stuff here, so if you spot any errors in the assertions I have made above, please let me know.
Thanks!
(And yes, usual story, I'm a long time reader of SO & co, but a first time asker, so please feel free to edit or let me know if I've done something wrong/bad)
Call g_file_query_settable_attributes() and g_file_query_writable_namespaces() on the GFile, as described in the GFileInfo documentation:
However, not all attributes can be changed in the file. For instance, the actual size of a file cannot be changed via g_file_info_set_size(). You may call g_file_query_settable_attributes() and g_file_query_writable_namespaces() to discover the settable attributes of a particular file at runtime.

starting a process with exactly the same address structure as previous openning

Is it possible to start a process in windows with exactly the same address structure as the previous opening of the process?
To clarify the goal of this question I should mention that I use cheatengine (http://www.cheatengine.org/) to cheat some games! It includes several iterations to find a parameter (e.g. ammunition) and freeze it. However, each time I restart the game, since the memory structure of the game changes, I need to go through the time-consuming iterations again. So, if there were a method bring up the game exactly with the same memory structure as before, I wouldn't need going through iterations.
Not to say it's impossible, but this is essentially too much work due to the dynamic memory allocation routines the process will be using including the new operator and malloc(). Additionally when the DLL's imported by the executable are loaded into memory they have a preferred imagebase but if that address is already used, the OS will load it into a different memory location. Additionally Address Space Layout Randomization (ASLR) can be enabled on the process which is a security measure that randomizes the memory address of code sections.
The solution to your problem is much easier then what you're asking. To defeat the dynamic memory allocation described above you can still resolve the correct address of a variable by utilizing:
Relative offsets from module bases
Multi-level pointers
Pattern Scanning
Cheat Engine has all 3 of these built into it. When you save an address to your table is often saves it as a module + relative offset. You can pointer scan for the address and save it as a multilevel pointer or reverse the pointer yourself and manually place it in the table. Pattern scanning is achieved by using a CE Script, which you can put right in the Cheat Table.
In this case the ammo variable, may be a "static address" which means it's relative to the base address of the module. you may see it listed in Cheat Engine as "client.dll + 0xDEADCODE". You simply get the base address of the module at runtime and add the relative offset.
If you're looking to make an external hack in C++ you can get started like this.
In an external hack you do this by walking a ToolHelp32Snapshot:
uintptr_t GetModuleBase(const wchar_t * ModuleName, DWORD ProcessId) {
// This structure contains lots of goodies about a module
MODULEENTRY32 ModuleEntry = { 0 };
// Grab a snapshot of all the modules in the specified process
HANDLE SnapShot = CreateToolhelp32Snapshot(TH32CS_SNAPMODULE, ProcessId);
if (!SnapShot)
return NULL;
// You have to initialize the size, otherwise it will not work
ModuleEntry.dwSize = sizeof(ModuleEntry);
// Get the first module in the process
if (!Module32First(SnapShot, &ModuleEntry))
return NULL;
do {
// Check if the module name matches the one we're looking for
if (!wcscmp(ModuleEntry.szModule, ModuleName)) {
// If it does, close the snapshot handle and return the base address
CloseHandle(SnapShot);
return (DWORD)ModuleEntry.modBaseAddr;
}
// Grab the next module in the snapshot
} while (Module32Next(SnapShot, &ModuleEntry));
// We couldn't find the specified module, so return NULL
CloseHandle(SnapShot);
return NULL;
}
To get the Process ID you would do:
bool GetPid(const wchar_t* targetProcess, DWORD* procID)
{
HANDLE snap = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
if (snap && snap != INVALID_HANDLE_VALUE)
{
PROCESSENTRY32 pe;
pe.dwSize = sizeof(pe);
if (Process32First(snap, &pe))
{
do
{
if (!wcscmp(pe.szExeFile, targetProcess))
{
CloseHandle(snap);
*procID = pe.th32ProcessID;
return true;
}
} while (Process32Next(snap, &pe));
}
}
return false;
}
Using my example you would combine these functions and do:
DWORD procId;
GetPid(L"game.exe", &procId);
uintptr_t modBaseAddr = GetModuleBase(L"client.dll", procId);
uintptr_t ammoAddr = modBaseAddr + 0xDEADCODE;
If the address is not "static" you can find a pointer to it, the base address of the pointer must be static and then you just follow the above guide, and dereference each level of the pointer and add an offset.
Of course I have a function for that too :)
uintptr_t FindDmaAddy(HANDLE hProcHandle, uintptr_t BaseAddress, uintptr_t Offsets[], int PointerLevel)
{
uintptr_t pointer = BaseAddress;
uintptr_t pTemp;
uintptr_t pointerAddr;
for (int i = 0; i < PointerLevel; i++)
{
if (i == 0)
{
ReadProcessMemory(hProcHandle, (LPCVOID)pointer, &pTemp, sizeof(pTemp), NULL);
}
pointerAddr = pTemp + Offsets[i];
ReadProcessMemory(hProcHandle, (LPCVOID)pointerAddr, &pTemp, sizeof(pTemp), NULL);
}
return pointerAddr;
}
I would highly recommend watching some Youtube tutorials to see how it's done, much better explained in video format.

Caching streams in Functional Reactive Programming

I have an application which is written entirely using the FRP paradigm and I think I am having performance issues due to the way that I am creating the streams. It is written in Haxe but the problem is not language specific.
For example, I have this function which returns a stream that resolves every time a config file is updated for that specific section like the following:
function getConfigSection(section:String) : Stream<Map<String, String>> {
return configFileUpdated()
.then(filterForSectionChanged(section))
.then(readFile)
.then(parseYaml);
}
In the reactive programming library I am using called promhx each step of the chain should remember its last resolved value but I think every time I call this function I am recreating the stream and reprocessing each step. This is a problem with the way I am using it rather than the library.
Since this function is called everywhere parsing the YAML every time it is needed is killing the performance and is taking up over 50% of the CPU time according to profiling.
As a fix I have done something like the following using a Map stored as an instance variable that caches the streams:
function getConfigSection(section:String) : Stream<Map<String, String>> {
var cachedStream = this._streamCache.get(section);
if (cachedStream != null) {
return cachedStream;
}
var stream = configFileUpdated()
.filter(sectionFilter(section))
.then(readFile)
.then(parseYaml);
this._streamCache.set(section, stream);
return stream;
}
This might be a good solution to the problem but it doesn't feel right to me. I am wondering if anyone can think of a cleaner solution that maybe uses a more functional approach (closures etc.) or even an extension I can add to the stream like a cache function.
Another way I could do it is to create the streams before hand and store them in fields that can be accessed by consumers. I don't like this approach because I don't want to make a field for every config section, I like being able to call a function with a specific section and get a stream back.
I'd love any ideas that could give me a fresh perspective!
Well, I think one answer is to just abstract away the caching like so:
class Test {
static function main() {
var sideeffects = 0;
var cached = memoize(function (x) return x + sideeffects++);
cached(1);
trace(sideeffects);//1
cached(1);
trace(sideeffects);//1
cached(3);
trace(sideeffects);//2
cached(3);
trace(sideeffects);//2
}
#:generic static function memoize<In, Out>(f:In->Out):In->Out {
var m = new Map<In, Out>();
return
function (input:In)
return switch m[input] {
case null: m[input] = f(input);
case output: output;
}
}
}
You may be able to find a more "functional" implementation for memoize down the road. But the important thing is that it is a separate thing now and you can use it at will.
You may choose to memoize(parseYaml) so that toggling two states in the file actually becomes very cheap after both have been parsed once. You can also tweak memoize to manage the cache size according to whatever strategy proves the most valuable.

Resources