how do the registers get saved when a process gets interrupted? - stack

this has been bugging me all day. When a program sets itself up to call a function when it receives a certain interrupt, I know that the registers are pushed onto the stack when the program is interrupted, but what I can't figure out is: how do the registers get off the stack? I know that the compiler doesn't know if the function is an interrupt handler, and it can't know how many arguments the interrupt gave to the function. So how on earth does it get the registers off?

It depends on the compiler, the OS and the CPU.
For low level embedded stuff, where an ISR may be called directly in response to an interrupt, the compiler will typically have some extension to the language (usually C or C++) that flags a given routine as an ISR, and registers will be saved and restored at the beginning and end of such a routine. [1]
For common desktop/server OSs though there is normally a level of abstraction between interrupts and user code - interrupts are normally handled first by some kernel code before being passed to a user routine, in which case the kernel code takes care of saving and restoring registers, and there is nothing special about the user-supplied ISR.
[1] E.g. Keil 8051 C compiler:
void Some_ISR(void) interrupt 0 // this routine will get called in response to interrupt 0
{
// compiler generates preamble to save registers
// ISR code goes here
// compiler generates code to restore registers and
// do any other special end-of-ISR stuff
}

Related

vkGetMemoryFdKHR is return the same fd?

In WIN32:
I'm sure that if the handle is the same, the memory may not be the same, and the same handle will be returned no matter how many times getMemoryWin32HandleKHR is executed.
This is consistent with vulkan's official explanation: Vulkan shares memory.
It doesn't seem to work properly in Linux.
In my program,
getMemoryWin32HandleKHR works normally and can return a different handle for each different memory.
The same memory returns the same handle.
But in getMemoryFdKHR, different memories return the same fd.
Or the same memory executes getMemoryFdKHR twice, it can return two different handles.
This causes me to fail the device memory allocation during subsequent imports.
I don't understand why this is?
Thanks!
#ifdef WIN32
texGl.handle = device.getMemoryWin32HandleKHR({ info.memory, vk::ExternalMemoryHandleTypeFlagBits::eOpaqueWin32 });
#else
VkDeviceMemory memory=VkDeviceMemory(info.memory);
int file_descriptor=-1;
VkMemoryGetFdInfoKHR get_fd_info{
VK_STRUCTURE_TYPE_MEMORY_GET_FD_INFO_KHR, nullptr, memory,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT
};
VkResult result= vkGetMemoryFdKHR(device,&get_fd_info,&file_descriptor);
assert(result==VK_SUCCESS);
texGl.handle=file_descriptor;
// texGl.handle = device.getMemoryFdKHR({ info.memory, vk::ExternalMemoryHandleTypeFlagBits::eOpaqueFd });
Win32 is nomal.
Linux is bad.
It will return VK_ERROR_OUT_OF_DEVICE_MEMORY.
#ifdef _WIN32
VkImportMemoryWin32HandleInfoKHR import_allocate_info{
VK_STRUCTURE_TYPE_IMPORT_MEMORY_WIN32_HANDLE_INFO_KHR, nullptr,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT, sharedHandle, nullptr };
#elif __linux__
VkImportMemoryFdInfoKHR import_allocate_info{
VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR, nullptr,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT,
sharedHandle};
#endif
VkMemoryAllocateInfo allocate_info{
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO, // sType
&import_allocate_info, // pNext
aligned_data_size_, // allocationSize
memory_index };
VkDeviceMemory device_memory=VK_NULL_HANDLE;
VkResult result = vkAllocateMemory(m_device, &allocate_info, nullptr, &device_memory);
NVVK_CHECK(result);
I think it has something to do with fd.
In my some test: if I try to get fd twice. use the next fd that vkAllocateMemory is work current......but I think is error .
The fd obtained in this way is different from the previous one.
Because each acquisition will be a different fd.
This makes it impossible for me to distinguish, and the following fd does vkAllocateMemory.
Still get an error.
So this test cannot be used.
I still think it should have the same process as win32. When the fd is obtained for the first time, vkAllocateMemory can be performed correctly.
thanks very much!
The Vulkan specifications for the Win32 handle and POSIX file descriptor interfaces explicitly state different things about their importing behavior.
For HANDLEs:
Importing memory object payloads from Windows handles does not transfer ownership of the handle to the Vulkan implementation. For handle types defined as NT handles, the application must release handle ownership using the CloseHandle system call when the handle is no longer needed.
For FDs:
Importing memory from a file descriptor transfers ownership of the file descriptor from the application to the Vulkan implementation. The application must not perform any operations on the file descriptor after a successful import.
So HANDLE importation leaves the HANLDE in a valid state, still referencing the memory object. File descriptor importation claims ownership of the FD, leaving it in a place where you cannot use it.
What this means is that the FD may have been released by the internal implementation. If that is the case, later calls to create a new FD may use the same FD index as a previous call.
The safest way to use both of these APIs is to have the Win32 version emulate the functionality of the FD version. Don't try to do any kinds of comparisons of handles. If you need some kind of comparison logic, then you'll have to implement it yourself. When you import a HANDLE, close it immediately afterwards.

How to pass native void pointers to a Dart Isolate - without copying?

I am working on exposing an audio library (C library) for Dart. To trigger the audio engine, it requires a few initializations steps (non blocking for UI), then audio processing is triggered with a perform function, which is blocking (audio processing is a heavy task). That is why I came to read about Dart isolates.
My first thought was that I only needed to call the performance method in the isolate, but it doesn't seem possible, since the perform function takes the engine state as first argument - this engine state is an opaque pointer ( Pointer in dart:ffi ). When trying to pass engine state to a new isolate with compute function, Dart VM returns an error - it cannot pass C pointers to an isolate.
I could not find a way to pass this data to the isolate, I assume this is due to the separate memory of main isolate and the one I'm creating.
So, I should probably manage the entire engine state in the isolate which means :
Create the engine state
Initialize it with some options (strings)
trigger the perform function
control audio at runtime
I couldn't find any example on how to perform this actions in the isolate, but triggered from main thread/isolate. Neither on how to manage isolate memory (keep the engine state, and use it). Of course I could do
Here is a non-isolated example of what I want to do :
Pointer<Void> engineState = createEngineState();
initEngine(engineState, parametersString);
startEngine(engineState);
perform(engineState);
And at runtime, triggered by UI actions (like slider value changed, or button clicked) :
setEngineControl(engineState, valueToSet);
double controleValue = getEngineControl(engineState);
The engine state could be encapsulated in a class, I don't think it really matters here.
Whether it is a class or an opaque datatype, I can't find how to manage and keep this state, and perform triggers from main thread (processed in isolate). Any idea ?
In advance, thanks.
PS: I notice, while writing, that my question/explaination may not be precise, I have to say I'm a bit lost here, since I never used Dart Isolates. Please tell me if some information is missing.
EDIT April 24th :
It seems to be working with creating and managing object state inside the Isolate. But the main problem isn't solved. Because the perform method is actually blocking while it is not completed, there is no way to still receive messages in the isolate.
An option I thought first was to use the performBlock method, which only performs a block of audio samples. Like this :
while(performBlock(engineState)) {
// listen messages, and do something
}
But this doesn't seem to work, process is still blocked until audio performance finishes. Even if this loop is called in an async method in the isolate, it blocks, and no message are read.
I now think about the possibility to pass the Pointer<Void> managed in main isolate to another, that would then be the worker (for perform method only), and then be able to trigger some control methods from main isolate.
The isolate Dart package provides a registry sub library to manage some shared memory. But it is still impossible to pass void pointer between isolates.
[ERROR:flutter/lib/ui/ui_dart_state.cc(157)] Unhandled Exception: Invalid argument(s): Native objects (from dart:ffi) such as Pointers and Structs cannot be passed between isolates.
Has anyone already met this kind of situation ?
It is possible to get an address which this Pointer points to as a number and construct a new Pointer from this address (see Pointer.address and Pointer.fromAddress()). Since numbers can freely be passed between isolates, this can be used to pass native pointers between them.
In your case that could be done, for example, like this (I used Flutter's compute to make the example a bit simpler but that would apparently work with explicitly using Send/ReceivePorts as well)
// Callback to be used in a backround isolate.
// Returns address of the new engine.
int initEngine(String parameters) {
Pointer<Void> engineState = createEngineState();
initEngine(engineState, parameters);
startEngine(engineState);
return engineState.address;
}
// Callback to be used in a backround isolate.
// Does whichever processing is needed using the given engine.
void processWithEngine(int engineStateAddress) {
final engineState = Pointer<Void>.fromAddress(engineStateAddress);
process(engineState);
}
void main() {
// Initialize the engine in a background isolate.
final address = compute(initEngine, "parameters");
final engineState = Pointer<Void>.fromAddress(address);
// Do some heavy computation in a background isolate using the engine.
compute(processWithEngine, engineState.address);
}
I ended up doing the processing of callbacks inside the audio loop itself.
while(performAudio())
{
tasks.forEach((String key, List<int> value) {
double val = getCallback(key);
value.forEach((int element) {
callbackPort.send([element, val]);
});
});
}
Where the 'val' is the thing you want to send to callback. The list of int 'value' is a list of callback index.
Let's say you audio loop performs with vector size of 512 samples, you will be able to pass your callbacks after every 512 audio samples are processed, which means 48000 / 512 times per second (assuming you sample rate is 48000). This method is not the best one but it works, I still have to see if it works in very intensive processing context though. Here, it has been thought for realtime audio, but it could work the same for audio rendering.
You can see the full code here : https://framagit.org/johannphilippe/csounddart/-/blob/master/lib/csoundnative.dart

Potential Impact of Violating ZwOpenKey Should Only Be Called at IRQL = PASSIVE_LEVEL

A 3rd party data loss prevention driver when enabled driver verifier on it causes driver verifier bugcheck based on IrqlZwPassive Rule
The crash includes the following information:
ZwOpenKey should only be called at IRQL = PASSIVE_LEVEL.
What are some of the potential impacts to a Windows system if ZwOpenKey is used outside of IRQL=PASSIVE_LEVEL?
Is this always a serious problem that we should raise with a vendor, or only in certain scenarios.
all Zw api in kernel must be called only on PASSIVE_LEVEL. this is by design. if call it on APC_LEVEL this already will be UB some times this can work, some times produce hang or crash. say in case ZwOpenKey - registry manager can read key data from disk, if it still not in memory. so pass IRP to filesystem and wait for it complete. but Irp for completion can insert special APC (IopCompleteRequest) in calling thread. if thread on APC level - APC will not be executed, until IRQL of thread not lower to passive. but it never done - he wait on IRP complete..
another point - on exit from Zw service, system check - are UserApcPending in Thread and if yes, raise IRQL to APC_LEVEL, initiate user apc, and lower it back to PASSIVE_LEVEL (system assume that Zw called on PASSIVE_LEVEL) - so you can enter to Zw api at APC_LEVEL and exit on PASSIVE_LEVEL. can ask - why thread at some time have APC_LEVEL ? simply, because nothing to do IRQL raised ? or exist some requirements why at some point must be APC_LEVEL ? if yes, what is be if situation require stay on APC_LEVEL but thread ahead of time lower IRQL to PASSIVE_LEVEL ? really UB. in most case can be nothing. but in some case can be very nasty bug which very hard catch and research.

ios modify registers to call function

i connect to iphone's debugserver and able to send GDB Serial Protocol packets. I can set breakpoint and wait until it reached. When it did i want to call objc_msgSend with known parameters, get it's output and continue execution. For now i am simulating it's process in xcode and lldb, so i can not use just 'call objc_msgSend(object, _cmd)'.
what i do:
set breakpoint to some code
register read pc // read next operation address
register write lr 0x0x0000253a // set return address to continue execution (pc value)
register write pc 0x30300c88 // my objc_msgSend address
register write r0 0x16ed30 // my object address
register write r1 0x3161 // my selector address
breakpoint set -a 0x0x0000253a
continue
So i have my method called, but then app crashes and never reaches my 'return address' 0x0x0000253a. Also it rewrites r0 with return value, so my method is totally incomplete. I understand that what i do is hardcore overwriting registers without storing and restoring previous values so please help. How can i store/restore registers state, what i am doing wrong or what necessary things i do not do?
Also it could be very helpful to trace xcode's debugger for what it is doing while 'call objc_msgSend'. I tried to use this code and fruitstrap to use dtruss and then research it's output - it had thousands of memory reads and breakpoint sets, useless for me.
Note: i can use only GDB Serial Protocol.

trouble reading from __global memory after atom_inc in OpenCL

OpenCL doesn't have a global barrier that will stop all threads, so I'm trying to create a work around with the following code:
void barrier(__global uint* scratch) {
uint nThreads = get_global_size(0);
atom_inc(scratch);
/* this loop never terminates */
while(scratch[0] < nThreads) {
continue;
}
}
The idea is that each thread loops until all of them increment that one piece of memory.
However, the value read from scratch[0] never changes for the threads once it's been read, and it loops forever. I know it's being incremented because it's the correct value when I read it back to the host.
Is the global memory being locally cached? What's going on here?
Found the problem: the order in which work groups are executed is implementation defined. This means that some threads might start only after others have finished.
In the code I gave, the work groups that are started first will loop forever waiting on the the others to hit the 'barrier'. And the work groups that would be started later won't ever start because they're waiting for the first ones to finish.
If the implementation (I'm on a Radeon 5750, using Stream SDK 2.2) executes all work groups concurrently, then it probably wouldn't be an issue. But that's not the case for my setup.

Resources