Does calling convention affect a context switch? - calling-convention

Does it matter what my calling convention was for doing a context switch. As in, in AMD64, the first 4 parameters are passed via registers or something.
Does the context switch system need to worry about these details?

A context switch needs to make sure that all thread context is saved: the stack, the CPU registers, and some additional OS-specific stuff.
Since the context switch is saving everything, it does not need to know about the calling convention. It is saving the registers, regardless of whether they happen to hold parameters of the current function, or some other data.

Related

How to avoid an entire call stack being declared MainActor because a low-level function needs it?

I have an interesting query with regard to #MainActor and strict concurrency checking (-Xfrontend -warn-concurrency -Xfrontend -enable-actor-data-race-checks)
I have functions (Eg, Analytics) that at the lowest level require access to the device screen scale UIScreen.main.scale which is isolated to MainActor. However I would prefer not to have to declare the entire stack of functions above the one that accesses scale as requiring MainActor.
Is there a way to do this, or do I have no other options?
How would be the best way to ensure my code only ever calls UIScreen once and keeps the result available for next time without manually defining a var and checking if its nil? Ie is there a kind of computed property that will do this?
Edit: Is there an equivalent of this using MainActor (MainActor.run doesn't do the same thing; it seems to block synchronously):
DispatchQueue.main.async {
Thanks,
Chris
Non-UI code should not rely directly on UIScreen. The scale (for example), should be passed as a parameter, or to actors in their init. If the scale changes (which it can, when screens are added or removed), then the new value should be sent to the actor. Or the actor can observe something that publishes the scale when it changes.
The key point is accessing UIScreen from a random thread is not valid for a reason. The scale can in fact change at any time. Reading it from an actor is and should be an async call.
It sounds like you have some kind of Analytics actor. The simplest implementation of this would be to just pass the scale when you create it.

Rails class method is thread safe?

I have this method in a class:
class User < ApplicationRecord
...
def answers
#answers ||= HTTParty.get("http://www.example.com/api/users/#{self.id}/answers.json")
end
...
end
Since I'm using Puma as a web server I'm wondering if this code is thread safe? can someone confirm it and if possible explain why this is thread safe?
This in an instance method, not to be confused with a class method. The answers method is on an instance of User, as opposed to being on the User class itself. This method is caching the answers on the instance of a User, but as long as this User instance is being instantiated with each web request (such as a User.find()or User.find_by()), you’re fine because the instance is not living between threads. It’s common practice to look records up every web request in the controller, so you’re likely doing that.
If this method was on the User class directly (such as User.answers), then you’d need to evaluate whether it’s safe for that cached value to be maintained across threads and web requests.
To recap, the your only concern for thread safety is class methods, class variables (instance variables that use two at signs such as ##answers), and instance methods where the instance lives on past a single web request.
If you ever find yourself needing to use a class-level variable safely, you can use Thread.current, which is essentially a per-thread Hash (like {}) that you can store values in. For example Thread.current[:foo] = 1 would be an example. ActiveSupport uses this when setting Time.zone.
Alternatively you may find times where you need a single array that you need to safely share across threads, in which case you’d need to look into Mutex, which basically lets you have an array that you lock and unlock to give threads safe access to reading and writing in it. The Sidekiq gem uses a Mutex to manage workers, for example. You lock the Mutex, so that no one else can change it, then you write to it, and then unlock it. It’s important to note that if any other thread wants to write to the Mutex while it’s locked, it’ll have to wait for it to become unlocked (like, the thread just pauses while the other thread writes), so it’s important to lock as short as possible.

How to check if a pointer has been already disposed? [duplicate]

In another question, I found out that the Assigned() function is identical to Pointer <> nil. It has always been my understanding that Assigned() was detecting these dangling pointers, but now I've learned it does not. Dangling Pointers are those which may have been created at one point, but have since been free'd and haven't been assigned to nil yet.
If Assigned() can't detect dangling pointers, then what can? I'd like to check my object to make sure it's really a valid created object before I try to work with it. I don't use FreeAndNil as many recommend, because I like to be direct. I just use SomeObject.Free.
Access Violations are my worst enemy - I do all I can to prevent their appearance.
If you have an object variable in scope and it may or may not be a valid reference, FreeAndNil is what you should be using. That or fixing your code so that your object references are more tightly managed so it's never a question.
Access Violations shouldn't be thought of as an enemy. They're bugs: they mean you made a mistake that needs fixed. (Or that there's a bug in some code you're relying on, but I find most often that I'm the one who screwed up, especially when dealing with the RTL, VCL, or Win32 API.)
It is sometimes possible to detect when the address a pointer points to resides in a memory block that is on the heap's list of freed memory blocks. However, this requires comparing the pointer to potentially every block in the heap's free list which could contain thousands of blocks. So, this is potentially a computationally intensive operation and something you would not want to do frequently except perhaps in a severe diagnostic mode.
This technique only works while the memory block that the pointer used to point to continues to sit in the heap free list. As new objects are allocated from the heap, it is likely that the freed memory block will be removed from the heap free list and put back into active play as the home of a new, different object. The original dangling pointer still points to the same address, but the object living at that address has changed. If the newly allocated object is of the same (or compatible) type as the original object now freed, there is practically no way to know that the pointer originated as a reference to the previous object. In fact, in this very special and rare situation, the dangling pointer will actually work perfectly well. The only observable problem might be if someone notices that the data has changed out from under the pointer unexpectedly.
Unless you are allocating and freeing the same object types over and over again in rapid succession, chances are slim that the new object allocated from that freed memory block will be the same type as the original. When the types of the original and the new object are different, you have a chance of figuring out that the content has changed out from under the pointer. However, to do that you need a way to know the type of the original object that the pointer referred to. In many situations in native compiled applications, the type of the pointer variable itself is not retained at runtime. A pointer is a pointer as far as the CPU is concerned - the hardware knows very little of data types. In a severe diagnostic mode it's conceivable that you could build a lookup table to associate every pointer variable with the type allocated and assigned to it, but this is an enormous task.
That's why Assigned() is not an assertion that the pointer is valid. It just tests that the pointer is not nil.
Why did Borland create the Assigned() function to begin with? To further hide pointerisms from novice and occasional programmers. Function calls are easier to read and understand than pointer operations.
The bottom line is that you should not be attempting to detect dangling pointers in code. If you are going to refer to pointers after they have been freed, set the pointer to nil when you free it. But the best approach is not to refer to pointers after they have been freed.
So, how do you avoid referring to pointers after they have been freed? There are a couple of common idioms that get you a long way.
Create objects in a constructor and destroy them in the destructor. Then you simply cannot refer to the pointer before creation or after destruction.
Use a local variable pointer that is created at the beginning of the function and destroyed as the last act of the function.
One thing I would strongly recommend is to avoid writing if Assigned() tests into your code unless it is expected behaviour that the pointer may not be created. Your code will become hard to read and you will also lose track of whether the pointer being nil is to be expected or is a bug.
Of course we all do make mistakes and leave dangling pointers. Using FreeAndNil is one cheap way to ensure that dangling pointer access is detected. A more effective method is to use FastMM in full debug mode. I cannot recommend this highly enough. If you are not using this wonderful tool, you should start doing so ASAP.
If you find yourself struggling with dangling pointers and you find it hard to work out why then you probably need to refactor the code to fit into one of the two idioms above.
You can draw a parallel with array indexing errors. My advice is not to check in code for validity of index. Instead use range checking and let the tools do the work and keep the code clean. The exception to this is where the input comes from outside your program, e.g. user input.
My parting shot: only ever write if Assigned if it is normal behaviour for the pointer to be nil.
Use a memory manager, such as FastMM, that provides debugging support, in particular to fill a block of freed memory with a given byte pattern. You can then dereference the pointer to see if it points at a memory block that starts with the byte pattern, or you can let the code run normallly ad raise an AV if it tries to access a freed memory block through a dangling pointer. The AV's reported memory address will usually be either exactly as, or close to, the byte pattern.
Nothing can find a dangling (once valid but then not) pointer. It's your responsibility to either make sure it's set to nil when you free it's content, or to limit the scope of the pointer variable to only be available within the scope it's valid. (The second is the better solution whenever possible.)
The core point is that the way how objects are implemented in Delphi has some built-in design drawbacks:
there is no distinction between an object and a reference to an object. For "normal" variables, say a scalar (like int) or a record, these two use cases can be well told apart - there's either a type Integer or TSomeRec, or a type like PInteger = ^Integer or PSomeRec = ^TSomeRec, which are different types. This may sound like a neglectable technicality, but it isn't: a SomeRec: TSomeRec denotes "this scope is the original owner of that record and controls its lifecycle", while SomeRec: PSomeRec tells "this scope uses a transient reference to some data, but has no control over the record's lifecycle. So, as dumb it may sound, for objects there's virtually no one who has denotedly control over other objects' lifecycles. The result is - surprise - that the lifecycle state of objects may in certain situations be unclear.
an object reference is just a simple pointer. Basically, that's ok, but the problem is that there's sure a lot of code out there which treats object references as if they were a 32bit or 64bit integer number. So if e.g. Embarcadero wanted to change the implementation of an object reference (and make it not a simple pointer any more), they would break a lot of code.
But if Embarcadero wanted to eliminate dangling object pointers, they would have to redesign Delphi object references:
when an object is freed, all references to it must be freed, too. This is only possible by double-linking both, i.e. the object instance must carry a list with all of the references to it, that is, all memory addresses where such pointers are (on the lowest level). Upon destruction, that list is traversed, and all those pointers are set to nil
a little more comfortable solution were that the "one" holding such a reference can register a callback to get informed when a referenced object is destroyed. In code: when I have a reference FSomeObject: TSomeObject I would want to be able to write in e.g. SetSomeObject: FSomeObject.OnDestruction := Self.HandleDestructionOfSomeObject. But then FSomeObject can't be a pointer; instead, it would have to be at least an (advanced) record type
Of course I can implement all that by myself, but that is tedious, and isn't it something that should be addressed by the language itself? They also managed to implement for x in ...

How to do Parallel operations with Thread Scope in Web app using Ninject

I've run into the same question repeatedly whenever using a new DI framework... how do you run massively-parallel operation kicked off from an HttpRequest where each thread needs its own unique copy of the dependencies? In my case, I'm using Ninject.
The specific case I always run into is a CPU-intensive report, using Parallel.ForEach, that needs to use an Entity Framework DbContext; the EF context must be unique to the thread, but outside of these special reports the EF context it must be InRequestScope.
How do you achieve this with Ninject? Preferably allow disposing the EF context with each task on the Parallel.ForEach, since the data loaded with the context would just stay in the context and consume memory.
Note that this report is big enough to warrant Parallel.ForEach but small enough that it can run synchronously on a web request and not timeout the browser (<60 seconds). Maybe I'm weird, but I run into this need a lot.
The solution has several different moving parts that, IMO, aren't terribly well-documented parts of Ninject. The upside is that after implementing something like this, you should start feeling comfortable with Ninject in a hurry!
First, you need to change the scope for your objects so they use the HttpContext if it exists, and if not, use the current thread as a fallback. There is no documentation for this, but there is a DefaultScopeCallback that was added to the settings a while back. Set that property to your own scope callback which uses the same code in the Ninject.Web.Common source to get the HttpContext, but then use "?? Thread.CurrentThread" as the fallback. Do that in the CreateKernel code that should have been created automatically when you installed the NuGet package.
(I have substituted the StandardScopeCallbacks.Thread(ctx) where I used to have Thread.CurrentThread, since the former could conceivably change at some point. Currently those two are identical in what they do.)
private static IKernel CreateKernel()
{
var settings = new NinjectSettings{ DefaultScopeCallback = DefaultScopeCallback };
var kernel = new StandardKernel(settings);
// The rest of the default implementation of CreateKernel left out for brevity
}
private static Object DefaultScopeCallback(Ninject.Activation.IContext ctx)
{
var scope = ctx.Kernel.Components.GetAll<INinjectHttpApplicationPlugin>()
.Select(c => c.GetRequestScope(ctx)).FirstOrDefault(s => s != null);
return scope ?? Ninject.Infrastructure.StandardScopeCallbacks.Thread(ctx);
}
Also, don't forget that the Kernel needs to be set aside as a static object for access later. You don't want to new-up a new Kernel every time you need it; I make mine accessible via "MyConfig.ObjectFactory". While this is a code smell of the service locator anti-pattern, we're going to great lengths here to avoid the anti-pattern as much as possible.
Second, according to the commit description, the DefaultScopeCallback only affects explicit bindings with no explicit scope. So if, like me, you were depending on a bunch of implicit bindings that you hadn't added, you now need to configure them:
kernel.Bind(i => i.From(Assembly.GetExecutingAssembly(), Assembly.GetAssembly(typeof(Bll.MyConfig)))
.SelectAllClasses()
.BindToSelf());
If you don't like doing the above, there's another way of setting the default scope for all implicit bindings that is arguably more elegant. Changing default object scope with Ninject 2.2
Third, if you'd like to clear all cached objects from the scope at the end of each Parallel operation so that memory usage doesn't skyrocket due to EF caching or whatnot, here's how clear the Ninject cache scoped to the current thread:
Parallel.ForEach(myList, i =>
{
var threadDb = MyConfig.ObjectFactory.Get<MyContext>();
CreateModelsForItem(i, threadDb);
MyConfig.ObjectFactory.Components.Get<Ninject.Activation.Caching.ICache>().Clear(Thread.CurrentThread);
});
Note that I did some testing without that Clear line at the end, and it seemed like the EF Context was getting re-used even if that HttpRequest finished and I generated the report several more times. This was not what I wanted, so the Clear operation was important. Really, the behavior I want is closer to InCallScope, but trying to get InRequestScope with InCallScope as a fallback is a can of worms I'll open on another day.

Reading ARM CPU registers from LKM

I want to read the values stored in the Link Register or Frame Pointer from a linux kernel module and I am not sure the syntax to use. For context, I've compiled Android goldfish 3.4 kernel and am using insmod to load my module into the kernel.
My knowledge of this area is entirely hobbyist in nature, someone else might know something really stylish that obviates this dangerous and hackish method.
As a philosophical issue, the kernel doesn't tamper with user-mode operation as part of it's normal duties. This means you are going to have to tamper with the direct operation of the kernel and potentially cause crashes, corruption and other problematic c-words.
There are two ways to go about doing this. You can go through the syscall entry/exit mechanism: switching a single running thread from running usermode code to running kernel code in the context of that thread while slyly replacing it's stored registers before it goes back again. The second is the context switch mechanism itself, which switches in kernel mode from running in the context of one thread to another, again replacing relevant stored register material.
The operating theory behind all of this is that each user thread has both a user-mode stack and a kernel-mode stack. When a thread enters the kernel, the current value of the user-mode stack and instruction pointer are saved to the thread's kernel-mode stack, and the CPU switches to the kernel-mode stack. The remaining register values and flags are then also saved to the kernel stack.
At this stage, you can directly read and modify those values prior to the process being returned off the run queue. After this, when your thread returns from the kernel to user-mode, the register values and flags are popped from the kernel-mode stack, then the user-mode stack and instruction pointer values are restored from the modified values on the kernel-mode stack.
The schedule has an internal mechanism that selects the process to run next, calling switch_to(). As the name implies, this function essentially just switches the kernel stacks - it saves the current value of the stack pointer into the TCB for the current thread (called struct task_struct in Linux), and loads a previously-saved stack pointer from the TCB for the next thread. You can use this to calculate the user-mode process in question (possibly requiring a cross-reference of existing kernel-mode process structures)
The way to look at the state of the current userspace process from kernel-side is current_pt_regs() (cf. task_pt_regs() for a specific task). This gets you a pointer to a struct pt_regs, which is the same thing you'd find in the mcontext_t in a signal handler (on ARM at least). The kernel even provides nice accessor macros to make the whole caboodle rather civilised - reading through existing uses in the source should give a good feel of how to do it, but for the sake of completeness here's a trivial example*:
#include <asm/ptrace.h>
void func()
{
struct pt_regs *regs = current_pt_regs();
pr_info("User LR was %p\n", (void *)regs->ARM_lr);
}
You'd have to know the ABI details of the userspace binary to know which, if any, register is being used as a frame pointer, but if there is one it's typically in r11 or r7.
*Code typed directly into browser late at night, usual disclaimers apply, etc.

Resources