I found kern/processor.h files that include current_processor().
But I cannot use current_processor() even i include kern/processor.h file.
Is there any methods to get current processor id?
Add 030420 : I need any methods that get current processor id and methods should be possible to used in KEXT. current_processor(), cpu_number() doesn't work on KEXT.
The following function is declared in <kern/cpu_number.h>:
extern int cpu_number(void);
and returns the index of the CPU on which the code is currently executing.
Please note that this is in the unsupported KPI however, so you need to link against com.apple.kpi.unsupported.
Also note that the result will be meaningless unless preemption is disabled, which is of course normally not the case, only when a spinlock is held, or when running in a primary interrupt handler. Preemption being enabled means that the running thread can be rescheduled at any time, so by the time your code uses the CPU number it obtained by calling the above function, it may already have been rescheduled to run on a different CPU.
Related
I'm a newbie of FreeRTOS. But I don't think it's well documented. As in xTaskCreate() :
pcName A descriptive name for the task. This is mainly used to facilitate debugging, but can also be used to obtain a task handle.
The maximum length of a task’s name is set using the configMAX_TASK_NAME_LEN parameter in FreeRTOSConfig.h.
Is a task must be associated with a name?
What happened if pcName is NULL?
What happened if I created multiple tasks with the same name?
Should I maintain the mapping between task's handle and name? Or FreeRTOS maintains this relation?
In a word, it's not clear about the relation of task's handle and it's name in official document.
No it does not need a name
If pcName is NULL it simply does not have a name. Nothing happens.
You will get several tasks with the same name so you cannot identify them by name. The behaviour of xTaskGetHandle is then undefined (as per the documentation).
FreeRTOS handles this.
The documentation also states this function takes a long time to complete and should be used sparingly. Personally I don't think there is any reason to use this at all. Just use task names for debug purposes only (useful in a debugger or when using Percipio Trace)
I am implementing a HDF5 layer in an interpreted language with automatic reclamation facilities (garbage collect).
When a proxy to a HDF5 entity (H5File, H5Group, H5Dataset, H5Dataspace, H5Datatype, etc...) will be no longer referenced, it will be automatically reclaimed. With ephemeron like facility, I can arrange to be noticed and invoke the corresponding close function automagically (H5Fclose, H5Gclose, H5Dclose, etc...) in order to release the target resource.
By default, I have no control on the order of reclamation. However, if ever order of close counts, then I can arrange to keep a strong pointer on a parent proxy (for example the H5 File) from within any other entity. If order does not count, then I will avoid this useless complication.
So my questions:
Can I invoke H5Fclose(fid); before H5Gclose(gid); where previously gid=H5Gcreate(fid,'/foo',H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);?
Can I continue to operate on the group once I closed the containing file? For example, is it legal to call H5Fclose(fid); before gid2=H5Gcreate(gid,'bar',H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT); in above example? If not, are there other entities concerned, or is it just file?
Doh, case of blindness, the documentation tells that close is delayed until all objects have been closed, so 1. order does not count and 2. is legal.
https://support.hdfgroup.org/HDF5/doc1.6/RM_H5F.html#File-Close
However, it may not work in every circumstances, so it's not recommended.
H5Fclose terminates access to an HDF5 file by flushing all data to storage and terminating access to the file through file_id.
If this is the last file identifier open for the file and no other access identifier is open (e.g., a dataset identifier, group identifier, or shared datatype identifier), the file will be fully closed and access will end.
Delayed close:
Note the following deviation from the above-described behavior. If H5Fclose is called for a file but one or more objects within the file remain open, those objects will remain accessible until they are individually closed. Thus, if the dataset data_sample is open when H5Fclose is called for the file containing it, data_sample will remain open and accessible (including writable) until it is explicitely closed. The file will be automatically closed once all objects in the file have been closed.
Be warned, however, that there are circumstances where it is not possible to delay closing a file. For example, an MPI-IO file close is a collective call; all of the processes that opened the file must close it collectively. The file cannot be closed at some time in the future by each process in an independent fashion. Another example is that an application using an AFS token-based file access privilage may destroy its AFS token after H5Fclose has returned successfully. This would make any future access to the file, or any object within it, illegal.
In such situations, applications must close all open objects in a file before calling H5Fclose. It is generally recommended to do so in all cases.
In vulkan.h, every instance of VkAccessFlagBits appears in a pair that contains a srcAccessMask and a dstAccessMask:
VkAccessFlags srcAccessMask;
VkAccessFlags dstAccessMask;
In every case, according to my understanding, the purpose of these masks is to help designate two sets of operations, such that results of operations in the first set will be visible to operations in the second set. For instance, write operations occurring prior to a barrier should not get hung up in caches but should instead propagate all the way to locations from which they can be read after the barrier. Or something like that.
The access flags come in both READ and WRITE forms:
/* ... */
VK_ACCESS_SHADER_READ_BIT = 0x00000020,
VK_ACCESS_SHADER_WRITE_BIT = 0x00000040,
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT = 0x00000080,
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT = 0x00000100,
/* ... */
But it seems to me that srcAccessMask should probably always be some sort of VK_ACCESS_*_WRITE_BIT combination, while dstAccessMask should always be a combination of VK_ACCESS_*_READ_BIT values. If that is true, then the READ/WRITE distinction is identical to and implicit in the src/dst distinction, and so it should be good enough to just have VK_ACCESS_SHADER_BIT etc., without READ_ or WRITE_ variants.
Why are there READ_ and WRITE_ variants, then? Is it ever useful to specify that some read operations must fully complete before some other operations have begun? Note that all operations using VkAccessFlagBits produce (I think) execution dependencies as well as memory dependencies. It seems to me that the execution dependencies should be good enough to prevent earlier reads from receiving values written by later writes.
While writing this question I encountered a statement in the Vulkan specification that provides at least part of an answer:
Memory dependencies are used to solve data hazards, e.g. to ensure that write operations are visible to subsequent read operations (read-after-write hazard), as well as write-after-write hazards. Write-after-read and read-after-read hazards only require execution dependencies to synchronize.
This is from the section 6.4. Execution And Memory Dependencies. Also, from earlier in that section:
The application must use memory dependencies to make writes visible before subsequent reads can rely on them, and before subsequent writes can overwrite them. Failure to do so causes the result of the reads to be undefined, and the order of writes to be undefined.
From this I surmise that, yes, the execution dependencies produced by the Vulkan commands that involve these access flags probably do free you from ever having to put a VK_ACCESS_*_READ_BIT into a srcAccessMask field--but that you might in fact want to have READ_ flags, WRITE_ flags, or both in some of your dstAccessMask fields, because apparently it's possible to use an explicit dependency to prevent read-after-write hazards in such a way that write-after-write hazards are NOT prevented. (And maybe vice-versa?)
Like, maybe your Vulkan will sometimes decide that a write does not actually need to be propagated all the way through a particular cache to its final specified destination for the sake of a subsequent read operation, IF Vulkan happens to know that that read operation will simply read from that same cache, saving some time? But then a second write might happen, and write to a different cache, and there'll be two caches left in a race (with the choice of winner undefined) to send their two values to the same spot. Or something? Maybe my mental model of these caches is entirely wrong.
It is fairly solidly established, at least, that memory barriers are confusing.
Let's go over all the possibilities:
read–read — well yeah that one is pretty useless. Khronos seems to agree #131 it is pointless value in src (basically equivalent to 0).
read–write — execution dependency should be sufficient to synchronize without this. Khronos seems to agree #131 it is pointless value in src (basically equivalent to 0).
write–read — that's the obvious and most common one.
write–write — similar reason to write–read above. Without it the order of the writes would be undefined. It is a bit pointless for most situations to write something you haven't even read in between. But hey, now you have a way to synchronize it.
You can provide bitmask of more of these masks to both src and dst. In which case it makes sense to have both masks for driver to sort the dependencies out for you. (I don't expect performance overhead from this on API level, so it is allowed as convenience)
From API design perspective, it could mean adding different enum for srcAccess. But perhaps _READ variants could just be forbidden in srcAccess through "Valid Usage", making this argument weak. The src == READ variant might have been kept, because it is benign.
I have an application what writes commands to some specialized printers directly on LPT1 port. The code looks like this:
AssignFile(t, 'LPT1');
Rewrite(t);
Write(t,#27 + '#'); // initialize
Sleep(50); // avoid buffer fill
Write(t,#27#32 + Chr(0)); // set default font
...
The problem is that when the printer is not connected to the port, the first Write instruction doesn't do anything, it just hangs up and the entire thread is locked.
Is there a way to define a timeout for these instructions, or can you recommend another library that could do this job? It would be great if it had a Write function similar to the one in Delphi, because the amount of code using this approach is very large, and it would be very hard to change all of it.
You can use SetCommTimeouts to configure a timeout for the printer handle. To get the handle from your TextFile variable, type-cast it to TTextRec and read the Handle field:
var
CommTimeouts: TCommTimeouts;
CommTimeouts.WriteTotalTimeoutConstant := DesiredTimeout;
Win32Check(SetCommTimeouts(TTextRec(t).Handle, CommTimeouts));
You may wish to call GetCommTimeouts first to discover the default values for the other fields before you set the ones you need.
Move your printing code to a separate thread. The built-in text-file functions don't have any timeout mechanism, but you can tell the OS to cancel any pending I/O operations whenever you decide that too much time has passed.
I'd start with CancelSynchronousIo, which cancels all I/O on a given thread. It should allow you to keep all your existing Write calls. Be prepared to handle when they fail upon being cancelled.
That function requires Windows Vista or higher, which shouldn't be a problem nowadays, but won't work if you still need support for Windows XP. In that case, you'll need to use CreateFile to open the port for overlapped I/O. Then you can use CancelIo or CancelIoEx. You'll need to replace all your Write calls since the built-in Delphi functions don't support overlapped operations.
I am having trouble using pthreads in my MPI program. My program runs fine without involving pthreads. But I then decided to execute a time-consuming operation in parallel and hence I create a pthread that does the following (MPI_Probe, MPI_Get_count, and MPI_Recv). My program fails at MPI_Probe and no error code is returned. This is how I initialize the MPI environment
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided_threading_support);
The provided threading support is '3' which I assume is MPI_THREAD_SERIALIZED. Any ideas on how I can solve this problem?
The provided threading support is '3' which I assume is MPI_THREAD_SERIALIZED.
The MPI standard defines thread support levels as named constants and only requires that their values are monotonic, i.e. MPI_THREAD_SINGLE < MPI_THREAD_FUNNELED < MPI_THREAD_SERIALIZED < MPI_THREAD_MULTIPLE. The actual numeric values are implementation-specific and should never be used or compared against.
MPI communication calls by default never return error codes other than MPI_SUCCESS. The reason for that is, MPI calls the communicator's error handler before an MPI call returns and all communicators are initially created with MPI_ERRORS_ARE_FATAL installed as their error handler. That error handler terminates the program and usually prints some debugging information, e.g. the reason for the failure. Both MPICH (and its countless variants) and Open MPI produce quite elaborate reports on what led to the termination.
To enable user error handling on communicator comm, you should make the following call:
MPI_Comm_set_errhandler(comm, MPI_ERRORS_RETURN);
Watch out for the error codes returned - their numerical values are also implementation-specific.
If your MPI implementation isn't willing to give you MPI_THREAD_MULTIPLE, there's three things you can do:
Get a new MPI implementation.
Protect MPI calls with a critical section.
Cut it out with the threading thing.
I would suggest #3. The whole point of MPI is parallelism -- if you find yourself creating multiple threads for a single MPI subprocess, you should consider whether those threads should have been independent subprocesses to begin with.
Particularly with MPI_THREAD_MULTIPLE. I could maybe see a use for MPI_THREAD_SERIALIZED, if your threads are sub-subprocess workers for the main subprocess thread... but MULTIPLE implies that you're tossing data around all over the place. That loses you the primary convenience offered by MPI, namely synchronization. You'll find yourself essentially reimplementing MPI on top of MPI.
Okay, now that you've read all that, the punchline: 3 is MPI_THREAD_MULTIPLE. But seriously. Reconsider your architecture.