Dispatch Thread Groups Error - ios

I am running computer vision algorithms on a video feed I'm getting in real time. I'm running these operations/algorithms using DispatchQueue asynchronously. However, I am getting the following error, which I cannot interpret:
[MTLDebugComputeCommandEncoder dispatchThreadgroups:threadsPerThreadgroup:]:949: failed assertion (threadgroupsPerGrid.width(0) * threadgroupsPerGrid.y(12) * threadgroupsPerGrid.depth(1))(0) must not be 0.'
What is this error?

This message is stating that there was an assertion failure caused by an assertion in [MTLDebugComputeCommandEncoder dispatchThreadgroups:threadsPerThreadgroup:].
As far as I understand it, it was asserted that this expression:
threadgroupsPerGrid.width * threadgroupsPerGrid.y * threadgroupsPerGrid.depth
should not be 0, but it was 0, causing this assertion failure. Additionally, they've annotated the values of these variables:
threadgroupsPerGrid.width was 0
threadgroupsPerGrid.y was 12
threadgroupsPerGrid.depth was 1
threadgroupsPerGrid.width * threadgroupsPerGrid.y * threadgroupsPerGrid.depth evaluated to 0
This invalid state is likely a result of you passing invalid arguments to [MTLDebugComputeCommandEncoder dispatchThreadgroups:threadsPerThreadgroup:]. If I had to guess, the issue is probably that threadgroupsPerGrid.width was 0.

Related

CloudWatch Events Metrics - DeadLetterInvocations

The documentation says:
Metric: DeadLetterInvocations
Description:
Measures the number of times a rule’s target is not invoked in
response to an event. This includes invocations that would result in
triggering the same rule again, causing an infinite loop.
Valid Dimensions: RuleName
Units: Count
Can someone give a simple explanation of what the above description means in layman's terms.
I was also confused about dead letter invocation before,
correct explanation:
DeadLetterInvocations: Invocations that are failed temporarily and are being retried by the event rule itself. Only some events, such as those that fail due to a throttling or timeout error, are retried.
InvocationsSentToDlq: Permanently failed invocations sent to SQS DLqueue configured in the target.

Uploading Program to OpenMPI gives initialization error, on IntelMPI memory leak

I am a graduate student (master's) and use an in-house code for running my simulations that use MPI. Earlier, I used OpenMPI on a supercomputer we used to access and since it shut down I've been trying to switch to another supercomputer that has Intel MPI installed on it. The problem is, the same code that was working perfectly fine earlier now gives memory leaks after a set number of iterations (time steps). Since the code is relatively large and my knowledge of MPI is very basic, it is proving very difficult to debug it.
So I installed OpenMPI onto this new supercomputer I am using, but it gives the following error message upon execution and then terminates:
Invalid number of PE
Please check partitioning pattern or number of PE
NOTE: The error message is repeated for as many numbers of nodes I used to run the case (here, 8). Compiled using mpif90 with -fopenmp for thread parallelisation.
There is in fact no guarantee that running it on OpenMPI won't give the memory leak, but it is worth a shot I feel, as it was running perfectly fine earlier.
PS: On Intel MPI, this is the error I got (compiled with mpiifort with -qopenmp)
Abort(941211497) on node 16 (rank 16 in comm 0): Fatal error in PMPI_Isend: >Unknown error class, error stack:
PMPI_Isend(152)...........: MPI_Isend(buf=0x2aba1cbc8060, count=4900, dtype=0x4c000829, dest=20, tag=0, MPI_COMM_WORLD, request=0x7ffec8586e5c) failed
MPID_Isend(662)...........:
MPID_isend_unsafe(282)....:
MPIDI_OFI_send_normal(305): failure occurred while allocating memory for a request object
Abort(203013993) on node 17 (rank 17 in comm 0): Fatal error in PMPI_Isend: >Unknown error class, error stack:
PMPI_Isend(152)...........: MPI_Isend(buf=0x2b38c479c060, count=4900, dtype=0x4c000829, dest=21, tag=0, MPI_COMM_WORLD, request=0x7fffc20097dc) failed
MPID_Isend(662)...........:
MPID_isend_unsafe(282)....:
MPIDI_OFI_send_normal(305): failure occurred while allocating memory for a request object
[mpiexec#cx0321.obcx] HYD_sock_write (../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:357): write error (Bad file descriptor)
[mpiexec#cx0321.obcx] cmd_bcast_root (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:164): error sending cmd 15 to proxy
[mpiexec#cx0321.obcx] send_abort_rank_downstream (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:557): unable to send response downstream
[mpiexec#cx0321.obcx] control_cb (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1576): unable to send abort rank to downstreams
[mpiexec#cx0321.obcx] HYDI_dmx_poll_wait_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:79): callback returned error status
[mpiexec#cx0321.obcx] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1962): error waiting for event"
I will be happy to provide the code in case somebody is willing to take a look at it. It is written using Fortran with some of the functions written in C. My research progress has been completely halted due to this problem and nobody at my lab has enough experience with MPI to resolve this.

computeFunction must not be nil error reported

The error that ComputeFunction must not be nill was reported after a metal shading function has been called repeatedly for about 248 times.
/Library/Caches/com.apple.xbs/Sources/Metal/Metal-56.6/Framework/MTLComputePipeline.mm:230: failed assertion `computeFunction must not be nil.'
Abort trap: 6
The first 247 calls worked correctly, but the program failed at the 248th call.
What causes this and how can it be avoided?
Thanks in advance.
only the command buffer and encoder are transient and can be created on every call (inside the draw() function). libraries/functions are not transient so you should avoid creating them repeatedly.

Testing `errno` after calling `strtol` returns "No such process"

Even though the string conversion succeeds, testing errnoreturns a value indicating an error:
#include <stdlib.h>
#include <sys/errno.h>
const char* numberString = "7";
char* endPtr;
errno = 0;
long number = strtol(numberString, &endPtr, 10);
NSLog(#"%ld", number);
if (errno) {
perror("string to integer conversion failed");
}
The output is (on the Simulator, iOS 7)
$ 2014-05-22 09:27:32.954 Test[2144:60b] 7
$ string to integer conversion failed: No such process
The behavior is similar on the device.
The man page for strtol says in a comment:
RETURN VALUES
The strtol(), strtoll(), strtoimax(), and strtoq() functions return the result of the conversion,
unless the value would underflow or overflow. If no conversion could be performed, 0 is returned and
the global variable errno is set to EINVAL (the last feature is not portable across all platforms). If
an overflow or underflow occurs, errno is set to ERANGE and the function return value is clamped
according to the following table.
It's quite unclear what this exactly means for iOS. Any insights here?
Edit:
It turned out that the function call NSLog did set errno. So, #Mat in his answer and comments was spot on, saying that "all bets are off when testing errno AFTER calling an unrelated function (here NSLog).
If no conversion could be performed, 0 is returned
You're not in that case, 7 was returned.
If an overflow or underflow occurs, ... the return value is clamped according to the following table.
You're not in that case either.
So strtol didn't fail. Inspecting errno is meaningless. Function that are documented to set errno on failure will do so, in case of failure. When no failure happened, the value of errno is "morally" undefined. Don't inspect it.
strtol is a special case though. POSIX requires the following:
These functions shall not change the setting of errno if successful.
So your example should be ok. Except that you're calling a function between the strtol call and your inspection of errno. If you're 100% sure that that function will not itself change errno, or call another function that might set it, then you'd be ok in this very specific case (a function documented not to alter errno in case of success - this is not the norm). Apparently that's not the case though. NSLog will very likely use some system calls at some point, and those (in general) have no guarantee of not altering errno.

Assertion failed: xdrPtr && xdrPtr == *xdrLPP, file xx.cpp, line 2349

Have a system build using C++ Builder 2010 that after running for about 20 hours it starts firing of assertion failures.
Assertion failed: xdrPtr && xdrPtr == *xdrLPP, file xx.cpp, line 2349
Tried google on it like crazy but not much info. Some people seem to refer a bunch of different assertions in xx.cpp to shortcomings in the exception handling in C++ Builder. But I haven't found anything referencing this particular line in the file.
We have integrated madExcept and it seems like somewhere along the way this catches an out of memory exception, but not sure if it's connected. No matter what an assertion triggering doesn't seem correct.
Edit:
I found an instance of a if-statement that as part of it's statement used a function that could throw an exception. I wonder if this could be the culprit somehow messing up the flow of the exception handling or something?
Consider
if(foo() == 0) {
...
}
wrapped in a try catch block.
If an exception is thrown from within foo() so that no int is returned here how will the if statement react? I'm thinking it still might try to finish executing that line and this performing the if check on the return of the function which will barf since no int was returned. Is this well defined or is this undefined behaviour?
Wouldn't
int fooStatus = foo();
if(fooStatus == 0) {
...
}
be better (or should I say safer)?
Edit 2:
I just managed to get the assertion on my dev machine (the application just standing idle) without any exception about memory popping up and the app only consuming around 100 mb. So they were probably not connected.
Will try to see if I can catch it again and see around where it barfs.
Edit 3:
Managed to catch it. First comes an assertion failure notice like explained. Then the debugger shows me this exception notification.
If I break it takes me here in the code
It actually highlights the first code line after
pConnection->Open();
But it seems I can change this to anything and that line is still highlighted. So my guess is that the error is in the code above it somehow. I have seen more reports about people getting this type of assertion failure when working with databases in RAD Studio... hmmmm.
Update:
I found a thread that recursively called it's own Execute function if it wasn't able to reach the DB server. I think this is at least part of the issue. This will just keep on trying and as more and more worker threads spawn and also keep trying it can only end in disaster.
If madExcept is hinting that you have an out of memory condition, the assert could fail if the pointers are NULL (i.e. the allocation failed). What are the values of xdrPtr and xdrLPP when the assert occurs? Can you trace back to where they are allocated?
I would start looking for memory leaks.

Resources