In my iOS project Metal is used extensively.
Sometimes MTLDevice newTextureWithDescriptor fails to create a texture.
Texture format is valid, it's a small RGBA 512x512 texture, everything is set up correctly.
When I print MTLDevice.currentAllocatedSize >> 20 it's 1396 MB on iPhone XR ( A12 processor ).
I've used a lot this thread for max runtime RAM estimate: https://stackoverflow.com/a/15200855/2567725
So I believe for iPhone XR max allowed RAM is 1792 MB.
I guess the texture is not created because RAM has been exsausted.
So the guestions are:
Is it true that on A12 Metal's currentAllocatedSize correlates with CPU memory, and GPU memory is "shared" with the CPU one, so 1792 MB = GPU+CPU for iPhone XR ?
Is there any way to get the description of Metal error? (now it just returns nil in newTextureWithDescriptor).
Is it possible to know how Metal allocation strategy works, e.g. how small available RAM should be to MTLDevice newTextureWithDescriptor return nil ?
UPDATE:
In some cases, however, MTLDevice.currentAllocatedSize >> 20 is much less, e.g. 14 MB, so I suspect Metal has some state corruption. But how to check what's the reason of the error?
The debug description of a texture descriptor:
textureDescriptor.pixelFormat = 70
textureDescriptor.width = 512
textureDescriptor.height = 512
textureDescriptor.textureType = 3
textureDescriptor.usage = 23
textureDescriptor.arrayLength = 1000
textureDescriptor.sampleCount = 1
textureDescriptor.mipmapLevelCount = 1
device.currentAllocatedSize >> 20 = 14
Related
I've seen a fair bit of noise about "false positives," and have even encountered them, myself.
However, this takes the cake.
Easy to reproduce, using Swift 5/Xcode 10.2, create a new single-view iOS app.
Run Leaks.
You get these critters:
Malloc 64 Bytes 1 0x600001d084c0 64 Bytes Foundation +[NSString stringWithUTF8String:]
Malloc 16 Bytes 3 < multiple > 48 Bytes
Malloc 1.50 KiB 3 < multiple > 4.50 KiB
Malloc 32 Bytes 3 < multiple > 96 Bytes
Malloc 8.00 KiB 1 0x7fc56f000c00 8.00 KiB
Malloc 64 Bytes 10 < multiple > 640 Bytes
Malloc 80 Bytes 3 < multiple > 240 Bytes
Malloc 4.00 KiB 3 < multiple > 12.00 KiB
Using the simulator (XR, iOS 12.2).
That first one has a stack trace, but it's worthless.
Is there some way that I can correct for this noise? I'm writing an infrastructure component, and I need to:
A) Make damn sure it doesn't leak, and
B) Not have every Cocoapod jockey on Earth emailing me, and telling me that my component leaks.
If using a iOS 12.1 simulator , the leak instrument still can work (Swift 5/Xcode 10.2). Currently we are hoping it will be fixed in future versions.
Why there are different data rates in each WLAN protocol.
Ex: 802.11 supports 1 and 2 Mbps,
802.11a can support 6, 9, 12, 18, 24, 36, 48, 54
802.11b can support 1, 2, 5.5, 11
etc..
802.11 and 802.11b:
Each data bits is converted into multiple bits of information for protection against errors due to noise or interference. Each of the new coded bits is called a chip. The different data rates has different chipping methods.
For example:
1 and 2 Mbps using the Barker code.
5.5 and 11 Mbps using Complementary Code Keying (CCK)
Both run at 11 Mchips/s.
Barker code has 11 chip code per symbol, CCK has 8 chip code per symbol =>
Symbol rate for Barker code 11000000/11 = 1 Msps, for CCK 1.375 Msps.
For Barker code:
DBPSK can modulate 1 bit of data => 1 bit * 1 Msps = 1 Mbps
DQPSK can modulate data bits in pairs => 2 bits * 1Msps = 2 Mbps
For CCK:
4 bits * 1.375 Msps = 5.5 Mbps
8 bits * 1.375 Msps = 11 Mbps
802.11g(802.11a):
This standart use OFDM modulation scheme. Look at modulation types:
BPSK (1 bit per symbol) => max rate is 12Mbps
QPSK (2 bits per symbol) => max rate is 24 Mbps
16-QAM (4 bits per symbol) => max rate is 48 Mbps
64-QAM (6 bits per symbol) => max rate is 72 Mbps
This types use code rate for error correction:
BPSK 1/2 => 6 Mbps
BPSK 3/4 => 9 Mbps
QPSK 1/2 => 12 Mbps
and so on.
This is my homework question. I just want to confirm my approach and hence the answer.
A computer system with 2 level paging scheme in which regular memory access takes 300 nanoseconds and servicing a pagefault takes 500ns. An average instruction takes 200 ns of CPU time and one memory access. The TLB hit ratio is 80% and page fault ratio is 20%. The average instruction execution time is ?
My Approach =>
**Average time To Execute Instruction = CPU Time + Memory Access Time**
It is given that CPU Time = 200 ns
Probability of having a page fault for an instruction = 20% = 1/5
Hence, probability of not having a page fault = 4/5
If TLB hit occurs,
then memory Access time = 0 + 300 = 300 ns ( Here, TLB is taken negligible, so, 0 )
and
if TLB miss occurs,
then Memory Access Time = TLB access time + Access time Page Table 1 +
Access time Page table 2 + one memory Access = 0+ 300 + 300 + 300 = 900 ns
( Assume all the page tables in main memory )
Hit ratio of TLB = 80 %
Hence, Memory Access Time = prob. of no page fault (...Memory
access...........) + prob. of page fault (.......Page fault service
time..........)
Memory Access Time = 4/5 ( 0.80 * 300 + 0.20 * 900 )
+ 1/5 ( (0.80 * 300 + 0.20 * 900) + 500 ns for servicing a page fault )
= 336 + 184
= 520 ns
Average time To Execute Instruction = CPU Time + Memory Access Time
Average time To Execute Instruction = 200 ns + 520 ns = 720 ns .
Please correct me, if i am doing any mistake.
I'm using Ipython parallel in an optimisation algorithm that loops a large number of times. Parallelism is invoked in the loop using the map method of a LoadBalancedView (twice), a DirectView's dictionary interface and an invocation of a %px magic. I'm running the algorithm in an Ipython notebook.
I find that the memory consumed by both the kernel running the algorithm and one of the controllers increases steadily over time, limiting the number of loops I can execute (since available memory is limited).
Using heapy, I profiled memory use after a run of about 38 thousand loops:
Partition of a set of 98385344 objects. Total size = 18016840352 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 5059553 5 9269101096 51 9269101096 51 IPython.parallel.client.client.Metadata
1 19795077 20 2915510312 16 12184611408 68 list
2 24030949 24 1641114880 9 13825726288 77 str
3 5062764 5 1424092704 8 15249818992 85 dict (no owner)
4 20238219 21 971434512 5 16221253504 90 datetime.datetime
5 401177 0 426782056 2 16648035560 92 scipy.optimize.optimize.OptimizeResult
6 3 0 402654816 2 17050690376 95 collections.defaultdict
7 4359721 4 323814160 2 17374504536 96 tuple
8 8166865 8 196004760 1 17570509296 98 numpy.float64
9 5488027 6 131712648 1 17702221944 98 int
<1582 more rows. Type e.g. '_.more' to view.>
You can see that about half the memory is used by IPython.parallel.client.client.Metadata instances. A good indicator that results from the map invocations are being cached is the 401177 OptimizeResult instances, the same number as the number of optimize invocations via lbview.map - I am not caching them in my code.
Is there a way I can control this memory usage on both the kernel and the Ipython parallel controller (who'se memory consumption is comparable to the kernel)?
Ipython parallel clients and controllers store past results and other metadata from past transactions.
The IPython.parallel.Client class provides a method for clearing this data:
Client.purge_everything()
documented here. There is also purge_results() and purge_local_results() methods that give you some control over what gets purged.
I found this question in one of my previous exam papers and I am not really sure if I got the right answer to it. As far as I see 2^15 is 32768 which is 32 MB so the answer could be 15 bits. But I think I'm missing something here?
32768 bytes is not 32 Mb.
32 Mb = 32 * 1024Kb = 32 * 1024 * 1024 bytes = 2^5 * 2^10 * 2^10 = 2^25
That is, 33.554.432 bytes = 32 Mb.
So you will need, at least 25 bits to address a single byte in that memory scheme.
Yes, some powers of 10. 32768<>32MB
1M is 2^20, 32 is 2^5, so you need 25 bits.
Since 1MB = 10^6 bytes i.e. 2^20 bytes for 32 MB we have:
32 = 2^5 bytes
1MB = 2^20 bytes so,
32MB = 2^5 * 2^20 = 2^25 bytes,
BUT the question asks "How many address bits..." not bytes, therefore we multiply by 8 = 2^3 (because 1byte = 8bits), that is
32 Mbytes = 2^5 * 2^20 *2^3 = 2^28
Thus, 28 bits are needed.