Eigen empty sparse matrix memory usage

Eigen empty sparse matrix memory usage - memory

I'm trying to understand the memory usage of my program which is using EIGEN, and there's a part related to EIGEN that I'm not understanding.
I'm creating a SparseMatrix<short,RowMajor>(2,3), empty, and the cost of this is 12 bytes. Inner and Outer index are int.
I was expecting 8 bytes and I don't understand why I'm 1 integer lower than the effective cost. Here is my calculation :
Cost of non zeroes values = 0 bytes
Cost of inner index = 0 bytes
Cost of outer index = 2 rows * 4 bytes = 8 bytes
Total cost = 8 bytes
I guess my mistake is on the inner index cost, but I don't understand why? The whole matrix is empty so the inner array should be empty too?
Thanks in advance.

Look there, the outer index buffers has one more entry to store the end position of the last row.

Related

Does ID3D11DeviceContext::DrawIndexed() have UB if I use 16 bit Indices with an Offset?

ID3D11DeviceContext::DrawIndexed() has a parameter StartIndexLocation, which adds a value to each Index when drawing.
What happens if I use 16 bit Indices ?
The highest value 16 bit can represent is 65535. What If my Draw Call has 10000 vertices and I use a StartIndexLocation of 65000 ? Will it invoke UB?

StartIndexLocation is not a byte-position, but an Index position. Cross reference DirectX Index Buffer -> Start of Index Buffer
So the maximum StartIndexLocation is not related to the stride of the Index Buffer.

What is the relation between address lines and memory?

These are my assignments:
Write a program to find the number of address lines in an n Kbytes of memory. Assume that n is always to the power of 2.
Sample input: 2
Sample output: 11
I don't need specific coding help, but I don't know the relation between address lines and memory.

To express in very easy terms, without any bus-multiplexing, the number of bits required to address a memory is the number of lines (address or data) required to access that memory.
Quoting from the Wikipedia article,
a system with a 32-bit address bus can address 232 (4,294,967,296) memory locations.
for a simple example, consider this, you have 3 address lines (A, B, C), so the values which can be formed using 3 bits are
A B C
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
Total 8 values. So using ABC, you can access any of those eight values, i.e., you can reach any of those memory addresses.
So, TL;DR, the simple relationship is, with n number of lines, we can represent 2n number of addresses.

An address line usually refers to a physical connection between a CPU/chipset and memory. They specify which address to access in the memory. So the task is to find out how many bits are required to pass the input number as an address.
In your example, the input is 2 kilobytes = 2048 = 2^11, hence the answer 11. If your input is 64 kilobytes, the answer is 16 (65536 = 2^16).

Direct Mapped Cache of Blocks Example

So i have this question in my homework assignment that i have struggling a bit with. I looked over my lecture content/notes and have been able to utilize those to answer the questions, however, i am not 100% sure that i did everything correctly. There are two parts (part C and D) in the question that i was not able to figure out even after consulting my notes and online sources. I am not looking for a solution for those two parts by any means, but it would be greatly appreciated if i could get, at least, a nudge in the right direction in how i can go about solving it.
I know this is a rather large question, however, i hope someone could possibly check my answers and tell me if all my work and methods of looking at this problem is correct. As always, thank you for any help :)
Alright, so now that we have the formalities out of the way,
--------------------------Here is the Question:--------------------------
Suppose a small direct-mapped cache of blocks with 32 blocks is constructed. Each cache block stores
eight 32-bit words. The main memory—which is byte addressable1—is 16,384 bytes in size. 32-bit words are stored
word aligned in memory, i.e., at an address that is divisible by 4.
(a) How many 32-bit words can the memory store (in decimal)?
(b) How many address bits would be required to address each byte of memory?
(c) What is the range of memory addresses, in hex? That is, what are the addresses of the first and last bytes of
memory? I'll give you a hint: memory addresses are numbered starting at 0.
(d) What would be the address of the last word in memory?
(e) Using the cache mapping scheme discussed in the Chapter 5 lecture notes, how many and which address bits
would be used to form the block offset?
(f) How many and which memory address bits would be used to form the cache index?
(g) How many and which address bits would be used to form the tag field for each cache block?
(h) To which cache block (in decimal) would memory address 0x2A5C map to?
(i) What would be the block offset (in decimal) for 0x2A5C?
(j) How many other main memory words would map to the same block as 0x2A5C?
(k) When the word at 0x2A5C is moved into a cache block, what are the memory addresses (in hex) of the other
words which will also be moved into this block? Express your answer as a range, e.g., [0x0000, 0x0200].
(l) The first word of a main memory block that is mapped to a cache block will always be at an address that is
divisible by __ (in decimal)?
(m) Including the V and tag bits of each cache block, what would be the total size of the cache (in bytes)
(n) what would be the size allocated for the data bits (in bytes)?
----------------------My answers and work-----------------------------------
a) memory = 16384 bytes. 16384 bytes into bits = 131072 bits. 131072/32 = 4096 32-bit words
b) 2^14 (main memory) * 2^2 (4 bits/word) = 2^16. take log(base2)(2^16) = 16 bits
c) couldnt figure this part out (would appreciate some input (NOT A SOLUTION) on how i can go about looking at this problem
d)could not figure this part out either :(
e)8 words in each cache line. 8 * 4(2^2 bits/word) = 32 bits in each cache line. log(base2)(2^5) = 5 bits used for block offset.
f) # of blocks = 2^5 = 32 blocks. log(base2)(2^5) = 5 bits for cache index
g) tag = 16 - 5 - 5 - 2(word alignment) = 4 bits
h) 0x2A5C
0010 10100 10111 00
tag index offset word aligned bits
maps to cache block index = 10100 = 0x14
i) maps to block offset = 10111 = 0x17
j) 4 tag bits, 5 block offset = 2^9 other main memory words
k) it is a permutation of the block offsets. so it maps the memory addresses with the same tag and cache index bits and block offsets of 0x00 0x01 0x02 0x04 0x08 0x10 0x11 0x12 0x14 0x18 0x1C 0x1E 0x1F
l)divisible by 4
m) 2(V+tag+data) = 2(1+4+2^3*2^5) = 522 bits = 65.25 bytes
n)data bits = 2^5 blocks * 2^3 words per block = 256 bits = 32 bytes

Part C:
If a memory has M bytes, and the memory is byte addressable, the the memory addresses range from 0 to M - 1.
For your question, this means that memory addresses range from 0 to 16383, or in hex 0x0 to 0x3FFF.
Part D:
Words are 4 bytes long. So given your answer to C, the last word is at:
(0x3FFFF - 3) -> 0x3FFC.
You can see that this is correct because the lowest 2 bits of the address are 0, which must be true of any 4 byte aligned address.

Cache Addressing: Length of Index, Block offset, Byte offset & Tag?

Let's say I know the following values:
W = Word length (= 32 bits)
S = Cache size in words
B = Block size in words
M = Main memory size in words
How do I calculate how many bits are needed for:
- Index
- Block offset
- Byte offset
- Tag
a) in Direct Mapped Cache
b) in Fully Associative Cache?

The address may be split up into the following parts:
[ tag | index | block or line offset | byte offset ]
Number of byte offset bits
0 for word-addressable memory, log2(bytes per word) for byte addressable memory
Number of block or line offset bits
log2(words per line)
Number of index bits
log2(CS), where CS is the number of cache sets.
For Fully Associative, CS is 1. Since log2(1) is 0, there are no
index bits.
For Direct Mapped, CS is equal to CL, the number of cache lines, so the number of index bits is log2(CS) === log2(CL).
For n-way Associative CS = CL ÷ n: log2(CL ÷ n) index bits.
How many cache lines you have got can be calculated by dividing the cache size by the block size = S/B (assuming they both do not include the size for tag and valid bits).
Number of tag bits
Length of address minus number of bits used for offset(s) and index. The Length of the the addresses can be calculated using the size of the main memory, as e.g. any byte needs to be addressed, if it's a byte addressable memory.
Source: http://babbage.cs.qc.edu/courses/cs343/cache_parameters.xhtml

iOS calculating sum of filesizes always negative

I've got a strange problem here, and i'm sure it's just something small.
I recieve information about files via JSON (RestKit is doing a good job).
I write the filesize of each file via coredata to a local store.
Afterwards within one of my viewcontrollers i need to sum up the files-sizes of all files in database. I fetch all files and then going through a slope (for) to sum the size up.
The problem is now, the result is always negative!
The coredata entity filesize is of type Integer 32 (filesize is reported in bytes by JSON).
I read the fetchresult in an NSArray allPublicationsToLoad and then try to sum up. The Objects in the NSArray of Type CDPublication have a value filesize of Type NSNumber:
for(int n = 0; n < [allPublicationsToLoad count]; n = n + 1)
{
CDPublication* thePub = [allPublicationsToLoad objectAtIndex:n];
allPublicationsSize = allPublicationsSize + [[thePub filesize] integerValue];
sum = [NSNumber numberWithFloat:([sum floatValue] + [[thePub filesize] floatValue])];
Each single filesize of the single CDPublications objects are positive and correct. Only the sum of all the filesizes ist negative afterwards. There are around 240 objects right now with filesize-values between 4000 and 234.645.434.123.
Can somebody please give me a hit into the right direction !?
Is it the problem that Integer 32 or NSNumber can't hold such a huge range?
Thanks
MadMaxApp
}

The NSNumber object can't hold such a huge number. Because of the way negative numbers are stored the result is negative.
Negative numbers are stored using two's complement, this is done to make addition of positive and negative numbers easier. The range of numbers NSNumber can hold is split in two, the highest half (the int values for which the highest order bit is equal to 1) is considered to be negative, the lowest half (where the highest order bit is equal to 0) are the normal positive numbers. Now, if you add sufficiently large numbers, the result will be in the highest half and thus be interpreted as a negative number. Here's an illustration for the 4-bit integer situation (32 works exactly the same but there would be a lot more 0 and 1 to type;))
With 4 bits you can represent this range of signed integers:
0000 (=0)
0001 (=1)
0010 (=2)
...
0111 (=7)
1000 (=-8)
1001 (=-7)
...
1111 (=-1)
The maximum positive integer you can represent is 7 in this case. If you would add 5 and 4 for example you would get:
0101 + 0100 = 1001
1001 equals -7 when you represent signed integers like this (and not 9, as you would expect). That's the effect you are observing, but on a much larger scale (32 bits)
Your only option to get correct results in this case is to increase the number of bits used to represent your integers so the result won't be in the negative number range of bit combinations. So if 32 bits is not enough (like in your case), you can use a long (64 bits).
[myNumber longLongValue];

I think this has to do with int overflow: very large integers get reinterpreted as negatives when they overflow the size of int (32 bits). Use longLongValue instead of integerValue:
long long allPublicationsSize = 0;
for(int n = 0; n < [allPublicationsToLoad count]; n++) {
CDPublication* thePub = [allPublicationsToLoad objectAtIndex:n];
allPublicationsSize += [[thePub filesize] longLongValue];
}

This is an integer overflow issue associated with use of two's complement arithmetic. For a 32 bit integer there are exactly 232 (4,294,967,296) possible integer values which can be expressed. When using two's complement, the most significant bit is used as a sign bit which allows half of the numbers to represent non-negative integers (when the sign bit is 0) and the other half to represent negative numbers (when the sign bit is 1). This gives an effective range of [-231, 231-1] or [-2,147,483,648, 2,147,483,647].
To overcome this problem for your case, you should consider using a 64-bit integer. This should work well for the range of values you seem to be interested in using. Alternatively, if even 64-bit is not sufficient, you should look for big integer libraries for iOS.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Eigen empty sparse matrix memory usage - memory

Look there, the outer index buffers has one more entry to store the end position of the last row.

Related

Does ID3D11DeviceContext::DrawIndexed() have UB if I use 16 bit Indices with an Offset?

What is the relation between address lines and memory?

Direct Mapped Cache of Blocks Example

Cache Addressing: Length of Index, Block offset, Byte offset & Tag?

iOS calculating sum of filesizes always negative

Categories

Resources