Virtual Memory - calculate number of pages in page table - memory

Virtual address space is 64bits
Page size is 64KB
Word size is 4bytes
How many pages are in the page table?
At first I thought:
page size = 64KB = 2^16bytes, so the offset uses 16 bits of the 64
Therefore, 48 bits left -> there are 2^48 pages in the page table
(I didn't understand where to use the info about the word size)
However, the correct answer is that there are 2^50 pages, which confuses me..
Then I thought that maybe the page offset is only 14bits because the word size is 4bytes = 2^2bytes. so there are really 2^50 pages in the pagetable.
Am I right? can I get a better explanation?

Each page uses 14 bits of the 64, not 16 as the minimum addressable unit is a 4 byte word (which effectively removes 2 bits off the number needed). So the offset has the remaining 50 bits left.

Related

Assuring stack pointer alignment using bitwise operators

Assume I want to reserve 8 bytes on the stack and I also want to make sure current stack pointer is 8 byte aligned. I have seen some codes that assure current sp is 8 bye aligned using this logic:
sp = sp & -8;
They AND it with the amount they are going to reserve on the stack (which of course is negative).
How does this logic work?
It works because negative numbers are represented in two's complement, so -8 is equivalent to ~7, of which the 3 least significant bits are 0s and the rest are 1s. ANDing this with a value clears the 3 least significant bits, which obviously results in it being 8-byte aligned. By the way, this trick only works to align things to powers of 2. If you had some strange reason to align things to a 12-byte boundary, for example, sp = sp & -12 would not work as intended.

JPEG2000 : Can number of tiles in X direction be zero?

According to JPEG2000 specs, Number of tiles in X and Y directions is calculated by following formula:
numXtiles =  (Xsiz − XTOsiz)/ XTsiz
&
numYtiles =  (Ysiz − YTOsiz)/ YTsiz
But it is not mentioned about the range of numXtiles or numYtiles.
Can we have numXtiles=0 while numYtiles=250 (or any other value) ?
In short, no. You will always need at least one row and one column of tiles to place your image in the canvas.
In particular, the SIZ marker of the JPEG 2000 stream syntax does not directly define the number of tiles, but rather the size of each tile. Since the tile width and height are defined to be larger than 0 (see page 453 of "JPEG 2000 Image compression fundamentals, standards and practice", by David Taubman and Michael Marcellin), you will always have at least one tile.
That said, depending on the particular implementation that you are using, there may be a parameter numXtiles that you can set to 0 without crashing your program. In that case, the parameter is most likely being ignored or interpreted differently.

Calculating the table entry size

I got a question like this and need to calculate the table entry size.
Microsoft Windows 98 used a 32-bit memory address space while the default page size was 4KB. If it is having a physical memory of 256MB
i) What is the size of an entry in the page table?
Does this equal to page offset?
In 32-bit Intel, the page table entry is 32-Bits.
Size of an entry in the page table is 32-12 = 20.

Cuda: Operating on images (linearized 2d arrays) with a single column of constant values

I am processing images, which are long, usually a few hundred thousand pixel in length. The height is usually in the 500-1000 pixel range. The process involves modifying the images on a column by column basis. So, for example, I have a column of constant values that needs to be subtracted from each column in the image.
Currently I split the image into smaller blocks, put them into linearized 2d arrays. Then I make a linearized 2d array from the column of constant values that is the same size as the smaller block. Then a (image array - constant array) operation is done until the full image is processed.
Should I copy the constant column to the device, and then just operate column by column? Or should I try to make as large of a "constant array" as possible, and then perform the subtraction. I am not looking for 100% optimization or even close to that, but an idea about what the right approach to take is.
How can I optimize this process? Any resources to learn more about this type of processing would be appreciated.
Constant memory is up to 64KB, so assuming your pixels are 4 bytes or less, then you should be able to handle an image height up to about 16K pixels, and still put the entire "constant column" in constant memory.
After that, you don't need to process things "column by column". Constant memory is optimized for access when every thread is requesting the same value from constant memory, which perfectly describes your case.
Therefore, your thread code can be trivially simple:
#define MAX_COL_SIZE 1024
__constant__ float const_column[MAX_COL_SIZE];
__global__ void img_col_kernel(float *in, float *out, int num_cols, int col_size){
int idx = threadIdx.x + blockDim.x*blockIdx.x;
if (idx < num_cols)
for (int i = 0; i < col_size; i++)
out[idx+i*num_cols] = in[idx+i*num_cols] - const_column[i];
}
(coded in browser, not tested)
Set up const_column in your host code using cudaMemcpyToSymbol prior to calling img_col_kernel. Call the kernel with a 1D grid including a total number of threads equal to or greater than your image width (num_cols). Pass the "linearized 2D" pointers to your input and output images to the kernel (in and out). The above kernel should run pretty fast, and essentially be bound by memory bandwidth for images of width 1000 or more. For small images, you may want to increase the number of threads by dividing your image vertically into say, 4 pieces, and operate with 4 times as many threads (and 4 regions of constant memory).

My preallocation of a matrix gives out of memory error in MATLAB

I use zeros to initialize my matrix like this:
height = 352
width = 288
nFrames = 120
imgYuv=zeros([height,width,3,nFrames]);
However, when I set the value of nFrames larger than 120, MATLAB gives me an error message saying out of memory.
The original function is
[imgYuv, S, A]= changeYuv(fileName, width, height, idxFrame, nFrames)
my command is
[imgYuv,S,A]=changeYuv('tilt.yuv',352,288,1:120,120);
Can anyone please tell me what's going on here?
PS: one of the purposes of the function is to load a yuv video which consists more than 2000 frames. Is there any possibility to implement that?
There are three ways to avoid the error
Process a limited number of
frames at any given time.
Work
with integer arrays. Most movies are
in 8-bit format, while Matlab
normally works with doubles.
uint8 takes 1 byte per element,
while double takes 8 bytes. Thus,
if you create your array as B =
zeros(height,width,3,nFrames,'uint8)`,
it only uses 1/8th of the memory.
This might work for 120 frames,
though for 2000 frames, you'll run
again into trouble. Note that not
all Matlab functions work for
integer arrays; you may have to
reimplement those that require
double.
Buy more RAM.
Yes, you (or rather, your Matlab session) are running out of memory.
Get out your calculator and find the product height x width x 3 x nFrames x 8 which will tell you how much memory you have tried to get in your call to zeros. That will be a number either close to or in excess of the RAM available to Matlab on your computer.
Your command is:
[imgYuv,S,A]=changeYuv('tilt.yuv',352,288,1:120,120);
That is:
352*288*120*120 = 1459814400
That is 1.4 * 10^9. If one object has 4 bytes, then you need 6GB. That is a lot of memory...
Referencing the code I've seen in your withdrawn post, your calculating the difference between adjacent frame histograms. One option to avoid massive memory allocation might be to just hold two frames in memory, instead of reading all the frames at once.
The function B = zeros([d1 d2 d3...]) creates an multi-dimensional array with dimensions d1*d2*d3*...
Depending on width and height, given the 3rd dimension of 3 and the 4th dimension of 120 (which effectively results in width*height*360), may result in a very huge array. There are certain memory limits on every machine, maybe you reached these... ;)

Resources