I am reading OS concepts book and saw the practice question :
Consider a logical address space of 64 pages of 1024 words each, mapped
onto a physical memory of 32 frames.
a. How many bits are there in the logical address?
b. How many bits are there in the physical address?
how to calculate?
I have the answer but I need to know a method to solve such problems.
Logical Address(LA) = PageNumber + PageOffset
Similarly,
Physical Address(PA) = FrameNumber + FrameOffset
for a page to fit perfectly inside a frame its size has to be equal to frame size
hence, frame size = 1024 Words
therefore, PageOffset = FrameOffset = 10 bits
number of pages = 64
so, minimum number of bits required to represent any page out of 64 pages = 6 = PageNumber
similarly, minimum number of bits required to represent any frame out of 32 frames = 5 = FrameNumber
hence using above stated equations :
LA = 6 + 10;
PA = 5 + 10;
Related
I have a task - to multiply big row vector (10 000 elements) via big column-major matrix (10 000 rows, 400 columns). I decided to go with ARM NEON since I'm curious about this technology and would like to learn more about it.
Here's a working example of vector matrix multiplication I wrote:
//float* vec_ptr - a pointer to vector
//float* mat_ptr - a pointer to matrix
//float* out_ptr - a pointer to output vector
//int matCols - matrix columns
//int vecRows - vector rows, the same as matrix
for (int i = 0, max_i = matCols; i < max_i; i++) {
for (int j = 0, max_j = vecRows - 3; j < max_j; j+=4, mat_ptr+=4, vec_ptr+=4) {
float32x4_t mat_val = vld1q_f32(mat_ptr); //get 4 elements from matrix
float32x4_t vec_val = vld1q_f32(vec_ptr); //get 4 elements from vector
float32x4_t out_val = vmulq_f32(mat_val, vec_val); //multiply vectors
float32_t total_sum = vaddvq_f32(out_val); //sum elements of vector together
out_ptr[i] += total_sum;
}
vec_ptr = &myVec[0]; //switch ptr back again to zero element
}
The problem is that it's taking very long time to compute - 30 ms on iPhone 7+ when my goal is 1 ms or even less if it's possible. Current execution time is understandable since I launch multiplication iteration 400 * (10000 / 4) = 1 000 000 times.
Also, I tried to process 8 elements instead of 4. It seems to help, but numbers still very far from my goal.
I understand that I might make some horrible mistakes since I'm newbie with ARM NEON. And I would be happy if someone can give me some tip how I can optimize my code.
Also - is it worth doing big vector-matrix multiplication via ARM NEON? Does this technology fit well for such purpose?
Your code is completely flawed: it iterates 16 times assuming both matCols and vecRows are 4. What's the point of SIMD then?
And the major performance problem lies in float32_t total_sum = vaddvq_f32(out_val);:
You should never convert a vector to a scalar inside a loop since it causes a pipeline hazard that costs around 15 cycles everytime.
The solution:
float32x4x4_t myMat;
float32x2_t myVecLow, myVecHigh;
myVecLow = vld1_f32(&pVec[0]);
myVecHigh = vld1_f32(&pVec[2]);
myMat = vld4q_f32(pMat);
myMat.val[0] = vmulq_lane_f32(myMat.val[0], myVecLow, 0);
myMat.val[0] = vmlaq_lane_f32(myMat.val[0], myMat.val[1], myVecLow, 1);
myMat.val[0] = vmlaq_lane_f32(myMat.val[0], myMat.val[2], myVecHigh, 0);
myMat.val[0] = vmlaq_lane_f32(myMat.val[0], myMat.val[3], myVecHigh, 1);
vst1q_f32(pDst, myMat.val[0]);
Compute all the four rows in a single pass
Do a matrix transpose (rotation) on-the-fly by vld4
Do vector-scalar multiply-accumulate instead of vector-vector multiply and horizontal add that causes the pipeline hazards.
You were asking if SIMD is suitable for matrix operations? A simple "yes" would be a monumental understatement. You don't even need a loop for this.
Edit: Sorry for late Edit, Without two parameter you cannot calculate it,
So first need to fill user Camera height from ground.
I have check a number of solutions but none of them helpful!
I know that Working Distance = (Sensor Height + Subject Height) * Focal Length / Sensor Height
and
distance to object (mm) = focal length (mm) * real height of the object (mm) * image height (pixels)
----------------------------------------------------------------
object height (pixels) * sensor height (mm)
And I want to get distance from this:
Image Formation by Lenses and the Eye
Hello I get the following info using image Exif ALAssetsLibrary
And I got following meta data :
Save image metadata.
{
DPIHeight = 72;
DPIWidth = 72;
FaceRegions = {
Regions = {
HeightAppliedTo = 2448;
RegionList = (
{
AngleInfoRoll = 270;
AngleInfoYaw = 0;
ConfidenceLevel = 376;
FaceID = 1;
Height = "0.1413399";
Timestamp = 5996166864910;
Type = Face;
Width = "0.1060049";
X = "0.3560049";
Y = "0.4746732";
}
);
WidthAppliedTo = 3264;
};
};
Orientation = 6;
"{Exif}" = {
ApertureValue = "2.526068811667587";
BrightnessValue = "1.291629806962232";
ColorSpace = 1;
DateTimeDigitized = "2014:03:25 15:43:36";
DateTimeOriginal = "2014:03:25 15:43:36";
ExposureMode = 0;
ExposureProgram = 2;
ExposureTime = "0.05";
FNumber = "2.4";
Flash = 24;
FocalLenIn35mmFilm = 33;
FocalLength = "4.12";
ISOSpeedRatings = (
160
);
LensMake = Apple;
LensModel = "iPhone 5 back camera 4.12mm f/2.4";
LensSpecification = (
"4.12",
"4.12",
"2.4",
"2.4"
);
MeteringMode = 5;
PixelXDimension = 3264;
PixelYDimension = 2448;
SceneType = 1;
SensingMethod = 2;
ShutterSpeedValue = "4.321956949076723";
SubjectArea = (
1631,
1223,
1795,
1077
);
SubsecTimeDigitized = 261;
SubsecTimeOriginal = 261;
UserComment = hoge;
WhiteBalance = 0;
};
"{GPS}" = {
Altitude = "196.008";
AltitudeRef = 0;
DateStamp = "2014:03:25";
Latitude = "28.61772";
LatitudeRef = N;
Longitude = "77.38891";
LongitudeRef = E;
TimeStamp = "10:13:37.439000";
};
"{MakerApple}" = {
1 = 0;
3 = {
epoch = 0;
flags = 1;
timescale = 1000000000;
value = 249840592070541;
};
4 = 0;
5 = 179;
6 = 139;
7 = 1;
};
"{TIFF}" = {
DateTime = "2014:03:25 15:43:36";
Make = Apple;
Model = "iPhone 5";
Software = "7.0.6";
XResolution = 72;
YResolution = 72;
};
}
I need to calculate the distance of the object from the camera, using the above details; using iphone4s,iphone5, or iphone5s.
Is it possible?
Modified Need to know formula used by this app any idea:
Need to know any method
http://www.youtube.com/watch?v=eCStIagorx8
how this App working ???
All Help are welcome
#iphonemaclover. As many people have pointed out there's insufficient information to calculate this purely using trigonometry. However, depending on how accurate you need to be, or what it is you're trying to measure and other data you can expect to recover from an iphone and/or you're willing to make some assumptions, it is possible to make some inroads.
a) +1 to Martin R, if you assume a flat earth, a phone height (which as per the app you quoted) a user could update for calibration purposes, can recover pitch information and know where a point on the ground at the base of your object is then this is simple trig. Once there's a estimate for the distance and assuming the thing you're measuring is close to vertical (or sits at a known angle) then its height can also be calculated.
b) Your exif file contains face region information. If you're interested in people and are happy to make an assumption that they're an adult then you could use an average head size assumption to estimate distance using the method you've outlined already.
c) If you can recover a series of images and camera positions and the camera / object positions vary then I believe that 3d information can be recovered using projective geometry.
If you're wondering how to find distance given the phone/tablet height and pitch up/downward, just use the atan of the angle downwards (0 if it is pointed forward; -30 if it is -30 degrees downwards) times the height.The only problem is, we don't know the phone height off the ground.
I know you want to use the focus of the camera, but that is improbable. You mentioned an android app, while if you'll notice that app requires you to enter phone height off the ground. You could try using GPS altitude minus what a topographic map says, but that is incredibly inaccurate. Or maybe you could take a picture from one location, then move around the target by 45 degrees or so, but not everyone has a compass to measure that or has the ability to run circles around the target. How would you move 45 degrees around a object in front of you while standing on a bridge? Even if you did try to use the camera focus, that is automatically updated and I know nothing about using a variable to find the camera focus. https://developer.apple.com/library/ios/documentation/AudioVideo/Conceptual/AVFoundationPG/Articles/04_MediaCapture.html shows how to set focus mode, but I can't find anything on setting a custom focus or viewing what the camera is focused on. If you want to take a picture and do some sort of custom pixel by pixel analysis to see how it's focused, by all means go ahead. Maybe a jail-broken iPhone can access camera focus, but I'm guessing that your target audience is more then the select group that have gone through the jail-breaking process.
I think the easiest way to find the height is to have them point their phone down at their foot. Then you can ask them shoe size and brand to estimate the width of their shoe, or just make a flat out guess. Just ask them to point at the side of their shoe, perhaps by displaying a red dot in the middle of the screen so they know exactly where they're pointing. Then ask them to point at the other and then, once you have captured both edges of their shoe, use the estimated inch width of their shoe multiplied by the tan of the angle difference from one edge to the other (so if it is 170 degrees when pointed at one edge and 190 when at the other use the tan of 20). That should give you the height in inches off the ground.
s = shoe width;
ao = angle when pointed at left shoe edge (y-axis)(radians);
at = angle when pointed at right shoe edge (y-axis)(radians);
a = angle of device when pointed at the point on the ground (x-axis)(radians);
d = distance;
d = atan(a) * tan(at - ao) * s;
Its not possible way to get distance while varying ground-mobile distance dynamically from each person. Manually user have to set the height of the mobile each time to calculate.
I know this is an old question, but something had me thinking about whether it was possible to use an iPhone's focus sensor to measure distance to an object, and this came up in my searches.
I think that in #iphonemaclover's original question, the "sensor height" does not refer to the camera's height above the ground, but the physical height of the camera sensor inside the phone. After all, if you had the phone on a rotating arm pivoting around the subject, the distance to the subject could remain constant while the phone's vertical position changed. It's the sensor size and focal length that determine what is in its frame.
With that in mind, I did some quick experiments with my phone which seemed to bear this out (though, I'll admit that I did not perform any real measurements beyond eyeball estimates and a desk ruler.)
I was using an iPhone 5, and with a little Googling, we can learn some key pieces of information about its camera sensor:
pixel resolution: 3264 x 2448
focal length: 4.10mm
sensor size: 4.54 x 3.42 mm
So: the first formula above for "Working Distance" can tell us the distance at which a subject of a given size will fill the sensor's frame. With the camera in landscape orientation, and a subject 2m tall (2000mm), we get:
(3.42 + 2000) * 4.10
-------------------- = 2401.76mm, or about 2.4m
3.42
So if you stand about 2.4m away from a 2m tall person, they should fill the frame in landscape orientation. For portrait, you'll need to step up to about 1.8m away.
That's all fine and wonderful for filling the frame. But we want to calculate for arbitrary distances, by seeing how much of the frame our subject takes up. That's what the second formula is for. Let's again use our 2m tall subject, and plug numbers in. For this experiment, let's say that we've examined our image, and the subject in the photo is taking up 1800 pixels, vertically. Plugging into the second distance formula, we get:
4.10 * 2000 * 2448
------------------ = 3260.82mm, or ~ 3.26m
1800 * 3.42
And if the subject took up the same 1800 vertical pixels in portrait mode, then we could guess that the photographer was about 3.28m away.
This all assumes that the formulae given above are correct. And again, I only did rudimentary attempts at verifying the 'full-frame' measurements (once with a subject about 28cm tall, then with a subject about 1.7m tall). But my eyeball estimates looked reasonably close to what the formula predicted.
Hi guys,
I'm trying to reduce the number of bits per pixel to below 8, on gray scale images using Scilab
Is this possible?
If so, how can I do this?
Thank you.
I think it is not possible. The integer types available in Scilab are one or multiple bytes, see types here.
If you are looking to loose the high frequency information, you could shift out information.
Pseudo implementation
for x=1:width
for y=1:height
// Get pixel and make a 1 byte integer
pixel = int8(picture(x,y))
//Display bits
disp( dec2bin(pixel) )
// We start out with 8 bits - 4 = 4 bits info
bits_to_shift = 4
shifted_down_pixel = pixel/(2^bits_to_shift)
//Display shifted down
disp( dec2bin(shifted_down_pixel))
//Shift it back
shifted_back_pixel = pixel*(2^bits_to_shift)
disp( dec2bin(shifted_back_pixel))
// Replace old pixel with new
picture(x,y) = shifted_back_pixel
end
end
Of course you can do the above code much faster with one big matrix operation, but it is to show the concept.
Working example
rgb = imread('your_image.png')
gry = rgb2gray(rgb)
gry8bit = im2uint8(gry)
function result = reduce_bits(img, bits)
reduced = img / (2^bits);
result = reduced * (2^bits);
return result;
endfunction
gry2bit = reduce_bits(gry8bit, 6)
imshow(gry2bit)
In this case:
float a = 0.99999f;
int b = 1000;
int c = a + b;
In result c = 1001. I discovered that it happens because b is converted to float (specific for iOS), then a + b doesn't have enough precision for 1000.9999 and (why?) is rounded to higher value. If a is 0.999f we get c = 1000 - theoretically correct behavior.
So my question is why float number is rounded to higher value? Where this behavior (or convention) is described?
I tested this on iPhone Simulator, Apple LLVM 4.2 compiler.
In int c = a + b, the integer b is converted to a float first, then 2 floating point
numbers are added, and the result is truncated to an integer.
The default floating point rounding mode is FE_TONEAREST, which means that the result
of the addition
0.99999f + 1000f
is the nearest number that can be represented as a float, and that is the number 1001f. This float is then truncated to the integer c = 1001.
If you change the rounding mode
#include <fenv.h>
fesetround(FE_DOWNWARD);
then the result of the addition is rounded downward (approximately 1000.99993f) and you would get c = 1000.
The reason is that when you add 1000 you get 8 total decimal digits of precision, but IEEE float is only supports 7 digits.
I am working on some CUDA program and I wanted to speed up computation using constant memory but it turned that using constant memory makes my code ~30% slower.
I know that constant memory is good at broadcasting reads to whole warps and I thought that my program could take an advantage of it.
Here is constant memory code:
__constant__ float4 constPlanes[MAX_PLANES_COUNT];
__global__ void faultsKernelConstantMem(const float3* vertices, unsigned int vertsCount, int* displacements, unsigned int planesCount) {
unsigned int blockId = __mul24(blockIdx.y, gridDim.x) + blockIdx.x;
unsigned int vertexIndex = __mul24(blockId, blockDim.x) + threadIdx.x;
if (vertexIndex >= vertsCount) {
return;
}
float3 v = vertices[vertexIndex];
int displacementSteps = displacements[vertexIndex];
//__syncthreads();
for (unsigned int planeIndex = 0; planeIndex < planesCount; ++planeIndex) {
float4 plane = constPlanes[planeIndex];
if (v.x * plane.x + v.y * plane.y + v.z * plane.z + plane.w > 0) {
++displacementSteps;
}
else {
--displacementSteps;
}
}
displacements[vertexIndex] = displacementSteps;
}
Global memory code is the same but it have one parameter more (with pointer to array of planes) and uses it instead of global array.
I thought that those first global memory reads
float3 v = vertices[vertexIndex];
int displacementSteps = displacements[vertexIndex];
may cause "desynchronization" of threads and then they will not take an advantage of broadcasting of constant memory reads so I've tried to call __syncthreads(); before reading constant memory but it did not changed anything.
What is wrong? Thanks in advance!
System:
CUDA Driver Version: 5.0
CUDA Capability: 2.0
Parameters:
number of vertices: ~2.5 millions
number of planes: 1024
Results:
constant mem version: 46 ms
global mem version: 35 ms
EDIT:
So I've tried many things how to make the constant memory faster, such as:
1) Comment out the two global memory reads to see if they have any impact and they do not. Global memory was still faster.
2) Process more vertices per thread (from 8 to 64) to take advantage of CM caches. This was even slower then one vertex per thread.
2b) Use shared memory to store displacements and vertices - load all of them at beginning, process and save all displacements. Again, slower than shown CM example.
After this experience I really do not understand how the CM read broadcasting works and how can be "used" correctly in my code. This code probably can not be optimized with CM.
EDIT2:
Another day of tweaking, I've tried:
3) Process more vertices (8 to 64) per thread with memory coalescing (every thread goes with increment equal to total number of threads in system) -- this gives better results than increment equal to 1 but still no speedup
4) Replace this if statement
if (v.x * plane.x + v.y * plane.y + v.z * plane.z + plane.w > 0) {
++displacementSteps;
}
else {
--displacementSteps;
}
which is giving 'unpredictable' results with little bit of math to avoid branching using this code:
float dist = v.x * plane.x + v.y * plane.y + v.z * plane.z + plane.w;
int distInt = (int)(dist * (1 << 29)); // distance is in range (0 - 2), stretch it to int range
int sign = 1 | (distInt >> (sizeof(int) * CHAR_BIT - 1)); // compute sign without using ifs
displacementSteps += sign;
Unfortunately this is a lot of slower (~30%) than using the if so ifs are not that big evil as I thought.
EDIT3:
I am concluding this question that this problem probably can not be improved by using constant memory, those are my results*:
*Times reported as median from 15 independent measurements. When constant memory was not large enough for saving all planes (4096 and 8192), kernel was invoked multiple times.
Although a compute capability 2.0 chip has 64k of constant memory, each of the multi-processors has only 8k of constant-memory cache. Your code has each thread requiring access to all 16k of the constant memory, so you are losing performance through cache misses. To effectively use constant memory for the plane data, you will need to restructure your implementation.