How can I access more than Conventional and Extended memory?
The XMS version 3.0 specification allows access to up to 4GB. See the Wikipedia article.
MS-DOS is a 16-bit operating system, which limits its inherent ability to address large amounts of memory. I believe the limit for addressable memory is 16 megabytes in protected mode, using extended memory (80286 processors and above).
See here: http://en.wikipedia.org/wiki/RAM_Limit
Nowadays, small application spaces, such as embedded controllers, typically use one of the many variants of Linux that are widely available.
Related
The x86 architecture has segment registers for various segments of the address space (ss, ds, etc). If I wanted to add a new memory segment into a process address space, could I do it by just modifying the kernel or would I need hardware support? Not looking to do anything specific just curious and trying to understand how Linux uses segment registers.
from this link
https://www.cs.princeton.edu/courses/archive/fall02/cs318/proj2/pc-arch.html
Modern operating system and applications use the (unsegmented) memory
modelĀ¾ all the segment registers are loaded with the same segment
selector so that all memory references a program makes are to a single
linear-address space.
When writing application code, you generally create segment selectors
with assembler directives and symbols. The assembler and/or linker
then creates the actual segment selectors associated with these
directives and symbols. If you are writing system code, you may need
to create segment selectors directly.
Also you can not add a new segment with out changing a lot of things including hardware support.
Memory is usually managed by a dedicated piece of hardware named Memory Management Unit (MMU).
Any x86 CPU has an MMU, but this doesn't mean that memory management must to be done in hardware. Linux itself can run by emulating an MMU in software.
Of course without hardware support it would be really difficult (and in some cases even impossible) to implement some features.
From a pure theoretical point of view you could be able to emulate in software (kernel space) a segmentation-like behavior with all the segments that you like, but in the real world that would be just a bad idea.
As you said x86_32 has support for memory segmentation, but since i386 there is also support for paging. Nowadays, segmentation is considered deprecated and there is no modern OS (AFAIK) that uses it (except maybe some hackish patch like grsecurity/PaX and their UDEREF feature).
It's also important to note that x86_64 completely lacks support for segmentation.
And if so, how. I'm talking about this 4GB Patch.
On the face of it, it seems like a pretty nifty idea: on Windows, each 32-bit application normally only has access to 2GB of address space, but if you have 64-bit Windows, you can enable a little flag to allow a 32-bit application to access the full 4GB. The page gives some examples of applications that might benefit from it.
HOWEVER, most applications seem to assume that memory allocation is always successful. Some applications do check if allocations are successful, but even then can at best quit gracefully on failure. I've never in my (short) life come across an application that could fail a memory allocation and still keep going with no loss of functionality or impact on correctness, and I have a feeling that such applications are from extremely rare to essentially non-existent in the realm of desktop computers. With this in mind, it would seem reasonable to assume that any such application would be programmed to not exceed 2GB memory usage under normal conditions, and those few that do would have been built with this magic flag already enabled for the benefit of 64-bit users.
So, have I made some incorrect assumptions? If not, how does this tool help in practice? I don't see how it could, yet I see quite a few people around the internet claiming it works (for some definition of works).
Your troublesome assumptions are these ones:
Some applications do check if allocations are successful, but even then can at best quit gracefully on failure. I've never in my (short) life come across an application that could fail a memory allocation and still keep going with no loss of functionality or impact on correctness, and I have a feeling that such applications are from extremely rare to essentially non-existent in the realm of desktop computers.
There do exist applications that do better than "quit gracefully" on failure. Yes, functionality will be impacted (after all, there wasn't enough memory to continue with the requested operation), but many apps will at least be able to stay running - so, for example, you may not be able to add any more text to your enormous document, but you can at least save the document in its current state (or make it smaller, etc.)
With this in mind, it would seem reasonable to assume that any such application would be programmed to not exceed 2GB memory usage under normal conditions, and those few that do would have been built with this magic flag already enabled for the benefit of 64-bit users.
The trouble with this assumption is that, in general, an application's memory usage is determined by what you do with it. So, as over the past years storage sizes have grown, and memory sizes have grown, the sizes of files that people want to operate on have also grown - so an application that worked fine when 1GB files were unheard of may struggle now that (for example) high definition video can be taken by many consumer cameras.
Putting that another way: applications that used to fit comfortably within 2GB of memory no longer do, because people want do do more with them now.
I do think the following extract from your link of 4 GB Patch pretty much explains the reason of how and why it works.
Why things are this way on x64 is easy to explain. On x86 applications have 2GB of virtual memory out of 4GB (the other 2GB are reserved for the system). On x64 these two other GB can now be accessed by 32bit applications. In order to achieve this, a flag has to be set in the file's internal format. This is, of course, very easy for insiders who do it every day with the CFF Explorer. This tool was written because not everybody is an insider, and most probably a lot of people don't even know that this can be achieved. Even I wouldn't have written this tool if someone didn't explicitly ask me to.
And to expand on CFF,
The CFF Explorer was designed to make PE editing as easy as possible,
but without losing sight on the portable executable's internal
structure. This application includes a series of tools which might
help not only reverse engineers but also programmers. It offers a
multi-file environment and a switchable interface.
And to quote a Microsoft insider, Larry Miller of Microsoft MCSA on a blog post about patching games using the tool,
Under 32 bit windows an application has access to 2GB of VIRTUAL
memory space. 64 bit Windows makes 4GB available to applications.
Without the change mentioned an application will only be able to
access 2GB.
This was not an arbitrary restriction. Most 32 bit applications simply
can not cope with a larger than 2GB address space. The switch
mentioned indicates to the system that it is able to cope. If this
switch is manually set most 32 bit applications will crash in 64 bit
environment.
In some cases the switch may be useful. But don't be surprised if it
crashes.
And finally to add from MSDN - Migrating 32-bit Managed Code to 64-bit,
There is also information in the PE that tells the Windows loader if
the assembly is targeted for a specific architecture. This additional
information ensures that assemblies targeted for a particular
architecture are not loaded in a different one. The C#, Visual Basic
.NET, and C++ Whidbey compilers let you set the appropriate flags in
the PE header. For example, C# and THIRD have a /platform:{anycpu,
x86, Itanium, x64} compiler option.
Note: While it is technically possible to modify the flags in the PE header of an assembly after it has been compiled, Microsoft does not recommend doing this.
Finally to answer your question - how does this tool help in practice?
Since you have malloc in your tags, I believe you are working on unmanaged memory. This patch would mostly result in invalid pointers as they become twice the size now, and almost all other primitive datatypes would be scaled by a factor of 2X.
But for managed code since all these are handled by the CLR in .NET, this would mean really helpful and would not have much problems unless you are dealing with any of the following :
Invoking platform APIs via p/invoke
Invoking COM objects
Making use of unsafe code
Using marshaling as a mechanism for sharing information
Using serialization as a way of persisting state
To summarize, being a programmer I would not use the tool to convert my application and rather would migrate it myself by changing build targets. being said that if I have a exe that can do well like games with more RAM, then this is worth a try.
The term "heterogeneous" is often used in distributed-system and middleware. What does it mean?
Homogenous hardware in a distributed system would be every machine having the same hardware, same OS, and possibly even being dedicated to just running one thing.
Heterogenous would mean:
Inconsistent Hardware. You might have one set of servers from 2011, one from 2013, and one fancy new set ramped up this year, all in the same pool for compute resources. They have different CPUs, amounts of memory, and amounts of disk available for use.
Possibly Inconsistent OS. A crowdsourced distributed computing environment - something that runs when a user's screensaver is going - might have different everything; it may be running on Linux, on Windows, or OS/X.
In both cases, you have to plan to be immune to differences in resources, or flexible based on what the actual differences are.
To be clear; the first is 1000x more common. If someone says "heterogenous distributed computing", they mean two or more computers with different hardware being used together to solve a problem.
Memory allocation of any programming language depends on compiler or the system architecture? If it depends on compiler, then what difference 32 bit/64 bit architecture makes? If it depends on architecture, then why the memory size of variables are constant for 16/32/64 bit architectures? What is the impact of slack bytes on the system architecture?
It's not either or. Compiler, system architecture and system conventions have an effect on how memory allocation works.
For one, the system architecture includes the size of pointers. So a compiler can't use more memory than a system provides, and can't address more bytes or have larger pointers than the CPU supports. (I mean actual RAM, and real addresses. Of course it can use Virtual Memory-like constructs to remap the addresses the programmer uses to something that CPU can handle).
Similarly, usually operating system vendors have conventions what a function call should be like: Whether parameters are passed on the stack or in registers. Whether parameters that are larger than a certain size should be copied onto the stack or provided as a pointer to a stack/heap object. If you want to call system functions, you will have to use these conventions.
However, beyond that, it is left to the compiler to decide details. E.g. many Pascal compilers pass additional hidden pointers to enclosing scope into built-in functions defined in the same language. As long as you know code has been written using the same language, you can agree on differences in calling conventions within the limits the system architecture permits. E.g. if you know that your software has to run on similar CPUs with stricter requirements, you can choose to apply these strict requirements even on CPUs that don't need it (like compiling code so everything is at least 2-byte-aligned so your MC68040 code will also run on the older MC68000 which had that requirement).
Knuth recently objected to 64-bit systems, saying that for programs which fit in 4 gigs of memory, "they effectively throw away half of the cache" because the pointers are twice as big as on a 32-bit system.
My question is: can this problem be avoided by installing a 32-bit operating system on a 64-bit machine? And are there any bandwidth-intensive benchmarks which demonstrate the advantage in this case?
Bandwidth is not really the correct term here. What Knuth was really talking about was data density, as it relates to cache footprint. Imagine that you have a 16KB L1 data cache: If you're purely storing pointers, you can store 2^14/2^2 = 2^12 = 4096 32-bit pointers, but only 2048 64-bit pointers. If the performance of your application depends on being able to keep track of over 2K different buffers, you may see a real performance benefit from a 32-bit address space. However, most real code is not this way, and real performance benefits from a caching system often come from being able to cache common integer and floating-point data structures, not huge quantities of pointers. If your working set is not pointer-heavy, the downside of 64-bit becomes negligible, and the upside becomes much more obvious if you're performing a lot of 64-bit integer arithmetic.
The answer is: yes it can to a certain extent, although the performance difference is unlikely to be great.
Any benchmark to test this will have to do a lot of pointer resolution, which will be difficult to separate out from the noise. Designing a benchmark that will not optimise away is difficult. This article about flawed java benchmarks was posted by someone in response to another question, but many of the principles described in it will apply to this.
I don't think Knuth objected to 64-bit systems. He just said that using 64-bit pointers on a system that has less than 4GB ram is idiotic (at least if you have lots of pointers like the ones in a double-linked list). I can't say that I agree with him, here are 3 different ways that can be taken. Let's assume you have a 64-bit capable CPU that can also run in 32-bit mode like some Intel Core Duo.
1 - Everything is 32-bit, the OS, the APPZ, all of them. So you have 32-bit pointers but you can not use the extra registers/instructions that are available on 64-bit mode.
2 - Everything is 64-bit, the OS, the APPZ, all of them. So you have 64-bit pointers and you can use the extra registers/instructions that are available on 64-bit mode. But as you have less than 4GB ram, using 64-bit pointers seems like idiotic. But, is it ?
3 - OS is 64-bit and OS interestingly makes sure that all the code/data pointers are in the 0x00000000 - 0xFFFFFFFF range (Virtual Memory !!!). The ABI runs in a very strange way that all the code/data pointers kept in memory/files are 32-bit wide but they are loaded into 64-bit registers as zero-extended. If there is a code location to jump, compiler/ABI does the necessary fix-ups and does the actual 64-bit jump. This way, pointers are 32-bit but APPZ can be 64-bit meaning they can make use of the 64-bit registers and instructions. This process is something like thunking, I think ;-P
My conclusion is ::
The 3rd option seemed doable to me but it is not an easy problem. In theory it can work but I do not think it is feasible. And I also think that his quote "When such pointer values appear inside a struct, they not only waste half the memory, they effectively throw away half of the cache." is exaggerated...
i've seen somewhere that the best mix (on x86 CPUs) is to use a 64-bit OS and 32-bit applications.
with a 64-bit OS you get:
ability to handle more than 4GB of address space
more, bigger registers to help in data-copying operations
with a 32-bit app you get:
smaller pointers
less, smaller registers to save on context switches
cons:
all libraries must be duplicated. tiny by HD space standards.
all loaded libraries are duplicated on RAM. not so tiny...
surprisingly, there seems not to be any overhead when switching modes. I guess that breaking from userspace to kernel costs the same, no matter the bitness of the userspace.
of course, there are some applications that benefit from big address space. but for everything else, you can get an extra 5% performance by staying at 32-bit.
and no, i don't care about this small speedup. but it doesn't "offend" me to run 32-bit FireFox on a 64-bit KUbuntu machine (like i've seen on some forums)