Are memory segments defined by the OS or architecture? - memory

The x86 architecture has segment registers for various segments of the address space (ss, ds, etc). If I wanted to add a new memory segment into a process address space, could I do it by just modifying the kernel or would I need hardware support? Not looking to do anything specific just curious and trying to understand how Linux uses segment registers.

from this link
https://www.cs.princeton.edu/courses/archive/fall02/cs318/proj2/pc-arch.html
Modern operating system and applications use the (unsegmented) memory
modelĀ¾ all the segment registers are loaded with the same segment
selector so that all memory references a program makes are to a single
linear-address space.
When writing application code, you generally create segment selectors
with assembler directives and symbols. The assembler and/or linker
then creates the actual segment selectors associated with these
directives and symbols. If you are writing system code, you may need
to create segment selectors directly.
Also you can not add a new segment with out changing a lot of things including hardware support.

Memory is usually managed by a dedicated piece of hardware named Memory Management Unit (MMU).
Any x86 CPU has an MMU, but this doesn't mean that memory management must to be done in hardware. Linux itself can run by emulating an MMU in software.
Of course without hardware support it would be really difficult (and in some cases even impossible) to implement some features.
From a pure theoretical point of view you could be able to emulate in software (kernel space) a segmentation-like behavior with all the segments that you like, but in the real world that would be just a bad idea.
As you said x86_32 has support for memory segmentation, but since i386 there is also support for paging. Nowadays, segmentation is considered deprecated and there is no modern OS (AFAIK) that uses it (except maybe some hackish patch like grsecurity/PaX and their UDEREF feature).
It's also important to note that x86_64 completely lacks support for segmentation.

Related

Why do we need AML - ACPI Machine Language?

As I understand, ACPI defines a generic hardware programming model where operating system relies on the OEM firmware provided AML (ACPI machine language) code to manipulate the hardware.
In order to execute the AML code, operating system has to incorporate an AML interpreter.
So, it looks to me that firmware developers use AML to provide a control interface between platform hardware and operating system.
But do we really need AML?
I think ultimately the hardware can only be configured through the native instruction of the platform. So the AML interpreter must translate the AML into native instructions otherwise it cannot be executed on the platform.
But what's the point of using an intermediate language like AML? I mean though the AML is said to be platform-independent, which means I can use AML to describe my platform in a non-native way.
But the AML is part of the platform firmware in practice. And the entire firmware has already been built into the target platform's native instructions. So what good can it be to make such a little part of the firmware as platform-independent? Why not just use native instructions? There must be some way to let OS use it as well. And this way operating system doesn't need the AML interpreter at all. A lot of complexity can be avoided.
One of the big goals of ACPI over its predecessor APM was to give the OS more viability and control over power state transitions.
APM was a black box. The OS knew nothing about the power management implementation. It would just call a BIOS function and the BIOS handled all of the magic. Did it work? Did the system sleep properly? Did the system freeze? Was a user application able to handle the BIOS implementation? The sad truth was that many systems had power management that was downright broken, and Microsoft wanted to provide a better power management experience for the growing laptop industry.
Now, the BIOS hands the ASL/AML code over to the OS and the OS executes it not the BIOS. If the BIOS code does something dumb (like messing with registers it shouldn't), Windows can detect that by parsing the code and block it. AML is 100% decompilable unlike C.
Remember that ACPI is not x86 specific. At the time it was developed, Itanium and Xscale were around. Intel and Microsoft needed a language that would work on all platforms, both 32 and 64 bit.
Lastly, ASL is more than just a list of executable functions. It is also number of static configuration tables. The ASL code has tables to define the non PnP hardware built onto your motherboard. It has tables of supported power states. A traditional programming language like C isn't really setup for that.
If ACPI was invented today, they would probably use something like XML to provide the info to the OS.
Originally, hardware for "80x86 PC" was cloned from IBM's PC, and this created an effective de-facto standard for hardware to follow. However it didn't take long before manufacturers wanted to add features that didn't previously exist, where there was no (official or de-facto) standard to follow.
This led to a major problem for operating system software (how do you support "non-standard chaos"). Some standards were created for some things (APM, etc) but they didn't really cover everything needed and became out-of-date. ACPI was created to fix this.
Ideally, what was (and still is) needed is standards that allow operating system to detect and use supported features of the motherboard. For example, a "standardised case temperature and fan control" device (with support for detecting how many fans, temperature sensors, etc), or a "standardised CPU speed/power consumption", a "PCI slot IRQ routing for IO APICs" standard, a "hot-plug PCI controller device" standard, etc.
However, ACPI didn't provide useful standards that hardware manufacturers and operating systems can use. Instead, ACPI provided an over-engineered mess (AML) to allow an OS to cope with ACPI's failure to standardise the hardware.
Essentially; we "need" AML now because it's the only viable way for an OS to work-around the "non-standard chaos" problem that ACPI failed to fix.
The problem with providing native code instead of AML is that different operating systems use CPUs in different ways (e.g. native 64-bit 80x86 code in firmware would be useless for an older "32-bit only" OS). AML provides portability between different types of CPUs and between the same CPU/s in different modes.
Also; native code is considered a major security problem (rootkits, etc); and people tend to think an interpreted language mitigates that problem. Of course in practice AML needs far too much access to the underlying hardware and does it in a way that an OS can't check, and there's isn't even a way for an OS to determine if the AML has been maliciously modified before the OS booted. For these reasons AML is still a major security problem despite using interpreted language.

Should "IMAGE_FILE_LARGE_ADDRESS_AWARE" work in Delphi6 to effectively avoid the EOutOfMemory error?

I understand from other posts here that "IMAGE_FILE_LARGE_ADDRESS_AWARE" may work to effectively expand memory availability in e.g. Delphi 2007.
I don't get this to work in Delphi6, is this indeed the case, or should it work? Or is there an alternative command that does the same thing?
If not, I may need to migrate to a later version of Delphi. Then, does anyone know what the most recent version of Delphi is that would easily allow me to migrate my existing code (ideally, my existing code, which is fairly simple Turbo Pascal-type code, would just work as is) AND would support the "IMAGE_FILE_LARGE_ADDRESS_AWARE" 'trick' to expand memory?
Many thanks!
Remco
You can apply the IMAGE_FILE_LARGE_ADDRESS_AWARE PE flag to a Delphi 6 application, but you must beware of the following issues:
The default memory manager for Delphi 6, the Borland memory manager, does not support memory allocations with addresses above 2GB. You must replace the memory manager with one that supports large addresses. For instance FastMM.
Your code may well contain pointer truncation bugs that will need to be found and fixed.
The same goes for any third party software that you use. This includes the Borland RTL and VCL libraries. I did not encounter many problems with these libraries, but it may be that your program uses different parts of the runtime libraries that have pointer truncation bugs.
In order to stress test your program under large address conditions you should turn on top down memory allocation. Do not be surprised if your anti-malware software (or indeed other system level software) has to be disabled whilst you operate in top down memory allocation mode. This type of software is notoriously poor at operating in top down memory allocation mode.
Finally, it is worth pointing out that large address aware cannot solve all out of memory problems. All it does is open up the top half of the 32 bit address space. Your program might require even more address space than that. In which case you'd need to either re-design your program, or move to a 64 bit compiler.

Is the "4GB patch" of any use in real life?

And if so, how. I'm talking about this 4GB Patch.
On the face of it, it seems like a pretty nifty idea: on Windows, each 32-bit application normally only has access to 2GB of address space, but if you have 64-bit Windows, you can enable a little flag to allow a 32-bit application to access the full 4GB. The page gives some examples of applications that might benefit from it.
HOWEVER, most applications seem to assume that memory allocation is always successful. Some applications do check if allocations are successful, but even then can at best quit gracefully on failure. I've never in my (short) life come across an application that could fail a memory allocation and still keep going with no loss of functionality or impact on correctness, and I have a feeling that such applications are from extremely rare to essentially non-existent in the realm of desktop computers. With this in mind, it would seem reasonable to assume that any such application would be programmed to not exceed 2GB memory usage under normal conditions, and those few that do would have been built with this magic flag already enabled for the benefit of 64-bit users.
So, have I made some incorrect assumptions? If not, how does this tool help in practice? I don't see how it could, yet I see quite a few people around the internet claiming it works (for some definition of works).
Your troublesome assumptions are these ones:
Some applications do check if allocations are successful, but even then can at best quit gracefully on failure. I've never in my (short) life come across an application that could fail a memory allocation and still keep going with no loss of functionality or impact on correctness, and I have a feeling that such applications are from extremely rare to essentially non-existent in the realm of desktop computers.
There do exist applications that do better than "quit gracefully" on failure. Yes, functionality will be impacted (after all, there wasn't enough memory to continue with the requested operation), but many apps will at least be able to stay running - so, for example, you may not be able to add any more text to your enormous document, but you can at least save the document in its current state (or make it smaller, etc.)
With this in mind, it would seem reasonable to assume that any such application would be programmed to not exceed 2GB memory usage under normal conditions, and those few that do would have been built with this magic flag already enabled for the benefit of 64-bit users.
The trouble with this assumption is that, in general, an application's memory usage is determined by what you do with it. So, as over the past years storage sizes have grown, and memory sizes have grown, the sizes of files that people want to operate on have also grown - so an application that worked fine when 1GB files were unheard of may struggle now that (for example) high definition video can be taken by many consumer cameras.
Putting that another way: applications that used to fit comfortably within 2GB of memory no longer do, because people want do do more with them now.
I do think the following extract from your link of 4 GB Patch pretty much explains the reason of how and why it works.
Why things are this way on x64 is easy to explain. On x86 applications have 2GB of virtual memory out of 4GB (the other 2GB are reserved for the system). On x64 these two other GB can now be accessed by 32bit applications. In order to achieve this, a flag has to be set in the file's internal format. This is, of course, very easy for insiders who do it every day with the CFF Explorer. This tool was written because not everybody is an insider, and most probably a lot of people don't even know that this can be achieved. Even I wouldn't have written this tool if someone didn't explicitly ask me to.
And to expand on CFF,
The CFF Explorer was designed to make PE editing as easy as possible,
but without losing sight on the portable executable's internal
structure. This application includes a series of tools which might
help not only reverse engineers but also programmers. It offers a
multi-file environment and a switchable interface.
And to quote a Microsoft insider, Larry Miller of Microsoft MCSA on a blog post about patching games using the tool,
Under 32 bit windows an application has access to 2GB of VIRTUAL
memory space. 64 bit Windows makes 4GB available to applications.
Without the change mentioned an application will only be able to
access 2GB.
This was not an arbitrary restriction. Most 32 bit applications simply
can not cope with a larger than 2GB address space. The switch
mentioned indicates to the system that it is able to cope. If this
switch is manually set most 32 bit applications will crash in 64 bit
environment.
In some cases the switch may be useful. But don't be surprised if it
crashes.
And finally to add from MSDN - Migrating 32-bit Managed Code to 64-bit,
There is also information in the PE that tells the Windows loader if
the assembly is targeted for a specific architecture. This additional
information ensures that assemblies targeted for a particular
architecture are not loaded in a different one. The C#, Visual Basic
.NET, and C++ Whidbey compilers let you set the appropriate flags in
the PE header. For example, C# and THIRD have a /platform:{anycpu,
x86, Itanium, x64} compiler option.
Note: While it is technically possible to modify the flags in the PE header of an assembly after it has been compiled, Microsoft does not recommend doing this.
Finally to answer your question - how does this tool help in practice?
Since you have malloc in your tags, I believe you are working on unmanaged memory. This patch would mostly result in invalid pointers as they become twice the size now, and almost all other primitive datatypes would be scaled by a factor of 2X.
But for managed code since all these are handled by the CLR in .NET, this would mean really helpful and would not have much problems unless you are dealing with any of the following :
Invoking platform APIs via p/invoke
Invoking COM objects
Making use of unsafe code
Using marshaling as a mechanism for sharing information
Using serialization as a way of persisting state
To summarize, being a programmer I would not use the tool to convert my application and rather would migrate it myself by changing build targets. being said that if I have a exe that can do well like games with more RAM, then this is worth a try.

Memory allocation of any programming language depends on compiler or the system architecture?

Memory allocation of any programming language depends on compiler or the system architecture? If it depends on compiler, then what difference 32 bit/64 bit architecture makes? If it depends on architecture, then why the memory size of variables are constant for 16/32/64 bit architectures? What is the impact of slack bytes on the system architecture?
It's not either or. Compiler, system architecture and system conventions have an effect on how memory allocation works.
For one, the system architecture includes the size of pointers. So a compiler can't use more memory than a system provides, and can't address more bytes or have larger pointers than the CPU supports. (I mean actual RAM, and real addresses. Of course it can use Virtual Memory-like constructs to remap the addresses the programmer uses to something that CPU can handle).
Similarly, usually operating system vendors have conventions what a function call should be like: Whether parameters are passed on the stack or in registers. Whether parameters that are larger than a certain size should be copied onto the stack or provided as a pointer to a stack/heap object. If you want to call system functions, you will have to use these conventions.
However, beyond that, it is left to the compiler to decide details. E.g. many Pascal compilers pass additional hidden pointers to enclosing scope into built-in functions defined in the same language. As long as you know code has been written using the same language, you can agree on differences in calling conventions within the limits the system architecture permits. E.g. if you know that your software has to run on similar CPUs with stricter requirements, you can choose to apply these strict requirements even on CPUs that don't need it (like compiling code so everything is at least 2-byte-aligned so your MC68040 code will also run on the older MC68000 which had that requirement).

General Purpose Registers

I am new to Computer Architecture.Can somebody help me in understanding the use of limited registers in processing of several complex applications. My question is there are fixed number of registers(For Example :: 80386 contains a total of sixteen registers) that are of interest to the applications programmer.
What happens if we want more registers( for example: to accommodate increased Stack size), are the addresses and data from registers written back to main memory ?.In multitasking environment, are the registers data and addresses of different applications moved from between main memory and back to register for processing ?
Does operating systems have special registers which does not interfere with application general purpose registers ?
And suggest any good resource for understanding such concepts for starters ?
Registers are the fastest memory in a computer. The instruction set of any particular cpu is written specifically for the register architecture. You are right that data/addresses must be backed to memory as more register space is used.
As far as a multitasking system goes, the scheduler generally has to save the execution context between tasks. This context involves the current state of the registers as well as other status bits (depending on the cpu).
A good first step would be to learn assembly programming. It is so close to the hardware that you will learn all of this stuff thoroughly. Once you have that, pick up an operating systems book to see how it is done at a higher level. Depending on your commitment (and curiosity), you could also read some of the source code for smaller real-time operating systems, such as FreeRTOS. Reading up on 8-bit microcontroller architectures is also nice, since they are simple. For example, AVR or HC08 are pretty straightforward architectures to learn. All of the info is free; you just have to read it.
Enjoy.

Resources