I'm writing a game in Forth (for learning purposes).
The game is played on a "10 cell board". I'm trying new stuff so I did
here 10 [char] - fill
to set up the space for the board.
Then, to play 'X' in position 3
[char] X here 3 + c!
This has been working fine, but raises the question
Is this OK?
What if the board was a million cells wide?
Thanks
The described approach has the certain environmental dependencies, so your program just have to match the environmental restriction on programs of your Forth system (i.e. that you use).
1. Size of the data space
The word UNUSED returns "the amount of space remaining in the region addressed by HERE". So, a program can check the available space.
Also, according to the subsection 4.1.3 Other system documentation of the Forth Standard:
A system shall provide the following information:
[...]
program data space available, in address units;
So, you have just to check whether your Forth system provides enough data space for you program, and how the available data space can be configured (if any).
2. Transient regions
In the general case, it is not safe for a portable program to use the data space without reserving it.
According to the section 3.3.3.6 Other transient regions of the Forth Standard, the contents of the data space regions identified by PAD, WORD, and #> may become invalid after the data space is allocated. Сonsequently, the contents of the region identified by HERE may become invalid after the contents of the regions identified by PAD, WORD, and #> are changed.
See also A.3.3.3.6 Other transient regions:
In many existing Forth systems, these areas are at HERE or just beyond it, hence the many restrictions.
Moreover, some Forth systems may use the region identified by HERE in internal purposes during translation. E.g. Gforth 0.7.9 uses this region when decoding escaped strings.
The phrase:
s\" test\-passed" cr here over type cr type cr
outputs:
test-passed
test-passed
So, you have to check the restrictions of your Forth system whether you may use the region identified by HERE without reserving the space (and in what conditions).
Related
I'm working on a dual OS system with STM32F103, I have two separate program that programmed on different FLASH locations. if both of the programs are the same, the only way to know which of them running is just by its start vector address.
But How I Can Read The Current Program Start Vector Address in STM32 ???
After reading the comments, it sounds like what you have/want is a bootloader. If your goal here is to have two different applications, one to do your main processing and real time handling and the other to just program new firmware, then you want to make a bootloader in your default boot flash space.
Bootloaders fundamentally do a few things, everything else is extra.
Check itself using some type of data integrity check like a CRC.
Checks the application
Jumps to the application.
Bootloaders will also program applications in the app space and verify they are programmed correctly before jumping as well. Colin gave some good advice about appending a CRC to the hex file before it is programmed in flash space to verify the applications.
There are a few things to look out for. The first would be the linker script and this is extremely important. A linker script will be used to map input objects to output objects and then determine based upon that script, what memory space they go into. For both of your applications, you need to create a memory map of how you want both programs to sit inside of the flash space. From this point, you can then make linker scripts for both programs so that a hex file can be generated within the parameters of what you deem acceptable flash space for the program. Each project you have will have its own linker script. An example would look something like this:
LR_IROM1 0x08000000 0x00010000 { ; load region size_region
ER_IROM1 0x08000000 0x00010000 { ; load address = execution address
*.o (RESET, +First)
*(InRoot$$Sections)
.ANY (+RO)
}
RW_IRAM1 0x20000000 0x00018000 { ; RW data
.ANY (+RW +ZI)
}
}
This will give RAM for the application to use as well as a starting point for the application.
After that, you can start the bootloader and give it information about where the application space lies for jumping and programming. Once again this is all determined by you from your memory map and both applications' linker scripts. You are going to need to add a separate entry inside of the linker for your CRC and length for a comparison of the calculated versus stored as well. Whatever tool you use to append the CRC to the hex file and have it programmed to flash space, remember to note the location and make it known to the linker script so you can reference those addresses to check integrity later.
After you check everything and it is determined that it is okay to go to the application, you can use some ARM assembly to jump to the starting application address. Before jumping, make sure to disable all peripherals and interrupts that were enabled in the bootloader. As Colin mentioned, these will share RAM, so it is important you de-initialize all used, otherwise, you'll end up with a hard fault.
At this point, the program used another hex file laid out by a linker script, so it should begin executing as planned, as long as you have the correct vector table offset, which gets into your question fully.
As far as your question on the "Flash vector address", I think what your really mean is your interrupt vector table address. An interrupt vector table is a data structure in memory that maps interrupt requests to the addresses of interrupt handlers. This is where the PC register grabs the next available instruction address upon hardware interrupt triggers, for example. You can see this by keeping track of the ARM pipeline in a few lines of assembly code. Each entry of this table is a handler's address. This offset must be aligned with your application, otherwise you will never go into the main function and the program will sit in the application space, but have nothing to do since all handlers addresses are unknown. This is what the SCB->VTOR is for. It is a vector interrupt table offset address register. In this case, there are a few things you can do. Luckily, these are hard-coded inside of STM generated files inside of the file "system_stm32(xx)xx.c" (xx is your microcontroller variant). There is a define for something called VECT_TAB_OFFSET which is the offset in the memory map of the vector table and is assigned to the SCB->VTOR register with the value that is chosen. Your interrupt vector table will always lie at the starting address of your main application, so for the bootloader it can be 0x00, but for the application, it will be the subtraction of the starting address of the application space, and the first addressable flash address of the microcontroller.
/************************* Miscellaneous Configuration ************************/
/*!< Uncomment the following line if you need to relocate your vector Table in
Internal SRAM. */
/* #define VECT_TAB_SRAM */
#define VECT_TAB_OFFSET 0x00 /*!< Vector Table base offset field.
This value must be a multiple of 0x200. */
/******************************************************************************/
Make sure you understand what is expected from the micro side using STM documentation before programming things. Vector tables in this chip can only be in multiples of 0x200. But to answer your question, this address can be determined by a few things. Your memory map, and eventually, you will have a hard-coded reference to it as a define. You can figure it out from there.
Hope this helps and good luck to you on your application.
I realized Data memory implementation in nand2tetris course. But I really don't understand some parts of my implementation:
CHIP Memory {
IN in[16], load, address[15];
OUT out[16];
PARTS:
DMux4Way(in=load, sel=address[13..14], a=RAM1, b=RAM2, c=scr, d=kbr);
Or(a=RAM1, b=RAM2, out=RAM);
RAM16K(in=in, load=RAM, address=address[0..13], out=RAMout);
Screen(in=in, load=scr, address=address[0..12], out=ScreenOut);
Keyboard(out=KeyboardOut);
Mux4Way16(a=RAMout, b=RAMout, c=ScreenOut, d=KeyboardOut, sel=address[13..14], out=out);
}
Is responsible for what load here. I understand that if load is 0 - out of Dmux4Way in any case will be 0 0 0 0. But i don't understand how it works in that case after that. Namely how it allows don't load data in Memory.
At least incomprehensible why in Screen we fed address[0..12] instead address[0..14] - full address. In my opinion we should use second because Screen memory map stay after RAM memory map and if we want to request for Screen memory map - we should use range (16 384 - 24 575) - decimal or (100000000000000 - 101111111111111) - binary. But how we can represent that range use just 13 width buss (address[0..12]) ??? It's impossible.
Therefore if we want to represent Screen memory map we should use range which was presented above. And that range has 15 width or address[0..14] BUT not address[0..12] (width 13). But why works just address[0..12] and doesn't work address[0..14](full address)
DMux4Way(in=load, sel=address[13..14], a=RAM1, b=RAM2, c=scr, d=kbr);
I'm sorry to criticize you at the beginning, but questions you ask suggest that you didn't do this exercise yourself or didn't start the whole course from the beginning.
To answer your questions:
Ad.1.
You demultiplex a single bit (load bit) to the correct memory part. Thereafter, you then feed the input data to all memory parts at the same time.
It's easier and neater than doing it the other way around, namely, to direct 16-bit input to the correct part (RAM16K, screen, or keyboard) while having a load bit that is connected and active at every register in all the parts.
To clarify. You have 2 possible destinations when writing data: RAM and Screen. The smallest demultiplexer you have is a 4-way multiplexer and that's what you're using. When you write into memory, you need to provide 2 pieces of information: the data and destination, both at the same time.
You might demultiplex the input data with DMux4Way16 and separately single load bit with DMux4Way but that would take 2 demultiplexers, and we can do better than that. That's what's done here, you direct data input to both RAM and Screen and then only use one demultiplexer : DMux4Way to select one of 2 possible destinations, only the one selected will be loaded with new data, on the other data input will be ignored. Knowing that, you need to study A-instruction format: when bit 14 and 13 of A-instruction (or data residing in A-register) have the binary value 00 or 01, the destination is RAM. When bit 14 and 13 have the binary value 10, it means the screen is the destination.
When you notice that you choose these 2 bits as sel for your demultiplexer. Selections 0 and 1 have the same meaning, so you can OR them and feed the output as load to RAM. Selection 2 means Screen will be loaded with a, new value, so load bit goes there. Selection 3 is never used so we don't care about it - output d of demultiplexer will not be connected anywhere. We make use of the demultiplexer's feature: The selected output will have value 1 and all other outputs will yield 0 as a result. It means only 1 memory destination will be loaded.
Ad.2.
Screen is separate device, it has nothing to do with RAM, ROM or Keyboard memory devices here. You, and only you, give meaning to what bits mean what to this specific device. To answer your question, when you address some register in Screen you address it in its own internal address space. In its internal address space first address will be 0, but from whole Memory it will be 16384. It's your job to make this transition. In this particular case, size of Screen memory device it is not necessary to use 14-bit address bus, 13 bits is all you need. What would 14th bit mean in this case? It wouldn't add any value. Also, you are user and not designer of Screen, you only look at and follow its interface description.
Hope it answers your questions, if not I urge you to go back and study more carefully previous hardware related chapters from course.
what is double word aligned data??. i am working on a Ti processor with the c6accel DSP engine.The fft function requires the input data array of samples to be double word aligned.what does double word aligned exactly mean and how do i generate it?
A word is the amount of data which each register in the CPU is able to hold. This is dependant on the processor being used - on 32-bit systems this will be 32 bits, and on 64-bit systems this will be 64 bits, etc... Your TI processor will probably be either 16-bit or 32-bit depending on the model (guess based on this).
The size of a word will generally correspond to the size of a pointer, although technically this is not guarenteed to be true (doesn't work on PS3/XB360) and as a result should not be relied on as a rule (source). The correct way of determining the size of a world will depend on which operating system you're using. As quoted from the previous source:
The C header file may defines WORD_BIT and/or __WORDSIZE.
The size of a double word is just the size of a word * 2. Data in memory (RAM) is generally fetched by programs 1 word at a time assuming the fetch begins on a word boundary. If this is not the case the data on either side of the word boundary will need to be fetched in two separate instructions, which leads to inefficiencies since twice the amount of fetches need to be done and reading/writing to/from RAM is relatively slow (Sidenote: this is largely mitigated by caches in modern processors, although that's another topic altogether).
For TI C6x: double word = 64-bit
Make sure your array starts at an address that is a multiple of 8 bytes.
You can use the assembly directive .align before declaring your array, or you can use the linker to link your array section at an aligned address. If you can't control the address (e.g. you have memory from malloc), make your array start after a few bytes (to the next 8 byte boundary).
I was reading about memory architecture and I got a bit confused with the paging and segmentation. I read that modern OS systems use only paging to manage memory access but looking at a disassembled codes I can see segments like "ds" and "fs". Does it means that the OS (saw that on Windows and Linux) is using both segmentation and paging or is it just mapping all the segments into the same pages (making segments irrelevant) ?
Ok, based on the book Modern Operating Systems 3rd Edition by Andrew S. Tanenbaum and the materials form Open Security Training (opensecuritytraining.info), i manage to understand the segmentation and paging and the answer for my question is:
Concepts:
1.1.Segmentation:
Segmentation is the division of the memory into pieces (sections) called segments. These segments are independents from each other, have variable sizes and may grow as needed.
1.2. Virtual Memory:
A virtual memory is an abstraction of real memory. This means that it maps a virtual address (used by programs) into a physical address (used by the hardware). If a program wants to access the memory 2000 (mov eax, 2000), the virtual address (2000) will be converted into the real physical address ( for example 1422) and then the program will access the memory 1422 thinking that he’s accessing the memory 2000.
So, if virtual memory is being used by the system, programs no longer access real memory directly, instead, they used the correspondent virtual memory.
1.3. Paging:
Paging is a virtual memory management scheme. As explained before, it maps a virtual address into a physical address. Paging divides the virtual memory into pieces called “pages” and also divides the physical memory into pieces called “frame pages”. A page can be bound to one or more frame page ( keep in mind that a page can map different frame pages, but just one at the time)
Advantages and Disadvantages
2.1. Segmentation:
The main advantage of using segmentation is to divide the memory into different linear address spaces. With this programs can have an address space for his code, other for the stack, another for dynamic variables (heap) etc. The disadvantage for segmentation is the fragmentation of the memory.
Consider this example – One portion of the memory is divided into 4 segments sequentially allocated -> seg1(in use) --- seg2(in use)--- seg3(in use)---seg4(in use). Now if we free the memory from seg2, we will have -> seg1(in use) --- seg2(FREE)--- seg3(in use)---seg4(in use). If we want to allocate some data, we can do it by using seg2, but if the size of data is bigger than the size of the seg2, we won’t be able to do so and the space will be wasted and the memory fragmented. Another problem is that some segments could have a larger size and since this segment can’t be “broken” into smaller pieces, it must be fully allocated in memory.
2.1. Paging:
The main advantages of using paging is to abstract the physical memory, hence, the programs (and programmers) don’t need to bother with memory addresses. You can have two programs that access the (virtual) memory 2000 because it will be mapped into two different physical memories. Also, it’s possible to use the hard disk to ensure that only the necessary pages are allocated into memory. The main disadvantage is that paging uses only one linear address space. Paging maps virtual memories from 0 to the max addressable value for all programs. Now consider this example - two chunks of memory, “chk1” and “chk2” are next to each other on virtual memory ( chk1 is allocated from address 1000 to 2000 and chk2 uses from 2001 to 3000), if chk1 needs to grow, the OS needs to make room in the memory for the new size of chk1. If that happens again, the OS will have to do the same thing or, if no space can be found, an exception will pop. The routines for managing this kind of operations are very bad for performance, because it this growth and shrink of the memory happens a lot of time.
Segmentation and Paging
3.1. Why combine both?
An operational systems can combine both segmentation and paging. Why? Because combining then it’s possible to have the best from both without their disadvantages. So, the memory is divided in many segments and each one of those have one or more pages. When a program tries to access a memory address, first the OS goes to the correspondent segment, there he’ll find the correspondent page, so he goes to the page and there he’ll find the frame page that the program wanted to access. Using this approach, the OS divides the memory into various segments, including segments for the kernel and user programs. These segments have variable sizes and access protection features. To solve the fragmentation and the “big segment” problems, paging is used. With paging a larger segment is broken into several pages and only the necessary pages remains in memory (and more pages could be allocated in memory if needed). So, basically, the Modern OSs have two memory abstractions, where segmentation is more used for “handling the programs” and Paging for “managing the physical memory”.
3.2. How it really works?
An OS running segmentation and Paging will have the following structures:
3.2.1. Segment Selector:
This represents an index on the Global/Local Descriptor Table. It contains 3 fields that represent the index on the descriptor table, a bit to identify if that segment is present on Global or Local descriptor table and a privilege level.
3.2.2. Segment Register:
A CPU register used to store segment selector. Usually ( on a x86 machine) there is at least the register CS (Code Segment) and DS (Data Segment).
3.2.3. Segment descriptors:
This structure contains the data regarding a segment, like his base address, size ( in pages or in bytes), the access privilege, the information of if this segment is present on the memory or not, etc. (for all the fields, search for segment descriptors on google)
3.2.4. Global/Local Descriptor Table:
The table that contains several segment descriptors. So this structure holds all the segment descriptors for the system. If the table is Global, it can hold other things like Local Descriptor Tables descriptors, Task State Segment Descriptors and Call Gate descriptors (I’ll not explain these here, please, google them). If the table is Local, it will only ( as far as I know) holds user related segment descriptors.
3.3. How it works?
So, to access a memory, first the program needs to access a segment. To do this a segment selector is loaded into a segment register and then a Global or Local Descriptor table (depending of field on the segment selector). This is the reason of the fully memory address be SEGMENT REGISTER: ADDRESS , like CS:ADDRESS -> 001B:0044BF7A. Now the OS goes to the G/LDT and (using the index field of the segment selector) finds the segment descriptor of the address trying to access. Then it checks if the segment if present, the protection and if everything is ok it goes to the address stated on the “base field” (of the descriptor) + the address offset . If paging is not enabled, the system goes directly into the real memory, but with paging on the address is treated as a virtual address and it goes to the Page directory. The base address + the offset are called Linear Address and will be interpreted as 3 fields: Directory+Page+offset. So on the Directory page, it will search for the directory entry specified on the “directory” field of the linear address, this entry points to the page table and the field “page” of the linear address is used to find the page, this entry points to a frame page and the offset is used to find the exactly address that the program want to access.
3.4. Modern Operational Systems
Modern OSes "do not use" segmentation. Its in quotes because they use 4 segments: Kernel Code Segment, Kernel Data Segment, User Code Segment and User Data Segment. What does it means is that all user's processes have the same code and data segments (so the same segment selector). The segments only change when going from user to kernel.
So, all the path explained on the section 3.3. occurs, but they use the same segments and, since the page tables are individual per process, a page fault is difficult to happen.
Hope this helps and if there is any mistakes or some more details ( I was I bit generic on some parts) please, feel free to comment and reply.
Thank you guys
Danilo PC
They only use paging for memory protection, while they use segmentation for other purposes (like storing thread-local data).
Why does a process's address space have to divide into four segments (text, data, stack and heap)? What is the advandatage? is it possible to have only one whole big segment?
There are multiple reasons for splitting programs into parts in memory.
One of them is that instruction and data memories can be architecturally distinct and discontiguous, that is, read and written from/to using different instructions and circuitry inside and outside of the CPU, forming two different address spaces (i.e. reading code from address 0 and reading data from address 0 will typically return two different values, from different memories).
Another is reliability/security. You rarely want the program's code and constant data to change. Most of the time when that happens, it happens because something is wrong (either in the program itself or in its inputs, which may be maliciously constructed). You want to prevent that from happening and know if there are any attempts. Likewise you don't want the data areas that can change to be executable. If they are and there are security bugs in the program, the program can be easily forced to do something harmful when malicious code makes it into the program data areas as data and triggers those security bugs (e.g. buffer overflows).
Yet another is storage... In many programs a number of data areas aren't initialized at all or are initialized to one common predefined value (often 0). Memory has to be reserved for these data areas when the program is loaded and is about to start, but these areas don't need to be stored on the disk, because there's no meaningful data there.
On some systems you may have everything in one place (section/segment/etc). One notable example here is MSDOS, where .COM-style programs have no structure other than that they have to be less than about 64KB in size and the first executable instruction must appear at the very beginning of file and assume that its location corresponds to IP=0x100 (where IP is the instruction pointer register). How code and data are placed and interleaved in a .COM program is unimportant and up to the programmer.
There are other architectural artifacts such as x86 segments. Again, MSDOS is a good example of an OS that deals with them. .EXE-style programs in it may have multiple segments in them that correspond directly to the x86 CPU segments, to the real-mode addressing scheme, in which memory is viewed through 64KB-long "windows" known as segments. The position of these windows/segments is relative to the value of the CPU's segment registers. By altering the segment register values you can move the "windows". In order to access more than 64KB one needs to use different segment register values and that often implies having multiple segments in the .EXE (can be not just one segment for code and one for data, but also multiple segments for either of them).
At least the text and data segments are separated to prevent malicious code that's stored inside a variable from being run.
Instructions (compiled code) are stored in the text segment, while the contents of your variables are stored in a data segment, the latter of which never gets executed, only read from and written to.
A little more info here.
Isn't this distinction just a big, hacky workaround for patching security into the von-Neumann architecture where data and instructions share the same memory?