I am attempting to move sections of memory around using a linker script for an STM32F446ZE micro controller. My original setup consisted of this:
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 512K - 128k
DATA (rwx) : ORIGIN = 0x08060000, LENGTH = 5120
}
SECTIONS
{
.user_data :
{
. = ALIGN(4);
KEEP(*(.user_data))
. = ALIGN(4);
} >DATA
.isr_vector :
{
. = ALIGN(4);
KEEP(*(.isr_vector)) /* Startup code */
. = ALIGN(4);
} >FLASH
.text :
{
. = ALIGN(4);
*(.text) /* .text sections (code) */
*(.text*) /* .text* sections (code) */
*(.glue_7) /* glue arm to thumb code */
*(.glue_7t) /* glue thumb to arm code */
*(.eh_frame)
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
_etext = .; /* define a global symbols at end of code */
} >FLASH
.rodata :
{
. = ALIGN(4);
*(.rodata) /* .rodata sections (constants, strings, etc.) */
*(.rodata*) /* .rodata* sections (constants, strings, etc.) */
. = ALIGN(4);
} >FLASH
.data :
{
. = ALIGN(4);
_sdata = .; /* create a global symbol at data start */
*(.data) /* .data sections */
*(.data*) /* .data* sections */
. = ALIGN(4);
_edata = .; /* define a global symbol at data end */
} >RAM AT> FLASH
. = ALIGN(4);
.bss :
{
/* This is used by the startup in order to initialize the .bss secion */
_sbss = .; /* define a global symbol at bss start */
__bss_start__ = _sbss;
*(.bss)
*(.bss*)
*(COMMON)
. = ALIGN(4);
_ebss = .; /* define a global symbol at bss end */
__bss_end__ = _ebss;
} >RAM
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(4);
} >RAM
What I want to do is move the DATA to start at 0x08000000 (where flash is currently starting) and start FLASH at 0x08040000 (after DATA). I can change that in the memory section easy enough, but my program wont start. I believe some of the code in the SECTIONS block may have to be changed, but I'm not sure how. Question is: how can I move flash (where the program code is) to a later memory address.
It is not possible as your STM32 uC starts from the address 0x8000000 when booting from flash.
Question is: how can I move flash (where the program code is) to a later memory address.
The answer: it is not possible. The vector table has to start at 0x8000000 when booting from the FLASH memory
As P__J__ mentioned you can not move whole data region to the address 0x0800 0000 because MCU expects the interrupt vector to start there. When the MCU is booted from the flash memory the address 0x0800 0000 is mapped to the address 0x0000 0000.
What you can do instead is to create another region for the length of vector table and move the other parts of your sections as you please.
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
VECTORS (rx) : ORIGIN = 0x08000000, LENGTH = 0xB8
FLASH (rx) : ORIGIN = 0x080000B8, LENGTH = 512K - 128k - 0xB8
DATA (rwx) : ORIGIN = 0x08060000, LENGTH = 5120
}
.isr_vector :
{
KEEP(*(.isr_vector))
} > VECTORS
How can I change the start address on flash? This would apply if you had stmf7, sadly you do not have a feature of booting from any address you like. There is a fixed number of options you have.
Take a look at data sheet, 7.1.2 Reset: The RESET service routine vector is fixed at address 0x0000_0004 in the memory map. which means that the second 4 bytes in your flash are the address of reset handler.
However, you can change a place where you boot from by using BOOT pin, again consult a datasheet at 2.4 STM32F4xx microcontrollers implement a special mechanism to be able to boot from other memories (like the internal SRAM).
So your only option is to change type of memory is used for booting. But in your case it wont help at all.
Related
I have some questions regarding the flash memory with a dspic33ep512mu810.
I'm aware of how it should be done:
set all the register for address, latches, etc. Then do the sequence to start the write procedure or call the builtins function.
But I find that there is some small difference between what I'm experiencing and what is in the DOC.
when writing the flash in WORD mode. In the DOC it is pretty straightforward. Following is the example code in the DOC
int varWord1L = 0xXXXX;
int varWord1H = 0x00XX;
int varWord2L = 0xXXXX;
int varWord2H = 0x00XX;
int TargetWriteAddressL; // bits<15:0>
int TargetWriteAddressH; // bits<22:16>
NVMCON = 0x4001; // Set WREN and word program mode
TBLPAG = 0xFA; // write latch upper address
NVMADR = TargetWriteAddressL; // set target write address
NVMADRU = TargetWriteAddressH;
__builtin_tblwtl(0,varWord1L); // load write latches
__builtin_tblwth(0,varWord1H);
__builtin_tblwtl(0x2,varWord2L);
__builtin_tblwth(0x2,varWord2H);
__builtin_disi(5); // Disable interrupts for NVM unlock sequence
__builtin_write_NVM(); // initiate write
while(NVMCONbits.WR == 1);
But that code doesn't work depending on the address where I want to write. I found a fix to write one WORD but I can't write 2 WORD where I want. I store everything in the aux memory so the upper address(NVMADRU) is always 0x7F for me. The NVMADR is the address I can change. What I'm seeing is that if the address where I want to write modulo 4 is not 0 then I have to put my value in the 2 last latches, otherwise I have to put the value in the first latches.
If address modulo 4 is not zero, it doesn't work like the doc code(above). The value that will be at the address will be what is in the second set of latches.
I fixed it for writing only one word at a time like this:
if(Address % 4)
{
__builtin_tblwtl(0, 0xFFFF);
__builtin_tblwth(0, 0x00FF);
__builtin_tblwtl(2, ValueL);
__builtin_tblwth(2, ValueH);
}
else
{
__builtin_tblwtl(0, ValueL);
__builtin_tblwth(0, ValueH);
__builtin_tblwtl(2, 0xFFFF);
__builtin_tblwth(2, 0x00FF);
}
I want to know why I'm seeing this behavior?
2)I also want to write a full row.
That also doesn't seem to work for me and I don't know why because I'm doing what is in the DOC.
I tried a simple write row code and at the end I just read back the first 3 or 4 element that I wrote to see if it works:
NVMCON = 0x4002; //set for row programming
TBLPAG = 0x00FA; //set address for the write latches
NVMADRU = 0x007F; //upper address of the aux memory
NVMADR = 0xE7FA;
int latchoffset;
latchoffset = 0;
__builtin_tblwtl(latchoffset, 0);
__builtin_tblwth(latchoffset, 0); //current = 0, available = 1
latchoffset+=2;
__builtin_tblwtl(latchoffset, 1);
__builtin_tblwth(latchoffset, 1); //current = 0, available = 1
latchoffset+=2;
.
. all the way to 127(I know I could have done it in a loop)
.
__builtin_tblwtl(latchoffset, 127);
__builtin_tblwth(latchoffset, 127);
INTCON2bits.GIE = 0; //stop interrupt
__builtin_write_NVM();
while(NVMCONbits.WR == 1);
INTCON2bits.GIE = 1; //start interrupt
int testaddress;
testaddress = 0xE7FA;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
testaddress += 2;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
testaddress += 2;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
testaddress += 2;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
What I see is that the value that is stored in the address 0xE7FA is 125, in 0xE7FC is 126 and in 0xE7FE is 127. And the rest are all 0xFFFF.
Why is it taking only the last 3 latches and write them in the first 3 address?
Thanks in advance for your help people.
The dsPIC33 program memory space is treated as 24 bits wide, it is
more appropriate to think of each address of the program memory as a
lower and upper word, with the upper byte of the upper word being
unimplemented
(dsPIC33EPXXX datasheet)
There is a phantom byte every two program words.
Your code
if(Address % 4)
{
__builtin_tblwtl(0, 0xFFFF);
__builtin_tblwth(0, 0x00FF);
__builtin_tblwtl(2, ValueL);
__builtin_tblwth(2, ValueH);
}
else
{
__builtin_tblwtl(0, ValueL);
__builtin_tblwth(0, ValueH);
__builtin_tblwtl(2, 0xFFFF);
__builtin_tblwth(2, 0x00FF);
}
...will be fine for writing a bootloader if generating values from a valid Intel HEX file, but doesn't make it simple for storing data structures because the phantom byte is not taken into account.
If you create a uint32_t variable and look at the compiled HEX file, you'll notice that it in fact uses up the least significant words of two 24-bit program words. I.e. the 32-bit value is placed into a 64-bit range but only 48-bits out of the 64-bits are programmable, the others are phantom bytes (or zeros). Leaving three bytes per address modulo of 4 that are actually programmable.
What I tend to do if writing data is to keep everything 32-bit aligned and do the same as the compiler does.
Writing:
UINT32 value = ....;
:
__builtin_tblwtl(0, value.word.word_L); // least significant word of 32-bit value placed here
__builtin_tblwth(0, 0x00); // phantom byte + unused byte
__builtin_tblwtl(2, value.word.word_H); // most significant word of 32-bit value placed here
__builtin_tblwth(2, 0x00); // phantom byte + unused byte
Reading:
UINT32 *value
:
value->word.word_L = __builtin_tblrdl(offset);
value->word.word_H = __builtin_tblrdl(offset+2);
UINT32 structure:
typedef union _UINT32 {
uint32_t val32;
struct {
uint16_t word_L;
uint16_t word_H;
} word;
uint8_t bytes[4];
} UINT32;
I am trying to read data from channel 0,1,2 and 3 of ADC1. The issue is that when I read from channel 1 and channel 2 and run the code in debug mode it shows the correct valueenter image description heres but when I modified it for doing the same for channel 3 and 4 also,it does not show any value.Following is the code and the screen shot of memory map:
#include "stm32f4xx.h"
#include "stm32f4_discovery.h"
uint16_t ADC1ConvertedValue[4] = {0,0,0,0};//Stores converted vals [2] = {0,0}
void DMA_config()
{
RCC_AHB1PeriphClockCmd(RCC_AHB1Periph_DMA2 ,ENABLE);
DMA_InitTypeDef DMA_InitStruct;
DMA_InitStruct.DMA_Channel = DMA_Channel_0;
DMA_InitStruct.DMA_PeripheralBaseAddr = (uint32_t) 0x4001204C;//ADC1's data register
DMA_InitStruct.DMA_Memory0BaseAddr = (uint32_t)&ADC1ConvertedValue;
DMA_InitStruct.DMA_DIR = DMA_DIR_PeripheralToMemory;
DMA_InitStruct.DMA_BufferSize = 4;//2
DMA_InitStruct.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
DMA_InitStruct.DMA_MemoryInc = DMA_MemoryInc_Enable;
DMA_InitStruct.DMA_PeripheralDataSize = DMA_PeripheralDataSize_HalfWord;//Reads 16 bit values _HalfWord
DMA_InitStruct.DMA_MemoryDataSize = DMA_MemoryDataSize_HalfWord;//Stores 16 bit values _Halfword
DMA_InitStruct.DMA_Mode = DMA_Mode_Circular;
DMA_InitStruct.DMA_Priority = DMA_Priority_High;
DMA_InitStruct.DMA_FIFOMode = DMA_FIFOMode_Enable;
DMA_InitStruct.DMA_FIFOThreshold = DMA_FIFOThreshold_HalfFull;//_HalfFull
DMA_InitStruct.DMA_MemoryBurst = DMA_MemoryBurst_Single;
DMA_InitStruct.DMA_PeripheralBurst = DMA_PeripheralBurst_Single;
DMA_Init(DMA2_Stream0, &DMA_InitStruct);
DMA_Cmd(DMA2_Stream0, ENABLE);
}
void ADC_config()
{
/* Configure GPIO pins ******************************************************/
ADC_InitTypeDef ADC_InitStruct;
ADC_CommonInitTypeDef ADC_CommonInitStruct;
GPIO_InitTypeDef GPIO_InitStruct;
RCC_AHB1PeriphClockCmd( RCC_AHB1Periph_GPIOA, ENABLE);
RCC_APB2PeriphClockCmd(RCC_APB2Periph_ADC1, ENABLE);//ADC1 is connected to the APB2 peripheral bus
GPIO_InitStruct.GPIO_Pin = GPIO_Pin_0 | GPIO_Pin_1 | GPIO_Pin_2 | GPIO_Pin_3;// PA0,PA1,PA3,PA3
GPIO_InitStruct.GPIO_Mode = GPIO_Mode_AN;//The pins are configured in analog mode
GPIO_InitStruct.GPIO_PuPd = GPIO_PuPd_NOPULL ;//We don't need any pull up or pull down
GPIO_Init(GPIOA, &GPIO_InitStruct);//Initialize GPIOA pins with the configuration
/* ADC Common Init **********************************************************/
ADC_CommonInitStruct.ADC_Mode = ADC_Mode_Independent;
ADC_CommonInitStruct.ADC_Prescaler = ADC_Prescaler_Div2;
ADC_CommonInitStruct.ADC_DMAAccessMode = ADC_DMAAccessMode_Disabled;
ADC_CommonInitStruct.ADC_TwoSamplingDelay = ADC_TwoSamplingDelay_5Cycles;
ADC_CommonInit(&ADC_CommonInitStruct);
/* ADC1 Init ****************************************************************/
ADC_InitStruct.ADC_Resolution = ADC_Resolution_12b;//Input voltage is converted into a 12bit int (max 4095)
ADC_InitStruct.ADC_ScanConvMode = ENABLE;//The scan is configured in multiple channels
ADC_InitStruct.ADC_ContinuousConvMode = ENABLE;//Continuous conversion: input signal is sampled more than once
ADC_InitStruct.ADC_ExternalTrigConv = DISABLE;
ADC_InitStruct.ADC_ExternalTrigConvEdge = ADC_ExternalTrigConvEdge_None;
ADC_InitStruct.ADC_DataAlign = ADC_DataAlign_Right;//Data converted will be shifted to right
ADC_InitStruct.ADC_NbrOfConversion = 4;
ADC_Init(ADC1, &ADC_InitStruct);//Initialize ADC with the configuration
/* Select the channels to be read from **************************************/
ADC_RegularChannelConfig(ADC1, ADC_Channel_0, 1, ADC_SampleTime_144Cycles);//PA0
ADC_RegularChannelConfig(ADC1, ADC_Channel_1, 2, ADC_SampleTime_144Cycles);//PA1
ADC_RegularChannelConfig(ADC1, ADC_Channel_2, 3, ADC_SampleTime_144Cycles);//PA2
ADC_RegularChannelConfig(ADC1, ADC_Channel_3, 4, ADC_SampleTime_144Cycles);//PA3
/* Enable DMA request after last transfer (Single-ADC mode) */
ADC_DMARequestAfterLastTransferCmd(ADC1, ENABLE);
/* Enable ADC1 DMA */
ADC_DMACmd(ADC1, ENABLE);
/* Enable ADC1 */
ADC_Cmd(ADC1, ENABLE);
}
int main(void)
{
DMA_config();
ADC_config();
while(1)
{
ADC_SoftwareStartConv(ADC1);
//value=ADC_Read();
}
return 0;
}
You shouldn't start the ADC conversion multiple times (in your while(1) loop).
Since you've configured the ADC with .ADC_ContinuousConvMode = ENABLE it should be enough to start it before the loop and do nothing within it.
Let us look at the STM32F103 linker script:
/* Entry Point */
ENTRY(Reset_Handler)
/* Highest address of the user mode stack */
_estack = 0x20005000; /* End of 20K RAM */
/* Generate a link error if heap and stack don't fit into RAM */
_Min_Heap_Size = 0; /* Required amount of heap */
_Min_Stack_Size = 0x100; /* Required amount of stack */
/* Specify the memory areas */
MEMORY
{
FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 128K
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 20K
MEMORY_B1 (rx) : ORIGIN = 0x60000000, LENGTH = 0K
}
/* Define output sections */
SECTIONS
{
/* The startup code goes first into FLASH */
.isr_vector :
{
. = ALIGN(4);
KEEP(*(.isr_vector)) /* Startup code */
. = ALIGN(4);
} >FLASH
/* The program code and other data goes into FLASH */
.text :
{
. = ALIGN(4);
*(.text) /* .text sections (code) */
*(.text*) /* .text* sections (code) */
*(.rodata) /* .rodata sections (constants, strings, etc.) */
*(.rodata*) /* .rodata* sections (constants, strings, etc.) */
*(.glue_7) /* Glue arm to thumb code */
*(.glue_7t) /* Glue thumb to arm code */
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
_etext = .; /* Define a global symbols at end of code */
} >FLASH
.ARM.extab : { *(.ARM.extab* .gnu.linkonce.armextab.*) } >FLASH
.ARM : {
__exidx_start = .;
*(.ARM.exidx*)
__exidx_end = .;
} >FLASH
.ARM.attributes : { *(.ARM.attributes) } > FLASH
.preinit_array :
{
PROVIDE_HIDDEN (__preinit_array_start = .);
KEEP (*(.preinit_array*))
PROVIDE_HIDDEN (__preinit_array_end = .);
} >FLASH
.init_array :
{
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array*))
PROVIDE_HIDDEN (__init_array_end = .);
} >FLASH
.fini_array :
{
PROVIDE_HIDDEN (__fini_array_start = .);
KEEP (*(.fini_array*))
KEEP (*(SORT(.fini_array.*)))
PROVIDE_HIDDEN (__fini_array_end = .);
} >FLASH
/* Used by the startup to initialize data */
_sidata = .;
/* Initialized data sections goes into RAM, load LMA copy after code */
.data : AT ( _sidata )
{
. = ALIGN(4);
_sdata = .; /* Create a global symbol at data start */
*(.data) /* .data sections */
*(.data*) /* .data* sections */
. = ALIGN(4);
_edata = .; /* Define a global symbol at data end */
} >RAM
/* Uninitialized data section */
. = ALIGN(4);
.bss :
{
/* This is used by the startup in order to initialize the .bss secion */
_sbss = .; /* Define a global symbol at BSS start */
__bss_start__ = _sbss;
*(.bss)
*(.bss*)
*(COMMON)
. = ALIGN(4);
_ebss = .; /* Define a global symbol at BSS end */
__bss_end__ = _ebss;
} >RAM
PROVIDE ( end = _ebss );
PROVIDE ( _end = _ebss );
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(4);
} >RAM
/* MEMORY_bank1 section, code must be located here explicitly */
/* Example: extern int foo(void) __attribute__ ((section (".mb1text"))); */
.memory_b1_text :
{
*(.mb1text) /* .mb1text sections (code) */
*(.mb1text*) /* .mb1text* sections (code) */
*(.mb1rodata) /* read-only data (constants) */
*(.mb1rodata*)
} >MEMORY_B1
/* Remove information from the standard libraries */
/DISCARD/ :
{
libc.a ( * )
libm.a ( * )
libgcc.a ( * )
}
}
I can see that ISR vectors are placed in flash, so if the program needs to call address stored in some vector, it will read from flash memory.
First question: How does hardware not make difference between reading from RAM and flash? And why do I then need special registers to write into or read from flash memory in code and can not just explicitly write or read from its addresses?
Second question is that about how much is it slower to read from flash than from RAM? And if I know what function is used the most in my code then am I able to move it to RAM section to crucially speed up it execution? I believe MEMORY_B1 in this script is made especially for this purpose.
Third question: How can we place anything in MEMORY_B1 if it has a length of 0?
And the last question: If I create an additional section in flash memory then can I create some simple analogue of virtual memory? I think answer to this question depends on the first one.
Hardware does not care, because both memories (and actually anything else) is mapped in the common address space. However this does not mean you can easily write flash memory to treat it as RAM, as this is slow and can quickly damage the memory (it has a typical write endurance of 100k cycles). Moreover, it can be erased only in full pages (which can be as big as 128 kB for some STM32 chips), which really makes it problematic to use as a substitute for RAM.
The difference in speed is negligible. Running your code from RAM on ARM Cortex-M microcontrollers will be slower than you expect, as RAM is connected to different bus (meant for data) and using it for code execution requires use of a slower "interconnect".
You cannot. If you would want to place something there, the size of memory would have to be increased (and size of "normal RAM" - decreased).
Generally you could do that, but it would be extremely slow and you would quickly damage the flash.
1/2. Is does not on an STM32F1. Only with core speeds over 100 Mhz the flash prefetch and cache miss will cost you. Even this chip has cache and prefetch.
With this core there might be a negligible small profit if you put the vector table in RAM in some special cases.
There could however be a hardware limit that applies to the access width used. But this flash is not affected by that.
3. Yes you certainly can. You'd could put a filesystem in it. But you the temperature range in which is can reliably write the flash is limited. And, since there is only one bank of flash, all activity halts until the erase/flash has been succeeded. Unless code is running from RAM.
Two extra caveats are that flash programming/erasing can take milliseconds, and you'd have to take into account the page erases of 2 kB each, but all flash controllers have to.
If you're in need of extra RAM, put some SPI FRAM on your board.
I am using the following library <flash.h> to Erase/Write/Read from memory but unfortunately the data I am trying to save doesn't seem to be written to flash memory. I am using PIC18F87j11 with MPLAB XC8 compiler. Also when I read the program memory from PIC after attempting to write to it, there is no data on address 0x1C0CA. What am I doing wrong?
char read[1];
/* set FOSC clock to 8MHZ */
OSCCON = 0b01110000;
/* turn off 4x PLL */
OSCTUNE = 0x00;
TRISDbits.TRISD6 = 0; // set as ouput
TRISDbits.TRISD7 = 0; // set as ouput
LATDbits.LATD6 = 0; // LED 1 OFF
LATDbits.LATD7 = 1; // LED 2 ON
EraseFlash(0x1C0CA, 0x1C0CA);
WriteBytesFlash(0x1C0CA, 1, 0x01);
ReadFlash(0x1C0CA, 1, read[0]);
if (read[0] == 0x01)
LATDbits.LATD6 = 1; // LED 1 ON
while (1) {
}
I don't know what WriteFlashBytes does but the page size for your device is 64 bytes and after writing you need to write an ulock sequence to EECON2 and EECON1 registers to start programming the flash memory
Is there a way for the gnu linker to combine memory blocks so the linker will use one sector name when assigning memory?
For example:
MEMORY
{
RAM1 (xrw) : ORIGIN = 0x20000480, LENGTH = 0x0BB80
RAM2 (xrw) : ORIGIN = 0x2001C000, LENGTH = 0x03C00
}
Can there be a memory block our sector that includes memory blocks RAM1 and RAM2? Something like this below:
.bss :
{
_bss_start = .;
*(.bss)
*(.bss.*)
*(COMMON)
_bss_end = .;
} >RAM >RAM1
Good question. There are multiple ways of to do this. One way would be to actually split the BSS section by selecting which file's BSS goes where.
MEMORY
{
RAM1 (xrw) : ORIGIN = 0x20000480, LENGTH = 0x0BB80
RAM2 (xrw) : ORIGIN = 0x2001C000, LENGTH = 0x03C00
}
SECTIONS
{
.bss1:
{
f1.o
. =+ 0x200;
f2.o (.bss)
} >RAM1
.bss2:
{
f3.o (.bss)
f4.o (.bss) = 0x1234
} >RAM2
}
Instead of doing this for each file (only useful if you have tiny RAM/ROM chips), I recommend to just place for example COMMON on RAM2, and .bss on RAM1.