How to implement a compile-time [dispatch] table for AVR? - compile-time

I have the same prerequisites as Dave Durbin in How can I implement a dynamic dispatch table in C... except my target is AVR. Here are my constraints:
modules are to be picked in a list, much like Linux compiled-in kernel modules
the number of C (can be C++) modules is known at compile-time
modules are to be statically linked (obviously)
I want the table in program memory, not in SRAM
Typically, the table should comprise items of this type:
typedef struct jump_item {
uint16_t function_id;
void (*callback)(void);
} jump_item_t;
I have tried using custom sections as suggested in the answer but then the linker throws an error for an unknown symbol __start_myownsection (whatever section name I use though). Of course since the code targets Linux/GCC. But I think I'm close because avr-gcc actually can use sections, just that I haven't been able to figure out yet how to stack symbols in a user-defined section and actually pointing to the beginning of the table, as well as determine the length of the table at run-time.
How could Art's answer be adapted to AVR?
* EDIT *
I can see at least two ways to achieve what I want using sections, either with functions "attached" to a user-defined section or tables of structures (as defined above) that all will stack up in the user-defined section. My current issues are:
unused variables are optimized away at compile-time!
unused functions are optimized away at link-time due to linker argument -gc-sections, which I need to clean unused functions.
I prefer the second option, something similar to this:
module1.c:
const jump_item_t module1_table[] __attribute__((__progmem__, section("tbl_dispatch"))) =
{
{ 0x02, func11 },
{ 0x03, func12 },
...
};
module2.c:
const jump_item_t module2_table[] __attribute__((__progmem__, section("tbl_dispatch"))) =
{
{ 0x12, func21 },
{ 0x13, func22 },
...
};
Note: indices aren't to be considered relevant.
When all modules define such variables, they're optimized away as there's nowhere any reference to these. They need to stack up in section tbl_dispatch though. So my question falls back to:
How can I tell the compiler from removing variables it "thinks" are unused but only with specific C/C++ modules?
The global command line I'm using so far is as follows:
avr-gcc -g -Wall -mcall-prologues -fshort-enums -Os \
-DF_CPU=8000000 -Wl,-relax -mmcu=... \
*.cpp *.c -o main
* EDIT *
To my disappointment, PROGMEM and custom sections don't go together. I've tried to combine them but I get disseminated jump tables in program memory... when I get these included at all. Fact is not even all tables appear in program memory.
Giving up.
Any idea welcome.

You can definitely make a module system if you write your own custom linker script, and copy what was done for constructors and destructors (ctors and dtors). The linker script below was based on avr5.x from AVR GCC, but I added the dispatch stuff to it.
If you look at the output of the build script in the shell session below, you can see that the dispatch table is set up correctly and has symbols pointing to the start and end of it. The shell session includes all the source code and build scripts that I used to compile this example.
$ ls
avr5-x-modules.ld build.sh kernel.c kernel.h module_foo.c
$ cat avr5-x-modules.ld
/* Default linker script, for normal executables */
/* Copyright (C) 2014 Free Software Foundation, Inc.
Copying and distribution of this script, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved. */
OUTPUT_FORMAT("elf32-avr","elf32-avr","elf32-avr")
OUTPUT_ARCH(avr:5)
MEMORY
{
text (rx) : ORIGIN = 0, LENGTH = 128K
data (rw!x) : ORIGIN = 0x800060, LENGTH = 0xffa0
eeprom (rw!x) : ORIGIN = 0x810000, LENGTH = 64K
fuse (rw!x) : ORIGIN = 0x820000, LENGTH = 1K
lock (rw!x) : ORIGIN = 0x830000, LENGTH = 1K
signature (rw!x) : ORIGIN = 0x840000, LENGTH = 1K
user_signatures (rw!x) : ORIGIN = 0x850000, LENGTH = 1K
}
SECTIONS
{
/* Read-only sections, merged into text segment: */
.hash : { *(.hash) }
.dynsym : { *(.dynsym) }
.dynstr : { *(.dynstr) }
.gnu.version : { *(.gnu.version) }
.gnu.version_d : { *(.gnu.version_d) }
.gnu.version_r : { *(.gnu.version_r) }
.rel.init : { *(.rel.init) }
.rela.init : { *(.rela.init) }
.rel.text :
{
*(.rel.text)
*(.rel.text.*)
*(.rel.gnu.linkonce.t*)
}
.rela.text :
{
*(.rela.text)
*(.rela.text.*)
*(.rela.gnu.linkonce.t*)
}
.rel.fini : { *(.rel.fini) }
.rela.fini : { *(.rela.fini) }
.rel.rodata :
{
*(.rel.rodata)
*(.rel.rodata.*)
*(.rel.gnu.linkonce.r*)
}
.rela.rodata :
{
*(.rela.rodata)
*(.rela.rodata.*)
*(.rela.gnu.linkonce.r*)
}
.rel.data :
{
*(.rel.data)
*(.rel.data.*)
*(.rel.gnu.linkonce.d*)
}
.rela.data :
{
*(.rela.data)
*(.rela.data.*)
*(.rela.gnu.linkonce.d*)
}
.rel.ctors : { *(.rel.ctors) }
.rela.ctors : { *(.rela.ctors) }
.rel.dtors : { *(.rel.dtors) }
.rela.dtors : { *(.rela.dtors) }
.rel.got : { *(.rel.got) }
.rela.got : { *(.rela.got) }
.rel.bss : { *(.rel.bss) }
.rela.bss : { *(.rela.bss) }
.rel.plt : { *(.rel.plt) }
.rela.plt : { *(.rela.plt) }
/* Internal text space or external memory. */
.text :
{
*(.vectors)
KEEP(*(.vectors))
/* For data that needs to reside in the lower 64k of progmem. */
*(.progmem.gcc*)
/* PR 13812: Placing the trampolines here gives a better chance
that they will be in range of the code that uses them. */
. = ALIGN(2);
__trampolines_start = . ;
/* The jump trampolines for the 16-bit limited relocs will reside here. */
*(.trampolines)
*(.trampolines*)
__trampolines_end = . ;
*(.progmem*)
. = ALIGN(2);
/* For future tablejump instruction arrays for 3 byte pc devices.
We don't relax jump/call instructions within these sections. */
*(.jumptables)
*(.jumptables*)
/* For code that needs to reside in the lower 128k progmem. */
*(.lowtext)
*(.lowtext*)
__ctors_start = . ;
*(.ctors)
__ctors_end = . ;
__dtors_start = . ;
*(.dtors)
__dtors_end = . ;
KEEP(SORT(*)(.ctors))
KEEP(SORT(*)(.dtors))
__dispatch_start = . ;
*(.dispatch)
__dispatch_end = . ;
KEEP(SORT(*)(.dispatch))
/* From this point on, we don't bother about wether the insns are
below or above the 16 bits boundary. */
*(.init0) /* Start here after reset. */
KEEP (*(.init0))
*(.init1)
KEEP (*(.init1))
*(.init2) /* Clear __zero_reg__, set up stack pointer. */
KEEP (*(.init2))
*(.init3)
KEEP (*(.init3))
*(.init4) /* Initialize data and BSS. */
KEEP (*(.init4))
*(.init5)
KEEP (*(.init5))
*(.init6) /* C++ constructors. */
KEEP (*(.init6))
*(.init7)
KEEP (*(.init7))
*(.init8)
KEEP (*(.init8))
*(.init9) /* Call main(). */
KEEP (*(.init9))
*(.text)
. = ALIGN(2);
*(.text.*)
. = ALIGN(2);
*(.fini9) /* _exit() starts here. */
KEEP (*(.fini9))
*(.fini8)
KEEP (*(.fini8))
*(.fini7)
KEEP (*(.fini7))
*(.fini6) /* C++ destructors. */
KEEP (*(.fini6))
*(.fini5)
KEEP (*(.fini5))
*(.fini4)
KEEP (*(.fini4))
*(.fini3)
KEEP (*(.fini3))
*(.fini2)
KEEP (*(.fini2))
*(.fini1)
KEEP (*(.fini1))
*(.fini0) /* Infinite loop after program termination. */
KEEP (*(.fini0))
_etext = . ;
} > text
.data :
{
PROVIDE (__data_start = .) ;
*(.data)
*(.data*)
*(.rodata) /* We need to include .rodata here if gcc is used */
*(.rodata*) /* with -fdata-sections. */
*(.gnu.linkonce.d*)
. = ALIGN(2);
_edata = . ;
PROVIDE (__data_end = .) ;
} > data AT> text
.bss ADDR(.data) + SIZEOF (.data) : AT (ADDR (.bss))
{
PROVIDE (__bss_start = .) ;
*(.bss)
*(.bss*)
*(COMMON)
PROVIDE (__bss_end = .) ;
} > data
__data_load_start = LOADADDR(.data);
__data_load_end = __data_load_start + SIZEOF(.data);
/* Global data not cleared after reset. */
.noinit ADDR(.bss) + SIZEOF (.bss) : AT (ADDR (.noinit))
{
PROVIDE (__noinit_start = .) ;
*(.noinit*)
PROVIDE (__noinit_end = .) ;
_end = . ;
PROVIDE (__heap_start = .) ;
} > data
.eeprom :
{
/* See .data above... */
KEEP(*(.eeprom*))
__eeprom_end = . ;
} > eeprom
.fuse :
{
KEEP(*(.fuse))
KEEP(*(.lfuse))
KEEP(*(.hfuse))
KEEP(*(.efuse))
} > fuse
.lock :
{
KEEP(*(.lock*))
} > lock
.signature :
{
KEEP(*(.signature*))
} > signature
.user_signatures :
{
KEEP(*(.user_signatures*))
} > user_signatures
/* Stabs debugging sections. */
.stab 0 : { *(.stab) }
.stabstr 0 : { *(.stabstr) }
.stab.excl 0 : { *(.stab.excl) }
.stab.exclstr 0 : { *(.stab.exclstr) }
.stab.index 0 : { *(.stab.index) }
.stab.indexstr 0 : { *(.stab.indexstr) }
.comment 0 : { *(.comment) }
.note.gnu.build-id : { *(.note.gnu.build-id) }
/* DWARF debug sections.
Symbols in the DWARF debugging sections are relative to the beginning
of the section so we begin them at 0. */
/* DWARF 1 */
.debug 0 : { *(.debug) }
.line 0 : { *(.line) }
/* GNU DWARF 1 extensions */
.debug_srcinfo 0 : { *(.debug_srcinfo) }
.debug_sfnames 0 : { *(.debug_sfnames) }
/* DWARF 1.1 and DWARF 2 */
.debug_aranges 0 : { *(.debug_aranges) }
.debug_pubnames 0 : { *(.debug_pubnames) }
/* DWARF 2 */
.debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) }
.debug_abbrev 0 : { *(.debug_abbrev) }
.debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) }
.debug_frame 0 : { *(.debug_frame) }
.debug_str 0 : { *(.debug_str) }
.debug_loc 0 : { *(.debug_loc) }
.debug_macinfo 0 : { *(.debug_macinfo) }
/* SGI/MIPS DWARF 2 extensions */
.debug_weaknames 0 : { *(.debug_weaknames) }
.debug_funcnames 0 : { *(.debug_funcnames) }
.debug_typenames 0 : { *(.debug_typenames) }
.debug_varnames 0 : { *(.debug_varnames) }
/* DWARF 3 */
.debug_pubtypes 0 : { *(.debug_pubtypes) }
.debug_ranges 0 : { *(.debug_ranges) }
/* DWARF Extension. */
.debug_macro 0 : { *(.debug_macro) }
}
$ cat build.sh
CFLAGS="-std=gnu11 -mmcu=atmega328p"
set -uex
avr-gcc $CFLAGS -c module_foo.c -o module_foo.o
avr-gcc $CFLAGS -c kernel.c -o kernel.o
avr-gcc -T avr5-x-modules.ld kernel.o module_foo.o \
-o program.elf -Wl,-Map=program.map
grep dispatch program.map
$ cat kernel.c
#include "kernel.h"
#include <avr/pgmspace.h>
extern dispatch_item * __dispatch_start;
extern dispatch_item * __dispatch_end;
int main()
{
while (1)
{
for (dispatch_item * item = __dispatch_start; item < __dispatch_end; item++)
{
// TODO: Insert code here for reading the contents of the
// dispatch item from program space and using it. You
// probably have to use pgm_read_word avr avr/pgmspace.h,
// but with GCC 5 you could probably use the new named
// memory space feature to just access the dispatch item
// the same way you would access any other struct:
// https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html
}
}
}
$ cat kernel.h
#pragma once
#include <stdint.h>
typedef struct dispatch_item {
uint16_t func_id;
void (*func)(void);
} dispatch_item;
#define DISPATCH_ITEM dispatch_item const __attribute__((section (".dispatch")))
$ cat module_foo.c
#include "kernel.h"
#include <avr/io.h>
// This gets called before main.
void __attribute__((constructor)) foo_init()
{
PINB = 0;
}
// There is a pointer to this in the dispatch table.
void foo()
{
PINB = 1;
}
// DISPATHCH_TABLE_ENTRY(0x12, &foo);
DISPATCH_ITEM foo_dispatch = { 0x12, &foo };
DISPATCH_ITEM foo_dispatch2 = { 0x13, &foo };
$ ./build.sh
++ avr-gcc -std=gnu11 -mmcu=atmega328p -c module_foo.c -o module_foo.o
++ avr-gcc -std=gnu11 -mmcu=atmega328p -c kernel.c -o kernel.o
++ avr-gcc -T avr5-x-modules.ld kernel.o module_foo.o -o program.elf -Wl,-Map=program.map
++ grep dispatch program.map
0x00000002 __dispatch_start = .
*(.dispatch)
.dispatch 0x00000002 0x8 module_foo.o
0x00000002 foo_dispatch
0x00000006 foo_dispatch2
0x0000000a __dispatch_end = .
SORT(*)(.dispatch)

The only practical way I can think of since all my attempts have failed so far is through makefile scripts and menus, much like building Linux Kernel modules: you pick up a series of modules to compile-in and the make script generates header/source files with the dispatch table.
The generated source file is built so as to include references to all of the required functions and variables, preventing the garbage collector from ripping them off at link time. I don't have the details of the implementation, this is just a hint that I might follow though not the simplest form.

A custom section will work, but do not use PROGMEM.
With avr-gcc, PROGMEM adds a section attribute.
Adding another will cause problems.
Unless you work at it, the new section will go into program memory.
You do not need to replace the default linker script,
but you need to add to it to get the start and size of the new section.
In the ld manual, see 3.10.9 Builtin Functions ADDR and SIZE,
3.11 Implicit Linker Scripts, 3.5.4 Source Code Reference.

Related

Is there a simple way to convert a lua table to a C++ array or vector?

I am starting to make my own package manager and am starting to develop a dependency system.
The builfiles are written in lua, they look something like this:
package = {
name = "pfetch",
version = "0.6.0",
source = "https://github.com/dylanaraps/pfetch/archive/0.6.0.tar.gz",
git = false
}
dependencies = {
"some_dep",
"some_dep2"
}
function install()
quantum_install("pfetch", false)
end
Only problem,I have no idea how to convert
dependencies = {
"some_dep",
"some_dep2"
}
To a global c++ array: ["some_dep", "some_dep2"]
Anything in the list that's not valid as a string should be ignored.
Any good way to do this?
Thanks in advance
Note: I am using the C api to interface with lua in C++. I don't know whether Lua's errors use longjmp or C++ exceptions.
Based on the clarification in your comment, something like this will work for you:
#include <iostream>
#include <string>
#include <vector>
#include <lua5.3/lua.hpp>
std::vector<std::string> dependencies;
static int q64795651_set_dependencies(lua_State *L) {
dependencies.clear();
lua_settop(L, 1);
for(lua_Integer i = 1; lua_geti(L, 1, i) != LUA_TNIL; ++i) {
size_t len;
const char *str = lua_tolstring(L, 2, &len);
if(str) {
dependencies.push_back(std::string{str, len});
}
lua_settop(L, 1);
}
return 0;
}
static int q64795651_print_dependencies(lua_State *) {
for(const auto &dep : dependencies) {
std::cout << dep << std::endl;
}
return 0;
}
static const luaL_Reg q64795651lib[] = {
{"set_dependencies", q64795651_set_dependencies},
{"print_dependencies", q64795651_print_dependencies},
{nullptr, nullptr}
};
extern "C"
int luaopen_q64795651(lua_State *L) {
luaL_newlib(L, q64795651lib);
return 1;
}
Demo:
$ g++ -fPIC -shared q64795651.cpp -o q64795651.so
$ lua5.3
Lua 5.3.3 Copyright (C) 1994-2016 Lua.org, PUC-Rio
> q64795651 = require('q64795651')
> dependencies = {
>> "some_dep",
>> "some_dep2"
>> }
> q64795651.set_dependencies(dependencies)
> q64795651.print_dependencies()
some_dep
some_dep2
>
One important pitfall: since you're not sure if Lua is compiled to use longjmp or exceptions for its errors, you need to make sure that you don't have any automatic variables with destructors anywhere that a Lua error could happen. (This is already the case in the code in my answer; just make sure you don't accidentally add any such places when incorporating this into your program.)

Why do builds for various projects fail with ‘Operation not permitted’ using iOS on-device compiler/toolchain?

I am an intermediately skilled Linux/Unix user trying to compile software for an iPad on a (jailbroken) iPad.
Many builds (for example, make and tex-live) fail with some Operation not permitted error. This will either look like Can't exec "blah": Operation not permitted or execvp: blah: Operation not permitted where blah is aclocal, a configure script, libtool, or just about anything. Curiously, finding the offending line in a Makefile or configure script and prefixing it with sudo -u mobile -E will solve the error for that line, only for it to reappear for on a later line or in another file. Since I am running the build scripts as mobile, I do not understand how this could possibly fix the issue, yet it does. I have confirmed that making these changes does actually allow for the script to work successfully up to that point. Running the build script with sudo or sudo -u mobile -E and/or running the entire build as root does not solve the issue; with either, I still must edit build scripts to add sudo’s.
I would like to know why this is happening, and if possible how I could address the issue without editing build scripts. Any information about these types of errors would be interesting to me even if they do not solve my problem. I am aware that the permissions/security/entitlements system is unusual on iOS and would like to learn more about how it works.
I am using an iPad Pro 4 on jailbroken iOS 13.5 with the build tools from sbingner’s and MCApollo’s repos (repo.bingner.com and mcapollo.github.io/Public). In particular, I am using a build of LLVM 5 (manually installed from sbingner’s old debs), Clang 10, Darwin CC tools 927 and GNU Make 4.2.1. I have set CC, CXX, CFLAGS, etc. to point to clang-10 and my iOS 13.5 SDK with -isysroot and have confirmed that these settings are working. I would like to replace these with updated versions, but I cannot yet build these tools for myself due to this issue and a few others. I do have access to a Mac for cross-compilation if necessary, but I would rather use only my iPad because I like the challenge.
I can attach any logs necessary or provide more information if that would be useful; I do not know enough about this issue to know what information is useful. Thanks in advance for helping me!
For anyone who ends up needing to address this issue on a jailbreak that does not have a fix for this issue, I have written (pasted below) a userland hook based on the posix_spawn implementation from the source of Apple’s xnu kernel.
Compile it with Theos, and inject it into all processes spawned by your shell by setting environment variable DYLD_INSERT_LIBRARIES to the path of the resulting dylib. Note: some tweak injectors (namely libhooker, see here) reset DYLD_INSERT_LIBRARIES, so if you notice this behavior, be sure to inject only your library.
Because the implementation of the exec syscalls in iOS call out to posix_spawn, this hook fixes all of the exec-related issue’s I’ve run into so far.
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <spawn.h>
// Copied from bsd/kern/kern_exec.c
#define IS_WHITESPACE(ch) ((ch == ' ') || (ch == '\t'))
#define IS_EOL(ch) ((ch == '#') || (ch == '\n'))
// Copied from bsd/sys/imgact.h
#define IMG_SHSIZE 512
// Here, we provide an alternate implementation of posix_spawn which correctly handles #!.
// This is based on the implementation of posix_spawn in bsd/kern/kern_exec.c from Apple's xnu source.
// Thus, I am fairly confident that this posix_spawn has correct behavior relative to macOS.
%hookf(int, posix_spawn, pid_t *pid, const char *orig_path, const posix_spawn_file_actions_t *file_actions, const posix_spawnattr_t *attrp, char *const orig_argv[], char *const envp[]) {
// Call orig before checking for anything.
// This mirrors the standard implementation of posix_spawn because it first checks if we are spawning a binary.
int err = %orig;
// %orig returns EPERM when spawning a script.
// Thus, if err is anything other than EPERM, we can just return like normal.
if (err != EPERM)
return err;
// At this point, we do not need to check for exec permissions or anything like that.
// because posix_spawn would have returned that error instead of EPERM.
// Now we open the file for reading so that we can check if it's a script.
// If it turns out not to be a script, the EPERM must be from something else
// so we just return err.
FILE *file = fopen(orig_path, "r");
if (file == NULL) {
return err;
}
if (fseek(file, 0, SEEK_SET)) {
return err;
}
// In exec_activate_image, the data buffer is filled with the first PAGE_SIZE bytes of the file.
// However, in exec_shell_imgact, only the first IMG_SHSIZE bytes are used.
// Thus, we read IMG_SHSIZE bytes out of our file.
// The buffer is filled with newlines so that if the file is not IMG_SHSIZE bytes,
// the logic reads an IS_EOL.
char vdata[IMG_SHSIZE] = {'\n'};
if (fread(vdata, 1, IMG_SHSIZE, file) < 2) { // If we couldn't read at least two bytes, it's not a script.
fclose(file);
return err;
}
// Now that we've filled the buffer, we don't need the file anymore.
fclose(file);
// Now we follow exec_shell_imgact.
// The point of this is to confirm we have a script
// and extract the usable part of the interpreter+arg string.
// Where they return -1, we don't have a shell script, so we return err.
// Where they return an error, we return that same error.
// We don't bother doing any SUID stuff because SUID scripts should be disabled anyway.
char *ihp;
char *line_startp, *line_endp;
// Make sure we have a shell script.
if (vdata[0] != '#' || vdata[1] != '!') {
return err;
}
// Try to find the first non-whitespace character
for (ihp = &vdata[2]; ihp < &vdata[IMG_SHSIZE]; ihp++) {
if (IS_EOL(*ihp)) {
// Did not find interpreter, "#!\n"
return ENOEXEC;
} else if (IS_WHITESPACE(*ihp)) {
// Whitespace, like "#! /bin/sh\n", keep going.
} else {
// Found start of interpreter
break;
}
}
if (ihp == &vdata[IMG_SHSIZE]) {
// All whitespace, like "#! "
return ENOEXEC;
}
line_startp = ihp;
// Try to find the end of the interpreter+args string
for (; ihp < &vdata[IMG_SHSIZE]; ihp++) {
if (IS_EOL(*ihp)) {
// Got it
break;
} else {
// Still part of interpreter or args
}
}
if (ihp == &vdata[IMG_SHSIZE]) {
// A long line, like "#! blah blah blah" without end
return ENOEXEC;
}
// Backtrack until we find the last non-whitespace
while (IS_EOL(*ihp) || IS_WHITESPACE(*ihp)) {
ihp--;
}
// The character after the last non-whitespace is our logical end of line
line_endp = ihp + 1;
/*
* Now we have pointers to the usable part of:
*
* "#! /usr/bin/int first second third \n"
* ^ line_startp ^ line_endp
*/
// Now, exec_shell_imgact copies the interpreter into another buffer and then null-terminates it.
// Then, it copies the entire interpreter+args into another buffer and null-terminates it for later processing into argv.
// This processing is done in exec_extract_strings, which goes through and null-terminates each argument.
// We will just do this all at once since that's much easier.
// Keep track of how many arguments we have.
int i_argc = 0;
ihp = line_startp;
while (true) {
// ihp is on the start of an argument.
i_argc++;
// Scan to the end of the argument.
for (; ihp < line_endp; ihp++) {
if (IS_WHITESPACE(*ihp)) {
// Found the end of the argument
break;
} else {
// Keep going
}
}
// Null terminate the argument
*ihp = '\0';
// Scan to the beginning of the next argument.
for (; ihp < line_endp; ihp++) {
if (!IS_WHITESPACE(*ihp)) {
// Found the next argument
break;
} else {
// Keep going
}
}
if (ihp == line_endp) {
// We've reached the end of the arg string
break;
}
// If we are here, ihp is the start of an argument.
}
// Now line_startp is a bunch of null-terminated arguments possibly padded by whitespace.
// i_argc is now the count of the interpreter arguments.
// Our new argv should look like i_argv[0], i_argv[1], i_argv[2], ..., orig_path, orig_argv[1], orig_argv[2], ..., NULL
// where i_argv is the arguments to be extracted from line_startp;
// To allocate our new argv, we need to know orig_argc.
int orig_argc = 0;
while (orig_argv[orig_argc] != NULL) {
orig_argc++;
}
// We need space for i_argc + 1 + (orig_argc - 1) + 1 char*'s
char *argv[i_argc + orig_argc + 1];
// Copy i_argv into argv
int i = 0;
ihp = line_startp;
for (; i < i_argc; i++) {
// ihp is on the start of an argument
argv[i] = ihp;
// Scan to the next null-terminator
for (; ihp < line_endp; ihp++) {
if (*ihp == '\0') {
// Found it
break;
} else {
// Keep going
}
}
// Go to the next character
ihp++;
// Then scan to the next argument.
// There must be another argument because we already counted i_argc.
for (; ihp < line_endp; ihp++) {
if (!IS_WHITESPACE(*ihp)) {
// Found it
break;
} else {
// Keep going
}
}
// ihp is on the start of an argument.
}
// Then, copy orig_path into into argv.
// We need to make a copy of orig_path to avoid issues with const.
char orig_path_copy[strlen(orig_path)+1];
strcpy(orig_path_copy, orig_path);
argv[i] = orig_path_copy;
i++;
// Now, copy orig_argv[1...] into argv.
for (int j = 1; j < orig_argc; i++, j++) {
argv[i] = orig_argv[j];
}
// Finally, add the null.
argv[i] = NULL;
// Now, our argv is setup correctly.
// Now, we can call out to posix_spawn again.
// The interpeter is in argv[0], so we use that for the path.
return %orig(pid, argv[0], file_actions, attrp, argv, envp);
}

iOS Mach-O – Make __TEXT segment temporarily writable

I've tried a lot to finally get this working, but it still doesn't work yet.
Im trying to change some variables in the __TEXT section, which is read-only by default, like changing the cryptid (and other stuff)
It kind of worked a while ago, back on 32 bit devices. But somehow, it always fails after I used the 64bit commands.
It currently crashes if I hit the following lines:
tseg->maxprot = tseg->initprot = VM_PROT_READ | VM_PROT_EXECUTE
or
crypt->cryptid = 1.
struct mach_header_64* mach = (struct mach_header_64*) _dyld_get_image_header(0);
uint64_t header_size = 0;
struct encryption_info_command_64 *crypt;
struct segment_command_64 *tseg;
struct dylib_command *protector_cmd;
// clean up some commands
void *curloc = (void *)mach + sizeof(struct mach_header);
for (int i=0;i<mach->ncmds;i++) {
struct load_command *lcmd = curloc;
if (lcmd->cmd == LC_ENCRYPTION_INFO_64) {
// save crypt cmd
crypt = curloc;
} else if (lcmd->cmd == LC_SEGMENT_64) {
struct segment_command_64 *seg = curloc;
if (seg->fileoff == 0 && seg->filesize != 0) {
header_size = seg->vmsize;
tseg = curloc;
}
}
if(i == mach->ncmds-1){
protector_cmd = curloc;
}
curloc += lcmd->cmdsize;
}
kern_return_t err;
// make __TEXT temporarily writable
err = vm_protect(mach_task_self(), (vm_address_t)mach, (vm_size_t)header_size, false, VM_PROT_ALL);
if (err != KERN_SUCCESS) exit(1);
// modify the load commands
// change protection of __TEXT segment
tseg->maxprot = tseg->initprot = VM_PROT_READ | VM_PROT_EXECUTE;
// change cryptid
crypt->cryptid = 1;
There's no point in changing the load command. The load commands were already processed when the program was loaded (which must be before this code of yours can run). They have no further effect on the protection of pages.
You're apparently already aware of the vm_protect() function. So why aren't you using that to make the text segment itself writable rather than trying to make the load commands writable?
And it's surely simpler to use getsegmentdata() to locate the segment in memory than looking at the load commands (to which you'd have to add the slide).
Beyond that, I would be surprised if iOS lets you do that. There's a general prohibition against run-time modifiable code (with very narrow exceptions).

Difference "Flex and Bison" code in windows and linux

I am currently working through sample code from the O'Reilly press book entitled "Flex and Bison". I am using the GNU C compiler for Windows with Flex and Bison binary install for Windows which is launched using gcc rather than the Linux cc command.
The problem is that the code if copied directly from the book does not compile and I have had to hack it a bit to get it to work.
Example from book
Example 1-1. Word count fb1-1.l
/* just like Unix wc */
%{
int chars = 0;
int words = 0;
int lines = 0;
%}
%%
[a-zA-Z]+ { words++; chars += strlen(yytext); }
\n { chars++; lines++; }
. { chars++; }
%%
main(int argc, char **argv)
{
yylex();
printf("%8d%8d%8d\n", lines, words, chars);
}
I compiled the code in the flowing way using Windows command line:
flex file.l
gcc lex.yy.c -o a.exe
It crashed stating yywrap() was not found. I added that and then it worked but did not complete the printf in the main function as it just hung waiting for more input!
Here is my solution that works but feels like a hack and that I am not in full understanding of the process.
/* just like Unix wc */
%{
*#include <string.h>*
int chars = 0;
int words = 0;
int lines = 0;
%}
%%
[a-zA-Z]+ { words++; chars += strlen(yytext); }
\n { chars++; lines++; }
*"." { return ;}*
. { chars++; }
%%
*int yywrap(void)
{
return 1;
}*
int main(void)
{
yylex();
printf("num lines is %8d, num words is %8d, num chars is %8d\n", lines, words, chars);
return 0;
}
I had to add a new rule to return out of yylex() which was not in the book, add yywrap()- not really knowing why and add string.h which was not present!. My main question is are there significant differences between flex for Windows and Unix and is it possible to run the original code with my gcc compiler and gnu flex without the said hacks?
I do not understand what have you achieved with this:
#include <string.h>
"." { return; }
But what I know for sure is that if you are running FLEX without specified input file you have to mark the end of input. Otherwise FLEX will wait for input. What I would suggest:
%{
#include <stdio.h>
int chars = 0;
int words = 0;
int lines = 0;
%}
WORD [a-zA-Z]+
%%
{WORD} {
words++;
chars += strlen(yytext);
}
\n {
lines++;
/* chars++; why this? there was no columns here - it's a new line */
}
\s {
/* count the spaces */
chars++;
}
\t {
/* count the tabs */
chars += 4 /* or 8 */;
}
. {
printf("Error (unknown symbol):\t%c\n", yytext[0]);
chars++;
}
%%
int main()
{
/* iterate until end of input and even if errors - continue */
while(yylex()){ }
printf("lines:\t%8d\nwords:\t%8d\nchars:\t%8d\n", lines, words, chars);
return 0;
}
Build with:
flex input.l
output will be lex.yy.c
Then build:
gcc -o scanner.exe lex.yy.c -lfl
Create a txt file with input. Run following:
scanner.exe <in.txt>out.txt
Less sign means redirect input from file in.txt while greater sign means redirect output to out.txt Cause file has EOF at the end of file FLEX will properly stop.

Why does this device_create() call not create a /dev/ entry?

I'm porting platform driver code to a PCIe variant and I don't understand why I'm not getting a /dev/ entry to show up. The platform driver code that has been modified:
static dev_t first;
static struct class * class;
ATTRIBUTE_GROUPS(my);
static int __init my_pci_init(void)
{
int ret;
/* Create a class entry in sysfs */
if ((class = class_create(THIS_MODULE, "test_driver")) == NULL) {
pr_err("Couldn't create 'struct class' structure.");
ret = -ENODEV;
goto exit;
}
class->dev_groups = my_groups;
/* Create the /dev/ file system entry */
/* return value ignored: there's a 'struct class' to 'struct device' mapping */
if (device_create(class, NULL, first, NULL, KBUILD_MODNAME) == NULL) {
pr_err("Couldn't create entry in '/dev/' file system.");
ret = -ENODEV;
goto exit;
} else {
pr_info("Created a /dev/ entry.");
}
if ((ret = pci_register_driver(&pci_driver)) < 0) {
pr_err("Couldn't register pci driver.");
}
exit:
if (ret < 0) {
my_pci_exit();
pr_err(" ret = %d", ret);
}
return ret;
}
module_init(my_pci_init);
If the module name is 'cz_tdm', I was hoping the above code would create an entry /dev/cz_tdm. At least it did when I was compiling this as a platform driver.
The driver enumerates just fine, an output of lspci shows that the driver was loaded and perusing the sysfs shows that all my attributes in /sys/devices/virtual/... are where I'd expect them to be.
What gives?
Whoops.
Because it's not supposed too. An overzealous deletion of code ripped out this necessary element:
/* Add the char device to the system. */
cdev_init(&cdev, &fops);
if ((ret = cdev_add(&cdev, first, DEV_MINOR_NUMBER_COUNT)) < 0) {
pr_err("Couldn't add device to system: %d", ret);
ret = -ENODEV;
goto exit;
}

Resources