Is it possible to run a device driver inside an Intel SGX enclave? Or is it impossible for an enclave to access DMA memory and perform memory-mapped I/O?
I already have a device driver that has mapped all of the necessary memory but I don't know if it will be possible to create an enclave that shares these mappings. I am essentially confused about whether enclaves can only access their own private memory or whether they can also access arbitrary physical memory that I would map to them.
The documentation seems to say that the enclave cannot access code at arbitrary locations but I want to know the rules for data and MMIO.
Enclaves are Statically Linked libraries, as so they share the Process with the application it gets loaded into. Multiple enclaves can be loaded into one process.
An Enclave owns one or more Page Tables, these pages are encrypted and protected from outside access. This is better explained on: https://software.intel.com/sites/default/files/332680-002.pdf, page 28.
Enclaves can access memory from the process they run, but their memory can only be accessed by themselves. DMA access attempts are rejected/aborted, is not possible to map to an enclave's memory, however, you can map to the memory of the process and access it from within the enclave.
Enclaves are by concept isolated from the outside world, they don't have I/O capabilites appart of the Protected File System Library. So, I don't think it's possible to run a driver inside sgx.
Related
How do we really limit machine memory access if a software code has a instruction that work with straight address bits and order cpu to access access a restricted area?
if we use container or virtual or ..., we should run a code to check every instruction of original code to see if it doesn't access a restricted area?
Privilege management usually requires hardware support in the CPU. In the case of software emulation, the emulator will be required to ensure the proper privilege levels are enforced.
The MMU is a component that (among other things) controls memory accesses. Certain regions of memory can be marked as readable, writable and executable. The MMU will check all memory accesses and cause some sort of fault on an illegal access. This prevents the CPU from reading/writing/executing at arbitrary memory locations.
Many CPUs have privilege separation built into the CPU itself. It will have a concept of privilege levels (e.g. rings in x86, mode bits in ARM) and checks that the instruction being run is allowed within the current privilege level. This prevents code running in an unprivileged mode from executing privileged instructions.
The operating system hosting the containers or virtual machine host software will need to ensure the proper privilege separation is implemented correctly (making use of hardware features as appropriate).
Is there a way to load an existing application into an Intel SGX enclave directly?
While hmofrad is right with the statement that SGX is not designed to run an entire existing application, there are approaches to achieve exactly this: There is SCONE (closed source) and Graphene (open source). So you could read up on Graphene with SGX and check if this fits your need.
Intel SGX is designed for securing data and not loading the entire application. You can perform secure computations inside the SGX enclaves on your data by sending temporary buffers from the user space program (app.cpp) to your SGX enclave (Enclave.cpp). But why?
The enclave size is small and you can't load all your data inside it at the same time.
Inside enclaves, you're limited to a set of programming primitives like if-then-else, for-loop, and etc. Also, you can't have syscalls like open for opening a file.
Thus, if your application is large or contains some syscalls or even some forbidden standard C library functions by SGX implementation, it is impossible to import it entirely inside an enclave. But, if your application is doing some primitive operations without the need for any special syscall or function call, you can freely port it inside an enclave. Still, you can't directly load it inside an enclave you have to change your implementation to make it as a trusted enclave call inside the Enclave.cpp.
As an example, I've implemented a set of cryptographic operations e.g. SHA-2, HMAC SHA-2, AES, and etc. inside an enclave. I send/receive temporary buffers of pliantext/ciphertext data to/from enclave performing the encryption/decryption operations inside the enclave and storing the results of computation like a hash digest, or ciphertexts in userspace. In this way, I ensure that no one can tamper the results of operations because they're running inside the enclave which is secured by CPU instructions.
You can read more about this example here and check the implementation here.
As pointed out by previous answers, the Intel SGX default design does not permit to run ummodified applications in general, because the latter contain (most probably) routines which are unsupported (all syscalls) by the trusted libc provided by the Intel SGX SDK. Tools such as Scone, Graphene SGX, Haven, or SGX-LKL allow to run unmodified applications in Intel SGX enclaves.
Most of the above mentioned tools run mini-OSs inside the enclave to handle (via emulation) the unsupported syscalls. This leads to a large enclave size which is very detrimental for applications which require large memory resources; the enclave memory is limited to 128MB (or 256MB in more recent SGX versions).
The solution you choose to use will depend largely on the application you are trying to run. If the latter is not that large, you could try porting it to Intel SGX. Porting involves separating your application into trusted and untrusted parts. Only the trusted part will run in the enclave, and may communicate securely with the untrusted part (a helper) out of the enclave runtime. During porting you may still have trusted code which depends on unsupported routines like syscalls. You could solve this problem by implementing/extending your own trusted libc (just the syscalls you need) in the enclave which redefines the syscalls as wrappers to ocalls which then invoke the real routines (securely) out of the enclave; good example here. This approach however is not for newbies though. This way you will maximize enclave memory and prevent bloating it will a full-blown library OS.
On the other hand, if you are dealing with a very complex application where porting is not feasible, then I will advice you to go for a solution such as Graphene-SGX which is opensource and well documented.
Assume there is an MCU(like a cypress PSOC4 chip which I'm using). It contains a flash memory(to store firmware) and a RAM(probably SRAM) inside the chip. I understand that even these two components need to be memory mapped in order for the processing unit to access them.
However, the flash memory and SRAM should be mapped every time the MPU is powered on, right?.
Then where is the configuration for memory map stored?
Is it somehow hardwired inside the MPU? Or is it stored in a separately hidden small piece of RAM?
I once thought that the memory map info should be located at the front of the firmware, but this doesn't make sense because the firmware is stored in the flash, and the MPU would have no idea where the flash is mapped to. So, I think this is a wrong idea.
By the way, is a memory map even configurable?
Yes hardwired in the mcu on boot, some mcus allow for remapping once up and running, but in order to boot the flash/rom has to be mapped to a known place, a sane design would also have the on chip sram mapped and ready to use on boot at a known location.
Some use straps (pins externally hardwired high or low) to manipulate how the mcu boots, sometimes that includes a different mapping. A single strap could for example choose between mapping a bootloader rom vs the user flash into the boot space of the processor. But that would be documented as with other mapping choices in the chip vendors documentation for the part.
Some mcus allow you to in software after boot move ram into the vector/exception table area so you can manipulate it at run time and not be limited to what was in the flash at boot. Some mcus are going so far as to have a mmu like feature, but I have a hard time calling those mcus as they can run in the hundreds of mhz, have floating point uints, caches, etc. Technically they are a SOC with ram and flash on chip, so classified as an MCU.
Your thinking is sane, the flash and sram mappings are in logic and at reset you can know where things will be. It is in the documentation for that product.
My question is simple How do I share memory among processes allowing reads and writes of memory .The main thing is only specific processes(like specific PID's for example) should have the ability to share that memory .Not all processes should have the ability to share the memory.
One option is to use the standard Sys V IPC shared memory. After call to shmget(), use shmctl() to set the permissions. Give read write permission to only one group/user and start the processes which are allowed to access this as the specific user. The shared memory key and IDs can be found using ipcs, and you need trust the standard unix user/group based security to do the job.
Another option is implementing a shared memory driver. Something similar to Android ashmem. In the driver, you can validate the PID/UID of the caller who is trying to access the memory and allow/deny the request based on filters. You can also implement a sysfs entry to modify these filter. If the filters needs to be configurable, again you need to trust the Unix user/group based security. If you are implementing a driver, you will have plenty of security options
What is the purpose of logical address? Why should CPU generate logical address? it can directly access relocatable register base address and limit to exe a process. Why should MMU make a mapping between logical and physical address?
Why?
Because this gives the Operating System a way to securely manage memory.
Why is secure memory management necessary?
Imagine if there was no logical addressing. All processes were given direct access to physical addresses. A multi-process OS runs several different programs simultaneously. Imagine you are editing an important letter in MS Word while listening to music on YouTube on a very recently released browser. The browser is buggy and writes bogus values to a range of physical addresses that were being used by the Word program to store the edits of your letter. All of that information is corrupt!
Highly undesirable situation.
How can the OS prevent this?
Maintain a mapping of physical addresses allocated to each process and make sure one process cannot access the memory allocated to another process!
Clearly, having actual physical addresses exposed to programs is not a good idea. Since memory is then handled totally by the OS, we need an abstraction that we can provide to processes with a simple API that would make it seem that the process was dealing with physical memory, but all allocations would actually be handled by the OS.
Here comes virtual memory!
The need of logical address is to securely manage our physical memory.
Logical address is used to reference to access the physical memory location.
A logical address is generated so that a user program never directly access the physical memory and the process donot occupies memory which is acquired by another process thus corrupting that process.
A logical address gives us a surety that a new process will not occupy memory space occupied by any other process.
In execution time binding, the MMU makes a mapping from logical address to physical address because in this type of binding:
logical address is specifically referred to as virtual address
The address actually has no meaning because it is there to illusion the user that it has a large memory for its processes. The address actually bear meaning when mapping occurs and they get some real addresses which are present in physical memory.
Also I would like to mention that the base register and limit register are loaded by executing privileged instructions and privileged instructions are executed in kernel mode and only operating system has access to kernel mode and therefore CPU cannot directly access the registers.
So first the CPU will generate the logical address then the MMU of Operating system will take over and do the mapping.
The binding of instruction and data of a process to memory is done at compile time, load time or at execution time. Logical address comes into picture, only if the process moved during its execution time from one memory segment to another. logical address is the address of a process, before any relocation happens(memory address = 10). Once relocation happened for a process(moved to memory address = 100), just to redirect the cpu to correct memory location=> memory management unit(MMU), maintains the difference between relocated address and original address(100-10 = 90) in relocation register(base register acts as relocation register here) . once CPU have to access data in memory address 10, MMU add 90(value in relocation register) to the address, and fetch data from memory address 100.