ESP32 dual core multi tasks priority issue - task

task1 : webServerTask,
task2 : guiTask (use mutex)
I assigned task1 to cpu0 and task2 to cpu1 using the xTaskCreatePinnedToCore function in esp32. Then I expected task1 and task2 to run concurrently on each other’s cpu regardless of their respective priorities, isn’t that right?
If I increase the priority of task1 over task2, task2 will not run. If I set the priorities to be the same, both tasks will run. Even if I assign tasks to different CPUs, does the execution of the tasks differ according to their priority? Or is it related to the use of semaphore in task2?
case 1: webServerTask run, guiTask does not work (priority: task1 > task2, task2 with mutex)
xTaskCreatePinnedToCore(webServerTask, "webServer", 4096, NULL, 2, NULL, 0);
xTaskCreatePinnedToCore(guiTask, "gui", 4096 * 2, NULL, 1, NULL, 1);
case 2: both webServerTask and guiTask run (priority: task1 = task2, task2 with mutex)
xTaskCreatePinnedToCore(webServerTask, "webServer", 4096, NULL, 1, NULL, 0);
xTaskCreatePinnedToCore(guiTask, "gui", 4096 * 2, NULL, 1, NULL, 1);
case 3: both webServerTask and guiTask run (priority: task1 > task2, task2 without mutex)
xTaskCreatePinnedToCore(webServerTask, "webServer", 4096, NULL, 2, NULL, 0);
xTaskCreatePinnedToCore(guiTask, "gui", 4096 * 2, NULL, 1, NULL, 1)
Task1 and Task2 are implemented as below:
static void webServerTask(void *pvParameters)
{
// initialize wifi and the web server
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED)
{
vTaskDelay(500 / portTICK_PERIOD_MS);
Serial.print(".");
}
// Run the web server loop indefinitely
while(1)
{
WiFiClient client = server.available();
if (client)
{
...
}
vTaskDelay(10 / portTICK_PERIOD_MS);
}
}
static void guiTask(void *pvParameter)
{
xGuiSemaphore = xSemaphoreCreateMutex();
... // display
while (1)
{
vTaskDelay(pdMS_TO_TICKS(10));
if (pdTRUE == xSemaphoreTake(xGuiSemaphore, portMAX_DELAY))
{
lv_task_handler();
xSemaphoreGive(xGuiSemaphore);
}
}
}

Related

OpenCL: buffering inputs for the same kernel

I have been trying to read a file and load it into a buffer of a kernel in OpenCL, while the kernel is processing another buffer. However, it seems to not like that: for some reason, the results are wrong.
First, I tried setting the Args for the same kernel every time before enqueueing a task. Then, I tried enqueuing tasks for 2 kernels of the same function like below, without changing the arguments:
krnl_1.setArg(0, buffer_a));
krnl_1.setArg(1, output_buffer));
krnl_2.setArg(0, buffer_b));
krnl_2.setArg(1, output_buffer));
void* ptr[2];
ptr[0] = q.enqueueMapBuffer(buffer_a, CL_TRUE, CL_MAP_READ | CL_MAP_WRITE, 0, buffer_size_in_bytes, NULL, NULL, &err);
ptr[1] = q.enqueueMapBuffer(buffer_b, CL_TRUE, CL_MAP_READ | CL_MAP_WRITE, 0, buffer_size_in_bytes, NULL, NULL, &err);
int sel = 0;
long long bytes_sent = 0;
// Fill buffer_a
bytes_sent += pread(myFd, (void*)ptr[sel], buffer_size_in_bytes, bytes_sent);
while (bytes_sent < total_size_in_bytes){
if (sel == 0){ // If buffer_a was just filled
q.enqueueTask(krnl_1);
sel = 1; // Fill buffer_b
} else { // If buffer_b was just filled
q.enqueueTask(krnl_2);
sel = 0; // Fill buffer_a
}
if (bytes_sent >= total_size_in_bytes) // If this is the last task
q.enqueueMigrateMemObjects({output_buffer},CL_MIGRATE_MEM_OBJECT_HOST);
else // Fill the buffer that is not being processed
bytes_sent += pread(myFd, (void*)ptr[sel], buffer_size_in_bytes, bytes_sent);
q.finish();
}
If I do it serially, it is working fine:
void* ptr[2];
ptr[0] = q.enqueueMapBuffer(buffer_a, CL_TRUE, CL_MAP_READ | CL_MAP_WRITE, 0, buffer_size_in_bytes, NULL, NULL, &err);
ptr[1] = q.enqueueMapBuffer(buffer_b, CL_TRUE, CL_MAP_READ | CL_MAP_WRITE, 0, buffer_size_in_bytes, NULL, NULL, &err);
int sel = 0;
long long bytes_sent = 0;
while (bytes_sent < total_size_in_bytes){
bytes_sent += pread(myFd, (void*)ptr[sel], buffer_size_in_bytes, bytes_sent);
if (sel == 0){
q.enqueueTask(krnl_1);
sel = 1;
} else {
q.enqueueTask(krnl_2);
sel = 0;
}
if (bytes_sent >= total_size_in_bytes) //if this is the last task
q.enqueueMigrateMemObjects({output_buffer},CL_MIGRATE_MEM_OBJECT_HOST);
q.finish();
}
I feel like I must have miscomprehended the way OpenCL treats arguments and enqueues tasks, but I cannot find any similar examples.
First, reads and writes by a kernel executing on a device to a memory region mapped for writing are undefined (for more information please see Accessing mapped regions of a memory object section for clEnqueueMapBuffer). Therefore, it is necessary to unmap the buffer before run the task.
Second, bytes_sent is incremented before it is checked in while() condition in the first example. If the first call of pread() reads all data, the loop won't be executed.
Therefore, I expect that the code should look something like this:
krnl_1.setArg(0, buffer_a);
krnl_1.setArg(1, output_buffer);
krnl_2.setArg(0, buffer_b);
krnl_2.setArg(1, output_buffer);
int sel = 0;
long long bytes_processed = 0;
void* buf_ptr = q.enqueueMapBuffer(buffer_a, CL_TRUE, CL_MAP_WRITE, 0, buffer_size_in_bytes, NULL, NULL, &err);
long long bytes_to_process = pread(myFd, buf_ptr, buffer_size_in_bytes, bytes_processed);
err = q.enqueueUnmapMemObject(buffer_a, buf_ptr);
while (bytes_processed < total_size_in_bytes)
{
if (sel == 0){ // If buffer_a was just filled
q.enqueueTask(krnl_1);
sel = 1; // Fill buffer_b
} else { // If buffer_b was just filled
q.enqueueTask(krnl_2);
sel = 0; // Fill buffer_a
}
bytes_processed += bytes_to_process;
if (bytes_processed < total_size_in_bytes) { // Fill the buffer that is not being processed
auto& buffer = sel ? buffer_b : buffer_a;
buf_ptr = q.enqueueMapBuffer(buffer, CL_TRUE, CL_MAP_WRITE, 0, buffer_size_in_bytes, NULL, NULL, &err);
bytes_to_process = pread(myFd, buf_ptr, buffer_size_in_bytes, bytes_processed);
err = q.enqueueUnmapMemObject(buffer, buf_ptr);
}
else { // If this is the last task
buf_ptr = q.enqueueMapBuffer(output_buffer, CL_TRUE, CL_MAP_READ, 0, output_buffer_size_in_bytes, NULL, NULL, &err);
}
}

STM32 & FreeRTOS task scheduling

I need to run three tasks in order of priority
/* Definitions for myTask01 */
osThreadId_t myTask01Handle;
const osThreadAttr_t myTask01_attributes = {
.name = "myTask01",
.stack_size = 512 * 4,
.priority = (osPriority_t) osPriorityLow,
};
/* Definitions for myTask02 */
osThreadId_t myTask02Handle;
const osThreadAttr_t myTask02_attributes = {
.name = "myTask02",
.stack_size = 512 * 4,
.priority = (osPriority_t) osPriorityNormal,
};
/* Definitions for myTask03 */
osThreadId_t myTask03Handle;
const osThreadAttr_t myTask03_attributes = {
.name = "myTask03",
.stack_size = 512 * 4,
.priority = (osPriority_t) osPriorityHigh,
};
The tasks simply blink the PA5 led
void StartTask02(void *argument) //Normal priority
{
for(;;)
{
HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_5);
osDelay(2000);
}
osThreadTerminate(NULL);
}
Of course, in the main, the tasks are created and the scheduler is initialized and started.
When I run the firmware, only the third task run (it has a faster blinking).
If I put this for in the third task
void StartTask03(void *argument) //High priority
{
for(int i=0;i<50;i++)
{
HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_5);
osDelay(100);
}
osThreadTerminate(NULL);
}
it blinks when is in for loop, and when it exits from the for loop, the led stops blinking.
How can I run the other tasks? Thanks

Should we use mutex with semaphore to make a correct synchronization and to prevent a race condition?

I am trying to see the race condition happens in the comsumer-producser problem,
so I made multiple producers and mulitple consumers.
From what I know that I need to provide mutex with semaphore:
Mutex for the race conditions, because muliple producers can access the buffer at the same time. then the data might be corrupted.
And semaphore to provide signaling between the producers and the consumers
The problem here that the sync is happening correctly while I am not using the Mutex (i am using the Semaphore only). is my understanding correct or is there anything wrong to do in the code below:
#include <pthread.h>
#include <stdio.h>
#include <semaphore.h>
#include <stdlib.h>
#include <unistd.h>
int buffer;
int loops = 0;
sem_t empty;
sem_t full;
sem_t mutex; //Adding MUTEX
void put(int value) {
buffer = value;
}
int get() {
int b = buffer;
return b;
}
void *producer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&empty);
//sem_wait(&mutex);
put(i);
//printf("Data Set from %s, Data=%d\n", (char*) arg, i);
//sem_post(&mutex);
sem_post(&full);
}
}
void *consumer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&full);
//sem_wait(&mutex);
int b = get();
//printf("Data recieved from %s, %d\n", (char*) arg, b);
printf("%d\n", b);
//sem_post(&mutex);
sem_post(&empty);
}
}
int main(int argc, char *argv[])
{
if(argc < 2 ){
printf("Needs 2nd arg for loop count variable.\n");
return 1;
}
loops = atoi(argv[1]);
sem_init(&empty, 0, 1);
sem_init(&full, 0, 0);
sem_init(&mutex, 0, 1);
pthread_t pThreads[3];
pthread_t cThreads[3];
pthread_create(&cThreads[0], 0, consumer, (void*)"Consumer1");
pthread_create(&cThreads[1], 0, consumer, (void*)"Consumer2");
pthread_create(&cThreads[2], 0, consumer, (void*)"Consumer3");
//Passing the name of the thread as paramter, Ignore attr
pthread_create(&pThreads[0], 0, producer, (void*)"Producer1");
pthread_create(&pThreads[1], 0, producer, (void*)"Producer2");
pthread_create(&pThreads[2], 0, producer, (void*)"Producer3");
pthread_join(pThreads[0], NULL);
pthread_join(pThreads[1], NULL);
pthread_join(pThreads[2], NULL);
pthread_join(cThreads[0], NULL);
pthread_join(cThreads[1], NULL);
pthread_join(cThreads[2], NULL);
return 0;
}
I believe I have the problem figured out. Here's what is happening
When initializing your semaphores you set empty's number of threads to 1 and full's to 0
sem_init(&empty, 0, 1);
sem_init(&full, 0, 0);
sem_init(&mutex, 0, 1);
This means that there is only one "space" for the thread to get into the critical region. In other words, what your program is doing is
produce (empty is now 0, full has 1)
consume (full is now 0, empty has 0)
produce (empty is now 0, full has 1)
...
It's as if you had a token (or, if you like, a mutex), and you pass that token between consumers and producers. That is actually what the consumer-producer problem is all about, only that in most cases we are worried about having several consumers and producers working at the same time (which means you have more than one token). Here, because you have only one token, you basically have what one mutex would do.
Hope it helped :)

Whats the difference between pthread_join and pthread_mutex_lock?

The following code is taken from this site and it shows how to use mutexes. It implements both pthread_join and pthread_mutex_lock:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *functionC();
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
int counter = 0;
main()
{
int rc1, rc2;
pthread_t thread1, thread2;
/* Create independent threads each of which will execute functionC */
if( (rc1=pthread_create( &thread1, NULL, &functionC, NULL)) )
{
printf("Thread creation failed: %d\n", rc1);
}
if( (rc2=pthread_create( &thread2, NULL, &functionC, NULL)) )
{
printf("Thread creation failed: %d\n", rc2);
}
/* Wait till threads are complete before main continues. Unless we */
/* wait we run the risk of executing an exit which will terminate */
/* the process and all threads before the threads have completed. */
pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
exit(EXIT_SUCCESS);
}
void *functionC()
{
pthread_mutex_lock( &mutex1 );
counter++;
printf("Counter value: %d\n",counter);
pthread_mutex_unlock( &mutex1 );
}
I ran the code as given above as it is and it produced following result:
Counter value: 1
Counter value: 2
But in the second run i removed "pthread_mutex_lock( &mutex1 );" and "pthread_mutex_unlock( &mutex1 );" . I compiled and ran the code, it again produced the same result.
Now the thing that confuses me is why mutex lock is used in above code when same thing can be done without it (using pthread_join)? If pthread_join prevents another thread from running untill the first one has finished then i think it would already prevent the other thread from accessing the counter value. Whats the purpose of pthread_mutex_lock?
The join prevents the starting thread from running (and thus terminating the process) until thread1 and thread2 finish. It doesn't provide any synchronization between thread1 and thread2. The mutex prevents thread1 from reading the counter while thread2 is modifying it, or vice versa.
Without the mutex, the most obvious thing that could go wrong is that thread1 and thread2 run in perfect synch. They each read zero from the counter, each add one to it, and each output "Counter value: 1".

main() thread versus one created by pthread_create()

I've been creating programs exemplifying concurrency bugs using POSIX threads.
The overall question I have is what is the difference between the main() thread and one created by pthread_create(). My original understanding was that they are pretty much the same but, I'm getting different results from the two programs below.
To expand before showing the code I've written, what I am wondering is: Is there a difference between the following.
int main() {
...
pthread_create(&t1, NULL, worker, NULL);
pthread_create(&t2, NULL, worker, NULL);
...
}
and
int main() {
...
pthread_create(&t1, NULL, worker, NULL);
worker();
...
}
To expand using a full example program. I've made two versions of the same program. They both have the same function worker()
void *worker(void *arg) {
printf("Entered worker function\n");
int myid;
int data = 999;
pthread_mutex_lock(&gidLock);
myid = gid;
gid++;
printf("myid == %d\n", myid);
pthread_mutex_unlock(&gidLock);
if (myid == 0) {
printf("Sleeping since myid == 0\n");
sleep(1);
result = data;
printf("Result updated\n");
}
return NULL;
}
gid and data are globals initialized to 0.
What's the difference between the following main() functions
int main_1() {
pthread_t t1, t2;
int tmp;
/* initialize globals */
gid = 0;
result = 0;
pthread_create(&t1, NULL, worker, NULL);
pthread_create(&t2, NULL, worker, NULL);
pthread_join(t2, NULL);
printf("Parent thread exited worker function\n");
tmp = result;
printf("%d\n", tmp);
pthread_exit((void *) 0);
}
and
int main_2() {
pthread_t t1;
int tmp;
/* initialize globals */
gid = 0;
result = 0;
pthread_create(&t1, NULL, worker, NULL);
worker(NULL);
printf("Parent thread exited worker function\n");
tmp = result;
printf("%d\n", tmp);
pthread_exit((void *) 0);
}
Sample output for main_1()
Entered worker function
myid == 0
Sleeping since myid == 0
Entered worker function
myid == 1
Parent thread exited worker function
0
Result Updated
Sample output for main_2()
Entered worker function
myid == 0
Sleeping since myid == 0
Entered worker function
myid == 1
/* program waits here */
Result updated
Parent thread exited worker function
999
Edit: The program intentionally has a concurrency bug (an atomicity violation). Delays were added by calling sleep() to attempt to force the buggy interleaving to occur. The intention of the program is to be used to test software automatically detecting concurrency bugs.
I would think that main_1() and main_2() are essentially the same program which should result in the same interleaving when run on the same system (or largely the same interleaving; it is indeterminate, but running the same program on the same system only tend explores a small section of the potential scheduling and rarely deviates [1]).
The "desired" output is is that from main_1()
I'm not sure why the thread with myid == 1 stalls and does not return in main_2(). If I had to guess
Thanks for reading this far, and if anyone needs more information I'd be happy to oblige. Here are links to the full source code.
main_1(): https://gist.github.com/2942372
main_2(): https://gist.github.com/2942375
I've been compiling with gcc -pthread -lpthread
Thanks again.
[1] S. Park, S. Lu, Y. Zhou. "CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places"

Resources