Regarding use of __device__ for variables - memory

I am using a global variable say d_myVar, which will be allocated device memory using cudaMalloc in main function. I am not clear, should I use __ device __ in front of it while doing global declaration? I ask this, since if it were local variable in host and was passed to a kernel, we would not write __ device __ in front of it. Let me know if I am wrong.

Globally-scoped __device__ variables are not allocated with cudaMalloc. Simply annotate a variable in the global scope with __device__:
#include <stdio.h>
__device__ int d_myVar;
__global__ void foo()
{
printf("d_myVar is %d\n", d_myVar);
}
int main()
{
int h_myVar = 13;
cudaMemcpyToSymbol(d_myVar, &h_myVar, sizeof(int), 0, cudaMemcpyHostToDevice);
foo<<<1,1>>>();
cudaThreadSynchronize();
return 0;
}
The result:
$ nvcc -arch=sm_20 test.cu -run
d_myVar is 13

Related

c programming how to write this in main

You can write the prototypes without the variable names?
int example(examplestruct *var1, examplestruct *var2);
void done(examplestruct *var1,FILE *f);
struct {
int* field1;
int field2;
}examplestruct;
Is it possible to write the prototypes without name variables?
Can anyone tell me if this is acceptable in C language? You can write the prototypes without the variable names?
Yes.
As for the second question:
If you want a function to be inside main(), then take the body of the function, put it in main() and make sure that the arguments that the function had are well handled.
This example will clear things up:
#include <stdio.h>
void print(int);
void inc_p(int);
int main(void) {
int num = 5;
print(num);
inc_p(num);
// to get rid of inc_p(), copy paste it's body inside main
// and you will get this
// a++;
// print(a);
// However, a was an argument, here you need to use
// the variable declared in main(), i.e. 'num'
num++;
print(num);
return 0;
}
void print(int a) {
printf("%d\n", a);
}
void inc_p(int a) {
a++;
print(a);
}

Pre-R16B driver_async_port_key alternative

According to erl_driver documentation for driver_async_port_key function,
Before OTP-R16, the actual port id could be used as a key with proper casting, but after the rewrite of the port subsystem, this is no longer the case. With this function, you can achieve the same distribution based on port id's as before OTP-R16.
What is this proper casting?
The ErlDrvPort type is a typedef of a pointer to a struct. To obtain an unsigned int async key type in older driver applications, you need to convert this pointer type to unsigned int. One way to achieve this is to cast it through the C99 uintptr_t type, which is guaranteed to be large enough to hold a pointer value:
#include <stdint.h>
#include "erl_driver.h"
unsigned int my_port_key(ErlDrvPort port)
{
return (unsigned int) (uintptr_t) port;
}
You can write a portable function to return an async key using driver API versioning information available in erl_driver.h. The driver_async_port_key function was introduced in driver API version 2.2, so we can call driver_async_port_key when using version 2.2 or newer, or fall back to the casting approach for older versions:
#include <stdint.h>
#include "erl_driver.h"
unsigned int my_port_key(ErlDrvPort port)
{
#if ERL_DRV_EXTENDED_MAJOR_VERSION > 2 || \
(ERL_DRV_EXTENDED_MAJOR_VERSION == 2 && ERL_DRV_EXTENDED_MINOR_VERSION >= 2)
return driver_async_port_key(port);
#else
return (unsigned int) (uintptr_t) port;
#endif
}

About the parameter of function pthread_create?

We know that we call pthread like this:
int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
void *(*start_routine) (void *), void* arg);
Hi guys, i want to know why the return type of third parameter is void*? why not void?
Because there is no way for a start function to know what kind of data a developer wants to return from the function they use a void* that can point to any type. It is up to the developer of the start function to then cast the void* to appropriate type he actually returned before using whatever the void* points to. So now the start function can return a pointer that may in actually point to anything. If the start function is declared to return void, it means this function returns nothing, then what if the developer wants the start function to return a int, a struct? For example:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <pthread.h>
struct test {
char str[32];
int x;
};
void *func(void*) {
struct test *eg = (struct test *)malloc(sizeof(struct test));
strcpy(eg->str,"hello world");
eg->x = 42;
pthread_exit(eg);
}
int main (void) {
pthread_t id;
struct test *resp;
pthread_create(&id, NULL, func, NULL);
pthread_join(id,(void**)&resp);
printf("%s %d\n",resp->str,resp->x);
free(resp);
return 0;
}
More details on this post: What does void* mean and how to use it?

How to create and using vapi files?

I want to make a custom vapi file, I have the basic stuff but I obviously miss something and I can't find anywhere how to do this properly. My main goal is to create a torent app, using libtorrent, and create the GUI (the frontend?) with vala and gtk.
I have a c_func_head.h:
#ifndef WHATEVER_H_INCLUDED
#define WHATEVER_H_INCLUDED
int add(int a, int b);
#endif
c_functions.c:
#include <stdio.h>
#include <stdlib.h>
#include "c_func_head.h"
int add(int a, int b){
printf("Adding numbers in c...\n");
return a+b;
}
vala_p.vapi:
[CCode (cheader_filename = "c_func_head.h")]
namespace MyFunc {
[CCode (cname = "add")]
public int add (int a, int b);
}
and finally vala_program.vala:
//extern int add(int a, int b);
using MyFunc;
void main(){
stdout.printf("Calling a c function...\n");
//stdout.printf("The sum is: %d\n", add2number(2, 2));
int sum = add(2, 2);
stdout.printf("The sum is: %d\n", sum);
}
As you can see I used an extern too, it worked with it but I want to use vapi files.
I compiled with (everything is in the same folder):
valac vala_program.vala --vapidir=vala_p.vapi -o mustrun
and the error is:
The namespace name `MyFunc' could not be found using MyFunc;
One more thing. Is it possible to make bindings for libtorrent? It uses c++ and I gues I have to use c++ too.
You can't make Vala bindings of C++ code. Only C. There a a guide to writing legacy bindings and a binding for Transmission, which is C-based.
As for the specific error you have, you want to call valac vala_program.vala vala_p.vapi if the library (i.e., header files) are the the same directory or valac vala_program.vala --pkg vala_p --vapidir=/path/to/directory/containing/vapi.

what is wrong with following pthread program?

I am not able to execute pthreads program in c. Please tell me what is wrong with the following program. I am neither getting any error nor expected output.
void *worker(void * arg)
{
int i;
int *id=(int *)arg;
printf("Thread %d starts\n", *id );
}
void main(int argc, char **argv)
{
int thrd_no,i,*thrd_id,rank=0;
void *exit_status;
pthread_t *threads;
thrd_no=atoi(argv[1]-1);
thrd_id= malloc(sizeof(int)*(thrd_no));
threads=malloc(sizeof(pthread_t)*(thrd_no));
for(i=0;i<thrd_no;i++)
{
rank=i+1;
thrd_id[i]=pthread_create(&threads[i], NULL, worker, &rank);
}
for(i=0;i<thrd_no;i++)
{
pthread_join(threads[i], &exit_status);
}
}
thrd_no = atoi(argv[1] - 1); likely doesn't do what you intended; the way argv is normally passed into a new process and parsed into a C array, argv[1] - 1 is probably pointing at \0 (specifically, the \0 at the end of argv[0]). (More generally, indexing backwards off the start of a string is rarely correct.) The result is that atoi() will return 0 and no threads will be created. What did you actually intend to do there?
You are passing the same address &rank to each thread, so id and *id is the same for all your worker-s.
You should better allocate on the heap the address you pass to each worker routine.
You might also include <stdint.h and use intptr_t, e.g.
void worker (void* p)
{
intptr_t rk = (intptr_t) p;
/// etc
}
and call
intptr_t rank = i + 1;
thrd_id[i]=pthread_create(&threads[i], NULL, worker, (void*)rank);
You should learn to use a debugger and compile with all warnings and debug information, i.e. gcc -Wall -g (and improve your code till it gets no warnings, then use gdb)
code segment rank=i+1;
thrd_id[i]=pthread_create(&threads[i], NULL, worker, &rank);
will produce race condition.

Resources