clang not generating debug symbols - clang

I have a a.c
#include <stdio.h>
int main() {
int a = 1;
int b = 2;
int c = a + b;
return 0;
}
when compiling with clang -g a.c, I can't get debug symbols.
joey#voyager-arch /t/a4> lldb a.out
(lldb) target create "a.out"
Current executable set to '/tmp/a4/a.out' (x86_64).
(lldb) l
(lldb)
But if I use gcc, I can successfully got the debug symbols, compile with gcc -g a.c
joey#voyager-arch /t/a4> gdb a.out
GNU gdb (GDB) 11.1
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from a.out...
(gdb) l
1 #include <stdio.h>
2
3 int main() {
4 int a = 1;
5 int b = 2;
6 int c = a + b;
7 return 0;
8 }
(gdb)
I'm using archlinux with amd ryzen 7 cpu.
clang: 12.0.1
lldb: 12.0.1
gcc: 11.1.0
gdb: 11.1

May be a lldb 12 incompatible bug, downgrade to lldb 10 solved this problem.

Related

Change value of const variables in LLDB

Const variables are good. However, sometimes I want to dynamically change their values when debugging to trace some specific behavior.
When I po flag = NO I receive this error:
error: <user expression 0>:1:34: cannot assign to variable 'flag' with const-qualified type 'const BOOL &' (aka 'const bool &')
Is there any workaround?
You can make the expression succeed by using const_cast, but it is likely to not do what you want. For instance:
(lldb) run
Process 32640 launched: '/tmp/foo' (x86_64)
Process 32640 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100003f66 foo`main at foo.c:7
4 main()
5 {
6 const int foo = 10;
-> 7 printf("%d\n", foo);
^
8 return 0;
9 }
Target 0: (foo) stopped.
(lldb) expr foo = 20
error: <user expression 0>:1:5: cannot assign to variable 'foo' with const-qualified type 'const int &'
foo = 20
~~~ ^
note: variable 'foo' declared const here
const_cast to the rescue:
(lldb) expr *(const_cast<int*>(&foo)) = 20
(int) $1 = 20
We really did change the value in the foo slot, as you can see by:
(lldb) expr foo
(const int) $2 = 20
and:
(lldb) frame var foo
(const int) foo = 20
But the compiler is free to inline the value of const variables, which it does pretty freely even at -O0 (*). For instance, the call to printf compiled on x86_64 at -O0 to:
-> 0x100003f66 <+22>: leaq 0x39(%rip), %rdi ; "%d\n"
0x100003f6d <+29>: movl $0xa, %esi
0x100003f72 <+34>: movb $0x0, %al
0x100003f74 <+36>: callq 0x100003f86 ; symbol stub for: printf
Note that it doesn't reference any variables, it just puts 0xa into the second argument passing register directly. So unsurprisingly:
(lldb) c
Process 33433 resuming
10
(*) At higher optimization levels there probably wouldn't even be a variable allocated on the frame for const variables. The value would just get inserted immediately as above, and the debug information would also record it for the debugger's display, but there wouldn't be anything in memory to change from the old to the new value.

How do I go build a Go GTK 2 app for Windows from Linux? Is there a Docker image?

I am trying to cross-compile a .go file for the GTK binding package Linux => Windows and can't figure it out. Tried going the route of setting up MSYS on Win but it was god-awful.
I looked for a Docker image but there is none.
$ ~/go/src/gui$ GOOS=windows CGO_ENABLED=1 GOARCH= CC=x86_64-w64-mingw32-gcc go build
# github.com/mattn/go-gtk/glib
In file included from /usr/lib/x86_64-linux-gnu/glib-2.0/include/glibconfig.h:9:0,
from /usr/include/glib-2.0/glib/gtypes.h:32,
from /usr/include/glib-2.0/glib/galloca.h:32,
from /usr/include/glib-2.0/glib.h:30,
from ./glib.go.h:4,
from ../github.com/mattn/go-gtk/glib/glib.go:5:
/usr/include/glib-2.0/glib/gtypes.h: In function '_GLIB_CHECKED_ADD_U64':
/usr/include/glib-2.0/glib/gmacros.h:241:53: error: size of array '_GStaticAssertCompileTimeAssertion_0' is negative
#define G_STATIC_ASSERT(expr) typedef char G_PASTE (_GStaticAssertCompileTimeAssertion_, __COUNTER__)[(expr) ? 1 : -1] G_GNUC_UNUSED
^
/usr/include/glib-2.0/glib/gmacros.h:238:47: note: in definition of macro 'G_PASTE_ARGS'
#define G_PASTE_ARGS(identifier1,identifier2) identifier1 ## identifier2
^~~~~~~~~~~
/usr/include/glib-2.0/glib/gmacros.h:241:44: note: in expansion of macro 'G_PASTE'
#define G_STATIC_ASSERT(expr) typedef char G_PASTE (_GStaticAssertCompileTimeAssertion_, __COUNTER__)[(expr) ? 1 : -1] G_GNUC_UNUSED
^~~~~~~
/usr/include/glib-2.0/glib/gtypes.h:423:3: note: in expansion of macro 'G_STATIC_ASSERT'
G_STATIC_ASSERT(sizeof (unsigned long long) == sizeof (guint64));
^~~~~~~~~~~~~~~
# github.com/mattn/go-gtk/pango
In file included from /usr/lib/x86_64-linux-gnu/glib-2.0/include/glibconfig.h:9:0,
from /usr/include/glib-2.0/glib/gtypes.h:32,
from /usr/include/glib-2.0/glib/galloca.h:32,
from /usr/include/glib-2.0/glib.h:30,
from /usr/include/pango-1.0/pango/pango-coverage.h:25,
from /usr/include/pango-1.0/pango/pango-font.h:25,
from /usr/include/pango-1.0/pango/pango-attributes.h:25,
from /usr/include/pango-1.0/pango/pango.h:25,
from ./pango.go.h:7,
from ../github.com/mattn/go-gtk/pango/pango.go:5:
/usr/include/glib-2.0/glib/gtypes.h: In function '_GLIB_CHECKED_ADD_U64':
/usr/include/glib-2.0/glib/gmacros.h:241:53: error: size of array '_GStaticAssertCompileTimeAssertion_0' is negative
#define G_STATIC_ASSERT(expr) typedef char G_PASTE (_GStaticAssertCompileTimeAssertion_, __COUNTER__)[(expr) ? 1 : -1] G_GNUC_UNUSED
^
/usr/include/glib-2.0/glib/gmacros.h:238:47: note: in definition of macro 'G_PASTE_ARGS'
#define G_PASTE_ARGS(identifier1,identifier2) identifier1 ## identifier2
^~~~~~~~~~~
/usr/include/glib-2.0/glib/gmacros.h:241:44: note: in expansion of macro 'G_PASTE'
#define G_STATIC_ASSERT(expr) typedef char G_PASTE (_GStaticAssertCompileTimeAssertion_, __COUNTER__)[(expr) ? 1 : -1] G_GNUC_UNUSED
^~~~~~~
/usr/include/glib-2.0/glib/gtypes.h:423:3: note: in expansion of macro 'G_STATIC_ASSERT'
G_STATIC_ASSERT(sizeof (unsigned long long) == sizeof (guint64));```

ctags: To get C function end line number

is it possible via ctags to get the function end line number as well
"ctags -x --c-kinds=f filename.c"
Above command lists the function definition start line-numbers. Wanted a way to get the function end line numbers.
Other Approaches:
awk 'NR > first && /^}$/ { print NR; exit }' first=$FIRST_LINE filename.c
This needs the code to be properly formatted
Example:
filename.c
1 #include<stdio.h>
2 #include<stdlib.h>
3 int main()
4 {
5 const char *name;
6
7 int a=0
8 printf("name");
9 printf("sssss: %s",name);
10
11 return 0;
12 }
13
14 void code()
15 {
16 printf("Code \n");
17 }
18
19 int code2()
20 {
21 printf("code2 \n");
22 return 1
23 }
24
Input: filename and the function start line no.
Example:
Input: filename.c 3
Output: 12
Input : filename.c 19
Output : 23
Is there any better/simple way of doing this ?
C/C++ parser of Universal-ctags(https://ctags.io) has end: field.
jet#localhost tmp]$ cat -n foo.c
1 int
2 main( void )
3 {
4
5 }
6
7 int
8 bar (void)
9 {
10
11 }
12
13 struct x {
14 int y;
15 };
16
[jet#localhost tmp]$ ~/var/ctags/ctags --fields=+ne -o - --sort=no foo.c
main foo.c /^main( void )$/;" f line:2 typeref:typename:int end:5
bar foo.c /^bar (void)$/;" f line:8 typeref:typename:int end:11
x foo.c /^struct x {$/;" s line:13 file: end:15
y foo.c /^ int y;$/;" m line:14 struct:x typeref:typename:int file:
awk to the rescue!
doesn't handle curly braces within comments but should handle blocks within functions, please give it a try...
$ awk -v s=3 'NR>=s && /{/ {c++}
NR>=s && /}/ && c && !--c {print NR; exit}' file
finds the matching brace for the first one after the specified start line number s.

ROS Custom message with sensor_msgs/Image Publisher

I have a custom .msg file MyImage.msg
sensor_msgs/Image im
float32 age
string name
I have configured the custom .msg fle as illustrated in link:CreatingMsgAndSrv
Further, I am trying to write a simple publisher with this msg.
#include <ros/ros.h>
#include <custom_msg/MyImage.h>
#include <image_transport/image_transport.h>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/opencv.hpp>
#include <cv_bridge/cv_bridge.h>
int main( int argc, char ** argv )
{
ros::init(argc, argv, "publish_custom");
ros::NodeHandle nh;
ros::Publisher pub2 = nh.advertise<custom_msg::MyImage>("custom_image", 2 );
cv::Mat image = cv::imread( "Lenna.png", CV_LOAD_IMAGE_COLOR );
sensor_msgs::ImagePtr im_msg = cv_bridge::CvImage(std_msgs::Header(), "bgr8", image).toImageMsg();
ros::Rate rate( 2 );
while( ros::ok() )
{
ROS_INFO_STREAM_ONCE( "IN main loop");
custom_msg::MyImage msg2;
msg2.age=54.3;
msg2.im = im_msg;
msg2.name="Gena";
pub2.publish(msg2);
rate.sleep();
}
}
This does not seem to compile with catkin_make. The error messages are -
/home/eeuser/ros_workspaces/HeloRosProject/src/custom_msg/publish.cpp: In function ‘int main(int, char**)’:
/home/eeuser/ros_workspaces/HeloRosProject/src/custom_msg/publish.cpp:40:19: error: no match for ‘operator=’ in ‘msg2.custom_msg::MyImage_<std::allocator<void> >::im = im_msg’
/home/eeuser/ros_workspaces/HeloRosProject/src/custom_msg/publish.cpp:40:19: note: candidate is:
/opt/ros/hydro/include/sensor_msgs/Image.h:56:8: note: sensor_msgs::Image_<std::allocator<void> >& sensor_msgs::Image_<std::allocator<void> >::operator=(const sensor_msgs::Image_<std::allocator<void> >&)
/opt/ros/hydro/include/sensor_msgs/Image.h:56:8: note: no known conversion for argument 1 from ‘sensor_msgs::ImagePtr {aka boost::shared_ptr<sensor_msgs::Image_<std::allocator<void> > >}’ to ‘const sensor_msgs::Image_<std::allocator<void> >&’
make[2]: *** [custom_msg/CMakeFiles/publish.dir/publish.cpp.o] Error 1
make[1]: *** [custom_msg/CMakeFiles/publish.dir/all] Error 2
make: *** [all] Error 2
Invoking "make" failed
I can understand that msg2.im = im_msg; isn't correct. Please help me fix this.
You are trying to assign a sensor_msgs::ImagePtr (a pointer) to a sensor_msgs::Image field. Simply you can't. Just look at the fifth line of your error log:
no known conversion for argument 1 from ‘sensor_msgs::ImagePtr {aka boost::shared_ptr<sensor_msgs::Image_<std::allocator<void> > >}’ to ‘const sensor_msgs::Image_<std::allocator<void> >&’
To solve this simple issue, just add the dereference operator (*) to that pointer:
msg2.im = *im_msg;
I assume that there are no other errors in the code.

Where is the global memory replay overhead coming from?

Running the code below to write 1 GB in global memory in the NVIDIA Visual Profiler, I get:
- 100% storage efficiency
- 69.4% (128.6 GB/s) DRAM utilization
- 18.3% total replay overhead
- 18.3% global memory replay overhead.
The memory writes are supposed to be coalesced and there is no divergence in the kernel, so the question is where is the global memory replay overhead coming from? I am running this on Ubuntu 13.04, with nvidia-cuda-toolkit version 5.0.35-4ubuntu1.
#include <cuda.h>
#include <unistd.h>
#include <getopt.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <stdint.h>
#include <ctype.h>
#include <sched.h>
#include <assert.h>
static void
HandleError( cudaError_t err, const char *file, int line )
{
if (err != cudaSuccess) {
printf( "%s in %s at line %d\n", cudaGetErrorString(err), file, line);
exit( EXIT_FAILURE );
}
}
#define HANDLE_ERROR(err) (HandleError(err, __FILE__, __LINE__))
// Global memory writes
__global__ void
kernel_write(uint32_t *start, uint32_t entries)
{
uint32_t tid = threadIdx.x + blockIdx.x*blockDim.x;
while (tid < entries) {
start[tid] = tid;
tid += blockDim.x*gridDim.x;
}
}
int main(int argc, char *argv[])
{
uint32_t *gpu_mem; // Memory pointer
uint32_t n_blocks = 256; // Blocks per grid
uint32_t n_threads = 192; // Threads per block
uint32_t n_bytes = 1073741824; // Transfer size (1 GB)
float elapsedTime; // Elapsed write time
// Allocate 1 GB of memory on the device
HANDLE_ERROR( cudaMalloc((void **)&gpu_mem, n_bytes) );
// Create events
cudaEvent_t start, stop;
HANDLE_ERROR( cudaEventCreate(&start) );
HANDLE_ERROR( cudaEventCreate(&stop) );
// Write to global memory
HANDLE_ERROR( cudaEventRecord(start, 0) );
kernel_write<<<n_blocks, n_threads>>>(gpu_mem, n_bytes/4);
HANDLE_ERROR( cudaGetLastError() );
HANDLE_ERROR( cudaEventRecord(stop, 0) );
HANDLE_ERROR( cudaEventSynchronize(stop) );
HANDLE_ERROR( cudaEventElapsedTime(&elapsedTime, start, stop) );
// Report exchange time
printf("#Delay(ms) BW(GB/s)\n");
printf("%10.6f %10.6f\n", elapsedTime, 1e-6*n_bytes/elapsedTime);
// Destroy events
HANDLE_ERROR( cudaEventDestroy(start) );
HANDLE_ERROR( cudaEventDestroy(stop) );
// Free memory
HANDLE_ERROR( cudaFree(gpu_mem) );
return 0;
}
The nvprof profiler and the API profiler are giving different results:
$ nvprof --events gst_request ./app
======== NVPROF is profiling app...
======== Command: app
#Delay(ms) BW(GB/s)
13.345920 80.454690
======== Profiling result:
Invocations Avg Min Max Event Name
Device 0
Kernel: kernel_write(unsigned int*, unsigned int)
1 8388608 8388608 8388608 gst_request
$ nvprof --events global_store_transaction ./app
======== NVPROF is profiling app...
======== Command: app
#Delay(ms) BW(GB/s)
9.469216 113.392892
======== Profiling result:
Invocations Avg Min Max Event Name
Device 0
Kernel: kernel_write(unsigned int*, unsigned int)
1 8257560 8257560 8257560 global_store_transaction
I had the impression that global_store_transation could not be lower than gst_request. What is going on here? I can't ask for both events in the same command, so I had to run the two separate commands. Could this be the problem?
Strangely, the API profiler shows different results with perfect coalescing. Here is the output, I had to run twice to get the proper counters:
$ cat config.txt
inst_issued
inst_executed
gst_request
$ COMPUTE_PROFILE=1 COMPUTE_PROFILE_CSV=1 COMPUTE_PROFILE_LOG=log.csv COMPUTE_PROFILE_CONFIG=config.txt ./app
$ cat log.csv
# CUDA_PROFILE_LOG_VERSION 2.0
# CUDA_DEVICE 0 GeForce GTX 580
# CUDA_CONTEXT 1
# CUDA_PROFILE_CSV 1
# TIMESTAMPFACTOR fffff67eaca946b8
method,gputime,cputime,occupancy,inst_issued,inst_executed,gst_request,gld_request
_Z12kernel_writePjj,7771.776,7806.000,1.000,4737053,3900426,557058,0
$ cat config2.txt
global_store_transaction
$ COMPUTE_PROFILE=1 COMPUTE_PROFILE_CSV=1 COMPUTE_PROFILE_LOG=log2.csv COMPUTE_PROFILE_CONFIG=config2.txt ./app
$ cat log2.csv
# CUDA_PROFILE_LOG_VERSION 2.0
# CUDA_DEVICE 0 GeForce GTX 580
# CUDA_CONTEXT 1
# CUDA_PROFILE_CSV 1
# TIMESTAMPFACTOR fffff67eea92d0e8
method,gputime,cputime,occupancy,global_store_transaction
_Z12kernel_writePjj,7807.584,7831.000,1.000,557058
Here gst_request and global_store_transactions are exactly the same, showing perfect coalescing. Which one is correct (nvprof or the API profiler)? Why does NVIDIA Visual Profiler says that I have non-coalesced writes? There are still significant instruction replays, and I have no idea where they are coming from :(
Any ideas? I don't think this is hardware malfunctioning, since I have two boards on the same machine and both show the same behavior.

Resources