Go profiling profile is empty - memory

I'm following this tutorial on Go profiling and did as advised:
flag.Parse()
if *cpuprofile != "" {
f, err := os.Create(*cpuprofile)
if err != nil {
log.Fatal(err)
}
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
}
I then started my code with the flag -cpuprofile=myprogram.prof and the file got created. Then I started the pprof tool with
go tool pprof myprogram myprogram.prof
Well, myprogram reads a big json file and maps it to a big map[string]string, so there is a lot going on in my program, but when I do like top10 in pprof, I get:
Entering interactive mode (type "help" for commands)
(pprof) top10
profile is empty

Most probably your code is executing too fast, even if you think it's doing a lot. Happened to me several times.
You can play with changing the sampling rate via runtime.SetCPUProfileRate. -
set it to the value above default 100, unit is Hz. Please note the Go authors don't recommend values above 500 - see explanation.
Do it just before pprof.StartCPUProfile. You will also see the warning runtime: cannot set cpu profile rate until previous profile has finished - please see this answer for explanation.
HTH

For profiling go programs you can use pprof as a web server. You need to add a bit of code to your main file of your go program/application to start the pprof server which will continuously serve the resource usage details for your program on the server and you can easily get all the relevant details. If you follow the code below the you can see the details of your program on your browser at http://localhost:6060/debug/pprof/
(Need to refresh the page to see the updated data)
You may see the code snippet below or go to the following link for the complete code:
github.com/arbaaz-khan/GoLangLearning/blob/master/ProfilingGoProgram/profile_go_prog.go
go func() {
log.Printf("Starting Server! \t Go to http://localhost:6060/debug/pprof/\n")
err := http.ListenAndServe("localhost:6060", nil)
if err != nil {
log.Printf("Failed to start the server! Error: %v", err)
wg.Done()
}
}()
Hope it helps!

If you use ctrl-c to stop the program, make sure you pass in profile.NoShutdownHook param in profile.Start().

Most probably you are not handling the System Interrupt signal. You should explicitly handle it in order for "pprof.StopCPUProfile()" to write the profile data successfully, otherwise, the program exits to fast when exited with "ctrl+c".
Here is an example solution:
var f *os.File
func main() {
flag.Parse()
if *cpuProfile != "" {
cpuProfileFile, err := os.Create(*cpuProfile)
if err != nil {
log.Fatal(err)
}
defer f.Close()
pprof.StartCPUProfile(cpuProfileFile)
}
c := make(chan os.Signal, 2)
signal.Notify(c, os.Interrupt, syscall.SIGTERM) // subscribe to system signals
onKill := func(c chan os.Signal) {
select {
case <-c:
defer f.Close()
defer pprof.StopCPUProfile()
defer os.Exit(0)
}
}
// try to handle os interrupt(signal terminated)
go onKill(c)
}

did you handle ctrl-c signal?
if you haven't, the program is stopped by OS. you must make sure the program is exit normally, only then the profile will be written to file.
you can also check netprof module.

For me, the problem was my code was executing too fast. What I did is changing the sampling rate using runtime.SetCPUProfileRate. Note that in runtime.pprof.StartCPUProfile the sampling rate is 100 Hz and recommended to be 500 Hz at most.
func StartCPUProfile(w io.Writer) error {
// The runtime routines allow a variable profiling rate,
// but in practice operating systems cannot trigger signals
// at more than about 500 Hz, and our processing of the
// signal is not cheap (mostly getting the stack trace).
// 100 Hz is a reasonable choice: it is frequent enough to
// produce useful data, rare enough not to bog down the
// system, and a nice round number to make it easy to
// convert sample counts to seconds. Instead of requiring
// each client to specify the frequency, we hard code it.
const hz = 100
cpu.Lock()
defer cpu.Unlock()
if cpu.done == nil {
cpu.done = make(chan bool)
}
// Double-check.
if cpu.profiling {
return fmt.Errorf("cpu profiling already in use")
}
cpu.profiling = true
runtime.SetCPUProfileRate(hz)
go profileWriter(w)
return nil
}
But setting it to 500 Hz wasn't fast enough in my case. After looking into the code of runtime.SetCPUProfileRate it seems that you can provide frequencies up to 1000000 Hz. After setting it to a large enough value it solved my issue.
// SetCPUProfileRate sets the CPU profiling rate to hz samples per second.
// If hz <= 0, SetCPUProfileRate turns off profiling.
// If the profiler is on, the rate cannot be changed without first turning it off.
//
// Most clients should use the runtime/pprof package or
// the testing package's -test.cpuprofile flag instead of calling
// SetCPUProfileRate directly.
func SetCPUProfileRate(hz int) {
// Clamp hz to something reasonable.
if hz < 0 {
hz = 0
}
if hz > 1000000 {
hz = 1000000
}
lock(&cpuprof.lock)
if hz > 0 {
if cpuprof.on || cpuprof.log != nil {
print("runtime: cannot set cpu profile rate until previous profile has finished.\n")
unlock(&cpuprof.lock)
return
}
cpuprof.on = true
cpuprof.log = newProfBuf(1, 1<<17, 1<<14)
hdr := [1]uint64{uint64(hz)}
cpuprof.log.write(nil, nanotime(), hdr[:], nil)
setcpuprofilerate(int32(hz))
} else if cpuprof.on {
setcpuprofilerate(0)
cpuprof.on = false
cpuprof.addExtra()
cpuprof.log.close()
}
unlock(&cpuprof.lock)
}

Related

ESP32 hardware ISR sometimes not triggered when wifi is transmitting

I tried to use hardware timer to read data from an external device periodically.
More specifically, I realized a custom driver using gpio to simulate SPI protocol, whenever an hardtimer interrupt happens, the driver is called to read gpio status. The timer is set to 2k.
When an interrupt happens, the isr shall put sample data into a buffer. When the buffer is full, the application will pause the timer and send these data out through mqtt protocol. Using signal generator and oscilloscope, I found the data was good. The whole process worked as expected.
The problem is that the sample process is not continual. When data is sending out through wifi, the timer is paused, and no data can be read into buffer.
To solve this problem, I create a special task responsible for transmitting data out. And then I use ping-pong buffers to store sample data. When one buffer is full, the sending task is notified to send these data out, meanwhile the timer isr is continually to put data into another buffer.
At first I wanted to send notify just from the isr (using xQueueSendFromISR()), which was proved not reliable. I found only a few notifies were able to be sent to the sending task. So I am obliged to using a flag. When one buffer is full, the flag is set to true, While a special task is looping this flag, whenever it find the flag is true, it will notify the sending task.
timer_isr()
{
read_data_using_gpio;
if(one buffer is full)
{
set the flag to true
}
}
task_1()
{
while(1)
{
if(the flag is true)
{
set the flag to false;
xQueueSend;
}
vTaskDelay(50ms)//it will cost 200ms to fill up the buffer
}
}
task_2()
{
while(1)
{
xStatus = xQueueReceive;
if(xStatus==pdPASS) // A message from other tasks is received.
{
transmitting data out using mqtt protocol.
}
}
}
Then I got the terrible data as below.
terroble data
I used oscilloscope to check the gpio operation in the isr.
oscilloscope1
oscilloscope2
So it seems like some isr not triggered? but what happened?
More weird thing: I added another task to get data from an audio chip through i2s. Again I used ping-pong buffers and send notify to the same sending task.
timer_isr()
{
read_data_using_gpio;
if(one buffer is full)
{
set the flag to true
}
}
task_1()
{
while(1)
{
if(the flag is true)
{
set the flag to false;
xQueueSend;
}
vTaskDelay(50ms)
}
}
task_3()
{
while(1)
{
i2s_read_to_buffer;
xQueueSend;
}
}
task_2()
{
while(1)
{
xStatus = xQueueReceive;
if(xStatus==pdPASS) // A message from other tasks is received.
{
if(data from task_1)
{
do something;
transmitting data out using mqtt protocol
}
if(data from task_2)
{
do something;
transmitting data out using mqtt protocol
}
}
}
}
And this time the data from former task turned ok!
data_ok
And what's more, after I commened task2-related code in the sending task, Again the data become bad!
So what happened? Can somebody give any hint?
task_2()
{
while(1)
{
xStatus = xQueueReceive;
if(xStatus==pdPASS) // A message from other tasks is received.
{
if(data from task_1)
{
do something;
transmitting data out using mqtt protocol
}
if(data from task_2)
{
// do something;
// transmitting data out using mqtt protocol
}
}
}
}
I have solved this problem.
If you enable power management(idf.py menuconfig-->component config-->power management), the APB(advanced peripheral bus) will low its frequency automatically, which is the clock source of hardware timer.Thus you will see the timer interrupt is not stable.
Just disable the power management.

IOS AudioUnit playback crackle issue (swift)

I am trying to play bytes coming from the UDP from an android device in IOS. I am using TPCircularBuffer to play the bytes. My code is below:
let success = initCircularBuffer(&circularBuffer, 1024)
if success {
print("Circular buffer init was successful")
} else {
print("Circular buffer init not successful")
}
func udpReceive() {
receivingQueue.async {
repeat {
do {
let datagram = try self.tcpClient?.receive()
let byteData = datagram?["data"] as? Data
let dataLength = datagram?["length"] as? Int
self.dataLength = dataLength!
let _ = TPCircularBufferProduceBytes(&self.circularBuffer, byteData!.bytes, UInt32(dataLength! * MemoryLayout<UInt8>.stride * 2))
} catch {
fatalError(error.localizedDescription)
}
} while true
}
}
func consumeBuffer() -> UnsafeMutableRawPointer? {
self.availableBytes = 0
let tail = TPCircularBufferTail(&self.circularBuffer, &self.availableBytes)
return tail
}
We are recording with 16K sample rate and sending to IOS via UDP from Android side and from there we are using AudioUnit to play our bytes but the problem is the crackling and clipping sound in our voice.
Playback Callback code:
func performPlayback(
_ ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>,
inTimeStamp: UnsafePointer<AudioTimeStamp>,
inBufNumber: UInt32,
inNumberFrames: UInt32,
ioData: UnsafeMutablePointer<AudioBufferList>
) -> OSStatus {
var buffer = ioData[0].mBuffers
let bufferTail = consumeBuffer()
memcpy(buffer.mData, bufferTail, min(self.dataLength, Int(availableBytes)))
buffer.mDataByteSize = UInt32(min(self.dataLength, Int(availableBytes)))
TPCircularBufferConsume(&self.circularBuffer, UInt32(min(self.dataLength, Int(availableBytes))))
return noErr
}
UDP is sending us 1280 bytes per sample. What we think the problem is the BUFFER SIZE that is not being set correctly. Can anyone guide me how to set proper buffer size. It would be great help indeed. I know the work of #Gruntcakes as an voip engineer https://stackoverflow.com/a/57136561/12020007. I have also studied the work of #hotpaw2 and was looking at https://stackoverflow.com/a/58545845/12020007 to check if there is some threading issue. Any kind of help would be appreciated.
An Audio Unit callback should return only the requested number of frames (samples), as indicated by the inNumberFrames parameter. Your code appears to be copying some different number of samples into the AudioBufferList, which won't work, as an iOS Audio Unit will only send the requested number of frames to the audio output.
You can suggest a preferred buffer duration in your Audio Session configuration, but this is only a suggestion. iOS is free to ignore this suggestion, and use an inNumberFrames that is better matched to the device's audio hardware, system activity, and current power state.
Don't forget to pre-fill the circular buffer enough to account for the the maximum expected jitter in network (UDP) transit time. Perhaps measure network packet-to-packet latency jitter, and compute its statistics, min, max, std.dev. etc.
If your UDP buffers are not a power-of-2 in size, or contain samples that are not at the iOS hardware sample rate, then you will also have to account for fractional buffer and resampling jitter in your safety buffering overhead.
The playback callback Code is:
private let AudioController_RecordingCallback: AURenderCallback = {(
inRefCon,
ioActionFlags/*: UnsafeMutablePointer<AudioUnitRenderActionFlags>*/,
inTimeStamp/*: UnsafePointer<AudioTimeStamp>*/,
inBufNumber/*: UInt32*/,
inNumberFrames/*: UInt32*/,
ioData/*: UnsafeMutablePointer<AudioBufferList>*/)
-> OSStatus
First we need to understand the callback based on our properties like our sample rate and number of channel, these things deduce the Frames the IOS callback will play, If our frames are less or more then the inNumberOfFrames the sound will contain crackle and clipping.
The solution to the problem is the TPCircularBuffer that will memcpy exactly the inNumberOfFrames to the ioData to play the sound correctly. Make sure that you only copy the number of frames that the callBack is producing at that time as it can vary from time to time.
The problem with your code is that you are trying to play 1280bytes whereas the call back is expecting less than these bytes.
For TPCircularBuffer refer to https://github.com/michaeltyson/TPCircularBuffer

CMMotionActivityManager queryActivityStarting in background with location updates

First, a little context: my app runs in the background for getting locations, and location updates are active for what's below.
CMMotionActivityManager's queryActivityStarting call returns 0 activities when run in the background, but if my app is active it returns activities for the same time period (the activities exist - they just aren't being returned while in the background).
The startActivityUpdates function works in the background, but I just want the high level activity data from queryActivityStarting without the battery cost of using startActivityUpdates.
Am I mistaken and something else is going on? These 2 functions have very similar signatures, so I'm a little surprised they'd work differently.
Edit - asked to show my code.
Here's how I ask for activities - I tried without a delay and with various delays - Apple docs say activities might not be available for several minutes and I didn't know what that meant (anyone have specifics? maybe something else has to happen before activities are available?), so I tried waiting first:
DispatchQueue.main.asyncAfter(deadline: DispatchTime.now() + .seconds(60 * 60), execute: {
ModeStorage.sharedInstance.saveModes(tripId, beforeStart, end)
})
Here's the code that queries - the last logging line shows a count of 0:
guard CMMotionActivityManager.isActivityAvailable() else { return Log.log("Activity N/A", true) }
CMMotionActivityManager().queryActivityStarting(from:fromDate, to:toDate, to:OperationQueue.main) { [weak self] (arr, err) -> Void in
guard var activities = arr else {return Log.log("modes n/a", true) }
Log.log("modes save (\(fromDate)-\(toDate): \(activities.count) in \(UIApplication.shared.applicationState.rawValue))", true)
...
}
Another edit: leaving this open as I'd love to use queryActivityStarting, but I switched my code to use startActivityUpdates and that seems to work without a problem.

pthread: locking mutex with timeout

I try to implement following logic (a kind of pseudo-code) using pthread:
pthread_mutex_t mutex;
threadA()
{
lock(mutex);
// do work
timed_lock(mutex, current_abs_time + 1 minute);
}
threadB()
{
// do work in more than 1 minute
unlock(mutex);
}
I do expect threadA to do the work and wait untill threadB signals but not longer than 1 minute. I have done similar a lot of time in Win32 but stuck with pthreads: a timed_lock part returns imediately (not in 1 minute) with code ETIMEDOUT.
Is there a simple way to implement the logic above?
even following code returns ETIMEDOUT immediately
pthread_mutex_t m;
// Thread A
pthread_mutex_init(&m, 0);
pthread_mutex_lock(&m);
// Thread B
struct timespec now;
clock_gettime(CLOCK_MONOTONIC, &now);
struct timespec time = {now.tv_sec + 5, now.tv_nsec};
pthread_mutex_timedlock(&m, &time); // immediately return ETIMEDOUT
Does anyone know why? I have also tried with gettimeofday function
Thanks
I implemented my logic with conditional variables with respect to other rules (using wrapping mutex, bool flag etc.)
Thank you all for comments.
For the second piece of code: AFAIK pthread_mutex_timedlock only works with CLOCK_REALTIME.
CLOCK_REALTIME are seconds since 01/01/1970
CLOCK_MONOTONIC typically since boot
Under these premises, the timeout set is few seconds into 1970 and therefore in the past.
try something like this :
class CmyClass
{
boost::mutex mtxEventWait;
bool WaitForEvent(long milliseconds);
boost::condition cndSignalEvent;
};
bool CmyClass::WaitForEvent(long milliseconds)
{
boost::mutex::scoped_lock mtxWaitLock(mtxEventWait);
boost::posix_time::time_duration wait_duration = boost::posix_time::milliseconds(milliseconds);
boost::system_time const timeout=boost::get_system_time()+wait_duration;
return cndSignalEvent.timed_wait(mtxEventWait,timeout); // wait until signal Event
}
// so inorder to wait then call the WaitForEvent method
WaitForEvent(1000); // it will timeout after 1 second
// this is how an event could be signaled:
cndSignalEvent.notify_one();

Star Micronics TSP650II bluetooth printer, can't write to EASession.OutputStream

I'm trying to print a label with a Star Micronics TSP650II printer in a monotouch app.
The problem is that session.OutputStream.HasSpaceAvailable() always returns false. What am I missing?
the C# code I have goes something like this (cut for simplicity):
var manager = EAAccessoryManager.SharedAccessoryManager;
var starPrinter = manager.ConnectedAccessories.FirstOrDefault (p => p.Name.IndexOf ("Star") >= 0); // this does find the EAAccessory correctly
var session = new EASession (starPrinter, starPrinter.ProtocolStrings [0]); // the second parameter resolves to "jp.star-m.starpro"
session.OutputStream.Schedule (NSRunLoop.Current, "kCFRunLoopDefaultMode");
session.OutputStream.Open ();
byte[] toSend = GetInitData(); // this comes from another project where the same printer with ethernet cable was used in a windows environment and worked, not null for sure
if (session.OutputStream.HasSpaceAvailable()) {
int bytesWritten = session.OutputStream.Write (toSend, (uint)stillToSend.Length);
if (bytesWritten < 0) {
Debug.WriteLine ("ERROR WRITING DATA");
} else {
Debug.WriteLine("Some data written, ignoring the rest, just a test");
}
} else
Debug.WriteLine ("NO SPACE"); // THIS ALWAYS PRINTS, the output stream is never ready to take any output
UPDATE:
I was able to work-around this problem by binding Star Micronics iOS SDK to my project, but that's less than ideal as it adds 700K to the package for something that should work without that binding.
UPDATE 2:
I've been getting requests for the binding code. I still strongly recommend you try to figure out the bluetooth connectivity and not use the binding but for those who are brave enough, here it is.
This is Kale Evans, Software Integration Engineer at Star Micronics.
Although Apple's EADemo doesn't show this, the following piece of code below is important for printing to EAAccessory.(Note, below code is Objective-C example).
if ([[_session outputStream] hasSpaceAvailable] == NO)
{
[[NSRunLoop currentRunLoop] runUntilDate:[NSDate dateWithTimeIntervalSinceNow:0.1]];
}
This gives OS time to process all input sources.
You say this does find the EAAccessory correctly
Could this be the reason the OutputStream returns false if the session is actually null?
Best Regards,
Star Support

Resources