How to clear GPU memory without 'kill pid'? - memory

I use my school's server for deep learning experiment. I stopped the python script but somehow the GPU memory was not released. I don't have the root to kill the processes, how can I clear the GPU memory?
I tried 'sudo kill -9' and 'nvidia-smi', but it said 'insufficient permissions' (I am using the university's server).

Related

docker build running out of memory, but plenty of memory seems to be available

I'm building an elixir/phoenix application using a docker container.
This has been working for some time now, but recently it stopped working, with the error always being associated with a lack of memory.
For instance, the most frequent point of failure is during the mix compile task of Elixir (the most time heavy task in the Dockerfile), which crashes with the error:
eheap_alloc: Cannot allocate 147852528 bytes of memory (of type "old_heap").
Crash dump is being written to: erl_crash.dump...done
Sometimes it might be able to get through that step, but will again fail at a later step, like brunch build which compiles the frontend code. Sometimes it just fails at some other step with no specific error message, just saying:
Killed
While this is happening, I can easily check htop and see that I'm using 3 or 4GB of RAM, out of 16GB total, so there's no lack of physical RAM at all.
After some digging, I found that sudo sysctl vm.overcommit_memory=1 could help, but no luck there either.
The exact same build runs fine on my other computer, which runs Arch Linux, while this one runs Ubuntu 16.04

Docker not releasing memory when shutdown, windows 10

I have recently started using docker for new development work, however I am still required to switch back to working on our older on-premise offering from time to time. That is, I sometimes need to shutdown docker and spin up a an installation of our on premise server.
I find that when I do this with docker installed the performance of this server is terrible, essentially unusable, I need to uninstall docker to get it to work again.
When I have docker running I can see it using the memory (my machine has 32 GB of RAM, I am telling docker to use 16) and when I shutdown docker I can see it being released, according to the task manager anyway, and I can also see on hyper-v manager that the VM has been shutdown. However the performance of on-premise server install continues to act as the memory is in use. This is not a small performance hit, actions that should take 1 second take 20 or 30.
It would seem like docker is not actually releasing the memory on shutdown and only does so when I actually uninstall it, when I do this performance recovers completely.
Is this a known issue? Is there anything else I can try to see where the memory is going? I can find no other reports about it.
I am using windows 10 with docker version 17.03.1-ce-win5 (10743)

What is the best CLI tool to take memory dumps for C++ in Linux

What is the best CLI tool to take memory dumps for C++ processes in Linux. I have a program which monitors the memory usage of different processes running on Linux. For Java based proceses, I am using jstack and Jmap to take the thread and heap dumps. But, are there any good CLI tools take similar dumps for C++ based processes?? And if yes, how do we use them and once dump is taken how to analyse the dumps?
Any inuputs will be appreciated.
I would recommend using gcore which is an open source executable to dump for remote process. In order to achieve consistency, target process is being suspended while collecting memory, and resumed afterwards.
usage info can be found in the following link :
gsp.com/cgi-bin/man.cgi?section=1&topic=gcore
another option is via gcc, by attaching the process to gcc instantiation and typing the 'gcore' command, and then detaching it.
$ gdb --pid=123
(gdb) gcore
Saved corefile core.123
(gdb) detach

Ruby on Rails VPS RAM amount

Currently I have a simpliest VPS: 1 core, 256 MB of RAM, Ubuntu 12.04 LTS. My application seems to be running fine enough (I'm using unicorn and nginx) but when I run my rake jobs:work command for my delayed_jobs, unicorn process is getting killed.
I was wondering if it is related to the RAM amount ?
When the unicorn process is up and running, free -m command shows me that around 230 MB of RAM are occupied.
I was wondering, how much RAM would I need in overall ? 512 ? 1024 ?
Which one should I go with ?
Would be very glad to receive any answers!
Thank you
Your DJ worker would run another instance of your Rails application, so you need to make sure that you have at least enough RAM for that other instance plus allowance for other processes you are running.
Check ps aux for the memory usage of your Rails app.
Run top and see how much physical memory is free (while your Rails app is running).
My guess is you'll have to bump up your RAM to 512 MB. You of course don't want your memory use to spill over to swap.
Of course, besides that, you also need to make sure that your application and database are optimized enough that there are no incredible spikes in memory usage.
You can start with
ulimit -S -a
to find out the limits of your environment

Erlang: Node freezes totaly. Now what?

Platform: windows 7 32bit, erlang R15B01.
I have developed an erlang server that simultaneously listens to 200 different tcp ports (200 gen_servers)
After a few minutes of moderate load(few clients in parallel) the entire node just freezes completely - even the shell freezes entirely.
How can this problem get diagnosed? is there a standard erlang approach for those kind of problems? (memory consumption was low ,so its not some kind of memory leak)
Important Edit
It seems that under werl.exe there is no such problem. only under erl.exe. Probably same as in http://erlang.2086793.n4.nabble.com/erl-exe-dies-but-werl-exe-does-not-on-both-Windows-XP-and-2008R2-with-R14B01-td3335030.html
If you kill your process with kill -SIGUSR1 <pid>, the erlang VM will generate a erlang crash dump file erl_crash.dump in the directory the app was started.
Then you can analyze it using the crash dump viewer.
A frozen erlang shell can be caused by uncaught exit signals. You can try to trap exits in the shell process (assuming it is the parent process of your server) which should give you the exit reason. See Reference manual on Errors

Resources