Ruby process memory structure - ruby-on-rails

I'm trying to figure out an issue with memory usage in a ruby process. I tried to take a heap dump of the ruby process using the ObjectSpace module to understand what's happening. What's puzzling is that, the "top" command in linux reports that the process uses 17.8 GB of virtual memory and 15GB of resident memory. But, the size of the heap dumps are only around 2.7-2.9 GB.
Based on the Ruby documentation, Objectspace.dump_all method dumps the contents of the ruby heap as JSON.
I'm not able to understand what is hogging the rest of the memory. It would be helpful, if someone could help me to understand what's happening.
Thank you.

It is likely that your application is allocating objects that are then groomed by the Garbage Collector. You can check this with a call to GC.stat
Ruby does not release memory back to the operating system in any meaningful way. (if you're running MRI) Consequently, if you allocate 18GB of memory and 15GB gets garbage collected, you'll end up with your ~3GB of heap data.
The Ruby MRI GC is not a compacting garbage collector, so as long as there is any data in the heap the heap will not be released. This leads to memory fragmentation and the values that you see in your app.

Related

Sidekiq causing memory bloat in rails app

I have a rails app with sidekiq workers performing processes in the background and originally had around 30 threads to perform tasks. We found this was causing high memory usage and reducing the thread count for the workers reduced the memory bloat, but I don't understand why. Can anyone please explain?
From a quick google it sounds like you are experiencing memory fragmentation which is pretty normal for Sidekiq. Are you using class variables at all? Does your code require classes during execution time? How many AR queries are you executing? Many AR queries create thousands, if not millions, of objects and throw them away. Is your code thread-safe? As per this post from the author of Sidekiq, we can see memory bloat happens from a large number of memory arenas in multithreaded applications. There are some details of a solution in that article and even the readme of the Sidekiq repo that are very helpful, but it might be worth outlining the causation to understand why memory bloat happens in 'rails/ruby'.
Memory allocation in Ruby involves three layers: interpreter, OS memory allocator library and the kernal. Ruby organises objects in memory arenas called Ruby heap pages and a ruby heap page is divided into equal-sized slots, where one object occupies a slot. These slots are either occupied or free and when Ruby allocates a new object, it tries to occupy a free slot. If there are no free slots, it will allocate a new heap page. Each slot has a byte limit and if an object is higher than the byte limit, a pointer is placed in the heap page to the object.
Memory fragmentation is when these allocations happen and is quite frequent in high thread applications. When garbage collection happens the heap page marks a cleared slot as free and allows the slot to be reused. If all objects in the heap page are marked as free then the heap page is freed back to the memory allocator and potentially the kernal. Ruby does not promise garbage collection on all objects, so what happens when not all free slots are freed and there a large amount of heap pages that are partially filled? The heap pages have available slots for Ruby to allocate to but the memory allocator still thinks they are allocated memory. The memory allocator does not release the entire OS heaps at once and can release any individual OS page, just once all allocations are released for said page.
So threading plays an issue as each thread try's to allocate memory from the same OS heap at the same time and they contend for access. Only one thread can perform an allocation at a time, which reduces multithreaded memory allocation performance. The memory allocator attempts to optimize performance by creating multiple OS heaps and tries to assign different threads to its own OS heap.
If you have access to ruby 2.7 you can call GC.compact to combat this. It provides a way to find objects that can be moved in Ruby and condenses them and reduces the amount of heap pages used. Empty slots that have been freed through GC in-between consumed slots can now be condensed against. Say, for example, you have a heap page with four slots and only slot one, two and four have an object assigned. The compact call will evaluate if object four is a movable object and will assign it to slot three and any references associated with the object and redirect to slot three. Slot four is now placed with a T_MOVED object and the final GC replaces the T_MOVED object with T_EMPTY, ready for assignment.
Personally, I would not rely solely on GC.compact and you could do the simple MALLOC_ARENA_MAX trick, but have a read of the source documents and you should find a suitable solution.

DelayedJob doesn't release memory

I'm using Puma server and DelayedJob.
It seems that the memory taken by each job isn't released and I slowly get a bloat causing me to restart my dyno (Heroku).
Any reason why the dyno won't return to the same memory usage figure before the job was performed?
Any way to force releasing it? I tried calling GC but it doesn't seem to help.
You can have one of the following problems. Or actually all of them:
Number 1. This is not an actual problem, but a misconception about how Ruby releases memory to operating system. Short answer: it doesn't. Long answer: Ruby manages an internal list of free objects. Whenever your program needs to allocate new objects, it will get those objects from this free list. If there are no more objects there, Ruby will allocate new memory from operating system. When objects are garbage collected they go back to the free list. So Ruby still have the allocated memory. To illustrate it better, imagine that your program is normally using 100 MB. When at some point program will allocate 1 GB, it will hold this memory until you restart it.
There are some good resource to learn more about it here and here.
What you should do is to increase your dyno size and monitor your memory usage over time. It should stabilize at some level. This will show you your normal memory usage.
Number 2. You can have an actual memory leak. It can be in your code or in some gem. Check out this repository, it contains information about well known memory leaks and other memory issues in popular gems. delayed_job is actually listed there.
Number 3. You may have unoptimized code that is using more memory than needed and you should try to investigate memory usage and try to decrease it. If you are processing large files, maybe you should do it in smaller batches etc.

Memory reported in Resource Monitor not showing in UMDH

I have a service which intermittently starts gobbling up server memory over time and needs to be restarted to free it. I turned +ust with gflags, restarted the service, and started taking scheduled UMDH snapshots. When the problem reoccurred, resource manager reported multiple GB under Working set and Private bytes, but the UMDH snapshots account only for a few MB allocations in the process' heaps.
At the top of UMDH snapshot files, it mentions "Only allocations for which the heap manager collected a stack are dumped".
How can an allocation in a process be without a trace when +ust flags were specified?
How can I find out where/how these GBs were allocated?
UMDH is short for User Mode Dump Heap. The term Heap is a key term here: it refers to the C++ heap manager only. This means that all memory which is allocated by other means than the C++ heap manager is not tracked by UMDH.
This can be
direct calls to VirtualAlloc()
memory used by .NET, since .NET has its own heap manager
But even for C++, there is the case that allocations larger than 512 kB are not efficiently manageable by the C++ heap manager, so it just redirects it to VirtualAlloc() and does not create a heap segment of such large allocations.
How can I find out where/how these GBs were allocated?
For direct calls to VirtualAlloc(), the WinDbg command !address -summary may give an answer. For .NET, the SOS extension and the !dumpheap -stat can give an answer.

OS memory managment

I'm studying about operating systems currently and I am a bit confused.
When a process is started for the first time, does the OS know the size of the heap? (I am guessing it knows the size of the data & code segments)
Heap is just a concept. There is no real, single heap. A heap is a block of memory that can be used for dynamic memory requests. A heap is created by library routines that allocate dynamic memory. There can be many heaps or no heap at all.
The OS never knows the size of the process heap.

Very large retained heap size for org.jruby.RubyRegexp$RegexpCache in JRuby Rails App

We have analysed a heap dump file for our application (running on Tomcat with jruby 1.7.8).
It shows us that the retained heap size is very large (439,459,128) for the class org.jruby.RubyRegexp$RegexpCache. This is 48% of our memory usage
Looking at the source code for that file it is 3 final static object created at startup (patternCache / quotedPatternCache / preprocessedPatternCache)
This seems to be a pretty core part of JRuby. My question is, is it normal to have such a large percentage of the heap to be dedicated to this cache?
it probably cached most of the Regexp objects through out the Rails/gems/user-code source ... so it might be quite huge. unless you run into a leak (out-of-memory issue) it's all fine since the actual caches are wrapped in a soft reference, that means until there's enough memory (heap size) they will be held from garbage collection but as soon as you allocate a chunk that does not fit all (or some) of those caches may get garbage collected.

Resources