Since http is stateless, every request to an app creates a new object. How does Rails clean up the unused objects / how frequently?
Simple answer: the Ruby runtime has a garbage collector. Depending on the runtime (JRuby/JVM generational GC, IronRuby/CLR generational GC, classic Ruby/mark-sweep GC) different algorithms are used. But the basics are pretty simple:
Upon an allocation request if there is "insufficient free memory" available - how much is insufficient is one of the ingredients of the GC algorithm - then a GC will commence
The GC starts by scanning roots, which are global variables and stack locations (parameters and local variables), to discover which objects are still alive; it marks each object it finds
Then, the GC process looks at links (references) inside these objects, and recurses into those objects that haven't already been marked
The GC can then start moving / copying all marked objects so that they are compacted in memory
The "free pointer", from whence new allocations occur, is reset to the end of this compacted chunk of memory
If there is still "insufficient free memory", then more is allocated from the operating system
All old objects that weren't marked during the scanning process are garbage and are implicitly discarded through the copying process and resetting of the free pointer.
The frequency of collections depends on the tuning of the GC, which may be affected by the operating system, the amount of physical memory, operating system memory pressure, user-controlled tweaks, underlying platform version revisions, dynamically optimized parameters, etc. Much of it comes down to deciding where the bar lies in that "insufficient free memory" test, though things get more complicated with generational collectors.
If you are interested in this you should check out the blog series about copy-on-write garbage collection by the Phusion team and their efforts to improve on the default ruby gc scheme in Ruby Enterprise Edition.
http://izumi.plan99.net/blog/index.php/2007/04/05/saving-memory-in-ruby-on-rails/
Other links in the series here:
http://www.rubyenterpriseedition.com/faq.html
Related
I have a rails app with sidekiq workers performing processes in the background and originally had around 30 threads to perform tasks. We found this was causing high memory usage and reducing the thread count for the workers reduced the memory bloat, but I don't understand why. Can anyone please explain?
From a quick google it sounds like you are experiencing memory fragmentation which is pretty normal for Sidekiq. Are you using class variables at all? Does your code require classes during execution time? How many AR queries are you executing? Many AR queries create thousands, if not millions, of objects and throw them away. Is your code thread-safe? As per this post from the author of Sidekiq, we can see memory bloat happens from a large number of memory arenas in multithreaded applications. There are some details of a solution in that article and even the readme of the Sidekiq repo that are very helpful, but it might be worth outlining the causation to understand why memory bloat happens in 'rails/ruby'.
Memory allocation in Ruby involves three layers: interpreter, OS memory allocator library and the kernal. Ruby organises objects in memory arenas called Ruby heap pages and a ruby heap page is divided into equal-sized slots, where one object occupies a slot. These slots are either occupied or free and when Ruby allocates a new object, it tries to occupy a free slot. If there are no free slots, it will allocate a new heap page. Each slot has a byte limit and if an object is higher than the byte limit, a pointer is placed in the heap page to the object.
Memory fragmentation is when these allocations happen and is quite frequent in high thread applications. When garbage collection happens the heap page marks a cleared slot as free and allows the slot to be reused. If all objects in the heap page are marked as free then the heap page is freed back to the memory allocator and potentially the kernal. Ruby does not promise garbage collection on all objects, so what happens when not all free slots are freed and there a large amount of heap pages that are partially filled? The heap pages have available slots for Ruby to allocate to but the memory allocator still thinks they are allocated memory. The memory allocator does not release the entire OS heaps at once and can release any individual OS page, just once all allocations are released for said page.
So threading plays an issue as each thread try's to allocate memory from the same OS heap at the same time and they contend for access. Only one thread can perform an allocation at a time, which reduces multithreaded memory allocation performance. The memory allocator attempts to optimize performance by creating multiple OS heaps and tries to assign different threads to its own OS heap.
If you have access to ruby 2.7 you can call GC.compact to combat this. It provides a way to find objects that can be moved in Ruby and condenses them and reduces the amount of heap pages used. Empty slots that have been freed through GC in-between consumed slots can now be condensed against. Say, for example, you have a heap page with four slots and only slot one, two and four have an object assigned. The compact call will evaluate if object four is a movable object and will assign it to slot three and any references associated with the object and redirect to slot three. Slot four is now placed with a T_MOVED object and the final GC replaces the T_MOVED object with T_EMPTY, ready for assignment.
Personally, I would not rely solely on GC.compact and you could do the simple MALLOC_ARENA_MAX trick, but have a read of the source documents and you should find a suitable solution.
I'm trying to understand the idea behind memory usage in Ruby. I'm currently going through memory issues on my Rails web app and API.
Here's a simple question:
If I load many records inside a variable like so:
users = User.where(work: 'cook')
This would probably hold in my app's memory for the time I'm using this variable, right?
But would it help to free memory by doing the following after I'm done using the variable in my code?
users = nil
Thank you for your help. I'm also open to answers that answer the question on a broader topic.
Yes setting users to nil would indeed reduce required memory (very slightly) but it's not necessary as the Garbage Collector will eventually sweep it. In production you should assume your Ruby process will always grow over time and should be periodically restarted if your concerned about memory management. The maximum heap space reduction you'll ever see in ruby is minimal compared to its growth over time so I wouldn't concern yourself with setting large collections to nil to save a few bytes here and there a little earlier than the GC would have swept it anyway. Ruby allocates objects in a heap space that consists of heap pages. Assuming you're using Ruby2.1 or better, the heap space is divided into used (aka Eden) and empty (aka Tomb) heap pages. When instantiating objects, ruby looks for free space in the eden pages first and only if no space is available will it take a page from tomb. When you then overwrite the object with nil, those heap pages are added back to the tomb. Moving pages from the eden to the tomb will reduct heap size slightly however Ruby's Garbage Collector won't drastically reduce it because it assumes if you've created a large collection of objects before, you'll do it again. One book I recommend diving into is "Ruby Performance Optimization" as it goes through ruby's Garbage Collector in depth.
I want to know technical details about garbage collection (GC) and memory management in Erlang/OTP.
But, I cannot find on erlang.org and its documents.
I have found some articles online which talk about GC in a very general manner, such as what garbage collection algorithm is used.
To classify things, lets define the memory layout and then talk about how GC works.
Memory Layout
In Erlang, each thread of execution is called a process. Each process has its own memory and that memory layout consists of three parts: Process Control Block, Stack and Heap.
PCB: Process Control Block holds information like process identifier (PID), current status (running, waiting), its registered name, and other such info.
Stack: It is a downward growing memory area which holds incoming and outgoing parameters, return addresses, local variables and temporary spaces for evaluating expressions.
Heap: It is an upward growing memory area which holds process mailbox messages and compound terms. Binary terms which are larger than 64 bytes are NOT stored in process private heap. They are stored in a large Shared Heap which is accessible by all processes.
Garbage Collection
Currently Erlang uses a Generational garbage collection that runs inside each Erlang process private heap independently, and also a Reference Counting garbage collection occurs for global shared heap.
Private Heap GC: It is generational, so divides the heap into two segments: young and old generations. Also there are two strategies for collecting; Generational (Minor) and Fullsweep (Major). The generational GC just collects the young heap, but fullsweep collect both young and old heap.
Shared Heap GC: It is reference counting. Each object in shared heap (Refc) has a counter of references to it held by other objects (ProcBin) which are stored inside private heap of Erlang processes. If an object's reference counter reaches zero, the object has become inaccessible and will be destroyed.
To get more details and performance hints, just look at my article which is the source of the answer: Erlang Garbage Collection Details and Why It Matters
A reference paper for the algorithm: One Pass Real-Time Generational Mark-Sweep Garbage Collection (1995) by Joe Armstrong and Robert Virding in
1995 (at CiteSeerX)
Abstract:
Traditional mark-sweep garbage collection algorithms do not allow reclamation of data until the mark phase of the algorithm has terminated. For the class of languages in which destructive operations are not allowed we can arrange that all pointers in the heap always point backwards towards "older" data. In this paper we present a simple scheme for reclaiming data for such language classes with a single pass mark-sweep collector. We also show how the simple scheme can be modified so that the collection can be done in an incremental manner (making it suitable for real-time collection). Following this we show how the collector can be modified for generational garbage collection, and finally how the scheme can be used for a language with concurrent processes.1
Erlang has a few properties that make GC actually pretty easy.
1 - Every variable is immutable, so a variable can never point to a value that was created after it.
2 - Values are copied between Erlang processes, so the memory referenced in a process is almost always completely isolated.
Both of these (especially the latter) significantly limit the amount of the heap that the GC has to scan during a collection.
Erlang uses a copying GC. During a GC, the process is stopped then the live pointers are copied from the from-space to the to-space. I forget the exact percentages, but the heap will be increased if something like only 25% of the heap can be collected during a collection, and it will be decreased if 75% of the process heap can be collected. A collection is triggered when a process's heap becomes full.
The only exception is when it comes to large values that are sent to another process. These will be copied into a shared space and are reference counted. When a reference to a shared object is collected the count is decreased, when that count is 0 the object is freed. No attempts are made to handle fragmentation in the shared heap.
One interesting consequence of this is, for a shared object, the size of the shared object does not contribute to the calculated size of a process's heap, only the size of the reference does. That means, if you have a lot of large shared objects, your VM could run out of memory before a GC is triggered.
Most if this is taken from the talk Jesper Wilhelmsson gave at EUC2012.
I don't know your background, but apart from the paper already pointed out by jj1bdx you can also give a chance to Jesper Wilhelmsson thesis.
BTW, if you want to monitor memory usage in Erlang to compare it to e.g. C++ you can check out:
Erlang Instrument Module
Erlang OS_MON Application
Hope this helps!
Whenever I use spawn(Mod, Func, Arguments) all the arguments are copied. Why are they copied if everything is immutable in Erlang? Why isn't just the pointer copied? Is it because that makes the garbage collection much more complicated?
At present, the Erlang VM maintains a separate heap per process*. This means that a process can collect its garbage independently of others, making Erlang less vulnerable to the effects of GC pauses than runtimes that keep a global heap.
In order for this to be effective, it is imperative that no process references memory allocated on the heap of another process. Presumably, the reason for copying the arguments sent to spawn/3 is so that they are moved into the newly spawned process' heap. The same holds for messages sent to a process, by the way (source: see the link above):
All data in messages between Erlang processes is copied, with the exception of refc binaries on the same Erlang node.
(*) You might enjoy reading this blog post about garbage collection in Erlang. It's actually a little more complicated than I said in the beginning as some objects (notably atoms and large binaries) are handled separately.
Robert Virding added the following in a comment below:
Having separate heaps for each process make the GC simpler and more efficient, you can reclaim much more memory in each pass than with a real-time collector. Also it scales much better in a parallel system as there are much much fewer locks and less synchronisation, which kills speed. It can also give better locality of memory and cache performance. It's one of those things which sounds worse but ends up being better.
We have some problems with Dart. It seems like after some period of time the garbage collector can't clear the memory in VM, so application hangs. Anyone with this issue? Are there any memory limits?
You should reuse your objects instead of creating new ones. You should use pool pattern:
http://en.wikipedia.org/wiki/Object_pool_pattern
Be careful about canvas and it's proper destruction.
Another GC performance papers:
http://blog.tojicode.com/2012/03/javascript-memory-optimization-and.html
http://qt-project.org/doc/qt-5/qtquick-performance.html
Are there any memory limits?
Yes. Dart apparently runs with a maximum sizes that can be configured at launch time:
How to run a dart program with big memory?
(The following applies to all garbage-collected languages ...)
If your application starts to run out of space (i.e. the heap is slowly filing with objects that the GC can't remove) then you may get into a nasty situation where the GC runs more and more frequently, and manages to reclaim less and less memory each time. Eventually you run out of memory, but before that happens the application gets really slow.
The solution is typically to do one or both of the following:
Find what is causing the memory to run out. It is typically not that you are allocating too many objects. Rather, the typical cause is that the unwanted objects are all still reachable ... via some data structure that your application has built.
Set the "quick death" tuning option for the GC .... if available. For example, Java garbage collectors can be configured to measure the time spent garbage collecting. (The GC overhead.) When the GC overhead exceeds a preset ratio, the Java virtual machine throws an OutOfMemoryError to "pull the plug".