Decrease rails boot time - ruby-on-rails

I found this blog about reducing rails boot time.
I set these environment variables in my bashrc.
export RUBY_HEAP_MIN_SLOTS=800000
export RUBY_HEAP_FREE_MIN=100000
export RUBY_HEAP_SLOTS_INCREMENT=300000
export RUBY_HEAP_SLOTS_GROWTH_FACTOR=1
export RUBY_GC_MALLOC_LIMIT=79000000
And it did reduce my boot time by half.
Now i would like to know why this decreased my boot time and what do these environment variables mean?

RUBY_HEAP_MIN_SLOTS (default 10_000) - the initial number of heap slots and minimum number of slots at all times. One heap slot can hold one Ruby object.
RUBY_HEAP_FREE_MIN (default 4_096) - the number of free slots that should be present after the garbage collector finishes running. If there are fewer than those defined, it allocates new ones according to RUBY_HEAP_SLOTS_INCREMENT and RUBY_HEAP_SLOTS_GROWTH_FACTOR parameters
RUBY_HEAP_SLOTS_INCREMENT (default 10_000) - the number of new slots to allocate when all initial slots are used. The second heap.
RUBY_HEAP_SLOTS_GROWTH_FACTOR (default 1.8) - multiplication factor used to determine how many new slots to allocate (RUBY_HEAP_SLOTS_INCREMENT * multiplication factor). For heaps #3 and onward.
RUBY_GC_MALLOC_LIMIT (default 8_000_000) - The number of C data structures that can be allocated before triggering the garbage collector.
The default settings for the Ruby garbage collector are not optimized for Rails, which uses a lot of memory and creates and destroys huge objects frequently. The optimal values depend on the application itself, and you can profile garbage collection under different settings: http://www.ruby-doc.org/core-2.0/GC/Profiler.html
You can also monitor the GC using New Relic, gdb.rb, or using gems like scrap (https://github.com/cheald/scrap/tree/master).
Here are some articles you may be interested in:
https://www.coffeepowered.net/2009/06/13/fine-tuning-your-garbage-collector/
http://technology.customink.com/blog/2012/03/16/simple-garbage-collection-tuning-for-rails/
http://snaprails.tumblr.com/post/241746095/rubys-gc-configuration

Related

How to check and release the memory occupied by DolphinDB stream engine?

I have already unsubscribed a table in DolphinDB, but when I executed the function getStreamEngineStat().AsofJoinEngine, there was still memory occupied by the engine.
Does the function return the real-time memory information on the stream engine? How can I check the current memory status and release the memory?
getStreamEngineStat().AsofJoinEngine does return the real-time memory information.
Undefining the subscription is not equivalent to dropping the engine. In your case, if you want to release the memory occupied by the engine, you need first use dropStreamEngine and set the variable returned by createEngine to be NULL.
Specific examples are as follows:
Suppose you define the engine as follows:
ajEngine=createAsofJoinEngine(name="aj1", leftTable=trades, rightTable=quotes, outputTable=prevailingQuotes, metrics=<[price, bid, ask, abs(price-(bid+ask)/2)]>, matchingColumn=`sym, timeColumn=`time, useSystemTime=false, delayedTime=1)
After undefining the subscription, you can correctly release memory by the following code:
dropStreamEngine("aj1") // 1.release the specified engine
ajEngine = NULL // 2. Release the engine handler from memory
If you find that the memory of the asofjoin engine is too large in the subscription, you can also specify the parameter garbageSize to clean up historical data that is no longer needed. The size of garbageSize needs to be set case by case, roughly according to how many records each key will have per hour.
The principle is: First, garbageSize is set for the key, that is, the data within each key group will be cleaned up. If garbageSize is too small, frequent cleaning of historical data will bring unnecessary overhead; if garbageSize is too large, it may not reach the threshold of garbageSize and cannot trigger cleaning, resulting in a leftover of unwanted historical data.
Therefore, it is recommended to clean up once per hour and set the garbageSize with a rough estimate.

Apache beam do all keys have to fit into memory on a worker

Assuming I have an unbounded dataset with extremely high cardinity > 1,000,000,000 unique keys, lets say I want to count by key, lets say over fixed windows
My understanding the combine function will essentially maintain an accumulator on each machine in memory for each key.
Question 1
Is the above assumption correct or can workers flush out keys and accumulators to disk when under memory pressure
Question 2 (assuming above correct)
Assuming the data is not naturally partitioned (e.g reading from pubsub) would we run out of memory on each worker since every machine may in theory see every key and have to maintain an in memory structure for each key?
Question 3 (assuming above correct)
If we store the data on kafka and split up the data into partitions based on the key we are counting on. Assuming you have 1 beam worker reading from 1 partition then each worker only see a consistent subset of the keyspace. In this scenario would the memory use of the workers be any different?
Beam is meant to be highly scalable; there are Beam pipelines that run on Dataflow with many trillions of unique keys.
When running a combining operation in Beam a table of keys and aggregated values is kept in memory, but when the table becomes full it is flushed to disk (well, technically, to shuffle) so it will not run out of memory. Another worker will read this data out of shuffle, one value at a time, to compute the final aggregate over all upstream worker outputs.
As for your other two questions, if your input is naturally partitioned by key such that each worker only sees a subset of keys it is possible that more combining could happen before the shuffle, leading to less data being shuffled, but this is by no means certain and the effects would likely be small. In particular, memory considerations won't change.

Maximize or force full garbage collection in Xodus database

We have an application that writes lots of data in a Xodus database. It actually writes so much data that the garbage collector cannot keep up with freeing old files.
My question therefor is, are there any recommended settings for "maximum GC", or is there a way to force Xodus to "stop" (disallow writes) and do a full garbage collection at some point during the night?
Edit (requested information)
Non-default settings:
GcFilesDeletionDelay = 0
GcMinUtilization = 75
GcRunEvery = 1 (for testing)
GcRunPeriod = 1 (for testing)
GcTransactionAcquireTimeout = 1000
GcTransactionTimeout up to 120000 (for testing)
Fiddling with these settings has not increased GC throughput in a relevant way
What we do:
We have a single import thread that writes exclusively to an environment store. Data is written permanently throughout the day.
There are many parallel threads that read the data using read only transactions.
The data basically is measurement data in the form (location, values...)
50% of values get updated every day
There are roughly 8 million records in the database, the database currently has 122 GB on disk with 75% free space (as printed by the GC)
The VM has 20 GB of RAM, the environment store may use up to 25%

Why does Prometheus consume so much memory?

I'm using Prometheus 2.9.2 for monitoring a large environment of nodes.
As part of testing the maximum scale of Prometheus in our environment, I simulated a large amount of metrics on our test environment.
My management server has 16GB ram and 100GB disk space.
During the scale testing, I've noticed that the Prometheus process consumes more and more memory until the process crashes.
I've noticed that the WAL directory is getting filled fast with a lot of data files while the memory usage of Prometheus rises.
The management server scrapes its nodes every 15 seconds and the storage parameters are all set to default.
I would like to know why this happens, and how/if it is possible to prevent the process from crashing.
Thank you!
The out of memory crash is usually a result of a excessively heavy query. This may be set in one of your rules. (this rule may even be running on a grafana page instead of prometheus itself)
If you have a very large number of metrics it is possible the rule is querying all of them. A quick fix is by exactly specifying which metrics to query on with specific labels instead of regex one.
This article explains why Prometheus may use big amounts of memory during data ingestion. If you need reducing memory usage for Prometheus, then the following actions can help:
Increasing scrape_interval in Prometheus configs.
Reducing the number of scrape targets and/or scraped metrics per target.
P.S. Take a look also at the project I work on - VictoriaMetrics. It can use lower amounts of memory compared to Prometheus. See this benchmark for details.
Because the combination of labels lies on your business, the combination and the blocks may be unlimited, there's no way to solve the memory problem for the current design of prometheus!!!! But i suggest you compact small blocks into big ones, that will reduce the quantity of blocks.
Huge memory consumption for TWO reasons:
prometheus tsdb has a memory block which is named: "head", because head stores all the series in latest hours, it will eat a lot of memory.
each block on disk also eats memory, because each block on disk has a index reader in memory, dismayingly, all labels, postings and symbols of a block are cached in index reader struct, the more blocks on disk, the more memory will be cupied.
in index/index.go, you will see:
type Reader struct {
b ByteSlice
// Close that releases the underlying resources of the byte slice.
c io.Closer
// Cached hashmaps of section offsets.
labels map[string]uint64
// LabelName to LabelValue to offset map.
postings map[string]map[string]uint64
// Cache of read symbols. Strings that are returned when reading from the
// block are always backed by true strings held in here rather than
// strings that are backed by byte slices from the mmap'd index file. This
// prevents memory faults when applications work with read symbols after
// the block has been unmapped. The older format has sparse indexes so a map
// must be used, but the new format is not so we can use a slice.
symbolsV1 map[uint32]string
symbolsV2 []string
symbolsTableSize uint64
dec *Decoder
version int
}
We used the prometheus version 2.19 and we had a significantly better memory performance. This Blog highlights how this release tackles memory problems. i will strongly recommend using it to improve your instance resource consumption.

Golang. Zero Garbage propagation or efficient use of memory

From time to time I face with the concepts like zero garbage or efficient use of memory etc. As an example in the section Features of well-known package httprouter you can see the following:
Zero Garbage: The matching and dispatching process generates zero bytes of garbage. In fact, the only heap allocations that are made, is by building the slice of the key-value pairs for path parameters. If the request path contains no parameters, not a single heap allocation is necessary.
Also this package shows very good benchmark results compared to standard library's http.ServeMux:
BenchmarkHttpServeMux 5000 706222 ns/op 96 B/op 6 allocs/op
BenchmarkHttpRouter 100000 15010 ns/op 0 B/op 0 allocs/op
As far as I understand the second one has (from the table) no heap memory allocation and zero average number of allocations made per repetition.
The question: I want to learn a basic understanding of memory management. When garbage collector allocates/deallocates memory. What does the benchmark numbers means (the last two columns of the table) and how people know when heap is allocating?
I'm absolutely new in memory management, so it's really difficult to understand what's going on "under the hood". The articles I've read:
https://golang.org/ref/mem
https://golang.org/doc/effective_go.html
http://gribblelab.org/CBootcamp/7_Memory_Stack_vs_Heap.html
http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)
The garbage collector doesn't allocate memory :-), it just deallocates. Go's garbage collector is evolving, for the details have a look at the design document https://docs.google.com/document/d/16Y4IsnNRCN43Mx0NZc5YXZLovrHvvLhK_h0KN8woTO4/preview?sle=true and follow the discussion on the golang mailing lists.
The last two columns in the benchmark output are dead simple: How many bytes have been allocated in total and how many allocations have happened during one iteration of the benchmark code. (This allocation is done by your code, not by the garbage collector). As any allocation is a potential creation of garbage reducing these numbers might be a design goal.
When are things allocated on the heap? Whenever the Go compiler decides to! The compiler tries to allocate on the stack, but sometimes it must use the heap, especially if a value escapes from the local stack-bases scopes. This escape analysis is currently undergoing rework, so it is not easy to tell which value will be heap- or stack-allocated, especially as this is changing from compiler version to version.
I wouldn't be too obsessed with avoiding allocations until your benchmarking show too much GC overhead.

Resources