Error: Thread creation failed-Cannot allocate memory

Error: Thread creation failed-Cannot allocate memory - ruby-on-rails

I have deployed my Rails application on server.
It is working fine, but it crashed on sign_up page and server will be stop.
I Checked my mongrel.log file, it give the following error.
libgomp: Thread creation failed: Cannot allocate memory
How can I resolved this error?
Thanks.

Sounds like your system was out of memory. You're probably on a VM with a limited amount of memory (and no swap). You'll need to get more memory, or use less.

Related

Sidekiq does not release memory after job is processed

I have a ruby on rails app, where we validate records from huge excel files(200k records) in background via sidekiq. We also use docker and hence a separate container for sidekiq. When the sidekiq is up, memory used is approx 120Mb, but as the validation worker begins, the memory reaches upto 500Mb (that's after a lot of optimisation).
Issue is even after the job is processed, the memory usage stays at 500Mb and is never freed, not allowing any new jobs to be added.
I manually start garbage collection using GC.start after every 10k records and also after the job is complete, but still no help.

This is most likely not related to Sidekiq, but to how Ruby allocates from and releases memory back to the OS.
Most likely the memory can not be released because of fragmentation. Besides optimizing your program (process data chunkwise instead of reading it all into memory) you could try and tweak the allocator or change the allocator.
There has been a lot written about this specific issue with Ruby/Memory, I really like this post by Nate Berkopec: https://www.speedshop.co/2017/12/04/malloc-doubles-ruby-memory.html which goes into all the details.
The simple "solution" is:
Use jemalloc or, if not possible, set MALLOC_ARENA_MAX=2.
The more complex solution would be to try and optimize your program further, so that it does not load that much data in the first place.
I was able to cut memory usage in a project from 12GB to < 3GB by switching to jemalloc. That project dealt with a lot of imports/exports and was written quite poorly and it was an easy win.

DelayedJob doesn't release memory

I'm using Puma server and DelayedJob.
It seems that the memory taken by each job isn't released and I slowly get a bloat causing me to restart my dyno (Heroku).
Any reason why the dyno won't return to the same memory usage figure before the job was performed?
Any way to force releasing it? I tried calling GC but it doesn't seem to help.

You can have one of the following problems. Or actually all of them:
Number 1. This is not an actual problem, but a misconception about how Ruby releases memory to operating system. Short answer: it doesn't. Long answer: Ruby manages an internal list of free objects. Whenever your program needs to allocate new objects, it will get those objects from this free list. If there are no more objects there, Ruby will allocate new memory from operating system. When objects are garbage collected they go back to the free list. So Ruby still have the allocated memory. To illustrate it better, imagine that your program is normally using 100 MB. When at some point program will allocate 1 GB, it will hold this memory until you restart it.
There are some good resource to learn more about it here and here.
What you should do is to increase your dyno size and monitor your memory usage over time. It should stabilize at some level. This will show you your normal memory usage.
Number 2. You can have an actual memory leak. It can be in your code or in some gem. Check out this repository, it contains information about well known memory leaks and other memory issues in popular gems. delayed_job is actually listed there.
Number 3. You may have unoptimized code that is using more memory than needed and you should try to investigate memory usage and try to decrease it. If you are processing large files, maybe you should do it in smaller batches etc.

Hunting down application errors coming from csrss.exe

I'm the maintainer of a legacy Delphi application. On machines running this program an Application Error appears sometimes with the caption referring to this Delphi app and a message like the following:
The instruction at "..." referenced memory at "...". The memory could not be "read".
Click on OK to terminate the program.
Task Manager says the process belonging to this message box is csrss.exe. What would be a systematic way to find the root cause of this error?
The problem is, this Delphi program is fairly complex, and the error message appear relatively rarely, so I cannot simply step-through the code and find the part which causes the error. Moreover, the app runs automatically, without user interruption, so I can't ask the user what she does when the message appears. Application and system logs don't indicate any problem. The app does not stop working when the message box is present.
I hope someone has run into such an error message before and was able to solve the problem. Thank you for your help in advance.

csrss supports Windows consoles. I expect that your application targets the console subsystem.
If you cannot get your application to fail under the debugger then you need to add some diagnostics to it. I recommend using an tool like madExcept or EurekaLog to do this. Personally I use madExcept and cannot recommend it highly enough. From what I have heard, EurekaLog is also a fine product.
Integrate one of these tools with your application and the next time it faults it will produce a detailed diagnostics report. Most significantly you will get stack traces for each thread in your process. The stack trace for the faulting thread should hopefully lead you to the root cause of your program's bug.
The doubt I have is that if the fault is occurring in csrss then including diagnostics in your process may not to yield fruit. It's plausible that your application already faulted, which in turn led to the error message in csrss. In which case diagnostics in your app will help. If not then you may need to find a way to make the fault occur in your process.

In addition to David's recommendation I would recommend using procdump from sysinternals to monitor the process and have it write a dumpfile when an unhandled exception occurs.
You can analyze the dumpfile offline with Windbg and the likes. While that might seem overwelming at first, I strongly believe there's a lot to be gained by getting yourself up to speed with Windbg.
Introduction
ProcDump is a command-line utility whose primary purpose is monitoring
an application for CPU spikes and generating crash dumps during a
spike that an administrator or developer can use to determine the
cause of the spike. ProcDump also includes hung window monitoring
(using the same definition of a window hang that Windows and Task
Manager use), unhandled exception monitoring and can generate dumps
based on the values of system performance counters.
Example
Launch a process and then monitor it for exceptions:
C:\>procdump -e 1 -f "" -x c:\dumps consume.exe

Cannot allocate 298930300 bytes of memory (of type "old_heap")

While load testing my erlang server with increasing number (100, 200, 3000,....) of processes using +P which is the maximum number of concurrent processes, as well as making 10 process sending 1 message to the rest of the created processes, I got a message on the erlang console:
"Crash dump was written to: erl_crash.dump. eheap_alloc: Cannot allocate 298930300 bytes of memory (of type "old_heap"). Abnormal termination".
I'm using Windows XP. The is no problem when I create the process (it's working). The crash happens after the process starts communicating (sending hi & receiving hello) and this is the only problem I have (by the way, +hms which sets the default heap size of processes).
How can I resolve this?

If somebody will find it useful as one of possible reasons for such problem(since I haven't found any specific answer anywhere)
we've experienced similar problem with rabbitmq server (linux, 64bit, persistent queue, watermarks with default config)
eheap_alloc: Cannot allocate yyy bytes of memory (of type "heap")
eheap_alloc: Cannot allocate xxx bytes of memory (of type "old_heap")
The problem was in re-queueing too much messages at once. Our "monitoring" code used "get" message with re-queue option without limiting number of messages to get & re-queue(in our case -all messages in the queue which was 4K)
So at a time it tried to add all this messages back to queue the server failed with above message.
hope it will save few hours to someone.

Have a look at that erl_crash.dump file using the Crashdump Viewer:
/usr/local/lib/erlang/lib/observer-1.0/priv/bin/cdv erl_crash.dump
(Apologies for the Unix path; you should be able to find a cdv.bat in your installation on Windows.)
Look at the process list; in my experience there's fairly often a process with a really long message queue where you didn't expect it.

You ran out of memory. Try decreasing the default heap size or limit the number of processes you start.
More advanced solutions include profiling your application to see if you can save some memory there, for example better sharing of binaries or less use of lists and large messages (which will copy the data to every process it's sent to).

One of your processes tries allocate almost 300MB memory. You have probably memory leak in your server. In proper design you should not have so much big heap in one process if it is not intended.

Jruby, Garbage Collector, Redis

I have an Jruby On Rails app which uses several WS's to gather data. The app processes the data displays it to the user, the user make his changes and then is sent back to the WS.
The problem here is that i store everything in the cache (session based) which is using memory store. But from time to time with no clear reason (for me at least) this error pops up:
ActionView::Template::Error (GC overhead limit exceeded)
I read what I could find about it and apparently this means that the Garbage Collector spends to much time in trying to free memory and no real progress is being made in that direction. My guess is that since everything is stored cache like into memory the GC tries to free it and can't do it and throws this error.
So here are the questions.
How can I get around this ?
If I switch from memory store to Redis, if my assumptions are correct, will this problem still appear.
Will the GC try to free Redis's memory area ? (Might be a stupid question but ... please help nonetheless :) )
Thank you.

Redis is a separate process, so your app garbage collector won't touch it. However, it is possible that redis is using up all the memory that would otherwise be available for the app, or that you are actually using a process memory cache and not redis, which is a different type of memory store.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Error: Thread creation failed-Cannot allocate memory - ruby-on-rails

I have deployed my Rails application on server. It is working fine, but it crashed on sign_up page and server will be stop. I Checked my mongrel.log file, it give the following error. libgomp: Thread creation failed: Cannot allocate memory How can I resolved this error? Thanks.

Sounds like your system was out of memory. You're probably on a VM with a limited amount of memory (and no swap). You'll need to get more memory, or use less.

Related

Sidekiq does not release memory after job is processed

DelayedJob doesn't release memory

Hunting down application errors coming from csrss.exe

Cannot allocate 298930300 bytes of memory (of type "old_heap")

Jruby, Garbage Collector, Redis

Categories

Resources