How to trigger garbage collector to reduce database size?

How to trigger garbage collector to reduce database size? - xodus

We use Xodus for a remote probe project in order to store temporary data before sending them to the centralized database. Thus, we have several stores which can grow or decrease accordingly to the environment (traffic, network connection, etc...). Thanks to the garbage collector, we expected to see decrease in the database file size but for the moment, it has only increased.
We tried several garbage collector configurations to trigger it as frequently as possible. For example, we have :
conf.setGcFileMinAge(1);
conf.setGcFilesInterval(1);
conf.setGcMinUtilization(1);
Without visible effects...
After the store has been emptied, we expected to see reducing or deletion of .xd files but the database keeps growing and growing.
EDIT :
I try to see GC effects with a simpler code as below :
Environment exodus = Environments.newInstance(dbPath);
final Transaction xtxn = exodus.beginExclusiveTransaction();
Store store = exodus.openStore("testStore", StoreConfig.WITHOUT_DUPLICATES, xtxn);
xtxn.commit();
Thread.sleep(10 * 1000); // Wait to do actions after first background cleaning cycle
// Fill store, then clear it
exodus.executeInExclusiveTransaction(tx -> {
for(int i = 1; i <= 1000000; i++) {
store.putRight(tx, LongBinding.longToEntry(i), StringBinding.stringToEntry(dbPath));
}
});
clearStore(exodus, store);
exodus.gc();
Thread.sleep(5 * 60 * 1000); // Wait to see GC doing the work
boolean clearStore(final Environment exodus, final Store store) {
Transaction tx = exodus.beginExclusiveTransaction();
try(Cursor cursor = store.openCursor(tx)) {
boolean success = true;
while(cursor.getNext() && success) {
success &= cursor.deleteCurrent();
}
if(success) {
tx.commit();
return true;
} else {
log.warn("failed to delete entry {}", cursor.getKey());
tx.abort();
return false;
}
} catch(Exception e) {
tx.abort();
return false;
}
}
If I remove the first "sleep", Garbage Collector is doing the work, the database file size is reduced as expected, everything is ok.
But if I keep the first "sleep", Garbage Collector never seems to be called.
It's like the first background cleaning cycle is ok, but not the following ones...
I keep default configuration in this example.

There is the Environment.gc() method. The javadoc for the method is as follows:
Says environment to quicken background database garbage collector activity. Invocation of this method doesn't have immediate consequences like freeing disk space, deleting particular files, etc.
I wouldn't recommend modifying default GC settings. EnvironmentConfig.setGcMinUtilization() can be used to keep the database more compact than it would be by default, or to decrease GC load (e.g., in parallel with batch updates). Basically, higher required minimum utilization (less admissible free space) results in higher GC load.
GC cleans the database file by file, selecting files with least utilization first. When a file is cleaned it is not deleted immediately, two conditions should be satisfied:
A delay configured by EnvironmentConfig.getGcFilesDeletionDelay() should pass. It's 5 seconds by default.
Any transaction (even read-only), created before the moment when the file is cleaned, should be finished (committed or aborted).

Related

How to work around gradle being broken in docker environments?

Gradle does not work correctly in a docker environment, it is destined to use too much memory and be killed for using too much memory.
The memory manager gets its snapshots using the following class
https://github.com/gradle/gradle/blob/master/subprojects/process-services/src/main/java/org/gradle/process/internal/health/memory/MemInfoOsMemoryInfo.java
and in particular Gradle determines how much free memory is left by reading /proc/meminfo, which provides an inaccurate reading in a container.
Gradle only kills off Worker Daemons when a request comes in to make a new Worker Daemon with a larger min heap size then is available according to this reading.
Thus, Gradle will keep making workers until it uses up the alotted amount for the container and be killed.
Does anyone have a workaround for this? Don't really understand how this hasn't been a problem for more people. I suppose it only really becomes an issue if your worker daemons can't be reused and so new ones get created, which is the case for me as I have a large number of modules.
I have a temporary workaround wherein I give every jvm spawned a huge -Xms and so it always triggers the min heap size > available and so always removes prior worker daemons, but this is not satisfactory.
-- edit
To preempt some things, --max-workers does not affect the number of Worker Daemons allowed to exist, it merely affects the number which are allowed to be active. Even with --max-workers = 1, it is allowed to have arbitrary many idle Worker Daemons.

Edit - Ignore the below, it somewhat works but I have since patched Gradle by overwriting the MemInfoOsMemoryInfo class and it works a lot better. Will provide a link to the MR onto Gradle soon.
Found a reasonable work around, we listen for the os memory updates, and every time a task is done we request more memory than is determined to be free, ensuring a daemon is stopped.
import org.gradle.process.internal.health.memory.OsMemoryStatus
import org.gradle.process.internal.health.memory.OsMemoryStatusListener
import org.gradle.process.internal.health.memory.MemoryManagertask
task expireWorkers {
doFirst {
long freeMemory = 0
def memoryManager = services.get(MemoryManager.class)
gradle.addListener(new TaskExecutionListener() {
void beforeExecute(Task task) {
}
void afterExecute(Task task, TaskState state) {
println "Freeing up memory"
memoryManager.requestFreeMemory(freeMemory * 2)
}
})
memoryManager.addListener(new OsMemoryStatusListener() {
void onOsMemoryStatus(OsMemoryStatus osMemoryStatus) {
freeMemory = osMemoryStatus.freePhysicalMemory
}
})
}
}

UWP App WebView Leaks Memory, doesn't clear images

Problem:
WebView not releasing memory after loading images.
The memory does seem to get released if all WebView instances are destroyed for few seconds. We removed from XAML tree and all references in code cleared. (We checked in debugger that all instances were released at that point.)
This solution is problematic since the web view must be dead for a while for the memory clearing to kick in and is unacceptable for our use case.
How to reproduce:
Make either a UWP C# app or a C++ UWP app -> add a WebView -> load large images with randomized URLs -> Memory keeps growing.
We have only one active WebView and we load a large image in it multiple times one after another.
(We randomise part of the image url to simulate different ad loads.)
The memory keeps growing as if the images never get released.
What we tried:
Clearing cache with WebView.ClearTemporaryWebDataAsync() but it doesnt do anything.
Manually triggering GC.
Notes:
We initialise webview using this "WebView(WebViewExecutionMode.SeparateThread)".
(other execution modes don't seem to help).
We do not use a WebViewBrush.

UWP App WebView Leaks Memory, doesn't clear images
WebView is complex element. And it has own garbage collection rules, For keep render performance, it will cache a lot of data that cause memory keeps growing and gc process is slow. we can't have it both ways.
For my experience, you could set the WebView Source as "about:blank" repeatedly that could release most data immediately.
private void AppBarButton_Click(object sender, RoutedEventArgs e)
{
int count = 0;
var timer = new DispatcherTimer() { Interval = TimeSpan.FromSeconds(1) };
timer.Start();
timer.Tick += (s, p) =>
{
TestWebView.Source = new Uri("about:blank");
count++;
if (count == 20)
{
timer.Stop();
}
};
}

Core Data Import - Not releasing memory

My question is about Core Data and memory not being released. I am doing a sync process importing data from a WebService which returns a json. I load in, in memory, the data to import, loop through and create NSManagedObjects. The imported data needs to create objects that have relationships to other objects, in total there are around 11.000. But to isolate the problem I am right now only creating the items of the first and second level, leaving the relationship out, those are 9043 objects.
I started checking the amount of memory used, because the app was crashing at the end of the process (with the full data set). The first memory check is after loading in memory the json, so that the measurement really takes only in consideration the creation, and insert of the objects into Core Data. What I use to check the memory used is this code (source)
-(void) get_free_memory {
struct task_basic_info info;
mach_msg_type_number_t size = sizeof(info);
kern_return_t kerr = task_info(mach_task_self(),
TASK_BASIC_INFO,
(task_info_t)&info,
&size);
if( kerr == KERN_SUCCESS ) {
NSLog(#"Memory in use (in bytes): %f",(float)(info.resident_size/1024.0)/1024.0 );
} else {
NSLog(#"Error with task_info(): %s", mach_error_string(kerr));
}
}
My setup:
1 Persistent Store Coordinator
1 Main ManagedObjectContext (MMC) (NSMainQueueConcurrencyType used to read (only reading) the data in the app)
1 Background ManagedObjectContext (BMC) (NSPrivateQueueConcurrencyType, undoManager is set to nil, used to import the data)
The BMC is independent to the MMC, so BMC is no child context of MMC. And they do not share any parent context. I don't need BMC to notify changes to MMC. So BMC only needs to create/update/delete the data.
Plaform:
iPad 2 and 3
iOS, I have tested to set the deployment target to 5.1 and 6.1. There is no difference
XCode 4.6.2
ARC
Problem:
Importing the data, the used memory doesn't stop to increase and iOS doesn't seem to be able to drain the memory even after the end of the process. Which, in case the data sample increases, leads to Memory Warnings and after the closing of the app.
Research:
Apple documentation
Efficiently importing Data
Reducing Memory Overhead
Good recap of the points to have in mind when importing data to Core Data (Stackoverflow)
Tests done and analysis of the memory release. He seems to have the same problem as I, and he sent an Apple Bug report with no response yet from Apple. (Source)
Importing and displaying large data sets (Source)
Indicates the best way to import large amount of data. Although he mentions:
"I can import millions of records in a stable 3MB of memory without
calling -reset."
This makes me think this might be somehow possible? (Source)
Tests:
Data Sample: creating a total of 9043 objects.
Turned off the creation of relationships, as the documentation says they are "expensive"
No fetching is being done
Code:
- (void)processItems {
[self.context performBlock:^{
for (int i=0; i < [self.downloadedRecords count];) {
#autoreleasepool
{
[self get_free_memory]; // prints current memory used
for (NSUInteger j = 0; j < batchSize && i < [self.downloadedRecords count]; j++, i++)
{
NSDictionary *record = [self.downloadedRecords objectAtIndex:i];
Item *item=[self createItem];
objectsCount++;
// fills in the item object with data from the record, no relationship creation is happening
[self updateItem:item WithRecord:record];
// creates the subitems, fills them in with data from record, relationship creation is turned off
[self processSubitemsWithItem:item AndRecord:record];
}
// Context save is done before draining the autoreleasepool, as specified in research 5)
[self.context save:nil];
// Faulting all the created items
for (NSManagedObject *object in [self.context registeredObjects]) {
[self.context refreshObject:object mergeChanges:NO];
}
// Double tap the previous action by reseting the context
[self.context reset];
}
}
}];
[self check_memory];// performs a repeated selector to [self get_free_memory] to view the memory after the sync
}
Measurment:
It goes from 16.97 MB to 30 MB, after the sync it goes down to 28 MB. Repeating the get_memory call each 5 seconds maintains the memory at 28 MB.
Other tests without any luck:
recreating the persistent store as indicated in research 2) has no effect
tested to let the thread wait a bit to see if memory restores, example 4)
setting context to nil after the whole process
Doing the whole process without saving context at any point (loosing therefor the info). That actually gave as result maintaing less amount of memory, leaving it at 20 MB. But it still doesn't decrease and... I need the info stored :)
Maybe I am missing something but I have really tested a lot, and after following the guidelines I would expect to see the memory decreasing again. I have run Allocations instruments to check the heap growth, and this seems to be fine too. Also no memory Leaks.
I am running out of ideas to test/adjust... I would really appreciate if anyone could help me with ideas of what else I could test, or maybe pointing to what I am doing wrong. Or it is just like that, how it is supposed to work... which I doubt...
Thanks for any help.
EDIT
I have used instruments to profile the memory usage with the Activity Monitor template and the result shown in "Real Memory Usage" is the same as the one that gets printed in the console with the get_free_memory and the memory still never seems to get released.

Ok this is quite embarrassing... Zombies were enabled on the Scheme, on the Arguments they were turned off but on Diagnostics "Enable Zombie Objects" was checked...
Turning this off maintains the memory stable.
Thanks for the ones that read trough the question and tried to solve it!

It seems to me, the key take away of your favorite source ("3MB, millions of records") is the batching that is mentioned -- beside disabling the undo manager which is also recommended by Apple and very important).
I think the important thing here is that this batching has to apply to the #autoreleasepool as well.
It's insufficient to drain the autorelease pool every 1000
iterations. You need to actually save the MOC, then drain the pool.
In your code, try putting a second #autoreleasepool into the second for loop. Then adjust your batch size to fine-tune.
I have made tests with more than 500.000 records on an original iPad 1. The size of the JSON string alone was close to 40MB. Still, it all works without crashes, and some tuning even leads to acceptable speed. In my tests, I could claim up to app. 70MB of memory on an original iPad.

Multiple Thread access the same code in multithread application

I am in the middle of the windows service multithread project where I need some inputs from you guys to run it successfully. below is the code and describe what I am trying to do and problem.
// I created a new thread and call MyTimerMethod() from the Main method.
private void MyTimerMethod()
{
timer = Timers.Timer(5000)
timer.Elapsed += new ElapsedEventHandler(OnElapsedTime);
timer.Start();
// make this thread run every time.
Application.Run();
}
private void OnElapsedTime(object source, ElapsedEventArgs e)
{
for(int i = 0; i < SomeNum; i++) //SomeNum > 0
ThreadPool.QueueUserWorkItem(WaitCallback(MyWorkingMethod),null);
}
private void MyWorkingMethod(object state)
{
// each thread needs to go and check the status and print if currentStatus = true.
// if currentStautus = true then that jobs is ready to print.
// FYI ReadStatusFromDB() from the base class so I cannot modify it.
ReadStatusFromDB(); // ReadStatusFromDB() contains jobs to be printed.
// after doing some work store procedure update the currentStatus = false.
//do more stuff.
}
Long story in short, program runs every five seconds and check if there is more work to do. If there is then create a new thread from the threadpool and push into the queue. Now my problem is when there is more than one threads in the queue. Even the currentStatus = false multiple threads grab the same jobs and tries to print.
let me know if you need further information.

I would suggest creating a BlockingCollection of work items, and structure your program as a producer/consumer application. When a job is submitted (either by the timer tick, or perhaps some other way), it's just added to the collection.
You then have one or more persistent threads that are waiting on the collection with a TryTake. When an item is added to the collection, one of those waiting threads will get and process it.
This structure has several advantages. First, it prevents multiple threads from working on the same item. Second, it limits the number of threads that will be processing items concurrently. Third, the threads are doing non-busy waits on the collection, meaning that they're not consuming CPU resources. The only drawback is that you have multiple persistent threads. But if you're processing items most of the time anyway, then the persistent threads isn't a problem at all.

Silverlight 3 IncreaseQuotaTo fails if I call AvailableFreeSpace first

The following code throws an exception...
private void EnsureDiskSpace()
{
using (IsolatedStorageFile file = IsolatedStorageFile.GetUserStoreForSite())
{
const long NEEDED = 1024 * 1024 * 100;
if (file.AvailableFreeSpace < NEEDED)
{
if (!file.IncreaseQuotaTo(NEEDED))
{
throw new Exception();
}
}
}
}
But this code does not (it displays the silverlight "increase quota" dialog)...
private void EnsureDiskSpace()
{
using (IsolatedStorageFile file = IsolatedStorageFile.GetUserStoreForSite())
{
const long NEEDED = 1024 * 1024 * 100;
if (file.Quota < NEEDED)
{
if (!file.IncreaseQuotaTo(NEEDED))
{
throw new Exception();
}
}
}
}
The only difference in the code is that the first one checks file.AvailableFreeSpace and the second checks file.Quota.
Are you not allowed to check the available space before requesting more? It seems like I've seen a few examples on the web that test the available space first. Is this no longer supported in SL3? My application allows users to download files from a server and store them locally. I'd really like to increase the quota by 10% whenever the user runs out of sapce. Is this possible?

I had the same issue. The solution for me was something written in the help files. The increase of disk quota must be initiated from a user interaction such as a button click event. I was requesting increased disk quota from an asynchronous WCF call. By moving the space increase request to a button click the code worked.
In my case, if the WCF detected there was not enough space, the silverlight app informed the user they needed to increase space by clicking a button. When the button was clicked, and the space was increased, I called the WCF service again knowing I now had more space. Not as good a user experience, but it got me past this issue.

There is a subtle bug in your first example.
There may not be enough free space to add your new storage, triggering the request - but the amount you're asking for may be less than the existing quota. This throws the exception and doesn't show the dialog.
The correct line would be
file.IncreaseQuotaTo(file.Quota + NEEDED);

I believe that there were some changes to the behavior in Silverlight 3, but not having worked directly on these features, I'm not completely sure.
I did take a look at this MSDN page on the feature and the recommended approach is definitely the first example you have; they're suggesting:
Get the user store
Check the AvailableFreeSpace property on the store
If needed, call IncreaseQuotaTo
It isn't ideal, since you can't implement your own growth algorithm (grow by 10%, etc.), but you should be able to at least unblock your scenario using the AvailableFreeSpace property, like you say.
I believe reading the amount of total space available (the Quota) to the user store could be in theory an issue, imagine a "rogue" control or app that simply wants to fill every last byte it can in the isolated storage space, forcing the user eventually to request more space, even when not available.

It turns out that both code blocks work... unless you set a break point. For that matter, both code blocks fail if you do set a break point.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart