Rollback snapshot but run out of space - rollback

I have a 1TB zpool and a 700GB volume with one clean snapshot, such as:
zpool1
zpool1/volume1
zpool1/volume1#snap1
After writing 500GB data into volume, its written property has growth to 500GB as well.
Then I tried to rollback to the snapshot and I got error with "out of space".
Does zpool need extra space to rollback snapshot with big written value? Or can anyone explain why it fails?

After searching of zfs source code(dsl_dataset.c), I found the last part of dsl_dataset_rollback_check() may explain this limit:
* When we do the clone swap, we will temporarily use more space
* due to the refreservation (the head will no longer have any
* unique space, so the entire amount of the refreservation will need
* to be free). We will immediately destroy the clone, freeing
* this space, but the freeing happens over many txg's.
*
unused_refres_delta = (int64_t)MIN(ds->ds_reserved,
dsl_dataset_phys(ds)->ds_unique_bytes);
if (unused_refres_delta > 0 &&
unused_refres_delta >
dsl_dir_space_available(ds->ds_dir, NULL, 0, TRUE)) {
dsl_dataset_rele(ds, FTAG);
return (SET_ERROR(ENOSPC));
}
So that the volume's "avail" must be lager than "refreserv" to perform rollback.
Only for thin-volume can pass this check.

Rolling back to a snapshot requires a little space (for updating metadata), but this is very small.
From what you’ve described, I would expect nearly anything you write in the same pool / quota group to fail with ENOSPC at this point. If you run zpool status, I bet you’ll see that the entire pool is almost entirely full, or if you are using quotas, perhaps you’ve eaten up all of whatever quota group it applies to. If this is not what you expected, it could be that you’re using mirroring or RAID-Z, which causes duplicate bytes to be written (to allow corruption recovery). You can tell this by looking at the used physical bytes (instead of written logical bytes) in zfs list.
Most of the data you added after the snapshot can be deleted once rollback has completed, but not before then (so rollback has to keep that data around until it completes).

Related

How to check and release the memory occupied by DolphinDB stream engine?

I have already unsubscribed a table in DolphinDB, but when I executed the function getStreamEngineStat().AsofJoinEngine, there was still memory occupied by the engine.
Does the function return the real-time memory information on the stream engine? How can I check the current memory status and release the memory?
getStreamEngineStat().AsofJoinEngine does return the real-time memory information.
Undefining the subscription is not equivalent to dropping the engine. In your case, if you want to release the memory occupied by the engine, you need first use dropStreamEngine and set the variable returned by createEngine to be NULL.
Specific examples are as follows:
Suppose you define the engine as follows:
ajEngine=createAsofJoinEngine(name="aj1", leftTable=trades, rightTable=quotes, outputTable=prevailingQuotes, metrics=<[price, bid, ask, abs(price-(bid+ask)/2)]>, matchingColumn=`sym, timeColumn=`time, useSystemTime=false, delayedTime=1)
After undefining the subscription, you can correctly release memory by the following code:
dropStreamEngine("aj1") // 1.release the specified engine
ajEngine = NULL // 2. Release the engine handler from memory
If you find that the memory of the asofjoin engine is too large in the subscription, you can also specify the parameter garbageSize to clean up historical data that is no longer needed. The size of garbageSize needs to be set case by case, roughly according to how many records each key will have per hour.
The principle is: First, garbageSize is set for the key, that is, the data within each key group will be cleaned up. If garbageSize is too small, frequent cleaning of historical data will bring unnecessary overhead; if garbageSize is too large, it may not reach the threshold of garbageSize and cannot trigger cleaning, resulting in a leftover of unwanted historical data.
Therefore, it is recommended to clean up once per hour and set the garbageSize with a rough estimate.

Time required to access the memory locations in the same cache line

Consider the big box in the following figure as a cache and the block as a single cache line inside the cache.
The CPU fetched the data (first 4 elements of the array A) from RAM into the cache block.
Now, my question is, does it takes exactly same time to perform read/write operations on all the 4 memory locations (A[0], A[1], A[2] and A[3]) in the cache block or is it approximately same?
PS: I am expecting an answer for ideal case where runtime to perform any read/write operation on any memory location is not affected by the operating system jitter on user processes or applications.
With the line already hot in cache, time is constant for access to any aligned word in the cache. The hardware that handles the offset-within-line part of an address doesn't have to iterate through to the right position or anything, it just MUXes those bytes to the output.
If the line was not already hot in cache, then it depends on the design of the cache. If the CPU doesn't transfer around whole lines at once over a wide bus, then one / some words of the line will arrive before others. A cache that supports early-restart can let the load complete as soon as the needed word arrives.
A critical-word-first bus and memory allow that word to be the first one transferred for a demand-miss. Otherwise they arrive in some fixed order, and a cache miss on the last word of the line could take an extra few cycles.
Related:
Does cacheline size affect memory access latency?
if cache miss happens, the data will be moved to register directly or first moved to cache then to register?
which is optimal a bigger block cache size or a smaller one?

How to disable core dump in a docker image?

I have a service that uses an Docker image. About a half dozen people use it. However, occasionally containers produces big core.xxxx dump files. How do I disable it on docker images? My base image is Debian 9.
To disable core dumps set a ulimit value in /etc/security/limits.conf file and defines some shell specific restrictions.
A hard limit is something that never can be overridden, while a soft limit might only be applicable for specific users. If you would like to ensure that no process can create a core dump, you can set them both to zero. Although it may look like a boolean (0 = False, 1 = True), it actually indicates the allowed size.
soft core 0
hard core 0
The asterisk sign means it applies to all users. The second column states if we want to use a hard or soft limit, followed by the columns stating the setting and the value.

Talend- Memory issues. Working with big files

Before admins start to eating me alive, I would like to say to my defense that I cannot comment in the original publications, because I do not have the power, therefore, I have to ask about this again.
I have issues running a job in talend (Open Studio for BIG DATA!). I have an archive of 3 gb. I do not consider that this is too much since I have a computer that has 32 GB in RAM.
While trying to run my job, first I got an error related to heap memory issue, then it changed for a garbage collector error, and now It doesn't even give me an error. (just do nothing and then stops)
I found this SOLUTIONS and:
a) Talend performance
#Kailash commented that parallel is only on the condition that I have to be subscribed to one of the Talend Platform solutions. My comment/question: So there is no other similar option to parallelize a job with a 3Gb archive size?
b) Talend 10 GB input and lookup out of memory error
#54l3d mentioned that its an option to split the lookup file into manageable chunks (may be 500M), then perform the join in many stages for each chunk. My comment/cry for help/question: how can I do that, I do not know how to split the look up, can someone explain this to me a little bit more graphical
c) How to push a big file data in talend?
just to mention that I also went through the "c" but I don't have any comment about it.
The job I am performing (thanks to #iMezouar) looks like this:
1) I have an inputFile MySQLInput coming from a DB in MySQL (3GB)
2) I used the tFirstRows to make it easier for the process (not working)
3) I used the tSplitRow to transform the data form many simmilar columns to only one column.
4) MySQLOutput
enter image description here
Thanks again for reading me and double thanks for answering.
From what I understand, your query returns a lot of data (3GB), and that is causing an error in your job. I suggest the following :
1. Filter data on the database side : replace tSampleRow by a WHERE clause in your tMysqlInput component in order to retrieve fewer rows in Talend.
2. MySQL jdbc driver by default retrieves all data into memory, so you need to use the stream option in tMysqlInput's advanced settings in order to stream rows.

Can Intel processors delay TLB invalidations?

This in reference to InteI's Software Developer’s Manual (Order Number: 325384-039US May 2011), the section 4.10.4.4 "Delayed Invalidation" describes a potential delay in invalidation of TLB entries which can cause unpredictable results while accessing memory whose paging-structure entry has been changed.
The manual says ...
"Required invalidations may be delayed under some circumstances. Software devel-
opers should understand that, between the modification of a paging-structure entry
and execution of the invalidation instruction recommended in Section 4.10.4.2, the
processor may use translations based on either the old value or the new value of the
paging-structure entry. The following items describe some of the potential conse-
quences of delayed invalidation:
If a paging-structure entry is modified to change the R/W flag from 0 to 1, write
accesses to linear addresses whose translation is controlled by this entry may or
may not cause a page-fault exception."
Let us suppose a simple case, where a page-strucure entry is modified (r/w flag is flipped from 0 to 1) for a linear address and after that the corresponding TBL invalidation instruction is called immediately. My question is--as a consiquence of delayed invalidation of TLB s it possible that even after calling invalidation of TLB a write access to the linear address in question doesn't fault (page fault)?
Or is the "delayed invalidation" can only cause unpredictable results when "invalidate" instruction for the linear address whose page-structure has changed has not been issued?
TLBs are transparently optimisitically not uncached by CR3 changes. TLBs entries are marked with a unique identifier for the address-space and are left in the TLB until they are either touched by the wrong process (in which case the TLB entry is trashed) or the address-space is restored (in which case the TLB was preserved over the address-space changing).
This all happens transparently to the CPU. Your program (or OS) shouldn't be able to tell the difference between this behaviour and the TLB being actually invalidated by a TLB invalidation except via:
A) Timing - i.e. TLB optimisticly not uncaching is faster (which is why they do it)
B) There are edge cases where this behaviour is somewhat undefined. If you modify the code page on which you're sitting or touch memory you've just changed, the old value of the TLB might still be there (even across a CR3 change).
The solution to this is to either:
1) force a TLB update via a invlpg instruction. This purges the TLB entry, triggering a TLB read-in on the next touch of the page.
2) disable and re-enable paging via the CR0 register.
3) mark all pages as un-cachable via the cache-disable bit in CR0 or on all of the pages of the TLB (TLB entries marked uncachable are auto-purged after use).
4) Change the mode of the code-segment.
Note that this is genuinely undefined behaviour. Transitioning to SMM can invalidate the TLB, or might not, leaving this open to a race-condition. Don't depend on this behaviour.

Resources