I'm using cadvisor's API to extract data from a docker machine for monitoring purposes. I've noticed that for all containers that I've created there's an aliases array in the data which contains a hash and the short name in a specific order (0 seems to always be the short name and 1 seems to always be the unique hash).
{
name: "/docker/4b29315fca60ce0e8e91889f9c8a4f35b6374fbbfcf6a92a108015106dd4ab77",
aliases: [
"stupefied_albattani",
"4b29315fca60ce0e8e91889f9c8a4f35b6374fbbfcf6a92a108015106dd4ab77"
]
}
Seems is the key word here. Unfortunately the documentation on cAdvisor's API is almost non-existent so I can't look there for a definitive answer. The fact that the data is an array named "aliases" seems to imply that it is possible for there to be aliases other than the hash and the short name created for a container. I also can't be certain that the order will always be 0 = short name, 1 = hash.
Is it safe to assume that aliases[0] is going to always be the short name (provided that aliases array exists), and if not then how can I extract the short name from the data with 100% confidence that I'm getting the correct field?
It is safe, unique hash value always comes at aliases[1] but it does not imply that aliases[0] is always short. As you can see in the following image.
Related
What is MS Graph "subscription id" property max length?
In examples length of id is 36 characters (e.g. "7f105c7d-2dc5-4530-97cd-4e7ae6534c07").
It will be always like this? I can't find info in documentation.
The documentation doesn't explicitly states it is an UUID... though it certainly looks line one, probably will be one, and will most likely always be one. However, imho, unless you really have problems in terms of storage, it is best to reserve a reasonable size and assume this ID is a "opaque string" that you just store, and assume is unique (so you can make some key of it, or build an index on it, if you would be referring to a database as the storage). If there are other reasons why you need to know the side, please clarify...
I have a problem where it would be very helpful if I was able to send a ReadModifyWrite request to BigTable where it only overwrites the value if the new value is bigger/smaller than the existing value. Is this somehow possible?
Note: I thought of a hacky way where I use the timestamp as my actual value, and have the max number of versions 1, so that would keep the "latest" value which is the higher timestamp. But those timestamps would have values from 1 to 10 instead of 1.5bn. Would this work?
I looked into the existing APIs but haven't found anything that would help me do this. It seems like it is available in DynamoDB, so I guess it's reasonable to ask for BigTable to have it as well https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_UpdateItem.html#API_UpdateItem_RequestSyntax
Your timestamp approach could probably be made to work, but would interact poorly with stuff like age-based garbage collection.
I also assume you mean CheckAndMutate as opposed to ReadModifyWrite? The former lets you do conditional overwrites, the latter lets you do unconditional increments/appends. If you actually want an increment that only works if the result will be larger, just make sure you only send positive increments ;)
My suggestion, assuming your client language supports it, would be to use a CheckAndMutateRow request with a value_range_filter. This will require you to use a fixed-width encoding for your values, but that's no different than re-using the timestamp.
Example: if you want to set the value to 000768, but only if that would be an increase, use a value_range_filter from 000000 to 000767, inclusive, and do your write in the true_mutation of the CheckAndMutate.
How can I know the size (in KB) of a particular key in redis?
I'm aware of info memory command, but it gives combined size of Redis instance, not for a single key.
I know this is an old question but, just for the record, Redis implemented a memory usage <key> command since version 4.0.0.
The output is the amount of bytes required to store the key in RAM.
Reference: https://redis.io/commands/memory-usage
You currently (v2.8.23 & v3.0.5) can't.
The serializedlength from DEBUG OBJECT (as suggested by #Kumar) is not indicative of the value's true size in RAM - Redis employs multiple "tricks" to save on RAM on the one hand and on the other hand you also need to account for the data structure's overhead (and perhaps some of Redis' global dictionary as well).
The good news is that there has been talk on the topic in the OSS project and it is likely that in the future memory introspection will be greatly improved.
Note: I started (and stopped for the time being) a series on the topic - here's the 1st part: https://redislabs.com/blog/redis-ram-ramifications-i
DEBUG OBJECT <key> reveals something like the serializedlength of key, which was in fact something I was looking for... For a whole database you need to aggregate all values for KEYS * which shouldn't be too dfficult with a scripting language of your choice... The bad thing is that redis.io doesn't really have a lot of information about DEBUG OBJECT.
Why not try
APPEND {your-key} ""
This will append nothing to the existing value but return the current length.
If you just want to get the length of a key (string): STRLEN
I have some questions regarding the optimal entry size setting for Redis hash sets.
In this example memory-optimization they use 100 hash entries
per key but use hash-max-zipmap-entries 256 ? Why not
hash-max-zipmap-entries 100 or 128?
On the redis website (above link) they used max hash entry size of
100, but in this post instagram, they mention 1000 entries. So
does this mean the optimal setting is a function of the product of
hash-max-zipmap-entries & hash-max-zipmap-value ?(ie in this case
Instagram has smaller hash-values than memory optimization example?)
Your comments/clarifications are much appreciated.
The key is, from here:
manipulating the compact versions of these [ziplist] structures can become slow as they grow longer
and
[as ziplists grow longer] fetching/updating individual fields of a HASH, Redis will have to decode many individual entries, and CPU caches won’t be as effective
So to your questions
This page just shows an example and I doubt the author gave much thought to the exact values. In real life, IF you wanted to take advantage of ziplists, and you knew your number of entries per hash was <100, then setting it at 100, 128 or 256 would make no difference. hash-max-zipmap-entries is only the LIMIT over which you're telling Redis to change the encoding from ziplist to hash.
There may be some truth in your "product of hash-max-zipmap-entries & hash-max-zipmap-value" idea, but I'm speculating. More importantly, first you have to define "optimal" based on what you want to do. If you want to do lots of HSET/HGETs in a large ziplist, it will be slower than if you used a hash. But if you never get/update single fields only ever do HMSET/HGETALL on a key, large ziplists wouldn't slow you down. The Instagram 1000 was THEIR optimal number based on THEIR specific data, use cases, and Redis function call frequencies.
You encouraged me to read both links and it seems that you are asking for "default value for hash table size".
I don't think that it's possible to say that one number is universal for all possibilities. The described mechanism is similar to standard hash mapping. Look at http://en.wikipedia.org/wiki/Hash_table
If you have small size of hash-table, it means that many various hash values point into the same array, where the equals method is used to find out the item.
On the other hand, large hash table means that it allocates large memory along with many empty fields. But this scales well as the algorithm uses O(1) big O notation and there is no equals searching for the item.
In general the size of the table IMHO depends on the overall count of all elements you expect to put into the table and it also depends on the diversity of the key. I mean if every hash start with "0001" not even size=100000 would help you.
I just start studying DHT implementation and theory and stuck on on part, how generates node id when node startup and connect to network. I read that ID is random hash from some hashes range but, is it unique hash? and is hash generates close no the data which this node store? Help me with this.
Self-generation of the node ID using a good hash function over a large space of values is a common technique used in DHT/P2P systems. Since the hash guarantees good random distribution, the probability of a collision is very small. Statistically, the ID will (almost always) be unique.
That hash is independent from the data stored of the node.
import random
import hashlib
def newID():
s = ""
for i in range(20):
s += chr(random.randint(0, 255))
m = hashlib.sha1()
m.update(s)
return m.digest()
As said in the previous answers, the ID of a node is generated by hashing it's IP address (generally speaking, such is the case in a DHT like Chord) or other uniquely identifiable information.
And since it uses Consistent Hashing when a node will join or leave the n-network, only 1/nkeys needs to be remapped, thus it lends itself to highly dynamic network topologies, such as peer-to-peer.
Technically, the hash generated doesn't convey any information about the data that is stored on this node. Rather the hash for a certain key (or entry in a data store, if used for such purpose) originates from hashing the keyword (or the filename or the file contents).
As a direct consequence of the Consistent Hashing, the abstract concept of distance between keys emerges. (As stated here) A node owns all the keys for which its identifying key (ID) is the closest to according to the distance metric.