Linode VPS storage capacity - storage

I recently uploaded my site(a voters registration app) to a LINODE VPS 1 Gig, with 98304 MB total storage.I am expecting for a maximum of 50 million voters (maybe less for that) to register in this site until next year.My concern is, is this storage is enough to hold that number of data?Every voter is required to fill up minimal information in the form for their complete name, age, address and profession.

The size to store the requested information (name, age, address, profession) in text format for my personal information was about 100 Bytes. Using this as a very rough average, we can calculate the total size to store 50 million:
50,000,000 * 100 B = 5,000,000,000 B / 1024 B/KB / 1024 KB/MB / 1024 MB/GB = 4.66 GB
If you have 96 GB of storage, and if you are expecting to store only the above mentioned data, that will be more than enough even if the estimate of 4.66 GB is off by double or more.
One great advantage of Linode is that you can adjust the size of your VPS to scale with demand. So if it turns out that you do need more storage capacity after launching the app, you can upgrade seamlessly using Linode's management dashboard. Here are the instructions on Resizing a Linode:
https://www.linode.com/docs/migrate-to-linode/disk-images/resizing-a-linode

Related

Calculating sizes of page table parameters

I am given a system with 64-bit virtual address space. with page size of 2KB.
Also it is given that the physical memory is of the size 16GB.
I need to calculate the following parameters:
number of page entries (number of lines in the page table), how many bits are needed for the page offset, how many bits are needed for the virtual page number (VPN), and how many bits are needed for the physical page number (PPN).
So, first I concluded that the size of the virtual memory is 2^64 bytes, and that means there are 2^53 entries in the page table.
From the size of a page I concluded that 11 bits are needed for the page offset.
From here I'm not so sure.
Since each virtual address is of the size 64 it, then the VPN is of the size 64 - 11 = 53 bits.
Since the physical memory is of the size 2^34 bytes, then a physical address if of 34 bits. Which means the PPN is of the size 34 - 11 = 23 bits.
Are my calculations correct? and also is my thinking correct?
Help would be appreciated
Some of your results are correct. PPN is 23 bits, VPN is 53 bits.
But all the stuff concerning the page tables is wrong.
A page table contains a set of physical page adresses. Hence as a PPN is 23 bits, one needs 4 bytes (the power of 2 above 23) to describe a PP. If pages are 2k bytes, you can store 2^9 PP adresses par page.
As VPN are 53bits, and each table can resolve 9 bits, the translation can be done by 6 consecutive tables.
If you are not familiar with multilevel pages, there are many good tuturials. See for instance https://en.wikipedia.org/wiki/Page_table
What is certain is that the PT size is NOT 2^53!! First because 2^55 is an insane amount of memory (~10^16). And second, because the total number of PP is 2^23, so why use a table 1 billion times larger... (and this is why we use multilevel page tables)

Explain Cost of Google Cloud PubSub when used with Cloud Dataflow

The documentation on pubsub pricing is very minimal. Can someone explain the costs for the scenario below ?
Size of the data per event = 0.5 KB
Size of data per day = 1 TB
There is only one publisher app and there are two dataflow pipeline subscriptions.
The very rough estimate I can come up with is:
1x publishing
2x subscription (1x for each subscription)
2x acknowledgment (1x for each subscription ack)
The questions are:
Is total data volume per month, 150 (30* 1 TB * 5x) TB? That is 8000$ per month from the price calculator.
1 KB min size for the calculation is applicable even for acknowledging a message?
Dataflow handles subscribe/acknowledge in bundles of ParDos. But, Is the bundle for each message acknowledged separately?
One does not pay for acknowledgements in Google Cloud Pub/Sub, only for publishes, pulls, and pushes. With messages of size 0.5KB, the amount you'd get charged would depend on the batching because of the 1KB minimum size. If all requests had at least 1KB, then the total cost for publishing and getting messages to two subscribers would be:
1TB/day * 30 days * 3 = 92,160GB/month
10GB * $0 + 92,150GB * $0.04 = $3,686
If some messages were not batched, then the price could go up because of the 1KB minimum. The Google Cloud Pub/Sub client library does batch published messages by default, so assuming your messages were not published very sporadically (meaning they were not frequent enough to result in batching), you would hit the 1KB minimum. With the amount of data, you are probably going to end up with batching on your subscribe side as well.

Dimensioning a redis set

We want to use redis for one of our data stores. We have a hard time "guessing" what the size of that redis store will be and we're hoping someone can come up with the right help.
This store will exclusively be be built using Sorted Sets. Each set will have a key that will be an integer between 1 and 10^10. We currently have about 8M keys, but we expect to reach 30M 'quickly'.
Each set will have a variable number of elements, but the average is 17 elements, with a max of 135 and a min of 0. (Let me know if we need to provide other numbers, like st. dev.).
The elements in the sorted set will be strings. Now we want them to be the shortest string possible (5 or 6 chars?), but still avoid collisions. The scores will be timestamps.
We currently have about 500 writes/sec, but expect to grow that 10 times, and we currently have 3000 reads/sec and expect to grow that also 10 times.
We will also use the "dump" strategy rather than AOF.
Our goal is to use a single (yet big) Redis master store (and maybe some slaves store). What RAM should we allocate to our redis instance?
If you use Redis 2.6, you can benefit from the ziplist memory optimization applied to zset, because most of your zsets have a small number of items.
To calculate the memory you need, you can simply fill an instance with a small number of keys corresponding to your requirements and extrapolate. For this use case, memory consumption will grow linearly with the number of keys.
I have just tried it on my system, I get 30 MB per 100000 keys (following your specifications), which results in 9 GB of memory required for 30M keys. You need to take some margin, and include some space for COW memory spent at save time.
A 12 GB server would probably work if you are careful.
A 16 GB server will be just fine.

Displaying file size: 1000b = 1kb or 1024b = 1kb?

I am making a iOS app where the size of some files is diplayed in MB. My question is if it is correct to calculate 1000 byte = 1kb or 1024 byte = 1kb ? I have seen that Finder on the mac calculates with 1000b, but an iOS file manager called iFile calculates with 1024b. The wikipedia article didn't really answer my question. I am just askig speifically for file size not HD capacity etc.
My question is if it is correct to calculate 1000 byte = 1kb or 1024
byte = 1kb ?
Both are correct, and both are used in different situations.
1024 is more common for file sizes, while 1000 is more common for physical disk sizes, but neither is always used that way. As you mentioned, some programs uses 1000 for file sizes, and for memory cards 1024 is often used rather than 1000.
An example of how inconsistently the units are used is the 1.44 MB floppy disk. It's neither 1.44 * 1000 * 1000 bytes nor 1.44 * 1024 * 1024 bytes, but actually 1.44 * 1000 * 1024 bytes.
An effort was made to introduce the kibibyte unit, which is always 1024 bytes. It never was a hit, but you can see it used sometimes.
A kilobyte was, and sometimes (usually?) still is, 1024 bytes. And a megabyte is 1024 KB, a gigabyte is 1024 MB, and so on. But lately, those decimal-lovers have redefined them to powers of 1000, making a kilobyte 8000 bits instead of a nice power of two. They renamed the old units to "kibibites" and "mibibytes" or KiB and MiB.
So, if you want to please both crowds1, you can use KiB and powers of 1024. However, I'd suggest that, if you think it's worth the effort, make it a setting you can change that defaults to binary KB.
1 This isn't really pleasing both crowds, though. I personally hate seeing KiB. It shouldn't matter. When you need an exact measurement, measure in bytes and don't abbreviate.
1024b = 1kb
This 1000b stuff is metric... ;)
basic units(Physic, math...) :
K = 10^3,
M = 10^6
so...
1Km are 1000m.. but no 1km are 1024m
So...
A lot of programs using not good units 1024Kb = 1Mb
Historical bug. :)
Windows using normal 1kb = 1024
But if you buy disc 1GB you buy 10^9 B
The true unit of measurement for 1KB is 1024B: http://oxforddictionaries.com/definition/kilobyte?q=kilobyte
However, some manufactures of software and hardware, in an effort to decieve consumers in order to make themselves look better, may calculate it as 1000B. This is actually a pretty recent trend.
Kilo- denotes multiplication by one thousand (not 1024). Modern terminology reflects this fact:
1 kilobyte = 1000 bytes = 8000 bits
1 kibibyte = 1024 bytes = 8192 bits
Previous use of kilo (with bytes) was based on the approximation that 210 (1024) is merely close to 1000.
Imagine being tasked with coming up with a word that means 1000 bytes after some "loose approximation" had already taken the most obvious term you'd want to use. This lead to the corrected meanings listed above.
This terminology has been standardized. The following is a quote from page 143 of the The International System of Units:
The SI prefixes refer strictly to powers of 10. They should not be
used to indicate powers of 2 (for example, one kilobit represents 1000
bits and not 1024 bits). The names and symbols for prefixes to be used
with powers of 2 are recommended as follows:
kibi Ki 210
mebi Mi 220
gibi Gi 230
tebi Ti 240
pebi Pi 250
exbi Ei 260
zebi Zi 270
yobi Yi 280
The bi in the prefixes above are based on the word "binary". When you append "bit" or "byte" onto them, you get the units listed here (where conversions are also provided).

Determine page table size for virtual memory

Consider a virtual memory system with a 38-bit virtual byte address, 1KB pages and 512 MB of physical memory. What is the total size of the page table for each process on this machine, assuming that the valid, protection, dirty and use bits take a total of 4 bits, and that all the virtual pages are in use? (assume that disk addresses are not stored in the page table.)
Well, if the question is simply "what is the size of the page table?" irrespective of whether it will fit into physical memory, the answer can be calculated thus:
First physical memory. There are 512K pages of physical memory (512M / 1K). This requires 19 bits to represent each page. Add that to the 4 bits of accounting information and you get 23 bits.
Now virtual memory. With a 38-bit address space and a 10-bit (1K) page size, you need 228 entries in your page table.
Therefore 228 page table entries at 23 bits each is 6,174,015,488 bits or 736M.
That's the maximum size needed for a single-level VM subsystem for each process.
Now obviously that's not going to work if you only have 512M of physical RAM so you have a couple of options.
You can reduce the number of physical pages. For example, only allow half of the memory to be subject to paging, keeping the other half resident at all time. This will save one bit per entry, not really enough to make a difference.
Increase the page size, if possible. A 1K page on a 38-bit address space is the reason for the very chunky page tables. For example, I think the '386, with its 32-bit address space, uses 4K pages. That would result in a million page table entries, far less than the 260 million required here.
Go multi-level. A bit more advanced but it basically means that the page tables themselves are subject to paging. You have to keep the first level of page tables resident in physical memory but the second level can go in and out as needed. This will greatly reduce the physical requirements but at the cost of speed, since two levels of page faults may occur to get at an actual process page (one for the secondary paging tables then one for the process page).
Let's look a little closer at option 3.
If we allow 32M for the primary paging table and give each entry 4 bytes (32 bits: only 23 are needed but we can round up for efficiency here), this will allow 8,388,608 pages for the secondary page table.
Since each of those secondary page table pages is 1K long (allowing us to store 256 secondary page table entries at 4 bytes each), we can address a total of 2,147,483,648 virtual pages.
This would allow 8,192 fully-loaded (i.e., using their entire 28-bit address space) processes to run side by side, assuming you have a fair chunk of disk space to store the non-resident pages.
Now obviously the primary paging table (and the VM subsystem, and probably a fair chunk of the rest of the OS) has to stay resident at all times. You cannot be allowed to page out one of the primary pages since you may well need that page in order to bring it back in :-)
But that's a resident cost of only 32M of the 512M for the primary paging table, much better than the (at a minimum, for one fully-loaded process) of 736M.
size of the page table= total no of page table entries*size of the page table entry
STEP 1:FINDING THE NO OF ENTRIES IN PAGE TABLE:
no of page table entries=virtual address space/page size
=2^38/2^10=2^28
so there are 2^28 entries in the page table
STEP2:NO OF FRAMES IN PHYSICAL MEMORY:
no of frames in the physical memory=(512*1024*1024)/(1*1024)=524288=2^19
so we need 19 bits and additional 4 bits for valid, protection, dirty and use bits
totally 23 bits=2.875 bytes
size of the page table=(2^28)*2.875=771751936B=736MB
1KB pages = 2^10, 512MB = 2^29 => Offset = 29 - 10 = 19 bit.
virtual includes two part: page frame + offset => page frame + dirty bit = 38 - 19 = 29 bit.
29 bit includes 4 bit dirty (above) => 25 bit for real page frame, each page frame has 10 bit long.
So, page table size: 2^25 * 10 = 320M.
Hope this correct.

Resources