Does anybody know the storage limits for running Google Colab? I seem to run out of space after uploading 22gb zip file, and then trying to unzip it, suggesting <~40gb storage being available. At least this is my experience running the TPU instance.
Presently, the amount of local storage in colab depends on the chosen hardware accelerator runtime type:
# Hardware accelerator none
!df -h .
Filesystem Size Used Avail Use% Mounted on
overlay 49G 22G 26G 46% /
# Hardware accelerator GPU
!df -h .
Filesystem Size Used Avail Use% Mounted on
overlay 359G 23G 318G 7% /
# Hardware accelerator TPU
!df -h .
Filesystem Size Used Avail Use% Mounted on
overlay 49G 22G 26G 46% /
Even if you don't need a GPU, swithcing to that runtime type will provide you with an extra 310Gb of storage space.
Yes, the Colab notebook local storage is about 40 GiB right now. One way to see the exact value (in Python 3):
import subprocess
p = subprocess.Popen('df -h', shell=True, stdout=subprocess.PIPE)
print(str(p.communicate()[0], 'utf-8'))
However: for large amounts of data, local storage is a non-optimal way to feed the TPU, which is not connected directly to the machine running the notebook. Instead, consider storing your large dataset in GCP storage, and sourcing that data from the Colab notebook. (Moreover, the amount of Colab local storage may change, and the Colab notebook itself will expire after a few hours, taking local storage with it.)
Take a look at the canonical TPU Colab notebook. At the bottom are some next steps, which include a link to Searching Shakespeare with TPUs. In that notebook is the following code fragment, which demonstrates GCP authentication to your Colab TPU. It looks like this:
from google.colab import auth
auth.authenticate_user()
if 'COLAB_TPU_ADDR' in os.environ:
TF_MASTER = 'grpc://{}'.format(os.environ['COLAB_TPU_ADDR'])
# Upload credentials to TPU.
with tf.Session(TF_MASTER) as sess:
with open('/content/adc.json', 'r') as f:
auth_info = json.load(f)
tf.contrib.cloud.configure_gcs(sess, credentials=auth_info)
# Now credentials are set for all future sessions on this TPU.
else:
TF_MASTER=''
Related
I am trying to build a customer docker image for Sagemaker elastic inference.
I read the document page, https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html#ei-intro-endpoint
and
https://docs.aws.amazon.com/sagemaker/latest/dg/ei-endpoints.html#ei-endpoints-boto3
https://github.com/aws/sagemaker-pytorch-inference-toolkit#building-your-image
I do not see an example showing how to build customized docker container.
If you know how to build a customized docker container for Sagemaker elastic inference, could you help me and show me how to do it?
The docker container will be used in cloudformation file to build the Sagemaker endpoint.
Type: AWS::SageMaker::Model
Properties:
Containers:
- ContainerDefinition
EnableNetworkIsolation: Boolean
ExecutionRoleArn: String
InferenceExecutionConfig:
InferenceExecutionConfig
ModelName: String
PrimaryContainer:
ContainerDefinition
Tags:
- Tag
VpcConfig:
VpcConfig
'''
Thanks,
there are better alternatives that may be cheaper and more performant than Amazon Elastic Inference such as the GPU accelerated ml.g4dn.xlarge and AWS Inferentia instances.
Can you evaluate these options for your workload and consider using them instead of Amazon Elastic Inference?
You can use the Amazon SageMaker Inference Recommender to benchmark and compare performance of these options for your workload. Here is a sample price-performance comparison of various hardware acceleration options on Amazon SageMaker based on US-East-1 pricing -
Instance name
On-Demand hourly rate
vCPU
Memory
Storage
Network performance.
ml.g4dn.xlarge
$0.736
4
16 GiB
125 GB NVMe SSD
Up to 25 Gigabit
ml.inf1.xlarge
$0.297
4
8
any EBS
Up to 25 Gigabit
ml.eia2.xlarge*
$0.476
n/a
n/a
n/a
n/a
In terms of a custom docker image for EI,
Is there a reason you need a Bring Your Own Container (BYOC) can you not make use of one of SageMaker's containers?
You need to make sure your DL framework is EI compatible. For example, if you read https://docs.aws.amazon.com/elastic-inference/latest/developerguide/ei-tensorflow.html, you can find that pip install -U ei_for_tf*.whl is required. That means the tf version is different from the generic one.
If you use DL AMI or DL images by AWS, the package has been pre-installed.
I have a Couchbase instance running as container. I'm thinking on exporting a bucket from there by cbexport.
The thing is that the bucket has 3M documents and it's size is approximately 2GBs and I'm not sure if it's going to be easy for CPU and RAM to export that data. Is it too much to work with for a system?
The general scenario is that we have a cluster of servers and we want to set up virtual clusters on top of that using Docker.
For that we have created Dockerfiles for different services (Hadoop, Spark etc.).
Regarding the Hadoop HDFS service however, we have the situation that the disk space available to the docker containers equals to the disk space available to the server. We want to limit the available disk space on a per-container basis so that we can dynamically spawn an additional datanode with some storage size to contribute to the HDFS filesystem.
We had the idea to use loopback files formatted with ext4 and mount these on directories which we use as volumes in docker containers. However, this implies a large performance loss.
I found another question on SO (Limit disk size and bandwidth of a Docker container) but the answers are almost 1,5 years old which - regarding the speed of development of docker - is ancient.
Which way or storage backend would allow us to
Limit storage on a per-container basis
Has near bare-metal performance
Doesn't require repartitioning of the server drives
You can specify runtime constraints on memory and CPU, but not disk space.
The ability to set constraints on disk space has been requested (issue 12462, issue 3804), but isn't yet implemented, as it depends on the underlying filesystem driver.
This feature is going to be added at some point, but not right away. It's a bit more difficult to add this functionality right now because a lot of chunks of code are moving from one place to another. After this work is done, it should be much easier to implement this functionality.
Please keep in mind that quota support can't be added as a hack to devicemapper, it has to be implemented for as many storage backends as possible, so it has to be implemented in a way which makes it easy to add quota support for other storage backends.
Update August 2016: as shown below, and in issue 3804 comment, PR 24771 and PR 24807 have been merged since then. docker run now allow to set storage driver options per container
$ docker run -it --storage-opt size=120G fedora /bin/bash
This (size) will allow to set the container rootfs size to 120G at creation time.
This option is only available for the devicemapper, btrfs, overlay2, windowsfilter and zfs graph drivers
Documentation: docker run/#Set storage driver options per container.
I see from the LXD storage specs that LVM can be used as a backingstore. I've previously managed to get LVM working with LXC. This was very
pleasing, since it allows quota-style control of disk consumption.
How do I achieve this with LXD?
From what I understand, storage.lvm_vg_name must point to my volume
group. I've set this for a container by creating a profile, and
applying that profile to the container. The entire profile config
looks like this:
name: my-profile-name
config:
raw.lxc: |
storage.lvm_vg_name = lxc-volume-group
lxc.start.auto = 1
lxc.arch = amd64
lxc.network.type = veth
lxc.network.link = lxcbr0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
lxc.cgroup.cpu.shares = 1
lxc.cgroup.memory.limit_in_bytes = 76895572
security.privileged: "false"
devices: {}
The volume group should be available and working, according to
pvdisplay on the host box:
--- Physical volume ---
PV Name /dev/sdc5
VG Name lxc-volume-group
PV Size 21.87 GiB / not usable 3.97 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 5599
Free PE 901
Allocated PE 4698
PV UUID what-ever
However after applying the profile and starting the container, it
appears to be using file backing store:
me#my-box:~# ls /var/lib/lxd/containers/container-name/rootfs/
bin boot dev etc home lib lib64 lost+found media mnt opt
proc root run sbin srv sys tmp usr var
What am I doing wrong?
Note that we also ship a python script with LXD to do the initial VG configuration for you.
As for disk quotas, we have a new specification for it which we'll be implementing shortly and that will let you set disk quotas for any storage attached to a container which supports it.
While we still support LVM, our main focus and preference as far as storage backend go is ZFS nowadays as it allows such changes to happen live and also works better when moving containers and snapshots across the network.
The new storage quota feature will be supported on zfs, LVM and btrfs but will only be applied live for zfs and btrfs, LVM will require a container restart.
I'll answer my own question, in case it's of use to others.
According to an authoritative answer on the lxc-users mailing, list:
"The storage.lvm_vg_name is not a per-container config setting, it's
for the whole daemon.
You set it using 'lxc config set storage.lvm_vg_name myvolgroup', and
then lxd will use the volume group as storage for every new image and
container that you create afterwards."
As a very rough summary, I used vgcreate to create a volume group, then lvcreate to create a volume within that group. This was followed by lxc config set storage.lvm_vg_name and lxc config set storage.lvm_thinpool_name appropriately.
It appears to work. However LXD feels a little too immature for my tastes at the moment, and I'm going to use plain LXC for now. I look forward to trying LXD again in a few months.
I have managed to create an NFS server on my Xenserver and mounted it on my Cloudstack 4.4!
However i realise the size of my primary storage and secondary storage is only 4gb when i have assigned 250gb to my Xenserver VM (local storage)
May i know why and how can i increase the space?
Picture link
http://115.66.5.90/manage/shares/Torrents/why%204gb%20size.png?__c=2533372089363723488
Edit on 6/8/2014-------------
Hello Miguel,I have done your steps as seen but still stuck. (Xen was given 100GB)
pvs
PV VG mt Attr PSize PFree
/dev/sda3 VG_XenStorage- lvm2 a- 91.99G 91.98G
Then i gdisk /dev/sda3 as this 91GB is the free storage i have after installing Xen on my VM.
I followed all your steps that you have written below.
Having this result when i PVS again
[root#xenserver-bpqbdmrk ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 lvm2 a- 4.00G 4.00G
However when i ran vgdisplay -c
[root#xenserver-bpqbdmrk ~]# vgdisplay -c
No volume groups found
fdisk -l
Disk /dev/sda: 107.3 GB, 107374182400 bytes
256 heads, 63 sectors/track, 13003 cylinders
Units = cylinders of 16128 * 512 = 8257536 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13004 104857599+ ee EFI GPT
[root#xenserver-bpqbdmrk ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 4.0G 1.9G 2.0G 49% /
none 381M 16K 381M 1% /dev/shm
/opt/xensource/packages/iso/XenCenter.iso
52M 52M 0 100% /var/xen/xc-install
172.16.109.11:/export/primary/97cffd9a-acfe-0c71-91d5-b93e58f27462
4.0G 1.9G 2.0G 49% /var/run/sr-mount/97cffd9a-acfe-0c71-91d5-b93e58f27462
May i know why i do not have a volume group even though i have a storage repo of 4GB on my NFS.
And why does my /dev/sda2 has only 4Gb too
More information about my testing Cloud.
i am running a VM of 100GB.
wanted a primary storage and secondary storage combine of 91Gb.
Command (? for help): p
Disk /dev/sda: 209715200 sectors, 100.0 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 7AE0B6EE-99F4-44F4-A9F0-5140B14DCC32
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 209715166
Partitions will be aligned on 2048-sector boundaries
Total free space is 6042 sectors (3.0 MiB)
Number Start (sector) End (sector) Size Code Name
1 2048 8388641 4.0 GiB 0700
2 8390656 16777249 4.0 GiB 0700
3 16779264 209715166 92.0 GiB 8E00
Command (? for help):
When you logon to your XenServer management console you are actually logging on to a VM (the one running on Dom0). This VM is the one that controls the whole hypervisor.
Only some of the resources you provided to your XenServer are used by the management VM in Dom0. The rest is used for the other VMs you might spin-up on the XenServer.
That goes for CPU, memory and disk space.
You need to check if the XenServer local storage logical volume already contains the remaining space of your disk. To do that type pvs on the terminal to list all LVM physical devices. The entry you are looking for starts with "VG_XenStorage-".
You should see the disk partition that is attached to that physical device, the total size and the free space.
If the local storage logical volume doesn't contain the extra space already you need to add it yourself by partitioning the space if it isn't already. Assuming your disk device is /dev/sda, type gdisk /dev/sda then at the prompt type pto print the partition table. If you have one too many (in relation to what is mounted) then you have a partition already available to use. If you have 2x 4GB partitions and one larger (taking the remaining space) the last is the one you want to use. If not, then you need to create one at the end of the disk. Still in gdisk type:
nto create a new partition, then chose a number for it (the next available int),
push enter twice to make it start at the next available disk block and end at the last,
type 8e00 to select the "Linux LVM" partition type
type w to write the new partition table
At this point you've either created a new partition or you had one already available. I'm assuming /dev/sda3. Now you need to create a physical volume and attach it to the logical volume XenServer uses for local storage.
pvcreate /dev/sda3 to create a new physical volume
vgextend $(vgdisplay -c | cut -d : -f 1) /dev/sda3
The $(vgdisplay ...) bit is to find out the name of the volume group you will attach the physical device to.
If you do pvs again you should see that the local storage logical volume has now more space available.
Edit:
As mentioned before XenServer can manage local storage for VMs using a Storage Repository (SR). When this is the case, then there is no need to create a primary storage directory for holding VM's storage.
As for secondary storage, there will still be a need for it. Secondary storage is where CloudStack looks for the templates (disk images) that it uses to boot the System VMs. System VMs are the VMs CloudStack uses for managing the cloud environment (e.g. virtual routers or console proxies). The hypervisors under CloudStack (in this case a XenServer) must be able to reach the secondary storage, and one of the most common ways of achieving this is to make the secondary storage available via NFS. Whether the NFS export is available from the hypervisor itself or some other reachable machine, that doesn't really matter.
Getting back to the setup of the question, the disk of the XenServer would have to be partitioned in such a way that one partition would be available for primary storage (managed by XenServer via a SR) and another one for secondary storage (with a file system, mounted on the locally and made available ad an NFS export).