I have a pretty big model I'm trying to run (30 GB of ram minimum) but every time I start a new instance, I can adjust the CPU ram but not the GPU. Is there a way on Google's AI notebook service to increase the ram for a GPU?
Thanks for the help.
In short: you can't. You might consider switching to Colab Pro that features e.g. better GPU:
With Colab Pro you get priority access to our fastest GPUs. For
example, you may get access to T4 and P100 GPUs at times when
non-subscribers get K80s. You also get priority access to TPUs. There
are still usage limits in Colab Pro, though, and the types of GPUs and
TPUs available in Colab Pro may vary over time.
In the free version of Colab there is very limited access to faster
GPUs, and usage limits are much lower than they are in Colab Pro.
That being said, don't count on getting best-in-class GPU just for yourself for ~10 USD / month. If you need high-memory dedicated GPU, you will likely have to resort to using a dedicated service. You should easily find services with 24 GB cards for less than 1 USD / hour.
Yes, you can create a personalized AI Notebook and also edit its hardware after the creation of it. Please take special care if you are not hitting the quota limit for GPU if you still are not able to change these settings.
Related
I am new to the RAPIDS AI world and I decided to try CUML and CUDF out for the first time.
I am running UBUNTU 18.04 on WSL 2. My main OS is Windows 11. I have a 64 GB RAM and a laptop RTX 3060 6 GB GPU.
At the time I am writing this post, I am running a TSNE fitting calculation over a CUDF dataframe composed by approximately 26 thousand values, stored in 7 columns (all the values are numerical or binary ones, since the categorical ones have been one hot encoded).
While classifiers like LogisticRegression or SVM were really fast, TSNE seems taking a while to output results (it's been more than a hour now, and it is still going on even if the Dataframe is not so big). The task manager is telling me that 100% of GPU is being used for the calculations even if, by running "nvidia-smi" on the windows powershell, the command returns that only 1.94 GB out of a total of 6 GB are currently in use. This seems odd to me since I read papers on RAPIDS AI's TSNE algorithm being 20x faster than the standard scikit-learn one.
I wonder if there is a way of increasing the percentage of dedicated GPU memory to perform faster computations or if it is just an issue related to WSL 2 (probably it limits the GPU usage at just 2 GB).
Any suggestion or thoughts?
Many thanks
The task manager is telling me that 100% of GPU is being used for the calculations
I'm not sure if the Windows Task Manager will be able to tell you of GPU throughput that is being achieved for computations.
"nvidia-smi" on the windows powershell, the command returns that only 1.94 GB out of a total of 6 GB are currently in use
Memory utilisation is a different calculation than GPU throughput. Any GPU application will only use as much memory as is requested, and there is no correlation between higher memory usage and higher throughput, unless the application specifically mentions a way that it can achieve higher throughput by using more memory (for example, a different algorithm for the same computation may use more memory).
TSNE seems taking a while to output results (it's been more than a hour now, and it is still going on even if the Dataframe is not so big).
This definitely seems odd, and not the expected behavior for a small dataset. What version of cuML are you using, and what is your method argument for the fit task? Could you also open an issue at www.github.com/rapidsai/cuml/issues with a way to access your dataset so the issue can be reproduced?
I need to be able to produce random longs using Java SecureRandom in an application running on Google Kubernetes Engine. The rate may vary based on day and time, perhaps from as low as 1 per minute to as high as 20 per second.
To assist RNG, we would install either haveged or rng-tools in the container.
Will GKE be capable of supporting this scenario with high-quality random distribution of longs without blocking? Which of haveged or rng-tools is more capable for this scenario?
I asked a related question here but didn't get a satisfactory answer:
Allocating datastore id using PRNG
On the Google Cloud Platform (GCP), I have the following specs:
Machine type: n1-standard-8 (8 vCPUs, 30 GB memory)
CPU platform: Intel Haswell
I am using Jupyter notebook to fit an SVM to large amounts of NLP data. This process is very slow, and according to the GCP I am only utilizing around 0.12% of CPUs
How do I increase CPU utilization?
As DazWilkin mentioned actually, you're using 12% (12/100). This corresponds to one vCPU. This is because -- IIRC -- Jupyter is a Python app and Python's single-threaded so you're stuck using one core. You could reduce the number of cores (the OS will use multiple cores, of course) to save yourself some money but you'll need to evaluate alternatives to use more cores.
I have recently came across an adaptor that would allow me to use laptop memory on my desktop. See item below:
http://www.amazon.co.uk/Laptop-Desktop-Adapter-Connector-Converter/dp/B009N7XX4Q/ref=sr_1_1?ie=UTF8&qid=1382361582&sr=8-1&keywords=Laptop+to+desktop+memory
Both the desktop and the laptop use DDR3.
My question is, are this adapters reliable?
I have 8 GB available and I was wondering if they could be put to use in my gaming rig.
The desktop is an i7 machine generally used for gaming and some basic development.
The adapter should be reliable based on how it looks. There is not much to it only that it extends the "mini" RAM block to a bigger one. You can make the analog with A-B USB cables.
What you should also consider is if both RAM devices use the same frequency and possible heat issues as you will have to cool down the laptop memory more that if it was desktop size. This is because a lot of current goes trough smaller size compared to the desktop based RAM blocks. Then again you have the extension board to handle and disperse some of the heat so if you are not having some really extensive RAM operations you should be fine but you should check what is the working frequency on both of them. For example if the laptop one is faster than the maximum one your computer can support then you won't get that faster performance and the RAM block will work with the frequency of the system bus but if it is slower then the system bus will work on that frequency.
Use standard things on this module as reference to calculate the width. Measure it on image and scale to a reference item and check on your system. Use contacts or the lock in grooves to do the scaling since they are of standard dimensions on all modules. Or the module length...
I'm working on a desktop application that will produce several in-memory datasets as an intermediary before being committed to a database.
Obviously I'm going to try to keep the size of these to a minimum, but are there any guidelines on thresholds I shouldn't cross for good functionality on an 'average' machine?
Thanks for any help.
There is no "average" machine. There is a wide range of still-in-use computers, including those that run DOS/Win3.1/Win9x and have less than 64MB of installed RAM.
If you don't set any minimum hardware requirements for your application, at least consider the oldest OS you're planning to support, and use the official minimum hardware requirements of that OS to gain a lower-bound assesment.
Generally, if your application is going to consume a considerable amount of RAM, you may want to let the user configure the upper bounds of the application's memory management mechanism.
That said, if you decide to dynamically manage the upper bounds based on realtime data, there are quite a few things you can do.
If you're developing a windows application, you can use WMI to get the system's total memory amount, and base your limitations on that value (say, use up to 5% of the total memory).
In .NET, if your data structures are complex and you find it hard to assess the amount of memory you consume, you can query the Garbage Collector for the amount of allocated memory using GC.GetTotalMemory(false), or use a System.Diagnostics.Process object.