Memory breakdown based on its speed [closed] - memory

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
In one technical discussion the person asked me which things you look into when you buy a laptop.
Then he asked me to Sort different types of memory e.g RAM etc on the basis of speed.In simple words he wanted memory hierarchy .

Technically speaking a processor's registers are the fastest memory a computer has. The size is very small and people generally don't include those numbers when talking about a CPU.
The quickest memory in a computer that would be advertised is the memory that is directly attached to the CPU. It's called cache, and in modern processors you have 3 levels - L1, L2, and L3 - where the first level is the fastest but also the smallest (it's expensive to produce and power). Cache typically ranges from several kilobytes to a few megabytes and is typically made from SRAM.
After that there is RAM. Today's computers use DDR3 for main memory. It's much larger and cheaper than cache, and you'll find sticks upwards of 1 gigabyte in size. The most common type of RAM today is DRAM.
Lastly storage space, such as a hard drive or flash drive, is a form of memory but in general conversation it's grouped separately from the previous types of memory. E.g. you would ask how much "memory" a computer has - meaning RAM - and how much "storage" it has - meaning hard drive space.

Related

Can a VM with more VRAM than RAM do machine learning efficiently? [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 days ago.
Improve this question
I'm working on a virtual machine that has been given 12 GB of RAM and has a Quadro RTX 6000 with 24 GB of VRAM. I'm trying to do machine learning on this virtual machine.
My intuition right now suggests that it is not using the video card as efficiently as it could be with the limited amount of RAM, and would work better with more. Is this correct?
The following suggests it is so, but is not very clear on that.
OpenCL - what happens if GPU memory is larger than system RAM
In short, how much RAM should I expect to need for machine learning and computer vision for this video card typically?
It very much depends on the software you are using. In some cases, GPU software can use significantly more VRAM than RAM, when the model runs only on the GPU and there is no need to have a copy of it in RAM.
As an example, although CFD and not ML, the FluidX3D software uses between 3.2x and 5.4x more VRAM than RAM. Here in your case the 24GB VRAM capacity would still be the limiting factor.
If it's 1:1 RAM:VRAM allocation, then you're limited by the 12GB RAM. In the end, you have to test your software and check the allocation ratio with tools like top/htop and nvidia-smi.

CUDA : why are we using so many kinds of memories? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've learned the CUDA programming and I went into some problem. The major one is in CUDA "why do we use so many kinds of memories(Global, local, shared, constant, texture, caches,registers)?" unlike in CPU where we have only three main memory(Ram, caches, hd etc).
The main reasons for having multiple kinds of memory are explained in this article: Wikipedia: Memory Hierarchy
To summarize it, it a very simplified form:
It is usually the case that the larger the memory is, the slower it is
Memory can be read and written faster when it is "closer" to the processor.
As mentioned in the comment: On the CPU, you also have several layers of memory: The main memory, and several levels of caches. These caches are much smaller than main memory, but much faster. These caches are managed by the hardware, so as a software developer, you do not directly notice that these caches exist at all. All the data seems to be in the main memory.
On the GPU, you have to manage this memory manually (althogh in newer CUDA versions, you can also declare the shared memory as "cache", and let CUDA take care of the data management).
For example, reading some data from the shared memory in CUDA may be done within a few NANOseconds. Reading data from global memory may take a few MICROseconds. One of the keys to high performance in CUDA is thus data locality: You should try to keep the data that you are working on in local or shared memory, and avoid reading/writing data in global memory.
(P.S.: The "Close" votes that mark this question as "Primarily Opinion Based" are somewhat ridiculous. The question may show a lack of own research, but is a reasonable question that can clearly be answered here)

Tools for checking memory fragmentation [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I have recently read topics about memory fragmentation:
How to solve Memory Fragmentation and What is memory fragmentation?
I want to see some memory allocation map such as author in these article http://pavlovdotnet.wordpress.com/2007/11/10/memory-fragmentation/
Could you recomend some tools to get memory allocation map like that, so I could see if the memory is fragmented and what is the biggest free space available.
I'm on Windows so I would prefer tools working on this system.
Here is a tool that visualizes GC memory and heap usage, also the source code is provided. Another similar app is linked in the comments there as well.
If you need to be able to profile memory usage for a .NET solution, you could check out ANTS Memory Profiler, it can run alongside a project in Visual Studio and keep tabs on how processes and objects are using memory.
There is indirect solution to the problem. I have developing server application for a few years. Initially we are doing the allocation on demand and as a result after a running for few weeks the performance of the server degraded. As a workaround we followed this approach -
Suppose you have user defined classes X,Y,Z, .. which you need to allocate from heap at runtime. Allocate n number of objects X at startup. Put all these objects in free pool list. On demand , take each object of x and provide it to your app. When in use, put it in busy pool list.
When app wants to release it, put it back to the free pool list. Follow this startegy for Y. Z etc.
Since you are allocating all the needed objects at startup and never releasing back to the OS memory manger until your program exits, you will not face the performance degradation caused by memory fragmentation.

How to use graphics memory as RAM? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
Since graphics cards provide large amounts of RAM (0.5GiB to 2GiB) and API access to the GPU is not that difficult with CUDA, Stream and more portable OpenCL I wondered if it is possible to use graphics memory as RAM. Grahics RAM might have a larger latency (from CPU) than real RAM but its definitively faster than HDD so it could be optimal for caching.
Is it possible to access graphics memory directly or at least with a thin memory management layer within own applications (rather than free usable for the OS)? If so, what the the preferred way to do this?
Yes, you can use it as swap memory on Linux. Refer to the link here for more details.
With Linux, it's possible to use it as swap space, or even as RAM disk.
Be warned
It's nice to have fast swap or RAM
disk on your home computer but be
warned, if a binary driver is loaded
for X, it may freeze the whole system
or create graphical glitches. Usually
there is no way to tell the driver how
much memory could be used, so it won't
know the upper limit. However, the
VESA driver can be used because it
provides the possibility to set the
video RAM size.
So, Direct Rendering or fast swap.
Your choice.
Unlike motherboard RAM and hard
drives, there aren't any known video
cards that have ECC memory. This may
not be a big deal for graphics
rendering, but you definitely don't
want to put critical data in it or use
this feature on servers.

Is there a way that I can use the 100% of my network bandwidth with only one connection? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
I have a program that reads about a million of rows and group the rows; the client computer is not stressed at all, no more than 5% cpu usage and the network card is used at about 10% or less.
If in the same client machine I run four copies of the program the use grow at the same rate, with the four programs running, I get about 20% cpu usage and about 40% network usage. That makes me think that I can improve the performance using threads to read the information from the database. But I don't want to introduce this complexity if a configuration change could do the same.
Client: Windows 7, CSDK 3.50.TC7
Server: AIX 5.3, IBM Informix Dynamic Server Version 11.50.FC3
There are a few tweaks you can try, most notably setting the fetch buffer size. The environment variable FET_BUF_SIZE can be set to a value such as 32767. This may help you get closer to saturating the client and the network.
Multiple threads sharing a single connection will not help. Multiple threads using multiple connections might help - they'd each be running a separate query, of course.
If the client program is grouping the rows, we have to ask "why?". It is generally best to leave the server (DBMS) to do that. That said, if the server is compute bound and the client PC is wallowing in idle cycles, it may make sense to do the grunt work on the client instead of the server. Just make sure you minimize the data to be relayed over the network.

Resources