I'm trying to get a better idea of how my SQL 2000 instance is using it's memory. I've run DBCC MEMORYSTATUS and I'm hoping someone can give me a better idea of how to interpret the output.
My main concern is the 'Other' section of the buffer distribution. It is currently using by far the most pages at 166,000. Considering that SQL has only about 2GB of availible RAM, the fact that most of that is being used by 'Other' worries me.
Below is the full output .
I appreciate any help you can offer.
Buffer Distribution Buffers
Stolen 30595
Free 966
Procedures 208
Inram 0
Dirty 8424
Kept 0
I/O 137
Latched 437
Other 166065
It's your buffer pool aka data cache. From MS KB 271624
Other. These are committed pages that do not meet any of the criteria mentioned earlier. Typically, the majority of buffers that meet this criteria are hashed data and index pages in the buffer cache.
This looks good: you have 1300MB cached data + indexes which means your queries are hitting RAM not disk.
Related
How do I figure out the right values for the memory parameters in TimesTen? How much memory do I need based on my tables and data?
A TimesTen database consists of two shared memory segments; one is small and is used exclusively by PL/SQL while the other is the main database segment which contains your data (tables, indexes etc.), temporary working space, the transaction log buffer and some space used by the system.
Attributes in the DSN definition set the size for these areas as follows:
PLSQL_MEMORY_SIZE - sets the size of the PL/SQL segment (default is 128 MB). If you do not plan to every use PL/SQL then you can reduce this to 32 MB. If you plan to make very heavy use of PL/SQL then you may need to increase this value.
LogBufMB - sets the size of the transaction log buffer. The default is 64 MB but this is too small for most production databases. A read-mostly workload may be able to get by with a value of 256 MB but workloads involving a lot of database writes will typically need 1024 MB and in extreme cases maybe as much as 16384 MB. When setting this value you should also take into account the setting (or default) for the LogBufParallelism attribute.
PermSize - sets the size for the permanent (persistent) database storage. This needs to be large enough to hold all of your table data, indexes, system metadata etc. and usually some allowance for growth, contingency etc.
TempSize - sets the value for the temporary memory region. This region is used for database locks, materialised tables, temporary indexes, sorting etc. and is not persisted to disk.
The total size of the main database shared memory segment is given by PermSize + TempSize + LogBufMB + SystemOverhead. The value for SystemOverhead varies from release to release but if you allow 64 MB then this is generally sufficient.
Documentation on database attributes can be found here: https://docs.oracle.com/database/timesten-18.1/TTREF/attribute.htm#TTREF114
You can estimate the memory needed for your tables and associated indexes using the TimesTen ttSize utility https://docs.oracle.com/database/timesten-18.1/TTREF/util.htm#TTREF369
I'm collecting data on an ARM Cortex M4 based evaluation kit in a remote location and would like to log the data to persistent memory for access later.
I would be logging roughly 300 bytes once every hour, and would want to come collect all the data with a PC after roughly 1 week of running.
I understand that I should attempt to minimize the number of writes to flash, but I don't have a great understanding of the best way to do this. I'm looking for a resource that would explain memory management techniques for this kind of situation.
I'm using the ADUCM350 which looks like it has 3 separate flash sections (128kB, 256kB, and a 16kB eeprom).
For logging applications the simplest and most effective wear leveling tactic is to treat the entire flash array as a giant ring buffer.
define an entry size to be some integer fraction of the smallest erasable flash unit. Say a sector is 4K(4096 bytes); let the entry size be 256.
This is to make all log entries be sector aligned and will allow you to erase any sector without cuting a log entry in half.
At boot, walk the memory and find the first empty entry. this is the 'write_pointer'
when a log entry is written, simply write it to write_pointer and increment write_pointer.
If write_pointer is on a sector boundary erase the sector at write_pointer to make room for the next write. essentially this guarantees that there is at least one empty log entry for you to find at boot and allows you to restore the write_pointer.
if you dedicate 128KBytes to the log entries and have an endurance of 20000 write/erase cycles. this should give you a total of 10240000 entries written before failure. or 1168 years of continuous logging...
I am using Delphi 7 Enterprise under Windows 7 64 bit.
My computer had 16 GB of RAM.
I try to use kbmMemTable 7.70.00 Professional Edition (http://news.components4developers.com/products_kbmMemTable.html) .
My table has 150,000 records, but when I try to copy the data from Dataset to the kbmMemTable it only copies 29000 records and I get this error: EOutOfMemory
I saw this message:
https://groups.yahoo.com/neo/groups/memtable/conversations/topics/5769,
but it didn't solve my problem.
An out of memory can happen of various reasons:
Your application uses too much memory in general. A 32 bit application typically runs out of memory when it has allocated 1.4GB using FastMM memory manager. Other memory managers may have worse or better ranges.
Memory fragementation. There may not be enough space in memory for a single large allocation that is requested. kbmMemTable will attempt to allocate roughly 200000 x 4 bytes as one single large allocation. As its own largest single allocation. That shouldnt be a problem.
Too many small allocations leading to the above memory fragmentation. kbmMemTable will allocate from 1 to n blocks of memory per record depending on the setting of the Performance property .
If Performance is set to fast, then 1 block will be allocated (unless blobs fields exists, in which case an additional allocation will be made per not null blob field).
If Performance is balanced or small, then each string field will allocate another block of memory per record.
best regards
Kim/C4D
I’m trying to read data which is generated by another application and stored in a Microsoft Office Access .MDB file. The number of records in some particular tables can vary from a few thousands up to over 10 millions depending on size of the model (in the other application). Opening the whole table in one query can cause an Out Of Memory exception in large files. So I split the table on some criteria and read each part in a different query. But the problem is about middle sized files that could be read significantly faster in one single query with no exceptions.
So, am I on right way? Can I solve the OutOfMemory problem in another way? Is it OK to choose one of mentioned strategies (1 query or N query) based on the number of records?
By the way, I’m using DelphiXE5 and Delphi’s standard ADO components. And I need the whole data of the table, and no joining to other tables is needed. I’m creating ADO components by code and they are not connected to any visual controls.
Edit:
Well, it seems that my question is not clear enough. Here are some more details, which are actually answers to questions or suggestions posed in comments:
This .mdb file is not holding a real database; it’s just structured data, so no writing new data, no transactions, no user interactions, no server, nothing. A third-party application uses Access files to export its calculation results. The total size of these files is usually about a few hundred MBs, but they can grow up to 2 GBs. Now I need to load this data into a Delphi data structure before starting my own calculations since there’s no place for waiting for I/O during these calculations.
I can’t compile this project for x64, it’s extremely dependent on some old DLLs that share same memory manager with main executable and their authors will never release an x64 version. The company hasn’t yet decided to replace them, and it won’t change in near future.
And, you know, support guys just prefer to tell us “fix this” rather than asking two thousand customers to “buy more memory”. So I have to be really stingy about memory usage.
Now my question is: Does TADODataSet provide any better memory management for fetching such amount of data? Is there any property that prevents DataSet from fetching all data at once?
When I call ADOTable1.open it starts to allocate memory and waits to fetch the entire table, just as expected. But reading all those records in a for loop will take a while and there’s no need to have all that data, on the other hand, there’s no need to keep a record in memory after reading it since there's no seeking in rows. That’s why I split table with some queries. Now I want to know if TADODataSet can handle this or what I'm doing is the only solution.
I did some try and errors and improved performance of reading data, in both memory usage and elapsed time. My test case is a table with more than 5,000,000 records. Each record has 3 string fields and 8 doubles. No index, no primary key. I used GetProcessMemoryInfo API to get memory usage.
Initial State
Table.Open: 33.0 s | 1,254,584 kB
Scrolling : +INF s | I don't know. But allocated memory doesn't increase in Task Manager.
Sum : - | -
DataSet.DisableControls;
Table.Open: 33.0 s | 1,254,584 kB
Scrolling : 13.7 s | 0 kB
Sum : 46.7 s | 1,254,584 kB
DataSet.CursorLocation := clUseServer;
Table.Open: 0.0 s | -136 kB
Scrolling : 19.4 s | 56 kB
Sum : 19.4 s | -80 kB
DataSet.LockType := ltReadOnly;
Table.Open: 0.0 s | -144 kB
Scrolling : 18.4 s | 0 kB
Sum : 18.5 s | -144 kB
DataSet.CacheSize := 100;
Table.Open: 0.0 s | 432 kB
Scrolling : 11.4 s | 0 kB
Sum : 11.5 s | 432 kB
I also checked Connection.CursorLocarion, Connection.IsolationLevel, Connection.Mode, DataSet.CursorType and DataSet.BlockReadSize but they made no appreciable change.
I also tried to use TADOTable, TADOQuery and TADODataSet and unlike what Jerry said here in comments, both ADOTable and ADOQuery performed better than ADODataSet.
The value assigned to CacheSize should be decided for each case, not any grater values lead to better results.
I am new to CUDA and currently optimize an existing application for molecular dynamics. What it does is that it takes array of double4 with coordinates and computes forces based on the neighborlist. I wrote a kernel with the following lines:
double4 mPos=d_arr_xyz[gid];
while(-1!=(id=d_neib_list[gid*MAX_NEIGHBORS+i])){
Calc(gid,mPos,AA,d_arr_xyz,id);i++;
}
then Calc takes d_arr_xyz[id] and calculates force. That gives 1 read of double4 + 65 reads of (int +double4) inside every call of Calc (65 is average number of neighbors (not equal to -1) in d_neib_list for each particle).
Is it possible to reduce those reads? Neighborlists for different particles, i.e. d_arr_xyz[gid] and d_arr_xyz[id] do not correalte, so I cannot use shared memory for the block of threads to cache d_arr_xyz.
What I see is that if somehow to load the whole list int*MAX_NEIGHBORS into shared memory in one or few large transactions, that will remove 65 separate reads of int.
So the question is: is it possible to do it so that those 65 reads of int will be translated into several large transactions. I read in the documentation that reads can be even 128 bytes long. What exactly should I write so that assembler will make 1 large call?
Update:
Thank you for your replies. From the answer from user talonmies below, I changed the code replacing dimensions x and y for the neighbors matrix. Now consecutive threads load consecutive int[gid], I guess that may result in a 128 byte read. The program works 8% faster.
All memory transactions are issued (where possible) on a per warp basis. So the 128 byte transaction you are asking about is when all 32 threads in a warp issue a memory load instruction which can be serviced in a single "coalesced" transaction. A single thread can't issue large memory transactions, only a warp of 32 threads can, and only when the memory coalescing requirements of whichever architecture you run the code on can be satisfied.
I couldn't really follow your description of what you code is actually doing, but from first principles alone, the answer would appear to be no.