I'm trying to build a report which would show actual memory usage per user session when working with a particular SSAS tabular in-mem model. The model itself is relatively big (~100GB in mem) and the test queries are relatively heavy: no filters, lowest granularity level, couple of SUM measures + exporting 30k rows to CSV.
First, I tried querying following DMV:
select SESSION_SPID
,SESSION_CONNECTION_ID
,SESSION_USER_NAME
,SESSION_CURRENT_DATABASE
,SESSION_USED_MEMORY
,SESSION_WRITES
,SESSION_WRITE_KB
,SESSION_READS
,SESSION_READ_KB
from $system.discover_sessions
where SESSION_USER_NAME='username'
and SESSION_SPID=29445
and got following results:
$system.discover_sessions result
I was expecting SESSION_USED_MEMORY to show at least several hundreds of MBs, but the biggest value I got is 11 KB (MS official documentation for this DMV indicates that SESSION_USED_MEMORY is in kilobytes).
I've also tried querying 2 more DMVs:
SELECT SESSION_SPID
,SESSION_COMMAND_COUNT
,COMMAND_READS
,COMMAND_READ_KB
,COMMAND_WRITES
,COMMAND_WRITE_KB
,COMMAND_TEXT FROM $system.discover_commands
where SESSION_SPID=29445
and
select CONNECTION_ID
,CONNECTION_USER_NAME
,CONNECTION_BYTES_SENT
,CONNECTION_DATA_BYTES_SENT
,CONNECTION_BYTES_RECEIVED
,CONNECTION_DATA_BYTES_RECEIVED from $system.discover_connections
where CONNECTION_USER_NAME='username'
and CONNECTION_ID=2047
But also got quite underwhelming results: 0 used memory from $system.discover_commands and 4,8 MB from $system.discover_connections for CONNECTION_DATA_BYTES_SENT, which still seems to be smaller than the actual session would take.
These results don't seem to correspond to a very blunt test, where users would send similar queries via PowerBI and we would observe ~40GB spike in RAM allocation on the SSAS server per 4 users (so roughly 10GB per user session).
Have anyone used these (or any other DMVs or methods) to get actual user session memory consumption? Using SQL tracer dump would be the last resort since it would require parsing and loading the result into a DB and my goal is to have a real-time report showing active user sessions.
Related
I'm using Apache Beam Java SDK to process events and write them to the Clickhouse Database.
Luckily there is ready to use ClickhouseIO.
ClickhouseIO accumulates elements and inserts them in batch, but because of the parallel nature of the pipeline it still results in a lot of inserts per second in my case. I'm frequently receiving "DB::Exception: Too many parts" or "DB::Exception: Too much simultaneous queries" in Clickhouse.
Clickhouse documentation recommends doing 1 insert per second.
Is there a way I can ensure this with ClickhouseIO?
Maybe some KV grouping before ClickhouseIO.Write or something?
It looks like you interpret these errors not quite correct:
DB::Exception: Too many parts
It means that insert affect more partitions than allowed (by default this value is 100, it is managed by parameter max_partitions_per_insert_block).
So either the count of affected partition is really large or the PARTITION BY-key was defined pretty granular.
How to fix it:
try to group the INSERT-batch such way it contains data related to less than 100 partitions
try to reduce the size of insert-block (if it quite huge) - withMaxInsertBlockSize
increase the limit max_partitions_per_insert_block in SQL-query (like this, INSERT .. SETTINGS max_partitions_per_insert_block=300 (I think ClickhouseIO should have the ability to set custom options on query level)) or on server-side by modifying userprofile-settings
DB::Exception: Too much simultaneous queries
This one managed by param max_concurrent_queries.
How to fix it:
reduce the count of concurrent queries by Beam means
increase the limit on the server-side in userprofile- or server-settings (see https://github.com/ClickHouse/ClickHouse/issues/7765)
We have Spring Boot 2.0.4 application. We use distributed Hazelcast 3.11 cache. In our application we configured HazelcastClient which connects to a Hazelcast server in Docker container.
In cache we store different "persons" in one map and the same "persons" but as a list in another (~900 persons in one list by one key; these persons in both maps are not the same for 100%, they both describe the person in real life but the last one in the list have less properties.). All the maps are of BINARY type.
When we made stress tests to get person by random id from the cache (1st map), everything went excellent. 5000 concurrent requests didn't influence our application HEAP at all, 10000 - slightly. In JSON format one person details has the size of 10kB.
When we made stress tests to get the list of persons from the cache (2nd map) we faced problems with the HEAP of our application where the client is configured. We made just 500 concurrent requests and the HEAP grew to 4Gb size! In JSON format the list has the size of 800kB. It is stored in the 2nd map and was requested by the same key 500 times.
Does anybody know what is going on?
DTO
Controller
Method of a Facade which is retrieved from the Controller, and where caching takes place via #Cacheable annotation
HazelcastInstance configuration
hazelcast.xml configuration for the server side
500 concurrent requests (3 times in a row)
Heap, Classes
UPDATED:
I made 500 concurrent requests sequentially 23 times. Below we can see the final minutes of the test.
Telemetries Overview
#Nicolay, correct me if I'm wrong:
the second map contains lists of people, ~900 people, as an entry. You mentioned each person is ~10KB, so each entry in the second map is ~9MB, even though you're saying it's 800KB in Json format. Can you please check the size of entries in the second map through Hazelcast. like: client.getMap(map_name).getEntryView(key).getCost(). This will give you entry memory cost in bytes.
500 concurrent req, if each entry is ~9MB, will require 4.5GB additional heap, which matches what you observed.
By looking numbers, everything seems fine, other that Json size being 800KB.
Can you check those numbers?
Before admins start to eating me alive, I would like to say to my defense that I cannot comment in the original publications, because I do not have the power, therefore, I have to ask about this again.
I have issues running a job in talend (Open Studio for BIG DATA!). I have an archive of 3 gb. I do not consider that this is too much since I have a computer that has 32 GB in RAM.
While trying to run my job, first I got an error related to heap memory issue, then it changed for a garbage collector error, and now It doesn't even give me an error. (just do nothing and then stops)
I found this SOLUTIONS and:
a) Talend performance
#Kailash commented that parallel is only on the condition that I have to be subscribed to one of the Talend Platform solutions. My comment/question: So there is no other similar option to parallelize a job with a 3Gb archive size?
b) Talend 10 GB input and lookup out of memory error
#54l3d mentioned that its an option to split the lookup file into manageable chunks (may be 500M), then perform the join in many stages for each chunk. My comment/cry for help/question: how can I do that, I do not know how to split the look up, can someone explain this to me a little bit more graphical
c) How to push a big file data in talend?
just to mention that I also went through the "c" but I don't have any comment about it.
The job I am performing (thanks to #iMezouar) looks like this:
1) I have an inputFile MySQLInput coming from a DB in MySQL (3GB)
2) I used the tFirstRows to make it easier for the process (not working)
3) I used the tSplitRow to transform the data form many simmilar columns to only one column.
4) MySQLOutput
enter image description here
Thanks again for reading me and double thanks for answering.
From what I understand, your query returns a lot of data (3GB), and that is causing an error in your job. I suggest the following :
1. Filter data on the database side : replace tSampleRow by a WHERE clause in your tMysqlInput component in order to retrieve fewer rows in Talend.
2. MySQL jdbc driver by default retrieves all data into memory, so you need to use the stream option in tMysqlInput's advanced settings in order to stream rows.
I'm collecting data on an ARM Cortex M4 based evaluation kit in a remote location and would like to log the data to persistent memory for access later.
I would be logging roughly 300 bytes once every hour, and would want to come collect all the data with a PC after roughly 1 week of running.
I understand that I should attempt to minimize the number of writes to flash, but I don't have a great understanding of the best way to do this. I'm looking for a resource that would explain memory management techniques for this kind of situation.
I'm using the ADUCM350 which looks like it has 3 separate flash sections (128kB, 256kB, and a 16kB eeprom).
For logging applications the simplest and most effective wear leveling tactic is to treat the entire flash array as a giant ring buffer.
define an entry size to be some integer fraction of the smallest erasable flash unit. Say a sector is 4K(4096 bytes); let the entry size be 256.
This is to make all log entries be sector aligned and will allow you to erase any sector without cuting a log entry in half.
At boot, walk the memory and find the first empty entry. this is the 'write_pointer'
when a log entry is written, simply write it to write_pointer and increment write_pointer.
If write_pointer is on a sector boundary erase the sector at write_pointer to make room for the next write. essentially this guarantees that there is at least one empty log entry for you to find at boot and allows you to restore the write_pointer.
if you dedicate 128KBytes to the log entries and have an endurance of 20000 write/erase cycles. this should give you a total of 10240000 entries written before failure. or 1168 years of continuous logging...
I have some scientific measurement data which should be permanently stored in a data store of some sort.
I am looking for a way to store measurements from 100 000 sensors with measurement data accumulating over years to around 1 000 000 measurements per sensor. Each sensor produces a reading once every minute or less frequently. Thus the data flow is not very large (around 200 measurements per second in the complete system). The sensors are not synchronized.
The data itself comes as a stream of triplets: [timestamp] [sensor #] [value], where everything can be represented as a 32-bit value.
In the simplest form this stream would be stored as-is into a single three-column table. Then the query would be:
SELECT timestamp,value
FROM Data
WHERE sensor=12345 AND timestamp BETWEEN '2013-04-15' AND '2013-05-12'
ORDER BY timestamp
Unfortunately, with row-based DBMSs this will give a very poor performance, as the data mass is large, and the data we want is dispersed almost evenly into it. (Trying to pick a few hundred thousand records from billions of records.) What I need performance-wise is a reasonable response time for human consumption (the data will be graphed for a user), i.e. a few seconds plus data transfer.
Another approach would be to store the data from one sensor into one table. Then the query would become:
SELECT timestamp,value
FROM Data12345
WHERE timestamp BETWEEN '2013-04-15' AND '2013-05-12'
ORDER BY timestamp
This would give a good read performance, as the result would be a number of consecutive rows from a relatively small (usually less than a million rows) table.
However, the RDBMS should have 100 000 tables which are used within a few minutes. This does not seem to be possible with the common systems. On the other hand, RDBMS does not seem to be the right tool, as there are no relations in the data.
I have been able to demonstrate that a single server can cope with the load by using the following mickeymouse system:
Each sensor has its own file in the file system.
When a piece of data arrives, its file is opened, the data is appended, and the file is closed.
Queries open the respective file, find the starting and ending points of the data, and read everything in between.
Very few lines of code. The performance depends on the system (storage type, file system, OS), but there do not seem to be any big obstacles.
However, if I go down this road, I end up writing my own code for partitioning, backing up, moving older data deeper down in the storage (cloud), etc. Then it sounds like rolling my own DBMS, which sounds like reinventing the wheel (again).
Is there a standard way of storing the type of data I have? Some clever NoSQL trick?
Seems like a pretty easy problem really. 100 billion records, 12 bytes per record -> 1.2TB this isn't even a large volume for modern HDDs. In LMDB I would consider using a subDB per sensor. Then your key/value is just 32 bit timestamp/32 bit sensor reading, and all of your data retrievals will be simple range scans on the key. You can easily retrieve on the order of 50M records/sec with LMDB. (See the SkyDB guys doing just that https://groups.google.com/forum/#!msg/skydb/CMKQSLf2WAw/zBO1X35alxcJ)
Try VictoriaMetrics as a time series database for big amounts of data.
It is optimized for storing and querying big amounts of time series data.
It uses low disk iops and bandwidth thanks to the storage design based on LSM trees, so it can work quite well on HDD instead of SSD.
It has good compression ratio, so 100 billion typical data points would require less than 100 GB of HDD storage. Read technical details on data compression.