I use spring-session-jdbc with spring-security. At this moment I have logged 20 users (with correct session id and principal_name) and about 11k rows with session-id and empty principal_name. Is it a normal behavior? My settings:
security.sessions= (Default)
#EnableJdbcHttpSession(maxInactiveIntervalInSeconds = 86400)
There isn't anything abnormal by itself with having a big number of session records in the database, especially since you've verified that clean up of expired sessions works OK.
You have configured a fairly big maxInactiveIntervalInSeconds of 86400 seconds (1 day) so it isn't unreasonable to have that number of anonymous sessions (i.e. unauthenticated sessions, or session records with no principal_name set) initiated over the period of one day.
Related
Searched a lot of examples but still can't get this to work
Seems pretty straightforward, I need to create a record, and then travel an hour in time ahead to make sure it has expired.
Records are being set in Redis and set to expire after 1 hour, I have actually verified this is happening by manual test.
#redis.expire("#{params[:key]}", 3600) # Expire inserts after 1 hour
but the test just keeps adding up the values
it 'it does not get results that are older than one hour' do
key = 'active_books'
post("/inventory/#{key}?value=30")
travel_to(1.hours.from_now) do
get("/inventory/#{key}/sum")
parsed_body = JSON.parse(response.body)
expect(parsed_body).to eql(0)
end
end
So when getting the records here it should return 0 because they expire. In real time I have tested this with the app, and at the moment it is actually set to expire in 60 seconds, which is happening. But in this test it returns the value 30 which was just created, even though if manually testing this it expires after a minute. If I run the test after a minutes it also only returns 30, not 60, so the record is expiring as expected, this is just not traveling forward in time (which is what I assumed it was suppose to do)
parsed_body returns a count, a sum of all the records. But none should be found, and again, on the app it is actually doing this but I want to get this test to reflect that accurately.
By Default, rollups tables like rollup360, rollup60, rollup7200, rollup86400 have value 0 for default_time_to_live which means the data never expires. But as per Opscenter Metrics blog Using Cassandra’s built in ttl support, OpsCenter expires the columns in the rollups60 column family after 7 days, the rollups300 column family after 4 weeks, the rollups 7200 column family after 1 year, and the data in the rollups86400 column family never expires.
What is the reason behind this setting and Where do we set the TTL for these tables?
Since OpsCenter data is growing, shouldn't we have TTLs defined for
rollups tables at the table level?
But in opscenterd.conf default values are listed below.
[cassandra_metrics]
1min_ttl = 86400
5min_ttl = 604800
2hr_ttl = 2419200
Which settings has preference over the other?
There are defaults if not set anywhere defined in the documentation:
1min_ttl Sets the time in seconds to expire 1
minute data points. The default value is 604800 (7 days).
5min_ttl Sets the time in seconds to expire 5
minute data points. The default value is 2419200 (28 days).
2hr_ttl Sets the time in seconds to expire 2 hour
data points. The default value is 31536000 (365 days).
24hr_ttl Sets the time to expire 24 hour data
points. The default value is 0, or never.
If you dont set them it will use the defaults, but if you override them in the [cassandra_metrics] section of the opscenterd.conf. When the agent on the node stores a rollup for a period it will include whatever TTL its associated with, ie (not exactly how opscenter does it but for demonstration purposes):
INSERT INTO rollups60 (key, timestamp, value) VALUES (...) USING TTL 604800;
In your example you lowered the TTLs which would decrease the amount of data stored. So for:
1) You set lower TTL to decrease amount of data stored on disk. You can configure it as you mentioned in your ticket. Although the compaction strategy can affect this significantly.
2) There is a default ttl setting on the tables, but there really isn't much difference between setting it per query and having it in the table. Doing an alter table is pretty expensive if need to change it compared to just changing the value of the ttl on the inserts. If having issues with obsolete data in tables try switching to LeveledCompactionStrategy (not this increases IO on compactions but probably not noticeable)
According to :
https://docs.datastax.com/en/latest-opsc/opsc/configure/opscChangingPerformanceDataExpiration_t.html
"Edit the cluster_name.conf file."
Chris, you suggested to edit the opscenterd.conf.
Scenario: There are 3 kinds of utilization metrics that i have derive for the users. In my application, users activity are tracked using his login history, number of customer calls made by the user, number of status changes performed by user.
All these information are maintained in 3 different tables in my application db like UserLoginHistory, CallHistory, OrderStatusHistory. All the actions made by each user is stored in these 3 tables along with DateTime info.
Now i am trying to create a reporting db that will help me in generating the overall utilization of user. Basically the report should show me for each user over a period:
UserName
Role
Number of Logins Made
Number of Calls Made
Number of Status updates Made
Now i am in the process of designing my fact table. How should i go about creating a Fact table for this scenario? Should i go about creating a single fact table with rows in it capturing all these details at the granular date level (in my DimDate table level) or 3 different fact tables and relate them?
The 2 options i described above arent convincing and i am looking for better design. Thanks.
As rule of thumb, when you have a report which uses different facts/metrics (Number of Logins Made, Number of Calls Made, Number of Status updates Made) with the same granularity (UserName, Role, Day/Hour/Minute), you put them in the same fact table, to avoid expensive joins.
For many reasons this is not always possible, but your case seems to me a bit different.
You have three tables with the user activity, where probably you store more detailed information about logins, calls and Status updates. What you need for your report is a table with your metrics and the values aggregated for the time granularity that you need.
Let's say you need the report at the day level, you need a table like this:
Day UserID RoleID #Logins #Calls #StatusUpdate
20150101 1 1 1 5 3
20150101 2 1 4 15 8
If tomorrow the business will require the report by hour, the you will need:
DayHour UserID RoleID #Logins #Calls #StatusUpdate
20150101 10:00AM 1 1 1 2 1
20150101 11:00AM 1 1 0 3 2
20150101 09:00AM 2 1 2 10 4
20150101 10:00AM 2 1 2 5 4
Then the Day level table will be like an aggregated (by Day) version of the second one. The DayHour attribute is child of the Day one.
If you need minute details you go down with the granularity.
You can also start directly with a summary table at the minute level, but I would double check the requirement with the business, usually one hour range (or 15 minutes) are enough.
Then if they need to get more detailed information, you can always drill down querying your original tables. The good thing is that when you drill to that level you should have just a small set of rows to query (like just few hours for a specific UserName) and your database should be able to handle it.
I'm measuring how long users are logged into a service. Every minute, for each user, their new total online time is sent to InfluxDB. I'd like to graph, in Grafana, the cumulative online time for all users.
What kind of query would I need to do that? I initially thought that I'd want sum(onlineTime) and group by time(1m), but I realized that's summing the values within that timeframe, not summing the totals of all users, so when a user wasn't logged in, the total would drop, because there were not data points for them.
I'm a bit confused about what I'm doing now. If I'm sending the wrong data, I can change that too.
So this depends on the time data you send back to InfluxDB
The time data is equal to the total time spent till that instant of time
In this case you would have to take the "last" value and add it up for all the users
The time is equal to the small increments
In this case you would have to add this multiple incremental value for a period of time.
This question is mainly to verify my current idea. I have a series of objects which I want to be active for a specified amount of time. For instance objects like ads which are shown in the ad space only for the amount of time bought, objects in search results which should only pop up when active, and frontpage posts which should be set to inactive after a prespecified time.
My current idea is to just give those type of objects a StartDate and EndDate and filter the search routines to only show results which fall in the range StartDate < currentDate < EndDate.
Is this the normal structure or should there be some sort of auto-routine which routinely checks for objects which are "over-time" and set a property "inactive" to true or something. Seems like this approach is such a hassle since I need a checker which runs say, every 5 minutes, to scan all DB objects. Seems like a bad idea to me.
So is the first structure the most commonly used or are there any other options? When searching on google or SO the search queries only return results setting users inactive.
Unless you are dealing with a seriously large number of objects, a date filter is definitely the best method.
The only reasons to implement a check like you describe are if the filter query takes too long or you have a cache that really needs to be updated as soon as anything goes inactive. Neither of those is likely to apply for a normal web application.