Number of records returned from team.accessLogs API too large for page and count constraints? - slack-api

API in question: https://api.slack.com/methods/team.accessLogs
The maximum page is 100 and the maximum records per page(count) is 1000 so total 100,000 records could potentially be returned. Since there is no way to limit the starting date for the accessLog, the results will continue to grow as more unique user/IP/useragent combinations are used until it reaches the limit at which point it wouldn't be possible to return all records. Is this correct?
Also, the documentation does not specify how the results are ordered?

You have mentioned correctly that typically you can fetch 100,000 records.
But there is a way to limit the starting date.
before argument in api lets you set the time before which you want the records.
https://api.slack.com/methods/team.accessLogs#arg_before
The records are fetched in reverse chronological order i.e. latest record first, and by default, the value of before argument is 'now'.
After fetching first 100,000 records,
set before argument value as "date_last" value from the last record.
(keep in mind that before argument is inclusive of the value provided, therefore the last record will be repeated. To avoid it you can reduce "date_last" value by 1 )

Related

Query on Influxdb only returns data up to the current timestamp

I want to determine the maximum value per day including past and future. The data are forecast values. They also include the entire current day and the following day. But a query without a WHERE clause only returns results up to the current time.
This is the query so far:
SELECT max("Prognose_Wh") FROM "Wetterprognose" group by time(1d)
The result provides the maximum values per day in the past, including the maximum value of the current day, but only up to now(). Today's maximum value is incorrect because a higher value is reached later this day. The following day, which is also contained in the requested data, is missing in the result.
How can I create a query in influxql that returns results from the past as well as data from the future?

Performance issues when retrieving last value

I have a measurement that keeps track of sensor readings for a bunch of machines.
There are something of the order of 50 different readings per machine, and there are up to 1000 machines. We have one reading every 30 seconds.
The way I store the reading is in a single measurement which has 2 tags, machine_id and analysis_id and a single value.
One of the use cases I have is to retrieve the current value for each reading for a list of machines.
When this database gets to 100 million records or something like that, which with those numbers means less than 1 day, I can no longer retrieve the last values with a query as it takes too long.
I tried the two following alternatives:
SELECT *
FROM analysisvalue
WHERE entity_id = '1' or entity_id = '2'
GROUP BY analysis_id, entity_id
ORDER BY time DESC
LIMIT 1
and:
SELECT last(*) AS value,
FROM analysisvalue
WHERE entity_id = '1' or entity_id = '2'
GROUP BY analysis_id, entity_id
both of then take a pretty long time to complete. At 100 million it's something of the order of 1 second.
The use case of retrieving the latest values is a very frequent one. I need to be able to get the "current" state of machines almost instantly.
I can work that out on the side of the app logic, by keeping track of the latest value in a separate place, but I was wondering what I could do with InfluxDB alone.
I was facing something similar and I worked around it by creating a continuous query.
https://docs.influxdata.com/influxdb/v0.8/api/continuous_queries/

Why limited number of next page tokens?

Through a script I can collect a sequence of videos that search list returns. The maxresults variable was set to 50. The total number items are big in number but the number of next page tokens are not enough to retrieve all the desired results. Is there any way to take all the returned items or it is YouTube restricted?
Thank you.
No, retrieving the results of a search is limited in size.
The total results that you are allowed to retrieve seems to have been reduced to 500 (in the past it was limited to 1000). The api does not allow you to retrieve more from a query. To try to get more, try using a number of queries with different parameters, like: publishedAfter, publishedBefore, order, type, videoCategoryId, or vary the query tags and keep track of getting different video id's returned.
See for a reference:
https://code.google.com/p/gdata-issues/issues/detail?id=4282
BTW. "totalResults" is an estimation and its value can change on the next page call.
See: YouTube API v3 totalResults field is returning 1 000 000 when it shoudn't

Is there a way to tell if an activerecord query hit its limit

given the following query:
Cars.where(color: red).limit(5)
Is there a way to tell if the limit was hit. Say there are 6 red cars, do I have to do a separate query to count the total number of red cars?
Basically, I am trying to send a message to the user letting them know that a search was limited due to reaching max number of results allowed.
The only way to combine it into one query is to do a limit(6). If the size is 6, then remove the last element and record that their are more results.
Alternatively, do a separate query Cars.where(color: red).count. Although this will do a separate SQL query, count queries are sometimes very fast for databases.
The direct answer to your question is that you cannot get this information back in one single query. You would have to run two queries: one to return the limited set, and one to find the total count.
The only time this wouldn't be necessary is when you are querying the first page of results (the offset is 0) and the number of returned results is less than the limited value (e.g., you set the limit to 5 but get 4 results back).
Sample code would look like this:
limit = 5
cars_scoped = Cars.where(color: red)
cars = cars_scoped.limit(limit)
cars_count = cars.length < limit ? cars.length : cars_scoped.count

If I use the will_paginate gem does a mongo query still select all rows THEN paginate?

The code:
Channel.all.paginate(:page => 3, :per_page => 25)
Say I have a table with 400,000 records, does the above code select all 400,000 records then get the current 25 I need or does it only query for the 25 I need.
If it queries all 400,000 records is there a better optimized way to paginate large datasets using rails?
Mongo Mapper (which I assume your using because of the syntax of your query) is implementing this using the limit and skip expressions.
Basically it would run a query where it skips over a number of Channels and then retrieves the amount specified by the limit (the number you are getting per page).
For example: If you were on page 3 and have 25 per page, the query that mongo mapper runs looks like this:
db.channels.find().skip((page - 1) * per_page).limit(per_page)
Which translates to:
db.channels.find().skip(2 * 25).limit(25)
To return results, mongo has to skip over (page - 1) * per_page number of results which can be costly if the page number is high. Lets say that expression evaluates to 1000, then it would have to run the query, skip over 1000 documents and get the next 25 documents (the limit). MongoDB would essentially be doing a table scan over those documents.
To avoid that you can do range based paging which provides better use of indexes but does not allow you to easily jump to a specific page.
If the Channel model has a date field for example, range based paging would, instead of using skip, use $gte and limit. You would take the date of last document on x page and get the next page's results by querying for documents with date $gte of previous page's final document. If you do that you could get dupes though, so it might make sense to use a different criteria.
In practice, don't worry about it unless you have a really high number of pages.
Cheers and good luck!

Resources