How to have Firebase automatically delete values older than 30 minutes - ios

I have a Swift/iOS application that creates/holds data something like this:
buyer_name: "john#gmail.com"
seller_name: "tom#gmail.com"
price: "20.00"
product: "baseball cards"
timestamp: 1498677314931
I am taking the timestamp with
FIRServerValue.timestamp()
I would like to create some kind of timeout, where it deletes firebase data older than 30 minutes, as per the timestamp. The problem is that I would not like to do it directly from the app, as the user could technically logout after 5 minutes. I would like to do it from some other independent process.
So, I'm wondering if this can be done automatically through Firebase, or if I have to deploy some other application that does this continuously on an EC2 instance, or if there is some other method?

You can use Cloud Functions for Firebase and set up a cron job that runs, say, every five minutes and checks for timestamps older than 30 minutes.
As Will commented, that blog post is useful to learn about scheduling Cloud Functions. Here are some other useful resources to get you started:
Getting Started with Cloud Functions for Firebase - YouTube
Cloud Functions for Firebase Documentation
Cloud Functions Samples
Timing Cloud Functions for Firebase using an HTTP Trigger and Cron - YouTube

Related

Understanding and mitigating Firebase Usage and Billing Fees

I have tried my best to understand Firebase's usage and billing, but I found I only got more confused. For reference, I use the following Firebase Services in the project in question:
Realtime Database
Firestore
Authentication
Cloud Functions
Cloud Messaging
A few notes, I do update the Real-time Database frequently, I as well have a simple Firebase Functions method deployed, which is triggered (iOS Notification Trigger) whenever a change happens to the Real-time Database.
With that being said, all the services are within the Free Tier threshold:
But I still get billed, not much, but on a daily basis nonetheless.
I used to empty my buckets, and that would lower the billing a little, but after a recent Firebase Functions update, it automatically clears buckets now.
Questions:
What actions correlate to these costs? (frequency of Firebase Functions triggers, Size of deployed methods, etc.)
How can I mitigate these costs?
Do these costs scale with more users?

What should I put in Cloud Scheduler Payload in order to reset scores in Firebase Realtime Database?

I have an iOS/swift game with a leaderboard, and I'd like the scores to all reset to 0 every Monday at 12:00am.
I believe I'm all set up (enabled the Blaze plan, followed the steps in the Getting Started with Cloud Functions for Firebase and also the Cloud Scheduler Quickstart, etc), but I'm a bit unclear on how it all connects and both feel so distant from my database.
How can I get this Google Cloud Scheduler job to reset the scores for all users in my Firebase Realtime Database when it runs? I'm guessing I put some code in the Payload field?
Apologies if my question isn't precisely worded or is missing needed information. It feels like just a few months ago that I dipped my toes into programming and wrote print("Hello World") in Xcode and now here we are :D
Thanks in advance for any direction!
You should read the documentation for scheduled functions. If you're deploying with the Firebase CLI, everything will get set up for you automatically on the schedule you specify in the function. You don't even have to go to the Cloud console at all to configure Cloud Scheduler. Just write your function to do what you want, whenever it gets invoked by the scheduler.

O365 Activity Management API - Performance for huge audit log streams

This question is regarding the O365 Activity Management API
We are using the API to retrieve audit log notifications from multiple channels (AzureAD, Outlook, SharePoint, etc.) for very large tenants, meaning that we need to retrieve potentially millions of notifications over a relatively short timespan.
O365 gathers audit notifications into a series of "blobs" which then contain a number of individual notifications (JSON messages). To my understanding, which in part comes from correspondence with the API's dev. team and from reading the docs, these blobs should contain a "considerable" number notifications as to function as a sort of batch approach when doing the actual web requests.
In our approach, we request blobs URLs for an interval of an hour, and then do a request for the individual blobs.
However, we have tested with a number of different tenants and different PublisherIdentifiers, but only seem to get around 2.5 messages per blob on average, no matter the total number of notifications "waiting" to be fetched.
This becomes a major issue for the larger tenants as is puts a strain on the SIEM solution running the fetcher logic (a Python service), due to the number of needed requests, and it also gives us throttling issues with the API itself.
In effect, we simply cannot fetch the audit notifications fast enough to keep up - within the retention period. Had the blobs contained more notifications per blob, we would be fine - as the total amount of data (in MBs) is not that large.
A "funny" thing is, that if we use the visual query tool within the Admin Center of the tenant, it searches and retrieves the notifications very fast.
My questions
Has anyone had any experience with this issue, or perhaps had a better "batch performance"?
Does anyone have any ideas as to what we could try to get a better performance?
As mentioned we have been in direct contact with the dev team and the program manager in Redmond. They have been very helpful with other issues we had, but they referred us to support for this specific issue - who in turn referred us to the forums / community. We currently do not have access to premium support...
Example request for content blobs for an hour
https://manage.office.com/api/v1.0/{tenantid}/activity/feed/subscriptions/content?contentType=Audit.Exchange&PublisherIdentifier={pub.id}&startTime=2017-12-03T10:31:24&endTime=2017-12-03T11:31:24
When retrieving the individual blobs, we just use the URLs given to us by the above request.
You can avoid throttling by appending "?PublisherIdentifier={Tenant ID}" to the contentUri in the retrieve content get request.
How can I add a PublisherId to a GetBlob call to the Office365 Rest API to avoid throttling?
I have been working with Office 365 Management Activity API for the past 6 months. I too faced this kind of issue before. This issue will occur if you are trying to get all the audit log contents from your Office 365 tenant at a particular interval, it will result in throttling issue. For your information, it is not possible to avoid throttling issues (resource over usage) for large active tenants.
To overcome these issues, you can create and deploy a web application in cloud and register with Office 365 Management Activity API webhook.
Whenever the office 365 tenant wrap the activity logs into an Azure Blob, it will immediately give the blob details to your registered Web Application. You can refer this link to know about how to enable webhook for a Web Application. Once you received the blob detail from Office 365 tenant, extract the logs from the Azure Blob and save it in your own blob storage / store in SQL / NOSQL databases.
I had a similar issue. Pulling down logs would take longer than the interval of time allotted to the Python script and the script would start overlapping itself or would fall behind when trying to pull logs for a SIEM implementation.
https://github.com/IntegralDefense/o365_log_fetch
I'm a little late to this post, but by using Asyncio in Python 3.5+ as well as aiohttp, you can make concurrent calls to O365 Management API and pull down the logs much faster. I performed some testing and retrieved logs for a 13 hour window (Audit.Exchange, Audit.AzureActiveDirectory, and Audit.Sharepoint). It took around 20 minutes using 'requests' and sequentially making the API calls. After implementing Asyncio/aiohttp, the same time frame took just under 2 minutes (500,000+ individual events were pulled from the data located at several thousand content blobs/locations).
I've been running the script in 10 minute intervals and usually the script completes in < 10 seconds.
The script I pasted above also supports pagination. So if you get a content list that was truncated in the response from Microsoft, the script will keep reaching out and pulling down more content locations.
At this time, the documentation isn't up to speed, but hopefully that will be caught up soon.

Is there a way to schedule edits to firebase database?

I am trying to create automated edits to the database in firebase. Is there a way to do that on the server-side? I am new to iOS development and swift so any help would be greatly appreciated.
Also, I've tried Zapier but the service is not specific enough for my needs.
Yes - Firebase has quite a flexible set of options for server-side updates and it is simple enough to schedule a cronjob to connect to firebase and perform some scheduled update or edits.
The most generic approach is to use the REST API to perform your updates although there are specific libraries to support Node and other platforms.
It is worth being aware of the recent major upgrade to version 3 of Firebase which introduced quite a few significant changes - it can be easy to confuse the older examples floating around with the new API so be aware of the differences as you put together your first proof of concept examples.
I assume that you are looking to run on your own server although another alternative is to use a container hosting environment ( Google Apps etc ).
If you have your own server and are looking to integrate I would suggest starting with:
https://firebase.google.com/docs/server/setup#prerequisites
Then perhaps a quick look at:
https://firebase.googleblog.com/docs/web/quickstart.html
and
https://www.firebase.com/docs/rest/
If you are just getting started I would suggest a first task being to authenticate, retrieve and update a Firebase record.
You can configure server auth keys through the FB console and use these as part of you authentication process.
If you are unfamiliar with JWT then it is worth spending a little time getting up to speed on this and working through the examples at https://www.firebase.com/docs/rest/guide/user-auth.html
Further to your comment:
So the first approach that comes to mind is to run some kind of scheduled job in your Cron which would connect using the REST API, perform some kind of query on the existing data to identify those records that require an update and remove or modify them.
Giving a little more though you could extend this approach without having to run at a recurring period less than the minimal anticipated deletion time you could run the scheduler just to clean up at some longer period but filter your results to the client so that you are not including stale data. This approach is discussed a little at Firebase chat - removing old messages
Getting the right solution to your particular scenario will depend a lot on how well you structure your data which can be counter-intuitive; particularly for users who have come from an RDBMS background.
There may be an inclination to keep the data slim and unpolluted with old irrelevant data however Firebase is quite good at managing large minimally structured data and the overhead of this bloat may not be as bad a thing as you may think.
If the filtering itself isn't sufficient and you don't have a server that you can CRON a cleanup process then you can implement a firebase worker process in Node or similar and have this running on a container service such as Heroku or Google Apps. See Firebase push notifications - node worker for some ideas on how to approach this.
When asked Google advised that they didn't advise on where best to host worker services but they did mention both Google App Engine and Heroku.
Another approach if you don't want to implement and host a watcher/worker process is to simply include some code in the client that checks for and removes stale data periodically.
The firebase Queue is very cool but may be a bit of an overkill for simply expiring stale data.

Medium scale repeated tasks in Rails

I'm building a web app that tracks stats for a game. The API for that game (Steam Web API) only allows me to retrieve data from the most recently played match.
When a user requests their stats to be tracked, I need make a call to the Steam Web API every 10 minutes or so to check if they have played another match, then store it in the database if they have. The problem is, I check every 10 minutes for every user...
Is there a way to schedule this efficiently so that the server doesn't get overloaded? This application could potentially server 10,000+ users.
Please feel free to correct anything in this question if I got something obviously wrong.
There are many gems for recurring events in ruby, like:
ice cube
recurrence
However, if you are going to have lots of users, and you are worried about server load, I would suggest not using your rails app to do this.
Instead, build another service which doesn't run on your rails app to update your database with statistics.
Having an independent service like this decoupled from your main app allows you to easily put it on another server, and/or have it scale independently from your webapp.

Resources