opening and closing streaming clients for specific durations - ruby-on-rails

I'd like to infrequently open a Twitter streaming connection with TweetStream and listen for new statuses for about an hour.
How should I go about opening the connection, keeping it open for an hour, and then closing it gracefully?
Normally for background processes I would use Resque or Sidekiq, but from my understanding those are for completing tasks as quickly as possible, not chilling and keeping a connection open.
I thought about using a global variable like $twitter_client but that wouldn't horizontally scale.
I also thought about building a second application that runs on one box to handle this functionality, but that seems excessive if it can be integrated into the main app somehow.
To clarify, I have no trouble starting a process, capturing tweets, and using them appropriately. I'm just not sure what I should be starting. A new app? A daemon of some sort?
I've never encountered a problem like this, and am completely lost. Any direction would be much appreciated!

Although not a direct fix, this is what I would look at:
Time
You're working with time, so I'd look at what time-centric processes could be used to induce the connection for an hour
Specifically, I'd look at running a some sort of job on the server, which you could fire at specific times (programmatically if required), to open & close the connection. I only have experience with resque, but as you say, it's probably not up to the job. If I find any better solutions, I'll certainly update the answer
Storage
Once you've connected to TweetStream, you'll want to look at how you can capture the tweets for that time period. It seems a waste to create a data table just for the job, so I'd be inclined to use something like Redis to store the tweets that you need
This can then be used to output the tweets you need, allowing you to simulate storing / capturing them, but then delete them after the hour-window has passed
Delivery
I don't know what context you're using this feature in, so I'll just give you as generic process idea as possible
To display the tweets, I'd personally create some sort of record in the DB to show the time you're pinging TweetStream that day (if it changes; if it's constant, just set a constant in an initializer), and then just include some logic to try and get the tweets from Redis. If you're able to collect them, show them as you wish, else don't print anything
Hope that gives you a broader spectrum of ideas?

Related

Proper way to save/update a one-off timestamp in Rails app

New to Rails, and looking for the 'right' way to do something that seems straight-forward, but nothing I've read about sounds quite right.
I have a Rails app on Heroku, and I've added a call to an endpoint that depends on an external system. If that call is unsuccessful there'll be some follow up needed, so I save details to the error log. I've added a notification email (to a slack room for this sort of thing) to prompt me to check the logs and follow up if it happens.
In case the endpoint gets bogged down and fails repeatedly, I want to be able to throttle the slack alert so I don't spam everyone (for example, only email the slack room if 30 min have gone by since the last time it alerted).
To do this, I imagine I need:
somewhere to save a timestamp for the last email notification for the error
whenever the error occurs, compare with that timestamp and only email slack room if the 30-min window has passed. Then update the timestamp with the new value.
What's an appropriate place to save this kind of timestamp value? I've read that global variables are the devil (and wouldn't actually work in this case), but the other options (adding database field, trying the simpleconfig gem) seem excessive/incorrect for something internal that I don't even know will happen once, let alone frequently.
Is there a lightweight way to get this done?
A popular choice would be to store it in a Redis store -- especially if you already have one set up for something else, like caching. As this is itself ephemeral data, you could even use the Rails.cache API to abstract away the detail and have this code just trust that it gets stored somewhere.
Failing that, the most straightforward solution is probably to create a tiny single-row table and store it in there: it's overkill, but doesn't involve doing anything unusual, or that would look out of place in the middle of a Rails application.
As a quick and simple solution, though, a global variable isn't out of the question: it has strong limitations, like it won't be shared across multiple server processes, and it'll go away any time the process restarts... but if those add up to a risk that you'll get, say, 4-6 notifications in an error-heavy 30 minute period -- maybe that's good enough? (It'd also give you a "reset on deploy" feature for free, so you know immediately if the problem's still occurring after you think you've fixed it.)

Should I POST high-rate user actions to my server on a per-action basis or send the batch of events once the session is closed?

I'm building a site where users can watch a video and click as many times as they want to "like" it. It's a bit like Periscope's "Hearts" function for those who know it.
The viewers are viewing the video on a web browser for now. Every "Like" is input into a heroku-hosted REDIS instance, so the write/read are fairly cheap. However potentially there could be a high rate of simultaneous input as many users watch a video at the same time.
In this scenario, I'm facing two options:
Send an event to the REDIS instance every time the user "likes." convenience: story the "like" right away with all relevant information. Inconvenience: lots of concurrent likes into the server.
Cache the "likes" locally and only send to REDIS once the session is over. Problem: at any time the user can close his browser (and potentially never return) so the "like" information could be lost permanently.
Any advice on which option is preferable?
Don't cache.
First, it's a really big complication as you won't know when the session is really over.
Second, Redis increment is probably as fast or faster than your cache. I bet your concern is Rails only, not Redis.
You may eventually want to make another endpoint - maybe a simple Sinatra app - to simply handle likes. I noticed autosuggest gems sometimes do this (for example) and it saves all the overhead of a rails request.
If it is a successful app, the concern could be someone writing a script to 'like' continually. You may need to put in some throttle to allow a limited number of requests over time.

Iphone app that needs to scrape a website once every day

So I'm making an iphone application that needs to scrape a website once everyday.
What I'm going to scrape is a table of upcoming games for that same day for a soccer division. Thats why i need the app to scrape from the same page and same table once everyday to keep the upcoming games updated.
I was referred to import.io but they didn't have something like a schedule re-crawl.
I would love to get some ideas and tips to how i should do this since I'm stuck now.
You might take a look at https://www.kimonolabs.com/
I played around with the service a while back and was impressed with how easy it way to set up. They have a "free" option so long as the APIs you create are not private.
Oh, and I agree with Paul, screen scraping is not something the iOS client should be doing. Too fragile, and when (not if) something breaks, you will need to go through an Apple review process to fix it.
This doesn't seem like something an app should do, your server should do it (so that the scraping is only performed once), and your clients can retrieve it from your server. That also means you could send out push notifications for important fixtures etc. Maybe that's what you meant, anyway.
If it's on the server you can just setup a scheduler (in Java, for example) to run once every x hours (probably a smaller number than 24 assuming you don't know when the website is to be updated). Then your app can just get the latest list of fixtures from your server on startup, pull-to-refresh, etc. Presumably someone will open your app, look at the fixtures, then come out of your app - so it doesn't seem like you need to cover the case where someone is in your app all day, but if you did you could use NSTimer to run every x minutes after the initial on-startup server call.

Can I prevent an iOS user from changing the date and time?

I want to deploy managed iOS devices to employees of the company, and the app they will use will timestamp data that will be recorded locally, then forwarded. I need those timestamps to be correct, so I must prevent the user from adjusting the time on the device, recording a value, then resetting the date and time. Date and time will be configured to come from the network automatically, but the device may not have network connectivity at all times (otherwise I would just read network time every time a data value is recorded). I haven't seen an option in Apple Configurator to prevent changing the date and time, so is there some other way to do this?
You won't be able to prevent a user either changing their clock or just hitting your API directly as other commentators have posted. These are two separate issues and can be solved by having a local time that you control on the device and by generating a hashed key of what you send to the server.
Local Time on Device:
To start, make an API call when you start the app which sends back a timestamp from the server; this is your 'actual time'. Now store this on the device and run a timer which uses a phone uptime function (not mach_absolute_time() or CACurrentMediaTime() - these get weird when your phone is in standby mode) and a bit of math to increase that actual time every second. I've written an article on how I did this for one of my apps at (be sure to read the follow up as the original article used CACurrentMediaTime() but that has some bugs). You can periodically make that initial API call (i.e. if the phone goes into the background and comes back again) to make sure that everything is staying accurate but the time should always be correct so long as you don't restart the phone (which should prompt an API call when you next open the app to update the time).
Securing the API:
You now have a guaranteed* accurate time on your device but you still have an issue in that somebody could send the wrong time to your API directly (i.e. not from your device). To counteract this, I would use some form of salt/hash with the data you are sending similar to OAuth. For example, take all of the parameters you are sending, join them together and hash them with a salt only you know and send that generated key as an extra parameter. On your server, you know the hash you are using and the salt so you can rebuild that key and check it with the one that was sent; if they don't match, somebody is trying to play with your timestamp.
*Caveat: A skilled attacked could hi-jack the connection so that any calls to example.com/api/timestamp come from a different machine they have set up which returns the time they want so that the phone is given the wrong time as the starting base. There are ways to prevent this (obfuscation, pairing it with other data, encryption) but that becomes a very open-ended question very quickly so best asked elsewhere. A combination of the above plus a monitor to notice weird times might be the best thing.
There doesn't appear to be any way to accomplish what you're asking for. There doesn't seem to be a way to stop the user from being able to change the time. But beyond that, even if you could prevent them from changing the time, they could let their device battery die, then plug it in and turn it on where they don't have a net connection, and their clock will be wrong until it has a chance to set itself over a network. So even preventing them from changing the time won't guarantee accuracy.
What you could do is require a network connection to record values, so that you can verify the time on a server. If you must allow it to work without a net connection, you could at least always log the current time when the app is brought up and note if the time ever seems to go backwards. You'll know something is up if the timestamp suddenly is earlier than the previous timestamp. You could also do this check perhaps only when they try to record a value. If they record a value that has a timestamp earlier than any previous recorded value, you could reject it, or log the event so that the person can be questioned about it at a later time.
This is also one of those cases where maybe you just have to trust the user not to do this, because there doesn't seem to be a perfect solution to this.
The first thing to note is that the user will always be able to forge messages to your server in order to create incorrect records.
But there are some useful things you can use to at least notice problems. Most of the time the best way to secure this kind of system is to focus on detection, and then publicly discipline anyone who has gone out of their way to circumvent policy. Strong locks are meaningless unless there's a cop who's eventually going to show up and stop you.
Of course you should first assume that any time mistakes are accidental. But just publicly "noticing" that someone's device seems to be "misbehaving" is often enough to make bad behaviors go away.
So what can you do? The first thing is to note the timestamps of things when they show up at the server. Timestamps should always move forward in time. So if you've already seen records from a device for Monday, you should not later receive records for the previous Sunday. The same should be true for your app. You can keep track of when you are terminated in NSUserDefaults (as well as posting this information to the server). You should not generally wake up in the past. If you do, complain to your server.
Watch for UIApplicationSignificantTimeChangeNotification. I believe you'll receive it if the time is manually changed (you'll receive it in several other cases as well, most of them benign). Watch for time moving significantly backwards. Complain to your server.
Pay attention to mach_absolute_time(). This is the time since the device was booted and is not otherwise modifiable by the user without jailbreaking. It's useful for distinguishing between reboots and other events. It's in a weird time unit, but it can be converted to human time as described in QA1398. If the mach time difference is more than an hour greater than the wall clock time, something is weird (DST changes can cause 1 hour). Complain to your sever.
All of these things could be benign. A human will need to investigate and make a decision.
None of these things will ensure that your records are correct if there is a dedicated and skilled attacker involved. As I said, a dedicated and skilled attacker could just send you fake messages. But these things, coupled with monitoring and disciplinary action, make it dangerous for insiders to even experiment with how to beat the system.
You cannot prevent the user from changing time.
Even the time of an Location is adjusted by Apple, and not a real GPS time.
You could look at mach kernel time, which is a relative time.
Compare that to the time when having last network connection.
But this all sounds not reliable.

Letting something happen at a certain time with Rails

Like with browser games. User constructs building, and a timer is set for a specific date/time to finish the construction and spawn the building.
I imagined having something like a deamon, but how would that work? To me it seems that spinning + polling is not the way to go. I looked at async_observer, but is that a good fit for something like this?
If you only need the event to be visible to the owning player, then the model can report its updated status on demand and we're done, move along, there's nothing to see here.
If, on the other hand, it needs to be visible to anyone from the time of its scheduled creation, then the problem is a little more interesting.
I'd say you need two things. A queue into which you can put timed events (a database table would do nicely) and a background process, either running continuously or restarted frequently, that pulls events scheduled to occur since the last execution (or those that are imminent, I suppose) and actions them.
Looking at the list of options on the Rails wiki, it appears that there is no One True Solution yet. Let's hope that one of them fits the bill.
I just did exactly this thing for a PBBG I'm working on (Big Villain, you can see the work in progress at MadGamesLab.com). Anyway, I went with a commands table where user commands each generated exactly one entry and an events table with one or more entries per command (linking back to the command). A secondary daemon run using script/runner to get it started polls the event table periodically and runs events whose time has passed.
So far it seems to work quite well, unless I see some problem when I throw large number of users at it, I'm not planning to change it.
To a certian extent it depends on how much logic is on your front end, and how much is in your model. If you know how much time will elapse before something happens you can keep most of the logic on the front end.
I would use your model to determin the state of things, and on a paticular request you can check to see if it is built or not. I don't see why you would need a background worker for this.
I would use AJAX to start a timer (see Periodical Executor) for updating your UI. On the model side, just keep track of the created_at column for your building and only allow it to be used if its construction time has elapsed. That way you don't have to take a trip to your db every few seconds to see if your building is done.

Resources