Caching( optimizing) Strategy with API live stream on Rails - ruby-on-rails

So I built a website that uses Twitch.tv API, which is a gaming live stream website. The requests are long and slow, and I would like to cache it somehow. The problem is that there are a lot of dynamic attributes, if they are still online, or how many viewers there are. Since the traffic to my website is low at the moment, expiring Cache early isn't going to help much. Also, I have a page where it lists all the live streams, and it requests to see if the stream is online. So even if no one is online it still takes a while to load. Is there anyway to retrieve api faster without caching?
here is twitch.tv api doc

Since you don't own the Twitch.tv API, unfortunately I would say there is really nothing you can do to make their calls faster.
The good news is that you can cache the calls you make to them, which will make things appear faster to your users.
The way to cache the calls is to create a key and then cache the return JSON from the API. To create the key I would just use the URL you are calling for the API. Then just give the cached value an expiration time of a few minutes and when it expires you make another API call to re-populate the cache.
Also I'd look at Varnish (https://www.varnish-cache.org/) which does a lot of HTTP caching really well. Could work really well for you and it has the concept of a grace period that tries to hide the expensive calls made when the cache expires.

Related

Caching an HTTP request made from a Rails API (google-id-token)?

ok, first time making an API!
My assumption is that if data needs to be stored on the back end such that it persists across multiple API calls, it needs to be 1) in cache or 2) in a Database. is that right?
I was looking at the code for the gem "google-id-token". it seems to do just what i need for my google login application. My front end app will send the google tokens to the API with requests.
the gem appears to cache the public (PEM) certificates from Google (for an hour by default) and then uses them to validate the Google JWT you provide.
but when i look at the code (https://github.com/google/google-id-token/blob/master/lib/google-id-token.rb) it just seems to fetch the google certificates and put them into an instance variable.
am i right in thinking that the next time someone calls the API, it will have no memory of that stored data and just fetch it again?
i guess its a 2 part question:
if i put something in an #instance_variable in my API, will that data exist when the next API call comes in?
if not, is there any way that "google-id-token" is caching its data correctly? maybe HTTP requests are somehow cached on the backend and therefore the network request doesnt actually happen over and over? can i test this?
my impulse is to write "google-id-token" functionality in a way that caches the google certs using MemCachier. but since i dont know what I'm doing i thought i would ask.? Maybe the gem works fine as is, i dont know how to test it.
Not sure about google-id-token, but Rails instance variables are not available beyond single requests and views (and definitely not from one user's session to another).
You can low-level cache anything you want with Rails.cache.fetch this is put in a block, takes a key name, and an expiration. So it looks like this:
Rails.cache.fetch("google-id-token", expires_in: 24.hours) do
#instance_variable = something
end
If the cache exists and it is not past its expiration date/time, Rails grabs it from the cache; otherwise, it would make your API request.
It's important to note that low-level caching doesn't work with mem_store (the default for development) and so you need to implement a solution with redis or memcached or something like that for development, too. Also, make sure the file tmp/cache.txt exists. You can run rails dev:cache or just touch it to create it.
More on Rails caching

How does the userQuota limitations works on YouTube Data API V3?

I'm building an alternative client for YouTube subscriptions browsing (folder based subscriptions with according feed generated), and I'm making a lot of requests to YouTube to aggregate that data.
I'm caching a lot of requests as it is not needed to refresh them once it has been fetched on any other day before the current one.
The fact is, current-day refreshes are consuming a lot, and I reach my quota pretty fast even though those requests are read-only.
I submitted that YouTube quota increase request form, but still, I'm quite afraid.
Am I missing something with the userIp & quotaUser parameters ?
Shouldn't those requests - as they are pretty much the same that a normal user would do on the regular YouTube client - be considered as "Queries per 100 seconds per user" ?
My main quota, the "Queries per day" currently seems to handle ALL the requests coming from my app, even though I added the quotaUser parameter on all my requests made by a user on the frontend.
I think I am missing something as my app should not be considered as "data consuming" as it is sending almost nothing to YouTube in terms of data, and it is just reading data that is also available on the YouTube main client, but not in the same format..
Thanks for your help.

Is there a canonical pattern for caching something related to a session on the server?

In my Rails app, once per user session, I need to have my server send a request to one of our other services to get some data about the user. I only want to make this request once per session because pinging another service every time the user makes a request will significantly slow down our response time. However, I can't store this information in a cookie client-side. This information has some security implications - if the user has the ability to lie to our server about what this piece of information is, they can gain access to data they're not authorized to see.
So what is the best way to cache or store a piece of data associated with a session on the Rails server?
I'm considering using Rails low-level caching, and I think it might even be correct:
Rails.cache.fetch(session.id, expires_in: 12.hours) do
OtherServiceAPI.get_sensitive_data(user.id)
end
I know that Rails often has one canonical way of doing things, though, so I want to be sure there's not a built-in, officially preferred way to associate a piece of data with a session. This question makes it look like there are potential pitfalls using the approach I'm considering as well, although it looks like those concerns may have been made obsolete in newer versions of Rails.
Is there a canonical pattern for what I'm trying to do? Or is the approach I'm considering idiomatic enough?

How to load external data into the view asynchronously

In a Rails 3.2 app I have a view that is pulling in information from an external API. On slow connections, this severely reduces the page load time and affects user experience.
How can I move this into an asynchronous process so that the rest of the page loads, and the external information is rendered later once it has been fetched and is available.
The external data is large and complex and I don't think is suitable to cache in the database or in a variable.
I'm aware of delayedjob and similar gems, but these seem more suited to queuing database methods rather than in the view.
What other options are available to me?
It seems like a large data set is perfectly suitable for caching on your local server.
Keep in mind, a long request is going to lock your Rails process/thread and and can't serve any other requests while waiting for your API call to finish.
That said, you can always trigger an Ajax request to occur once the rest of the page loads.

Why would you upload assets directly to S3?

I have seen quite a few code samples/plugins that promote uploading assets directly to S3. For example, if you have a user object with an avatar, the file upload field would load directly to S3.
The only way I see this being possible is if the user object is already created in the database and your S3 bucket + path is something like
user_avatars.domain.com/some/id/partition/medium.jpg
But then if you had an image tag that tried to access that URL when an avatar was not uploaded, it would yield a bad result. How would you handle checking for existence?
Also, it seems like this would not work well for most has many associations. For example, if a user had many songs/mp3s, where would you store those and how would you access them.
Also, your validations will be shot.
I am having trouble thinking of situations where direct upload to S3 (or any cloud) is a good idea and was hoping people could clarify either proper use cases, or tell me why my logic is incorrect.
Why pay for storage/bandwidth/backups/etc. when you can have somebody in the cloud handle it for you?
S3 (and other Cloud-based storage options) handle all the headaches for you. You get all the storage you need, a good distribution network (almost definitely better than you'd have on your own unless you're paying for a premium CDN), and backups.
Allowing users to upload directly to S3 takes even more of the bandwidth load off of you. I can see the tracking concerns, but S3 makes it pretty easy to handle that situation. If you look at the direct upload methods, you'll see that you can force a redirect on a successful upload.
Amazon will then pass the following to the redirect handler: bucket, key, etag
That should give you what you need to track the uploaded asset after success. Direct uploads give you the best of both worlds. You get your tracking information and it unloads your bandwidth.
Check this link for details: Amazon S3: Browser-Based Uploads using POST
If you are hosting your Rails application on Heroku, the reason could very well be that Heroku doesn't allow file-uploads larger than 4MB:
http://docs.heroku.com/s3#direct-upload
So if you would like your users to be able to upload large files, this is the only way forward.
Remember how web servers work.
Unless you're using a sort of async web setup like you could achieve with Node.JS or Erlang (just 2 examples), then every upload request your web application serves ties up an entire process or thread while the file is being uploaded.
Imagine that you're uploading a file that's several megabytes large. Most internet users don't have tremendously fast uplinks, so your web server spends a lot of time doing nothing. While it's doing all of that nothing, it can't service any other requests. Which means your users start to get long delays and/or error responses from the server. Which means they start using some other website to get the same thing done. You can always have more processes and threads running, but each of those costs additional memory which eventually means additional $.
By uploading straight to S3, in addition to the bandwidth savings that Justin Niessner mentioned and the Heroku workaround that Thomas Watson mentioned, you let Amazon worry about that problem. You can have a single-process webserver effectively handle very large uploads, since it punts that actual functionality over to Amazon.
So yeah, it's more complicated to set up, and you have to handle the callbacks to track things, but if you deal with anything other than really small files (and even in those cases), why cost yourself more money?
Edit: fixing typos

Resources