I have a 3rd party API service that I am interacting with, via my Rails app, and they have quite low daily limits.
Wondering what the best way to track API calls is?
Database
Redis
In memory
Other
I have a service object that makes all the calls. Just need a good way to track each call made and not go over the daily limit.
Related
I currently have an API for one of my projects and a service that is responsible for generating export files as CSVs, archive and store them somewhere in the cloud.
Since my API is written in Rails and my service in plain Ruby, I use the Her gem in the service to interact with the API. But I find my current implementation less performant, since I do a Model.all in my service, which in turn triggers a request that may contain way too many objects in the response.
I am curious on how to improve this whole task. Here's what I've thought of:
implement pagination at API level and call Model.where(page: xxx) from my service;
generate the actual CSV at API level and send the CSV back to the service (this may be done sync or async).
If I were to use the first approach, how many objects should I retrieve per page? How big should a response be?
If I were to use the second approach, this would bring quite an overhead to the request (and I guess API requests shouldn't take that long) and I also wonder whether it's really the API's job to do this.
What approach should I follow? Or, is there something better that I'm missing?
You need to pass a lot of information through a ruby process, that's always not simple, I don't think you're missing anything here.
If you decide to generate CSVs at the API level then what do you get with maintaining the service? You could just ditch the service altogether because replacing your service with an nginx proxy would do the same thing better (if you're just streaming the response from API host)?
If you decide to paginate, there will be a performance reduction for sure, but nobody can tell you exactly how much you should paginate - bigger pages will be faster and consume more memory (reducing throughput by being able to run less workers), smaller pages will be slower and consume less memory but demand more workers because of IO wait times,
exact numbers will depend on the IO response times of your API app and the cloud and your infrastructure, I'm afraid no one can give you a simple answer you can follow without experimentation with a stress test, and once you set up a stress test, you will get a number of your own anyway - better than anybody's estimate.
A suggestion, write a bit more about your problem, constraints you are working under etc and maybe someone can help you with a bit more radical solution. For some reason I get the feeling that what you're really looking for is a background processor like sidekiq or delayed job, or maybe connect your service to the DB directly through a DB view if you are anxoius to decouple your apps, or an nginx proxy for API responses, or nothing at all... but I really can't tell without more information.
I think it really depends how you want do define 'performance' and what your goal for your API is. Do you want to make sure no request to your API takes longer than 20msec to respond, than adding pagination would be a reasonable approach. Especially if the CSV generation is just an edge case, and the API is really built for other services. The number of items per page would then be limited by the speed at which you can deliver them. Your service would not be particularly more performant (even less so), since it needs to call the service multiple times.
Creating an async call (maybe with a webhook as callback) would be worth adding to your API if you think it is a valid use case for services to dump the whole record set.
Having said that, I think strictly speaking it is the job of the API to be quick and responsive. So maybe try to figure out how caching can improve response times, so paging through all the records is reasonable. On the other hand it is the job of the service to be mindful of the amount of calls to the API, so maybe store old records locally and only poll for updates instead of dumping the whole set of records each time.
I'm developing an app for iOS. I'd like to connect to a database to write users, passwords, some data related to them etc, so I discovered a thing called Parse. I've integrated it to my app and it works fine but I don't know if it's any limit (e.g. limit writes, limited in daily access... -I'm not paying any amount-).
Would you consider that it's better to program by myself a web service, an api, call it whatever, and make the write and read manually?
The biggest limits with Parse are their API requests per limit. They start out at 30/s free and the it gets more expensive from there. So if you're never going to have that many users/API requests then it can remain free for some time.
I have two Rails apps (one for web, and one for backend) accessing to the same PGSQL database. I would like to notify the other app if one app changes a table in the database.
How should I go about it?
I think this depends on:
How reliable you need it to be.
How fast you need the notifications to be delivered.
The FAYE solution suggested by #techvineet provides a good fast but unreliable option. (N.b. I don't mean it'll fail often, but it likely will occasionally, maybe 1/1000, If that causes you a problem, then avoid)
If you need something 100% reliable, and speed isn't important, you could write audit events to the database, and then poll that table from each app, if these are committed in the same transaction as the actual work is done, you should be safe... But it'll be as slow as your polling cycle.
Lastly, you if you want something fast AND reliable, then you could look at using something like ActiveMQ or RabbitMQ to give you reliable messaging between the applications to notify changes. You'll need a worker process in each app to listen to changes and deal with them appropriately.
My last comment would be that this 'smells' a little. The fact that you're trying to do this makes me think the architecture of your app might need looking at in the longer term. An obvious way of doing it might be to encapsulate all the business logic into an app which exposes an API, and then calling that API from both front and back end applications.
You can try using Faye http://faye.jcoglan.com/, which is a publish and subscribe messaging server. It can be integrated with Rails https://github.com/jamesotron/faye-rails.git. Messages can be transferred from one app to another by subscribing to the messages and publishing.
Hope this will help.
For web development I'd like to mix rails and node.js since I want to get the best out of both worlds (rails for fast web development and node for concurrency). I know that some people choose to just use full ruby stack with eventmachine that is integrated into rails controller so that every request can be nonblocking by using fiber in event-loop model. I have been able to understand how that works in a big picture.
At this moement however I want to try doing nonblocking request processing with rails and node.js with message queue concept. I heard that this can be achieved by using redis as an intermediary. I'm still having trouble trying to figure out how that works as of now. From what I can understand: so we have 2 apps A (rails) and B (node.js) and redis. rails app will handle requests from users that go through controllers in REST manner, and then from there rails will pass that through redis, and then redis will form queues and node.js app will pick up that queue and do whatever necessary afterhand (write or read from backend db).
My questions:
So how would that improve concurrency and scalability? from what i
know since rails handle the requests through controllers
synchronously, and then write to redis, the requests will be
blocking still, even though node.js end can pickup the queue
asynchronously. (I have a feeling that it's not asynchronous yet if it's not end to end
non-blocking).
Would node.js be considered a proxy or an application here if redis
is the intermediary?
I'm new to redis and learning it still. If I'm using 100% noSQL
solution for my backend database, such as mongoDB or couchDB, are they replaceable by redis entirely or is redis more seen as a
messaging queue tool like rabbitMQ?
Is messaging queue a different concurrency concept than threading or
event-loop model or is it supposed to supplement them?
That's all my question. I'm new to message queue concept. Will appreciate any help and pointers to right direction and articles that help me learn more. thanks.
You are mixing some things here that don't go together.
Let's first make sure we are on the same page regarding the strengths/weaknesses of the involved technologies
Rails: Used for it's web-development simplicity and perfect for serving database-backed web-applications.
Not very performant when having to serve a large number of long running requests as you'd run out of threads on your Ruby workers - but well suited for anything that can scale horizontally with more web-nodes (multiple web-servers - 1 db).
Node.js: Great for high-concurrency scenarios. Not as easy as rails to write a regular web-application in it. But can handle near an insane amount of long-running low-cpu tasks efficiently.
Redis: A Key-Value Store that supports operations on it's data-structures (increment/decrement values, append/prepent push/pop to lists - all operations that make this DB work consistently with multiple clients writing at once)
Now as you can see, there is no benefit in having Rails AND Node serve the same request - communicating through Redis. Going through the Rails Stack would not provide any benefit if the requests ends up being handled by the Node server.
And even if you only offload some processing to the node server, it's still the Rails webserver that handles the requests and has to wait for a response from node - killing the desired scalability. It simply makes no sense.
Where you would a setup with Node and Rails together is in certain areas of your app that have drastically different scaling requirements.
If you are for example writing a Website that displays live stats for Football games you can easily see that there are two different concerns in your app: The "normal" Site that contains signup, billing and profile stuff that screams for a quick implementation through rails. And the "live" portion of the site where users see live results and you expect to handle a lot of clients at once - all waiting for something to happen (low cpu - high concurrency).
In such a case it may be beneficial to actually seperate the two parts of the site into a Ruby and a Node app, with then sharing data about the user through a store like Redis (but actually you just need some shared state that both can look at and write to for synchronization purposes).
So you would use for example Rails for the Signup/Login portions - once signed up write the session cookie into redis alongside with the permissions of the user (what game is he allowed to follow) and hand the user off to the Node.js app.
There the Node app can read the session information from Redis and serve the user.
Word of advice:
You don't get scalability by simply throwing Node.js into your Toolbox. You really have to find out what Node.js is good at (low-cpu high-io concurrent operations) and how you can leverage that to remedy some of the problems your currently chosen technology has.
I can answer 3 for you. Redis does not guarantee that when you perform an operation that result will actually be on disk, also transaction handling it a bit "different". It also requires for the whole database to be in memory. Depending on the situation this can be an issue or not. It is however incredibly fast. It is not a messaging queue, you can easily make a queue out of it, but it is not it's purpose. If you want to have a queuing system only you can probably do better with something else.
I'm contemplating writing a web application with Rails. Each request made by the user will depend on an external API being called. This external API can randomly be very slow (2-3 seconds), and so obviously this would impact an individual request.
During this time when the code is waiting for the external API to return, will further user requests be blocked?
Just for further clarification as there seems to be some confusion, this is the model I'm anticipating:
Alice makes request to my web app. To fulfill this, a call to API server A is made. API server A is slow and takes 3 seconds to complete.
During this wait time when the Rails app is calling API server A, Bob makes a request which has to make a request to API server B.
Is the Ruby (1.9.3) interpreter (or something in the Rails 3.x framework) going to block Bob's request, requiring him to wait until Alice's request is done?
If you only use one single-threaded, non-evented server (or don't use evented I/O with an evented server), yes. Among other solutions using Thin and EM-Synchrony will avoid this.
Elaborating, based on your update:
No, neither Ruby nor Rails is going to cause your app to block. You left out the part that will, though: the web server. You either need multiple processes, multiple threads, or an evented server coupled with doing your web service requests with an evented I/O library.
#alexd described using multiple processes. I, personally, favor an evented server because I don't need to know/guess ahead of time how many concurrent requests I might have (or use something that spins up processes based on load.) A single nginx process fronting a single thin process can server tons of parallel requests.
The answer to your question depends on the server your Rails application is running on. What are you using right now? Thin? Unicorn? Apache+Passenger?
I wholeheartedly recommend Unicorn for your situation -- it makes it very easy to run multiple server processes in parallel, and you can configure the number of parallel processes simply by changing a number in a configuration file. While one Unicorn worker is handling Alice's high-latency request, another Unicorn worker can be using your free CPU cycles to handle Bob's request.
Most likely, yes. There are ways around this, obviously, but none of them are easy.
The better question is, why do you need to hit the external API on every request? Why not implement a cache layer between your Rails app and the external API and use that for the majority of requests?
This way, with some custom logic for expiring the cache, you'll have a snappy Rails app and still be able to leverage the external API service.