Question on on updating machine learning models live

Question on on updating machine learning models live - machine-learning

I have a model (loaded into memory) live in production that consumes messages/data from message queue for make a prediction. I have a separate process that retrains the model every few hours (necessary). What is the best way to trigger model to reload newly trained version into memory every-time retraining occurs? Currently I just have the production model reload on an interval or every 1000 messages.
I figured this would be easier if instead of a message queue I have a webserver. So I can just have an endpoint that can trigger reload.
It's hard to find best practices on this topic.

I've found a similar question here. Google App Engine: Automatically re-deploy once a day to update machine learning model?
The answers seem to suggest the best way would be to redeploy when training is complete. But I will likely have more models in this pipeline. redeploying on every retrain is not really feasible.

Related

Watching over SageMaker while it is training

I am using Amazon SageMaker to train a model with a lot of data.
This takes a lot of time - hours or even days. During this time, I would like be able to query the trainer and see its current status, particularly:
How many iterations it already did, and how many iterations it still needs to do? (the training algorithm is deep learning - it is based on iterations).
How much time does it need to complete the training?
Ideally, I would like to classify a test-sample using the model of the current iteration, to see its current performance.
One way to do this is to explicitly tell the trainer to print debug messages after each iteration. However, these messages will be availble only at the console from which I run the trainer. Since training takes so much time, I would like to be able to query the trainer status remotely, from different computers.
Is there a way to remotely query the status of a running trainer?

All logs are available in Amazon Cloudwatch. You can query CloudWatch programmatically or via an API to parse the logs.
Are you using built-in algorithms or a Framework like MXNet or TensorFlow? For TensorFlow you can monitor your job with TensorBoard.
Additionally, you can see high level job status using the describe training job API call:
import sagemaker
sm_client = sagemaker.Session().sagemaker_client
print(sm_client.describe_training_job(TrainingJobName='You job name here'))

Interval based API access and processing different DSL

Background
I'm currently working on a small Rails 5 project that needs to access and process an external API. There is a ruby wrapper gem available for the API, so accessing the data is not a problem.
Problem description
There are two parts of the equation that I am currently missing, and hoping someone out there can help me with.
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?

1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
The first problem you can solve using recurring tasks. The main idea is to run the process that will perform some operations every x minutes (or days or whatever fits your problem.
There are several tools that you can use. One of them is built-in the unix system and it is cron. You can read about it in system's manual. You can easily manage it using whenever gem. The main disadvantage is that you need an access to the system's cron which may be non-trivial on non-bare machines (for example Platform as a Service hosts such as Heroku).
You should also take a look at clockwork which does not rely on the system's cron. It uses approach where you have a separate process running all time and it keeps an eye on defined tasks.
In the second approach (having a separate process) you need to remember that time-consuming instructions may "lock" the process and postpone another tasks. In this case, you may want to use background processing such as sidekiq or delayed_job. The idea is to use one process for scheduling tasks at certain time and another process to process those tasks as soon as they appear in the queue.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
You need to create a client that will consume the API and map its responses into models that you have in your application. This way, you don't need to make your model's scheme dependent on the API scheme. Take a look at resource_kit gem - this is a sample solution that uses this approach.

HI hdauven,
processing the API every 15 minutes will affect your server performance,so done it by using sidekiq, it is a background job and use sidetiq it will help you to perform the task every 15 min automatically
You are accessing API, Then why are you worrying about different domain.

predictionio not producing any predictions

I am trying to test out prediction-io for the first time. I followed the installation instructions for linux and developed several test engines. After repeatedly getting the following error on my own datasets I decided to follow the movie 100k tutorial (https://github.com/PredictionIO/PredictionIO-Docs/blob/cbca03b1c2bad949db951a3a798f0080c48b3674/source/tutorials/movie-recommendation.rst). The same error seems to persist even though it seems as if my Hadoop is running correctly (and not in safe mode) and the engine says that it is running and training is complete. The error that I am getting is:
predictionio.ItemRecNotFoundError: request: GET
/engines/itemrec/movie-rec/topn.json {'pio_n': 10, 'pio_uid': '28',
'pio_appkey':
'UsZmneFir39GXO9hID3wDhDQqYNje4S9Ea3jiQjrpHFzHwMEqCqwJKhtAziveC9D'}
/engines/itemrec/movie-rec/topn.json?pio_n=10&pio_uid=28&pio_appkey=UsZmneFir39GXO9hID3wDhDQqYNje4S9Ea3jiQjrpHFzHwMEqCqwJKhtAziveC9D
status: 404 body: {"message":"Cannot find recommendation for user."}
The rest of the tutorial runs as expected, just no predictions ever seem to appear. Can someone please point me in the right direction on how to solve this issue?
Thanks!

Several suggestions:
Check if there is data in PredictioIO's database. I saw jobs failing because there was some items in database but no users and no user-to-item actions. Look into Mongo database appdata - there should be collections named users, items and u2iActions. These collections are only created when you add first user-item-u2iaction there via API. That's bad that it is not clear whether job completed successfully or not via the web interface.
Check logs - PredictionIO logs, and Hadoop logs if you use Hadoop jobs. See if model training jobs did complete (BTW, did you invoke "Train prediction model now" via web interface?)
Verify if there is some data in predictionio_modeldata for your algorithm.
Well, even if model is trained OK, there can still be not enough data to produce recommendations for some user. Try "Random" to get the simplest recommendations available for all, to check if system as a whole works.

Scaling Dynos with Heroku

I've currently got a ruby on rails app hosted on Heroku that I'm monitoring with New Relic. My app is somewhat laggy when using it, and my New Relic monitor shows me the following:
Given that majority of the time is spent in Request Queuing, does this mean my app would scale better if I used an extra worker dynos? Or is this something that I can fix by optimizing my code? Sorry if this is a silly question, but I'm a complete newbie, and appreciate all the help. Thanks!
== EDIT ==
Just wanted to make sure I was crystal clear on this before having to shell out additional moolah. So New Relic also gave me the following statistics on the browser side as you can see here:
This graph shows that majority of the time spent by the user is in waiting for the web application. Can I attribute this to the fact that my app is spending majority of its time in a requesting queue? In other words that the 1.3 second response time that the end user is experiencing is currently something that code optimization alone will do little to cut down? (Basically I'm asking if I have to spend money or not) Thanks!

Request Queueing basically means 'waiting for a web instance to be available to process a request'.
So the easiest and fastest way to gain some speed in response time would be to increase the number of web instances to allow your app to process more requests faster.
It might be posible to optimize your code to speed up each individual request to the point where your application can process more requests per minute -- which would pull requests off the queue faster and reduce the overall request queueing problem.
In time, it would still be a good idea to do everything you can to optimize the code anyway. But to begin with, add more workers and your request queueing issue will more than likely be reduced or disappear.
edit
with your additional information, in general I believe the story is still the same -- though nice work in getting to a deep understanding prior to spending the money.
When you have request queuing it's because requests are waiting for web instances to become available to service their request. Adding more web instances directly impacts this by making more instances available.
It's possible that you could optimize the app so well that you significantly reduce the time to process each request. If this happened, then it would reduce request queueing as well by making requests wait a shorter period of time to be serviced.
I'd recommend giving users more web instances for now to immediately address the queueing problem, then working on optimizing the code as much as you can (assuming it's your biggest priority). And regardless of how fast you get your app to respond, if your users grow you'll need to implement more web instances to keep up -- which by the way is a good problem since your users are growing too.
Best of luck!

I just want to throw this in, even though this particular question seems answered. I found this blog post from New Relic and the guys over at Engine Yard: Blog Post.
The tl;dr here is that Request Queuing in New Relic is not necessarily requests actually lining up in the queue and not being able to get processed. Due to how New Relic calculates this metric, it essentially reads a time stamp set in a header by nginx and subtracts it from Time.now when the New Relic method gets a hold of it. However, New Relic gets run after any of your code's before_filter hooks get called. So, if you have a bunch of computationally intensive or database intensive code being run in these before_filters, it's possible that what you're seeing is actually request latency, not queuing.
You can actually examine the queue to see what's in there. If you're using Passenger, this is really easy -- just type passenger status on the command line. This will show you a ton of information about each of your Passenger workers, including how many requests are sitting in the queue. If you run with preceded with watch, the command will execute every 2 seconds so you can see how the queue changes over time (so just execute watch passenger status).
For Unicorn servers, it's a little bit more difficult, but there's a ruby script you can run, available here. This script actually examines how many requests are sitting in the unicorn socket, waiting to be picked up by workers. Because it's examining the socket itself, you shouldn't run this command any more frequently than ~3 seconds or so. The example on GitHub uses 10.
If you see a high number of queued requests, then adding horizontal scaling (via more web workers on Heroku) is probably an appropriate measure. If, however, the queue is low, yet New Relic reports high request queuing, what you're actually seeing is request latency, and you should examine your before_filters, and either scope them to only those methods that absolutely need them, or work on optimizing the code those filters are executing.
I hope this helps anyone coming to this thread in the future!

Is Amazon SQS the right choice here? Rails performance issue [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm close to releasing a rails app with the common networking features (messaging, wall, etc.). I want to use some kind of background processing (most likely Bj) for off-loading tasks from the request/response cycle.
This would happen when users invite friends via email to join and for email notifications.
I'm not sure if I should just drop these invites and notifications in my Database, using a model and then just process it with a worker process every x minutes or if I should go for Amazon SQS, storing the messages and invites there and let my worker retrieve it from Amazon SQS for processing (sending the invites / notifications).
The Amazon approach would get load off my Database but I guess it is slower to retrieve messages from there.
What do you think?

Your title states that you have a Rails performance issue, but do you know this for certain? From the rest of your question it sounds like you're trying to anticipate a possible future performance issue. The only way to deal sensibly with performance issues is to get your application into the wild and profile it. Doing so will give you empirical data as to what the real performance issues are.
Given that Amazon SQS isn't free and the fact that using it will almost certainly add complexity to your application, I would migrate to it if and when database load becomes a problem. Don't try to second guess problems before they arise, because you'll find that you'll likely face different problems when your app goes live, some of which you probably haven't considered.
The main point is that you've already decided to use background processing, which is the correct decision, given that any sort of processing that isn't instantaneous doesn't belong within the Rails' request/response cycle, as it blocks that Rails process. You can always scale with Amazon later if you need to.

Is your app hosted on Amazon EC2 already? I probably wouldn't move an existing app over to AWS just so I can use SQS, but if you're already using Amazon's infrastructure, SQS ia a great choice. You could certainly set up your own messaging system (such as RabbitMQ), but by going with SQS that's one less thing you have to worry about.
There are a lot of options to add background processing to Rails apps, such as delayed_job or background_job, but my personal favorite is Workling. It gives you a nice abstraction layer that allows you to plug in different background runners without having to change the actual implementation of your jobs.
I maintain a Workling fork that adds an SQS client. There are some shortcomings (read the comments or my blog post for more details), but overall it worked well for us at my last startup.
I've also used SQS for a separate Ruby (non-Rails) project and generally found it reliable and fast enough. Like James pointed out above, you can read up to 10 messages at once, so you'll definitely want to do that (my Workling SQS client does this and buffers the messages locally).

I agree with John Topley that you don't want to over-complicate your application if you don't need to. That being said there are times when it is good to make this kind of decision early, do you anticipate high load from the beginning? Are you rolling this out to an existing user base or is it a public site that may or may not take off?
If you know you will need to handle a large amount of traffic from the beginning then this might be a good step. If you don't want to spend the money to use SQS take a look at some of the free queue solutions out there like RabbitMQ.
I currently push a couple million messages a month through SQS and it works pretty well. Make sure you plan for it being down or slow from time to time, so you would need to work in some retry facilities and exponential backoff. One of the nice things is that you can get 10 messages at a time which speeds up being able to work through the queue, you can use one request to get the 10 messages and process them 1 by 1.

Amazon SQS is a fine service, except where the following things become important:
Performance
Legal
Acknowledgments and Transactions
Messaging Idioms Message Properties
Security, Authenticity and Queue
Permissions
If any of these things are important you need to look at a real enterprise MQ service such as StormMQ, RabbitMQ, or even onlinemq.com.
I found this blog series interesting as it compares Amazon SQS to StormMQ without holding any punches back:
http://blog.stormmq.com/2011/01/06/apples-and-oranges-performance/

If you have issues with moving to EC2, You can use other services like onlinemq.com.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart