How to edit .sched in minix to to keep track of how much CPU time each user process has had recently? - minix

I am stuck on this problem from class and really have no clue where to start or how to do it:
Book, p219 Question 45
Modify the MINIX 3 Scheduler to keep track of how much CPU time each user process has had recently. When no task or server wants to run, pick the user process that has had the smallest share of the CPU.

Related

Rails & Heroku: How many workers/dynos do I need

I have a tinder style app that allows users to rate events. After a user rates an Event, a background resque job runs that re-ranks the other events based on user's feedback.
This background job takes about 10 seconds and it runs about 20 times a minute per user.
Using a simple example. If I have 10 users using the app at any given time, and I never want a job to be waiting, what's the optimal way to do this?
I'm confused about Dynos, resque pools, and redis connections. Can someone help me understand the difference? Is there a way to calculate this?
Not sure you're asking the right question. Your real question is "how can I get better performance?" Not "how many dynos?" Just adding dynos won't necessarily give you better performance. More dynos give you more memory...so if your app is running slowly because you're running out of available memory (i.e. you're running on swap), then more dynos could be the answer. If those jobs take 10 seconds each to run, though...memory probably isn't your actual problem. If you want to monitor your memory usage, check out a visualization tool like New Relic.
There are a lot of approaches to solving your problem. But I would start with the code that your wrote. Posting some code on SO might help understand why that job takes 10 seconds (Post some code!). 10 seconds is a long time. So optimizing the queries inside that job would almost surely help.
Another piece of low hanging fruit...switch from resque to sidekiq for your background jobs. Really easy to use. You'll use less memory and should see an instant bump in performance.
Dynos: These are individual virtual/physical servers. Think of them as being the same as EC2 instances.
Redis Connections: Individual connections to the Redis Instance.
Resque Pool: A gem that allows you to run workers concurrently on the same dyno/instance.
First of all, it’s worth looking for ways in which you can improve the performance of the job itself. You might be able to get it below ten seconds by using low level model caching or optimizing your algorithm.
In terms of working out how many workers you would need, you’ll need to take the number runs per minute (20) times the number of seconds it takes to run (10) times the number of users (10). That will give you the number of seconds per minute it would take to run on one worker. 20 * 10 * 10 = 2000. Divide that by 60 and you have the number of minutes per minute, 33.3. So if you had 34 workers, and these numbers were all consistent, they should be able to keep on top of things.
That said, you shouldn’t be in a position where you need to run 36 or more dynos for just 10 concurrent users for a ranking algorithm. That’s going to get expensive very quickly.
Optimise your algorithm, try to add more caching, and give Sidekiq a try too. In my experience, Sidekiq can process a queue up to 10 times faster than Resque. It depends what your job is doing, and how you utilize each tool, but it's worth checking out. See Sidekiq vs Resque.
Re-ranking other events is a bad idea.
You should consider having total_points and average_points columns for events table and let the ranks be decided by order by queries. Like this.
class Event
has_many :feedbacks
scope :rank_by_total, -> { order(:total_points) }
scope :rank_by_average, -> { order(:average_points) }
end
class Feedback
belongs_to :event
after_create :update_points
def update_points
total = event.feedbacks.sum(:points)
avg = event.feedbacks.average(:points)
event.update(total_points: total, average_points: avg)
end
end
So, How many workers/dynos do you need?
You don't need to worry about dyno or worker for this problem. No matter how many dynos with higher processing power you use, your solution will take good amount of time when your events table becomes huge. So try changing your solution the way I have described.

New Relic's pings not improving cold start

There's a similar question about app harbor on StackOverflow, but the user didn't try to use new relic to overcome the problem.
I deployed my ASP.NET MVC project on App Harbor. It's very easy to configure and you can even set automatic deployments from Git. However, as my website is still mainly used only by me, I was getting very long cold starts (over 15 secs). To avoid it, I installed New Relic. The idea was to simultaneously to monitor the application but also to create periodic pings that, according to "a lot of people", would drastically reduce the loading time.
It's not working. I have New Relic correctly pinging my application every minute, but I still get very long cold starts. For instance, 5 min ago, I've got a cold start of 16 seconds. 1 minute after, I got the page loaded in less than a second.
I know I could have used Pingdom or StillAlive to achieve the same result:
How do I improve app performance on AppHarbor?
I wouldn't like to do it because I like New Relic and I don't want to have a lot of add-on's on app harbor as they will slow down my website. Do you have any idea what might be causing it?
I'm not familiar with AppHarbor's setup. But if it's using IIS, the pinging is just keeping the application pool from reaching the idle timeout. But there the default IIS setting for the application pool to be recycled every 29 hours no matter the number of requests. And it's normally in the best interest to let it recycle once in a while, so working around it may not be in your best interest.
Your best bet is to reduce the number of things happening on application start. Precompiling your views is a good place to start. And heck, Stack Exchange/Stack Overflow precompiles views to avoid the application start up cost.

Scaling Dynos with Heroku

I've currently got a ruby on rails app hosted on Heroku that I'm monitoring with New Relic. My app is somewhat laggy when using it, and my New Relic monitor shows me the following:
Given that majority of the time is spent in Request Queuing, does this mean my app would scale better if I used an extra worker dynos? Or is this something that I can fix by optimizing my code? Sorry if this is a silly question, but I'm a complete newbie, and appreciate all the help. Thanks!
== EDIT ==
Just wanted to make sure I was crystal clear on this before having to shell out additional moolah. So New Relic also gave me the following statistics on the browser side as you can see here:
This graph shows that majority of the time spent by the user is in waiting for the web application. Can I attribute this to the fact that my app is spending majority of its time in a requesting queue? In other words that the 1.3 second response time that the end user is experiencing is currently something that code optimization alone will do little to cut down? (Basically I'm asking if I have to spend money or not) Thanks!
Request Queueing basically means 'waiting for a web instance to be available to process a request'.
So the easiest and fastest way to gain some speed in response time would be to increase the number of web instances to allow your app to process more requests faster.
It might be posible to optimize your code to speed up each individual request to the point where your application can process more requests per minute -- which would pull requests off the queue faster and reduce the overall request queueing problem.
In time, it would still be a good idea to do everything you can to optimize the code anyway. But to begin with, add more workers and your request queueing issue will more than likely be reduced or disappear.
edit
with your additional information, in general I believe the story is still the same -- though nice work in getting to a deep understanding prior to spending the money.
When you have request queuing it's because requests are waiting for web instances to become available to service their request. Adding more web instances directly impacts this by making more instances available.
It's possible that you could optimize the app so well that you significantly reduce the time to process each request. If this happened, then it would reduce request queueing as well by making requests wait a shorter period of time to be serviced.
I'd recommend giving users more web instances for now to immediately address the queueing problem, then working on optimizing the code as much as you can (assuming it's your biggest priority). And regardless of how fast you get your app to respond, if your users grow you'll need to implement more web instances to keep up -- which by the way is a good problem since your users are growing too.
Best of luck!
I just want to throw this in, even though this particular question seems answered. I found this blog post from New Relic and the guys over at Engine Yard: Blog Post.
The tl;dr here is that Request Queuing in New Relic is not necessarily requests actually lining up in the queue and not being able to get processed. Due to how New Relic calculates this metric, it essentially reads a time stamp set in a header by nginx and subtracts it from Time.now when the New Relic method gets a hold of it. However, New Relic gets run after any of your code's before_filter hooks get called. So, if you have a bunch of computationally intensive or database intensive code being run in these before_filters, it's possible that what you're seeing is actually request latency, not queuing.
You can actually examine the queue to see what's in there. If you're using Passenger, this is really easy -- just type passenger status on the command line. This will show you a ton of information about each of your Passenger workers, including how many requests are sitting in the queue. If you run with preceded with watch, the command will execute every 2 seconds so you can see how the queue changes over time (so just execute watch passenger status).
For Unicorn servers, it's a little bit more difficult, but there's a ruby script you can run, available here. This script actually examines how many requests are sitting in the unicorn socket, waiting to be picked up by workers. Because it's examining the socket itself, you shouldn't run this command any more frequently than ~3 seconds or so. The example on GitHub uses 10.
If you see a high number of queued requests, then adding horizontal scaling (via more web workers on Heroku) is probably an appropriate measure. If, however, the queue is low, yet New Relic reports high request queuing, what you're actually seeing is request latency, and you should examine your before_filters, and either scope them to only those methods that absolutely need them, or work on optimizing the code those filters are executing.
I hope this helps anyone coming to this thread in the future!

Running Jobs when DB is free on Ruby on Rails Heroku

I have a ruby on rails app that uses Heroku. I have the need to run things like import/export tasks on our db that lock up the whole system since they are so heavy on the DB. Is there a way to tell the system to only run these tasks when the database is not being used at that second?
There is no built-in way to schedule a job like this. There are a few things you can do, though.
Schedule the jobs to run during the least busy hours of the day. That will depend on your business, customer base and so on, but hopefully there is a window that is more suitable than others.
You could write your batch job to run for a longer time, doing small units of work. Between each unit of work, sleep for a few seconds, or take a look at the current load average and decide what to do based on that. This should lower the impact of the batch jobs.
Have the website update a "lock" somewhere, either in the database or in a memcached or something. If your normal website usage updates the database, you could look at the existing updated_at. Then only do batch work when there hasn't been any activity for a while. This doesn't guarantee that a new user won't pop in at the same time your batch job runs, of course, but could be a way to find a window where the site is less used.
Have you looked into using Background Jobs / Workers on Heroku? It's also worth reading about Heroku's Delayed Job queuing system

Processes timing

How to calculate the total execution time of each application Win (process) with the group on day.
For example:
the process - notepad.exe - 10 minutes today (in total)
I would use a WMI query against Win32_Process and retrieve the CreationDate value which according to MSDN is the date (and time) a process began executing.
A good library for Delphi is available from Magenta Systems. This library includes several examples to help you get started.
This only tracks currently running processes. If your wanting to track processes which were run once but are no longer running, then you will want to hook into windows so you are notified at each application launch. One example of this would be to use the CBT hooks (computer based training, but its used for other things also) which allow you to get notified every time a window is created. If you use the window handle to then find the parent process, you can then use this to track for yourself how long the parent process is running.
The Win32 function GetProcessTimes is an alternative way of getting hold of the creation time and CPU time spent during the day. With execution time I assume you mean cpu time rather than wall clock time.
Here is a link to a blog post describing how to use GetProcessTimes in Delphi.
This is a more "direct" approach than using WMI, however WMI has other advantages such as being able to query other machines over the network.
As skamradt also suggests, this application needs to run continuously to track all starting and stopping processes.The only way I know of to get events on starting and stopping applications is to use WMI events. See this question for some more info on that as well as this Technet blog on scripting
The steps to do this would be to
enumerate all running processes covered by this question
and then call GetProcessTimes for each process as described here.

Resources