How can I reduce the waiting (ttfb) time - developer-tools

I have a query which involves getting a list of user from a table in sorted order based on at what time it was created. I got the following timing diagram from the chrome developer tools.
You can see that TTFB (time to first byte) is too high.
I am not sure whether it is because of the SQL sort. If that is the reason then how can I reduce this time?
Or is it because of the TTFB. I saw blogs which says that TTFB should be less (< 1sec). But for me it shows >1 sec. Is it because of my query or something else?
I am not sure how can I reduce this time.
I am using angular. Should I use angular to sort the table instead of SQL sort? (many posts say that shouldn't be the issue)
What I want to know is how can I reduce TTFB. Guys! I am actually new to this. It is the task given to me by my team members. I am not sure how can I reduce TTFB time. I saw many posts, but not able to understand properly. What is TTFB. Is it the time taken by the server?

The TTFB is not the time to first byte of the body of the response (i.e., the useful data, such as: json, xml, etc.), but rather the time to first byte of the response received from the server. This byte is the start of the response headers.
For example, if the server sends the headers before doing the hard work (like heavy SQL), you will get a very low TTFB, but it isn't "true".
In your case, TTFB represents the time you spend processing data on the server.
To reduce the TTFB, you need to do the server-side work faster.

I have met the same problem. My project is running on the local server. I checked my php code.
$db = mysqli_connect('localhost', 'root', 'root', 'smart');
I use localhost to connect to my local database. That maybe the cause of the problem which you're describing. You can modify your HOSTS file. Add the line
127.0.0.1 localhost.

TTFB is something that happens behind the scenes. Your browser knows nothing about what happens behind the scenes.
You need to look into what queries are being run and how the website connects to the server.
This article might help understand TTFB, but otherwise you need to dig deeper into your application.

If you are using PHP, try using <?php flush(); ?> after </head> and before </body> or whatever section you want to output quickly (like the header or content). It will output the actually code without waiting for php to end. Don't use this function all the time, or the speed increase won't be noticable.
More info

I would suggest you read this article and focus more on how to optimize the overall response to the user request (either a page, a search result etc.)
A good argument for this is the example they give about using gzip to compress the page. Even though ttfb is faster when you do not compress, the overall experience of the user is worst because it takes longer to download content that is not zipped.

Related

What happens when Rails.cache.fetch expires

I'm generating json of 65,000 users to populate a typeahead. The query is quick, turns out building the json was the bottleneck. I'm trying to cache the result but what happens when the cache expires, does it rebuild it automatically or is it going to wait until someone triggers the call, resulting in a 9-second page load once every 12-hours?
def user_json
Rails.cache.fetch("users", expires_in: 12.hours) do
User.all.to_json
end
end
If you did not want to hit the database each time then you could look in to a solution such as elastic search or sphinx which are designed to perform quick searching like your describing.
I was listening to javascript jabber this morning and they where saying that the average web page is now a shade under 2mb including images and CSS. Your request doubles that size. While that's fine for north Americans your page is likely to feel much slower in internet backwaters such as Australia.
It's also worth noting that older browsers such as IE don't handle iteration in javascript too well. I would suggest that your application would crash in any IE pre version 9.
Because of these reasons I would avoid pushing JSON that contains 65,000 rows over the wire and in to the browser. If the query is quick why not do a trip back to the server each time the user changes the input. Many trips back to the server based on input would be quicker than sending all 65,000 records and in the process removes the entire class of problems I have described above. Your original problem also goes away as you don't have to cache any responses any more.

When/what to cache in Rails 3

Caching is something that I kind of ignored for a long time, as projects that I worked on were on local intranets with very little activity. I'm working on a much larger Rails 3 personal project now, and I'm trying to work out what and when I should cache things.
How do people generally determine this?
If I know a site is going to be relatively low-activity, should I just cache every single page?
If I have a page that calls several partials, is it better to do fragment caching in those partials, or page caching on those partials?
The Ruby on Rails guides did a fine job of explaining how caching in Rails 3 works, but I'm having trouble understanding the decision-making process associated with it.
Don't ever cache for the sake of it, cache because there's a need (with the exception of something like the homepage, which you know is going to be super popular.) Launch the site, and either parse your logs or use something like NewRelic to see what's slow. From there, you can work out what's worth caching.
Generally though, if something takes 500ms to complete, you should cache, and if it's over 1 second, you're probably doing too much in the request, and you should farm whatever you're doing to a background process…for example, fetching a Twitter feed, or manipulating images.
EDIT: See apneadiving's answer too, he links to some great screencasts (albeit based on Rails 2, but the theory is the same.)
You'll want to think about caching several kinds of things:
Requests that are hit a lot, and seldom change
Requests that are "expensive" to draw, lots of database calls, etc. Also hopefully these seldom change.
The other side of caching that shouldn't go without mention, is expiration. Its also often the harder part. You have to know when a cache is no longer good, and clear it out so fresh content will be generated. Sweepers, or Observers, depending on how you implement your cache can help you with this. You could also do it just based on a time value, allow caches to have a max-age and clear them after that no matter what.
As for fragment vs full page caching, think of it in terms of how often those parts are updated. If 3 partials of a page are never updated, and one is, maybe you want to cache those 3, and allow that 1 to be fetched live for so you can have up to the second accuracy. Or if the different partials of a page should have different caching rules: maybe a "timeline" section is cached, but has a cache-age of 1 minute. While the "friends" partial is cached for 12 hours.
Hope this helps!
If the site is relatively low activity you shouldn't cache any page. You cache because of performance problems, and performance problems come about because you have too much data to query, too many users, or worse, both of those situations at the same time.
Before you even think about caching, the first thing you do is look through your application for the requests that are taking up the most time. Not the slowest requests, but the requests your application spends the most aggregate time performing. That is if you have a request A that runs 10 times at 1500ms and request B that runs 5000 times at 250ms you work on optimizing B first.
It's actually pretty easy to grep through your production.log and extract rendering times and URLs to combine them into a simple report. You can even do that in real-time if you want.
Once you've identified a problematic request, you go about picking apart what it's doing to service the request. The first thing is to look for any queries that can be combined by using eager loading or by looking ahead a bit more to anticipate what you'll need. The next thing is to ensure you're not loading data that isn't used.
So many times you'll see code to list users and it's loading 50KB per person of biographical data, their Facebook and Twitter handles, literally everything about them, and all you use is their name.
Fetch as little as you need, and fetch it in the most efficient way you can. Use connection.select_rows when you don't need models.
The next step is to look at what kind of queries you're running, and how they're under-performing. Ensure your indexes are all set properly and are being used. Check that you're not doing complicated JOIN operations that could be resolved by a bit of tactical de-normalization.
Have a look at what data you are storing in your application, and try and find things that can be removed from your production database and warehoused somewhere else. Cycle your data out regularly when it's no longer relevant, preserve it in a separate database if you need to.
Then go over and have a look at how your database server is tuned. Does it have sufficiently large buffers? Is it on hardware that could be upgraded with more memory at a nominal cost? Too many people are running a completely un-tuned database server and with a few simple settings they can get ten-fold performance increases.
If, and only if, you still have a performance problem at this point then you might want to consider caching.
You know why you don't cache first? It's because once you cache something, that cached data is immediately stale. If parts of your application use this data under the assumption it's always up to date, you will have problems. If you don't expire this cache when the data does change, you will have problems. If you cache the data and never use it again, you're just clogging up your cache and you will have problems. Basically you'll have lots of problems when you use caching, so it's often a last resort.

Anybody using detrusion.com, web application firewall for ruby on rails

PS: I was doing to some random search and then I got detrusion.com.
Whats this web application firewall ?
How it works ?
Any performance hit, if yes then how much?
Should I use this destruction.com or anything else better available.
Anybody??
I quickly glanced at the code and it doesnt appear to be doing all that much. Basically it maintains a white and black list of IPs. While it cannot be that much of a crazy performance hit you'd probably be better off doing this kind of request analyzing in a Rack middleware, that is before it even gets to the Rails request handling.
That being said, I dont like the fact that it will re-sync every 5 minutes DURING processing a given request. That is, it will block the current request while it re-syncs its ruleset / and lists. Which means that you're at the mercy of the Detrusion.com team to keep their site/API up. So when they go down you go down.
While its not as real-timey, I'd feel more comfortable to have the updating process be out of bound. Maybe you store the rules/lists in a flat file or a local DB (Redis would be perfect) which you load on app start. Then you have a frequent cron which reloads the ruleset from Detrusion and writes it locally.
Something like that. Just anything to de-couple your request handling from a Detrusion API check.

Data logged to a file; how do I rotate logs and how do I parse the data to not have 'gaps' in the data?

I've got a web application that, for performance reasons, throws any data sent into a logfile.
I've got two concerns with this approach:
How do I best rotate logs, in order to not lose data?
For each user session multiple requests are logged. Each request has a unique id so there is an easy way for me to tie the requests to the session. The problem is, however, that if I rotate the logs I risk ending up with one request in one log and another request in another log.
How do I arrange my parsing in a way that allows me to parse all requests from a given session? I am willing to define a session timelimit, for example that the requests must, at maximum be 30 minutes apart.
If I had a hourly log rotation at 00 minutes:
What if the user made one request at 13:59 and one at 14:01 - The user would end up having requests in two different logs.
Answer to part 1: If you're on *nix, use syslog/logger. Check the logger(1) and syslog.conf(5) man pages.
Answer to part 2: You're not forced to look at just one log file at a time. less ${SERVICE}* will normally open all the relevant log files together: when you get to the bottom of a page, :n will move you to the next file and :p back.
Alternatively, use a log analyser program. Steve Kemp's post on promptly finding needles in syslog haystacks covers, together with its comments, a lot of ground.

Storing Data In Memory: Session vs Cache vs Static

A bit of backstory: I am working on an web application that requires quite a bit of time to prep / crunch data before giving it to the user to edit / manipulate. The data request task ~ 15 / 20 secs to complete and a couple secs to process. Once there, the user can manipulate vaules on the fly. Any manipulation of values will require the data to be reprocessed completely.
Update: To avoid confusion, I am only making the data call 1 time (the 15 sec hit) and then wanting to keep the results in memory so that I will not have to call it again until the user is 100% done working with it. So, the first pull will take a while, but, using Ajax, I am going to hit the in-memory data to constantly update and keep the response time to around 2 secs or so (I hope).
In order to make this efficient, I am moving the intial data into memory and using Ajax calls back to the server so that I can reduce processing time to handle the recalculation that occurs w/ this user's updates.
Here is my question, with performance in mind, what would be the best way to storing this data, assuming that only 1 user will be working w/ this data at any given moment.
Also, the user could potentially be working in this process for a few hours. When the user is working w/ the data, I will need some kind of failsafe to save the user's current data (either in a db or in a serialized binary file) should their session be interrupted in some way. In other words, I will need a solution that has an appropriate hook to allow me to dump out the memory object's data in the case that the user gets disconnected / distracted for too long.
So far, here are my musings:
Session State - Pros: Locked to one user. Has the Session End event which will meet my failsafe requirements. Cons: Slowest perf of the my current options. The Session End event is sometimes tricky to ensure it fires properly.
Caching - Pros: Good Perf. Has access to dependencies which could be a bonus later down the line but not really useful in current scope. Cons: No easy failsafe step other than a write based on time intervals. Global in scope - will have to ensure that users do not collide w/ each other's work.
Static - Pros: Best Perf. Easies to maintain as I can directly leverage my current class structures. Cons: No easy failsafe step other than a write based on time intervals. Global in scope - will have to ensure that users do not collide w/ each other's work.
Does anyone have any suggestions / comments on what I option I should choose?
Thanks!
Update: Forgot to mention, I am using VB.Net, Asp.Net, and Sql Server 2005 to perform this task.
I'll vote for secret option #4: use the database for this. If you're talking about a 20+ second turnaround time on the data, you are not going to gain anything by trying to do this in-memory, given the limitations of the options you presented. You might as well set this up in the database (give it a table of its own, or even a separate database if the requirements are that large).
I'd go with the caching method of for storing the data across any page loads. You can name the cache you want to store the data in to avoid conflicts.
For tracking user-made changes, I'd go with a more old-school approach: append to a text file each time the user makes a change and then sweep that file at intervals to save changes back to DB. If you name the files based on the user/account or some other session-unique indicator then there's no issue with conflict and the app (or some other support app, which might be a better idea in general) can sweep through all such files and update the DB even if the session is over.
The first part of this can be adjusted to stagger the write out more: save changes to Session, then write that to file at intervals, then sweep the file at larger intervals. you can tune it to performance and choose what level of possible user-change loss will be possible.
Use the Session, but don't rely on it.
Simply, let the user "name" the dataset, and make a point of actively persisting it for the user, either automatically, or through something as simple as a "save" button.
You can not rely on the session simply because it is (typically) tied to the users browser instance. If they accidentally close the browser (click the X button, their PC crashes, etc.), then they lose all of their work. Which would be nasty.
Once the user has that kind of control over the "persistent" state of the data, you can rely on the Session to keep it in memory and leverage that as a cache.
I think you've pretty much just answered your question with the pros/cons. But if you are looking for some peer validation, my vote is for the Session. Although the performance is slower (do you know by how much slower?), your processing is going to take a long time regardless. Do you think the user will know the difference between 15 seconds and 17 seconds? Both are "forever" in web terms, so go with the one that seems easiest to implement.
perhaps a bit off topic. I'd recommend putting those long processing calls in asynchronous (not to be confused with AJAX's asynchronous) pages.
Take a look at this article and ping me back if it doesn't make sense.
http://msdn.microsoft.com/en-us/magazine/cc163725.aspx
I suggest to create a copy of the data in a new database table (let's call it EDIT) as you send the initial results to the user. If performance is an issue, do this in a background thread.
As the user edits the data, update the table (also in a background thread if performance becomes an issue). If you have to use threads, you must make sure that the first thread is finished before you start updating the rows.
This allows a user to walk away, come back, even restart the browser and commit whenever she feels satisfied with the result.
One possible alternative to what the others mentioned, is to store the data on the client.
Assuming the dataset is not too large, and the code that manipulates it can be handled client side. You could store the data as an XML data island or JSON object. This data could then be manipulated/processed and handled all client side with no round trips to the server. If you need to persist this data back to the server the end resulting data could be posted via an AJAX or standard postback.
If this does not work with your requirements I'd go with just storing it on the SQL server as the other comment suggested.

Resources