I'm considering using SignalR to keep persistent (COMET) connections with my .Net server in a project where I need to update a client-side graph. I'm considering Flot of the graphing portion, but am curious how possible it is to display a "live graph" in this manner. Is Flot a good choice for this? I would like the server to be able to push new data to the graph and have it append to the existing data, as it becomes available.
I haven't found any examples of doing this, so am wondering if there is some difficulty in doing this that I am not anticipating.
Flot and Highcharts, the two I'm most familiar with, let you redraw the data as long as the axes and grid stay the same. They are pretty efficient in that case.
To use flot to append data to a continuous graph, you will end up just redrawing the whole graph all the time. In any modern browser (heck, even IE7), as long as you keep the number of points reasonable, the performance will be totally acceptable. I have pages with 4-6 flot graphs, updating every second, each having ~3-5 datapoints per second, with up to 5 minutes of data (so ~1000 datapoints per graph, 4000 points in total on the page). This is achieved with no lag, even on a low-powered machine.
I have not seen any libraries for managing this type of thing over top of flot, so I ended up doing my own caching.
I think the only "gotcha" you'll run into is making sure you don't let your memory usage run out of control. The first couple attempts I made at this, if you left the graph running overnight, you would come back to 4GB of memory used. Make sure you properly remove old data, and don't keep references to replaced graphs and AJAX requests.
Related
I have a simple jquery which calls a servlet via get and then Neo4j is used to return data in JSON format.
The system is workable after the FIRST query but the very first time it is used the system ins unbelievably slow. This is some kind of initialisation issue. I am using Heroku web hosting.
The code is fairly long so I am not posting now, but are there any known issues regarding the first invocation of Neo4j?
I have done limited testing so far for performance as I had a lot of JSON problems anyway and they only just got resolved.
Summary:
JQuery(LINUX)<--> get (JSON) <---> Neo4j
First Query - response is 10-20 secs
Second Query - time is 2-3 secs
More queries - 2/3 secs.
This is not a one-off; I tested this a few times and always the same pattern comes up.
This is a normal behaviour of Neo4j where store files are mapped into memory lazily for parts of the files that become hot, and becoming hot requires perhaps thousands of requests to such a part. This is a behaviour that has big stores in mind, whereas for smaller stores it merely gets in the way (why not map the whole thing if it fits in memory?).
Then on top of that is an "object" cache that further optimizes access, that get populated lazily for requested entities.
Using an SSD instead of spinning media will usually speed up the initial non-memory-mapped random access quite a bit, but in your scenario I recognize that's not viable.
There are thoughts on beeing more sensitive to hot parts of the store (i.e. memory map even if not as hot) at the start of a database lifecycle, or more precisely have the heat sensitivity be a function of how much is currently memory mapped versus how much can be mapped at maximum. This has shown to make initial requests much more responsive.
I'm currently building a little admin platform with statistics and graphs with highchart and highstock. Right now the graphs are always fetching the data from the database everytime the load. But since the data will grow substantially in the future this is very inefficient and slows the database down. My question now is what the best approach it would be to store or precompute the data so the graphs don't have to fetch it from the database everytime they load.
If the amount of data is going to be really huge and you can sacrifice some realtime-ish manner of presenting it, the best way would be to compute the data for charts in some seperate database table and show the charts out of it. You can setup a background process (using whenever or delayed_job or whatever you like) to periodically update the pre-processed data table with fresh values.
Another option would be caching the chart response by any means you like (using built-in Rails caching, writing your custom cache etc) to deliver the same data to a big number of users with reduced DB hit.
However, in general, preprocessing seems to be the winner as stats tables usually contain very "sparse" data which can be pre-computed to a much smaller set to display on a chart yet having the option to apply some filtering / sorting if needed.
EDIT just forgot to mention there might be some room for optimization on the database side. For instance, if you can limit the periods the user is able to view data for (= cap the amount to be queried per used), than proper indexing and DB setup can deliver substantial performance without man-in-the middle things like caching or precomputing.
I'm working on web app (Rails 3 based). And I really don't like the time it takes to generate the page - depending on the displayed data it takes up to 2.5 and even 4 seconds.
So I just was wondering what is the average reasonable time for generating page in your apps. Saying you check the generation time, e.g. it's 750ms and think "Ok, that should be fine even without caching". Or when you see 1.5sec you think "Oh my God, the user won't wait so long and leave the site"
There's a huge amount of research data regarding the time from query to rendering and user's experience. I'd recommend reading this useit.com article. After all Google integrated page speed in its results for a reason ;)
The 3 response-time limits are the
same today as when I wrote about them
in 1993 (based on 40-year-old research
by human factors pioneers):
0.1 seconds gives the feeling of instantaneous response — that is, the
outcome feels like it was caused by
the user, not the computer. This level
of responsiveness is essential to
support the feeling of direct
manipulation (direct manipulation is
one of the key GUI techniques to
increase user engagement and control —
for more about it, see our Principles
of Interface Design seminar).
1 second keeps the user's flow of thought seamless. Users can sense a
delay, and thus know the computer is
generating the outcome, but they still
feel in control of the overall
experience and that they're moving
freely rather than waiting on the
computer. This degree of
responsiveness is needed for good
navigation.
10 seconds keeps the user's attention. From 1–10 seconds, users
definitely feel at the mercy of the
computer and wish it was faster, but
they can handle it. After 10 seconds,
they start thinking about other
things, making it harder to get their
brains back on track once the computer
finally does respond.
A 10-second delay will often make
users leave a site immediately. And
even if they stay, it's harder for
them to understand what's going on,
making it less likely that they'll
succeed in any difficult tasks.
As a rule of thumb, think that you always should aim for a balance of optimization time vs time gained. Don't spend days optimizing the hell out of one routine when your images aren't compressed correctly, or your scripts/css not combined. Yes, faster is better, but a 90% gain in generating the page by setting up a smart cache beats a 10% gain after one week tweaking the algorithm.
Also don't look too much into the first-render-time when the framework has to load everything, but use stress-testing, cached or not, to simulate various situations.
Now, some data; some of the latest sites i worked on used DotNetNuke, a huge open-source CMS, and Asp.Net MVC where you nearer to the metal. Average page time with average db queries was 600-700 milliseconds for DotNetNuke. For Asp.net MVC, it's 70-100 milliseconds... Users really like the second one :)
There's no 'right' answer to this - the faster the better. Personally I normally aim for < 200ms, although I know from experience that it can be quite difficult to achieve this in Rails on anything but simple apps. Try and figure out where your bottlenecks are and cache what you can.
Edit: There seems to be some confusion between page generation time and page render time. Obviously a quick page render is the goal, and on most sites doing things like reducing HTTP requests, gzipping CSS/JS are where you can get most of your quick wins. But if the page itself can take 4-5 seconds to generate, then you're probably right that your app is where you should start.
It depends on whether nothing is displayed for 2.5-4 seconds, or that the user already sees (a part of) the page from the start, and it finishes loading completely after 2.5-4 seconds. In that case the user doesn't experience a 2.5-4 second load. Take the http://www.nytimes.com/ website; I see most of it right away, but according to the Web Inspector it takes 1.94 seconds for it to be loaded completely.
And keep in mind that the speed will also depend on the browser, computer, internet connection. What's fast for you might be slower for others.
Measure your apdex score and see how it is performing. That will give you a rough indiciation. From there, you can decide how you want to increase performance.
It also depends on what your site is; an system application for a business or software as a service (SaaS)? If it's a system application, the users are forced to use it to performance can be negotiated. If it is a SaaS, then the higher your apdex score, the more chance you have of losing your user's interest.
There are a few gems out there that measure performance and report on what your apdex is.
Here's a little more info: http://apdex.org/blog/?p=630
My personal rule - no page should take more than 0.05 seconds, or you are in troubles.
As long as you write proper code, you don't need to spend much time on optimization to stay under 0.05.
If you stick to giant frameworks, then you are out of luck.
I'm working on a website with reasonably heavy traffic and I'm looking into using a CSS sprite to reduce the number of image loads in its design.
Are there any advantages to using a CSS sprite besides reducing the amount of transmitted data? How much space do you really save? Is there a threshold where using sprites becomes worthwhile to a website?
UPDATE: Thank you for your responses. They are obviously all very carefully thought out and present good sources to verify your points. I feel much more capable to make an informed decision about using CSS sprites in my site design now.
The question is generally not about the amount of bandwith it might save. It is more about lowering the number of HTTP requests needed to render a webpage.
Considering :
web browsers only do a few HTTP requests in parallel
doing an HTTP request means a round-trip to the server, which takes lots of time
we have "fast" internet connection, which means we download fast...
What takes time, when doing lots of requests to get small contents (like images, icons, and the like) is the multiple round-trips to the server : you end up spending time waiting for the request to go, and the server to respond, instead of using this time to download data.
If we can minimize the number of requests, we minimize the number of trips to the server, and use our hight-speed connection better (we download a bigger file, instead of waiting for many smaller ones).
That's why CSS sprites are used.
For more informations, you can have a look at, for instance : CSS Sprites: Image Slicing’s Kiss of Death
Less http requests = faster loading overall. Yahoo and co. use this technique, if you can imagine the amount of users they have it saves a lot of bandwidth. Imagine 50 seperate images for icons, that's 50 seperate http requests as opposed to having just one css sprite containing all the images, that would save 49 http requests and multiply that per all the users of the site.
Actually, sprites are not used to reduce the amount of transmitted data (in most cases it slightly increases the amount of data transferred), but to reduce the amount of requests done on the server.
HTTP requests on a browsers are traditionally done in sequence. Which means that one request will not start until the previous one is completed. Also, it is expensive to open a connection to do a request. By limiting the amount of requests made on the server, you are increasing the speed the elements load.
I think Yahoo has the best argument for CSS sprites. Besides, the whole page is worth reading:
http://developer.yahoo.com/performance/rules.html#num_http
Besides the performance enhancement of the overall page load by limiting the amount of requests, image sprites can also make dynamically swapping images (for example changing the background image of a nav item on hover) "perform" a little better since all you do is change the x,y instead of the src.
So I guess to answer what is the threshold to warrant using them, I'd say immediately because of the potential loading improvements on each individual client.
In addition to reducing HTTP requests (as already noted), CSS sprites aren't dependent on JavaScript. This gives a few other advantages:
less code to maintain
easier cross-browser testing
can be coded inline via style attributes
no DOM hacking
no image preloading (so less administrivia -- "Oh wait, I need to preload that new nav button ... crap which .js file has my preloader?")
you can use css classes to apply it to several selectors
can be applied to any selector with the :hover pseudoclass, or in any selector that can be wrapped with an anchor (not just imgs)
If you're not averse to DOM hacking, though, you can get some nifty animation effects just by pushing the X and Y values around. Which makes it easier to animate lots of different states (like keypress or onmouseclick).
There are a few interesting graphic production side effects as well:
fewer graphic production files
easier to do layout for buttons etc. directly in HTML (less need for PSD comps)
easier to make GUI changes without having to regenerate a ton of graphics
just that much tougher for image pirates to slurp your graphics
I’m using SSIS to synchronize data between two databases. I’ve used SSIS and DTS in the past, but I generally write an application for things of this nature (I’m coder and it just comes easier to me).
In my package I use a SQL Task that returns about 15,000 rows. I’ve hooked that up to a Foreach Container, and within that I assign the resultset column values to variables, and then map those variables to parameters that are fed to another SQL Task.
The problem I’m having is with debugging, and not just more complicated debugging like breakpoints and evaluating values at runtime. I simply mean that if I run this with debugging rather than without, it takes hours to complete.
I ended up rewriting the process in Delphi, and the following is what I came up with:
Full Push of Data:
This pulls 15,000 rows, updates a destination table for each row, then pulls 11,000 rows and updates a destination table for each row.
Debugging:
Delphi App: 139s
SSIS: 4 hours, 46 minutes
Not Debugging:
Delphi App: 132s
SSIS: 384s
Update of Data:
This pulls 3,000 rows, but no updates are needed or made to the destination table. It then pulls 11,000 rows but, again, no updates are needed or made to the destination table.
Debugging:
Delphi App: 42s
SSIS: 1 hours, 10 minutes
Not Debugging:
Delphi App: 34s
SSIS: 205s
The odd thing is, I get the feeling that most of this time spent debugging is just updating UI elements in Visual Studio. If I watch the progress tab, a node is added to a tree for each iteration (thousands total), and this gets slower and slower as the process goes on. Trying to stop debugging usually doesn’t work, as Visual Studio seems caught in a loop updating the UI. If I check the profiler for SQL Server no actual work is being done. I'm not sure if the machine matters, but it should be more than up to the job (quad core, 4 gig of ram, 512 mb video card).
Is this sort of behavior normal? As I’ve said I’m a coder by trade, so I have no problem writing an app for this sort of thing (in fact it takes much less time for me to code an application than “draw” it in SSIS, but I figure that margin will shrink with more work done in SSIS), but I’m trying to figure out where something like SSIS and DTS would fit into my toolbox. So far nothing about it has really impressed me. Maybe I’m misusing or abusing SSIS in some way?
Any help would be greatly appreciated, thanks in advance!
SSIS control flow and loops are not very high performance, and not designed for processing these amounts of data. Especially during the debugging - before and after each task execution, debugger sends notifications to designer process, which updates colors of the shapes and this could be slow.
You could get much better performance using data flow. Data flow does not operate with single rows, it works with buffers of rows - much faster, and the debugger is only notified about beginning/end of the buffers - so its impact is less noticeable.
SSIS is not designed to do a foreach like that. If you are doing something for each row coming in, you probably want to read those into a dataflow and then using a lookup or merge join, determine whether to do an INSERT (these happen in bulk) or a database command object for multiple SQL UPDATE commands (a better performing option is to batch these into staging table and do a single UPDATE).
In another typical sync situation, you read all the data into a staging table, and do a SQL Server UPDATE on the existing rows (INNER JOIN) and INSERT on the new rows (LEFT JOIN, rhs IS NULL). There is also the possibility of using linked servers, but joins over that can be slow, since all (or a lot of) the data may have to come across the network.
I have SSIS packages that regular import 24 million rows, including handling data conversion and validation and slowly changing dimensions using the TableDifference component, and it performs relatively quickly for that large amount of data versus a separate client program.
I have noticed this is the behavior, I had an SSIS package for moves, that did somewhere in the neighborhood of 3 million entries, it was not possible to debug as it would run for about 3-4 days.
SSIS is still the way I did it, I just don't "debug" with SSIS, I run them when working with the full datasets. If I must debug, I use very small datasets.