Expectations on loading tracks - cocoalibspotify-2.0

Are there any guidelines, benchmarks or similar on what you can expect in terms of loading times, when you're loading a number of tracks (using SPAsyncLoading waitUntilLoaded:timeout:then)?
The reason for me asking is that I've tried loading ~20 tracks in a batch, since the track metadata's going to be used throughout a flow in my client (which is split up in steps, in between which I'd rather not have any loading times aside from an initial delay, which is more acceptable to the user in this case).
I.e. how many tracks are reasonable to attempt loading at once and how long it "should" take?
Is loading 20 tracks "a lot"? Does loading a single track itself trigger a number of new request to pull in the metadata, making it a very bad idea trying to load more than a handful at once? And is there any way of finding out why loading of a track failed, aside from it just timing out?
Rather often (maybe 1 out of 10), loading fails after the default 20 s timeout (and I've tried with longer timeouts, without big differences). Sometimes all 20 tracks have failed to load, sometimes only a single track.
I daresay that my internet connection stays the same (as much as it could, anyway) in between these attempts (in between which it can be a minute or two).
I realize that there's a lot of fuzzy input here on what's normal and what's not etc. And this obviously depends on a number of factors such as your internet connection, the status of the Spotify servers and such, but maybe it's possible to give some kind of hint on what to expect and if there are any do's and don't specific to the Spotify API.

Generally, the rule is: only load tracks whose metadata you need to display to the user right now, and perhaps pre-load track that'll be displayed next.
In addition, since CocoaLibSpotify uses a queue, the more load you place on the library, the more overhead you'll have. For example, if you separately SPAsyncLoading a bunch of stuff they'll all be separately queued, but their timers will start instantly. If you enqueue enough stuff, they might not even start loading by the time their timeout fires.
However, since a bunch of queueing goes on internally, this also happens if you throw a ton of stuff into a single SPAsyncLoading call.
To keep things fast, keep it light and try to follow the guideline in the first sentence. In addition, try and keep things efficient:
SPAlbumBrowse will load that album's track metadata in one hit, reducing backend load and queueing time.
Same for SPArtistBrowse.
Same for SPPlaylist, I believe.

Related

How to do UITableView operation queues to download concurrently

I am doing a UITableview to download data
the name webservice is very fast, so I use it to populate the table initially, then I start an operation queue for the image.
Then a seperate queue for the rest of data because it loads very slow but that effects the image load time, How can I do the 2 concurrently.
Can you figure out whats slowing the performance there and help me fix it?
As I assume you know, you can specify how many concurrent requests by setting maxConcurrentOperationCount when you create your queue. Four is a typical value.
self.imageDownloadingQueue.maxConcurrentOperationCount = 4;
The thing is, you can't go much larger than that (due to both iOS restrictions and some server restrictions). Perhaps 5, but no larger than that.
Personally, I wouldn't use up the limited number of max concurrent operations returning the text values. I'd retrieve all of those up front. You do lazy loading of images because they're so large, but text entries are so small that the overhead of doing separate network requests starts to impose its own performance penalties. If you are going to do lazy loading of the descriptions, I'd download them in batches of 50 or 100 or so.
Looking at your source code, at a very minimum, you're making twice as many JSON requests as you should (you're retrieving the same JSON in getAnimalRank and getAnimalType). But you really should just alter that initial JSON request to return everything you need, the name, the rank, the type, the URL (but not the image itself). Then in a single call, you get everything you need (except the images, which we'll retrieve asynchronously, and which your server is delivery plenty fast for the UX). And if you decide to keep the individual requests for the rank/type/url, you need to take a look at your server code, because there's no legitimate reason that shouldn't come back instantaneously, and it's currently really slow. But, as I said, you really should just return all of that in the initial JSON request, and your user interface will be remarkably faster.
One final point: You're using separate queues for details and image downloads. The entire purpose in using NSOperationQueue and setting maxConcurrentOperationCount is that iOS can only execute 5 concurrent requests with a given server. By putting these in two separate queues, you're losing the benefit of maxConcurrentOperationCount. As it turns out it takes a minute for requests to time out, so you're probably not going to experience a problem, but still, it reflects a basic misunderstanding of the purpose of the queues.
Bottom line, you should have only one network queue (because the system limitation is how many network concurrent connections between your device and any given server, not how many image downloads and, separately, how many description downloads).
Have you thought about just doing this asyncronously? I wrote a class to do something very similar to what you describe using blocks. You can do this two ways:
Just load async whenever cellForRowAtIndexPath fires. This works for lots of situations, but can lead to the wrong image showing for a second until the right one is done loading.
Call the process to load the images when the dragging has stopped. This is generally the way I do things so that the correct image always shows where it should. You can use a placeholder image until the image is loaded from the web.
Look at this SO question for details:
Loading an image into UIImage asynchronously

When/what to cache in Rails 3

Caching is something that I kind of ignored for a long time, as projects that I worked on were on local intranets with very little activity. I'm working on a much larger Rails 3 personal project now, and I'm trying to work out what and when I should cache things.
How do people generally determine this?
If I know a site is going to be relatively low-activity, should I just cache every single page?
If I have a page that calls several partials, is it better to do fragment caching in those partials, or page caching on those partials?
The Ruby on Rails guides did a fine job of explaining how caching in Rails 3 works, but I'm having trouble understanding the decision-making process associated with it.
Don't ever cache for the sake of it, cache because there's a need (with the exception of something like the homepage, which you know is going to be super popular.) Launch the site, and either parse your logs or use something like NewRelic to see what's slow. From there, you can work out what's worth caching.
Generally though, if something takes 500ms to complete, you should cache, and if it's over 1 second, you're probably doing too much in the request, and you should farm whatever you're doing to a background process…for example, fetching a Twitter feed, or manipulating images.
EDIT: See apneadiving's answer too, he links to some great screencasts (albeit based on Rails 2, but the theory is the same.)
You'll want to think about caching several kinds of things:
Requests that are hit a lot, and seldom change
Requests that are "expensive" to draw, lots of database calls, etc. Also hopefully these seldom change.
The other side of caching that shouldn't go without mention, is expiration. Its also often the harder part. You have to know when a cache is no longer good, and clear it out so fresh content will be generated. Sweepers, or Observers, depending on how you implement your cache can help you with this. You could also do it just based on a time value, allow caches to have a max-age and clear them after that no matter what.
As for fragment vs full page caching, think of it in terms of how often those parts are updated. If 3 partials of a page are never updated, and one is, maybe you want to cache those 3, and allow that 1 to be fetched live for so you can have up to the second accuracy. Or if the different partials of a page should have different caching rules: maybe a "timeline" section is cached, but has a cache-age of 1 minute. While the "friends" partial is cached for 12 hours.
Hope this helps!
If the site is relatively low activity you shouldn't cache any page. You cache because of performance problems, and performance problems come about because you have too much data to query, too many users, or worse, both of those situations at the same time.
Before you even think about caching, the first thing you do is look through your application for the requests that are taking up the most time. Not the slowest requests, but the requests your application spends the most aggregate time performing. That is if you have a request A that runs 10 times at 1500ms and request B that runs 5000 times at 250ms you work on optimizing B first.
It's actually pretty easy to grep through your production.log and extract rendering times and URLs to combine them into a simple report. You can even do that in real-time if you want.
Once you've identified a problematic request, you go about picking apart what it's doing to service the request. The first thing is to look for any queries that can be combined by using eager loading or by looking ahead a bit more to anticipate what you'll need. The next thing is to ensure you're not loading data that isn't used.
So many times you'll see code to list users and it's loading 50KB per person of biographical data, their Facebook and Twitter handles, literally everything about them, and all you use is their name.
Fetch as little as you need, and fetch it in the most efficient way you can. Use connection.select_rows when you don't need models.
The next step is to look at what kind of queries you're running, and how they're under-performing. Ensure your indexes are all set properly and are being used. Check that you're not doing complicated JOIN operations that could be resolved by a bit of tactical de-normalization.
Have a look at what data you are storing in your application, and try and find things that can be removed from your production database and warehoused somewhere else. Cycle your data out regularly when it's no longer relevant, preserve it in a separate database if you need to.
Then go over and have a look at how your database server is tuned. Does it have sufficiently large buffers? Is it on hardware that could be upgraded with more memory at a nominal cost? Too many people are running a completely un-tuned database server and with a few simple settings they can get ten-fold performance increases.
If, and only if, you still have a performance problem at this point then you might want to consider caching.
You know why you don't cache first? It's because once you cache something, that cached data is immediately stale. If parts of your application use this data under the assumption it's always up to date, you will have problems. If you don't expire this cache when the data does change, you will have problems. If you cache the data and never use it again, you're just clogging up your cache and you will have problems. Basically you'll have lots of problems when you use caching, so it's often a last resort.

Handling latency while synchronizing client-side timers using Juggernaut

I need to implement a draft application for a fantasy sports website. Each users will have 1m30 to choose a player on its team and if that time has elapsed it will be selected automatically. Our planned implementation will use Juggernaut to push the turn changes to each user participating in the draft. But I'm still not sure about how to handle latency.
The main issue here is if a user got a higher latency than the others, he will receive the turn changes a little bit later and his timer won't be synchronized. Say someone receive a turn change after choosing a player himself while on his side he think he still got 2 seconds left, how can we handle that case? Is it better to try to measure each user latency and adjust the client-side timer to minimize that issue? If so, how could we implement that?
This is a tricky issue, but there are some good solutions out there. Look into what time.gov does, and how it does it; essentially, as I understand it, they use Java to perform multiple repeated requests to the server, to attempt to get an idea of the latency involved in the communication, then they generate a measure of latency that they use to skew the returned time data. You could use the same process for your application, with even more accuracy; keeping track of what the latency is and how it varies over time lets you make some statistical inferences about how reliable your latency numbers are, etc. It can be a bit complex, but it can definitely allow you to smooth out your performance. My understanding is that this is what most MMOs do as well, to manage lag.

I'm making a multiplayer game and I need to verify that players aren't speed hacking

For security reasons I have a feeling that that testing should be done server side. Nonetheless, that would be rather taxing on the server, right? Given the gear and buffs a player is wearing they will have a higher movement speed, so each time they move I would need to calculate that new constant and see if their movement is legitimate (using TCP so don't need to worry about lost, unordered packets). I realize I could instead just save the last movement speed and only recalculate it if they've changed something affecting their speed, but even then that's another check.
Another idea I had is that the server randomly picks data that the client is sending it and verifies it and gives each client a trust rating. A low enough trust rating would mean every message from the client would be inspected and all of their actions would be logged in a more verbose manner. I would then know they're hacking by inspecting the logs and could ban/suspend them as well as undo any benefits they may have spread around through hacking.
Any advice is appreciated, thank you.
Edit: I just realized there's also the case where a hacker could send tiny movements (within the capability of their regular speed) in a very high succession. Each individual movement alone would be legite, but the cumulative effect would be speed hacking. What are some ways around this?
The standard way to deal with this problem is to have the server calculate all movement. The only thing that the clients should send to the server are commands, e.g. "move left" and the server should then calculate how fast the player moves etc., then finally send the updated position back to the client.
If you leave any calculation at all on the client, the chances are that sooner or later someone will find a way to cheat.
[...] testing should be done server side. Nonetheless, that would be rather taxing on the server, right?
Nope. This is the way to do it. It's the only way to do it. All talk of checking trust or whatever is inherently flawed, one way or another.
If you're letting the player send positions:
Check where someone claims they are.
Compare that to where they were a short while ago. Allow a tiny bit of deviation to account for network lag.
If they're moving too quickly, reposition them somewhere more reasonable. Small errors may be due to long periods of lag, so clients should use interpolation to smooth out these corrections.
If they're moving far too quickly, disconnect them. And check for bugs in your code.
Don't forget to handle legitimate traversals over long distance, eg. teleports.
The way around this is that all action is done on the server. Never trust any information that comes from the client. If anybody actually plays your game, somebody will reverse-engineer the communication to the server and figure out how to take advantage of it.
You can't assign a random trust rating, because cautious cheaters will cheat only when they really need to. That gives them a considerable advantage with a low chance of being spotted cheating.
And, yes, this means you can't get by with a low-grade server, but there's really no other method of preventing client-side cheating.
If you are developing in a language that has access to Windows API function calls, I have found from my own studies in speed hacking, that you can easily identify a speed hacker by calling two functions and comparing results.
TimeGetTime
and...
GetTickCount
Both functions will return the number of seconds since the system started. However, TimeGetTime is much more accurate than GetTickCount, whereas TimeGetTime is accurate up to ~1ms vs. GetTickCount, which is accurate at around ~50ms
Even though there is a small lag between these two functions, if you turn on a speed hacking application (pick your poison), you should see a very large difference between the two result sets, sometimes even up to a couple of seconds. The difference is very noticable.
Write a simple application that returns the GetTickCount and TimeGetTime results, without the speed hacking application running, and leave it running. Compare the results and display the difference -- you should see a very small difference between the two. Then, with your application still running, turn on the speed hacking application and you will see the very large difference in the two result sets.
The trick is figuring out what threshold will constitue suspicious activity.

Storing Data In Memory: Session vs Cache vs Static

A bit of backstory: I am working on an web application that requires quite a bit of time to prep / crunch data before giving it to the user to edit / manipulate. The data request task ~ 15 / 20 secs to complete and a couple secs to process. Once there, the user can manipulate vaules on the fly. Any manipulation of values will require the data to be reprocessed completely.
Update: To avoid confusion, I am only making the data call 1 time (the 15 sec hit) and then wanting to keep the results in memory so that I will not have to call it again until the user is 100% done working with it. So, the first pull will take a while, but, using Ajax, I am going to hit the in-memory data to constantly update and keep the response time to around 2 secs or so (I hope).
In order to make this efficient, I am moving the intial data into memory and using Ajax calls back to the server so that I can reduce processing time to handle the recalculation that occurs w/ this user's updates.
Here is my question, with performance in mind, what would be the best way to storing this data, assuming that only 1 user will be working w/ this data at any given moment.
Also, the user could potentially be working in this process for a few hours. When the user is working w/ the data, I will need some kind of failsafe to save the user's current data (either in a db or in a serialized binary file) should their session be interrupted in some way. In other words, I will need a solution that has an appropriate hook to allow me to dump out the memory object's data in the case that the user gets disconnected / distracted for too long.
So far, here are my musings:
Session State - Pros: Locked to one user. Has the Session End event which will meet my failsafe requirements. Cons: Slowest perf of the my current options. The Session End event is sometimes tricky to ensure it fires properly.
Caching - Pros: Good Perf. Has access to dependencies which could be a bonus later down the line but not really useful in current scope. Cons: No easy failsafe step other than a write based on time intervals. Global in scope - will have to ensure that users do not collide w/ each other's work.
Static - Pros: Best Perf. Easies to maintain as I can directly leverage my current class structures. Cons: No easy failsafe step other than a write based on time intervals. Global in scope - will have to ensure that users do not collide w/ each other's work.
Does anyone have any suggestions / comments on what I option I should choose?
Thanks!
Update: Forgot to mention, I am using VB.Net, Asp.Net, and Sql Server 2005 to perform this task.
I'll vote for secret option #4: use the database for this. If you're talking about a 20+ second turnaround time on the data, you are not going to gain anything by trying to do this in-memory, given the limitations of the options you presented. You might as well set this up in the database (give it a table of its own, or even a separate database if the requirements are that large).
I'd go with the caching method of for storing the data across any page loads. You can name the cache you want to store the data in to avoid conflicts.
For tracking user-made changes, I'd go with a more old-school approach: append to a text file each time the user makes a change and then sweep that file at intervals to save changes back to DB. If you name the files based on the user/account or some other session-unique indicator then there's no issue with conflict and the app (or some other support app, which might be a better idea in general) can sweep through all such files and update the DB even if the session is over.
The first part of this can be adjusted to stagger the write out more: save changes to Session, then write that to file at intervals, then sweep the file at larger intervals. you can tune it to performance and choose what level of possible user-change loss will be possible.
Use the Session, but don't rely on it.
Simply, let the user "name" the dataset, and make a point of actively persisting it for the user, either automatically, or through something as simple as a "save" button.
You can not rely on the session simply because it is (typically) tied to the users browser instance. If they accidentally close the browser (click the X button, their PC crashes, etc.), then they lose all of their work. Which would be nasty.
Once the user has that kind of control over the "persistent" state of the data, you can rely on the Session to keep it in memory and leverage that as a cache.
I think you've pretty much just answered your question with the pros/cons. But if you are looking for some peer validation, my vote is for the Session. Although the performance is slower (do you know by how much slower?), your processing is going to take a long time regardless. Do you think the user will know the difference between 15 seconds and 17 seconds? Both are "forever" in web terms, so go with the one that seems easiest to implement.
perhaps a bit off topic. I'd recommend putting those long processing calls in asynchronous (not to be confused with AJAX's asynchronous) pages.
Take a look at this article and ping me back if it doesn't make sense.
http://msdn.microsoft.com/en-us/magazine/cc163725.aspx
I suggest to create a copy of the data in a new database table (let's call it EDIT) as you send the initial results to the user. If performance is an issue, do this in a background thread.
As the user edits the data, update the table (also in a background thread if performance becomes an issue). If you have to use threads, you must make sure that the first thread is finished before you start updating the rows.
This allows a user to walk away, come back, even restart the browser and commit whenever she feels satisfied with the result.
One possible alternative to what the others mentioned, is to store the data on the client.
Assuming the dataset is not too large, and the code that manipulates it can be handled client side. You could store the data as an XML data island or JSON object. This data could then be manipulated/processed and handled all client side with no round trips to the server. If you need to persist this data back to the server the end resulting data could be posted via an AJAX or standard postback.
If this does not work with your requirements I'd go with just storing it on the SQL server as the other comment suggested.

Resources