Amazon SimpleDB: Response messages don't agree with the request parameters - amazon-simpledb

I'm making a simple high scores database for an iPhone game using Amazon's SimpleDB and am running into some strange issues where SimpleDB's response messages don't seem to line up with the requests I'm sending or even the state of the data on the server.
The expected sequence of events for submitting high scores in the app is:
A PutAttributes request is created
that tries to overwrite the current
score with the new value but only if
it is greater than the last known
value of the score.
If the expected value doesn't match the value on the server, SimpleDB's response message lets the app know what the actual value is and a new request is created using it as the new expected value.
This process continues until either
the response states that everything
was OK or until the score on
the server comes back as higher than
the score we're trying to submit
(i.e. if somebody with a higher
score submitted while this
back and forth was going on)
(In case it's relevant I'm using the ASIHTTPRequest class to handle the requests and I've explicitly turned off caching by setting each request's cache policy to ASIIgnoreCachePolicy when I create them.)
However, what's actually happening is a bit strange...
The first response comes back with the expected result. For example, the app submits a score of 200 and expects the score on the server to be 0 but it's actually 100. SimpleDB responds that the conditional check failed and lets the app know the actual value on the server (100).
The app sends a request with an updated expected value but SimpleDB responds with an identical response as the first time even though the expected value was changed (e.g. the response says the actual value is 100 and the expected value we passed in was 0 even though we had just changed it to 100).
The app sends a third request with the exact same score/expected values as the second request (e.g. 100 for both) and SimpleDB reports that the condition failed again because the actual value is 200.
So it looks like the second attempt actually worked even though SimpleDB reported a failure and gave an incorrect account of the parameters I had passed in. This odd behavior is also very consistent - every time I try to update a score with an expected value that doesn't match the one on the server the exact same sequence occurs.
I've been scratching my head at this for a while now and I'm flat out of ideas so if anyone with more SimpleDB experience than me could shed some light on this I'd be mighty grateful.
Below is a sample sequence of requests and responses in case that does a better job of describing the situation than my tortured explanation above (these values taken from actual requests and responses but I've edited out the non-relevant parts of the requests).
Request 1
(The score on the server is 100 at this point)
Attribute.1.Name=Score
Attribute.1.Replace=true
Attribute.1.Value=200
Expected.1.Name=Score
Expected.1.Value=000
Consistent=true
Response 1
Conditional check failed. Attribute (Score) value is (100) but was expected (000)
Request 2
(The app updates to proper score but based on the response SimpleDB seems to ignore the changes)
Attribute.1.Name=Score
Attribute.1.Replace=true
Attribute.1.Value=200
Expected.1.Name=Score
Expected.1.Value=100
Consistent=true
Response 2
Conditional check failed. Attribute (Score) value is (100) but was expected (000)
Request 3
(This time SimpleDB gets the expected value right but also reports that the score has been updated even though all previous responses indicated otherwise)
Attribute.1.Name=Score
Attribute.1.Replace=true
Attribute.1.Value=200
Expected.1.Name=Score
Expected.1.Value=100
Consistent=true
Response 3
Conditional check failed. Attribute (Score) value is (200) but was expected (100)
Update (10/21/10)
I checked to make sure that the requestIDs that are being returned from the server are all unique and indeed they are.

Try passing ConsistentRead=true in your requests.

Related

Circumventing negative side effects of default request sizes

I have been using Reactor pretty extensively for a while now.
The biggest caveat I have had coming up multiple times is default request sizes / prefetch.
Take this simple code for example:
Mono.fromCallable(System::currentTimeMillis)
.repeat()
.delayElements(Duration.ofSeconds(1))
.take(5)
.doOnNext(n -> log.info(n.toString()))
.blockLast();
To the eye of someone who might have worked with other reactive libraries before, this piece of code
should log the current timestamp every second for five times.
What really happens is that the same timestamp is returned five times, because delayElements doesn't send one request upstream for every elapsed duration, it sends 32 requests upstream by default, replenishing the number of requested elements as they are consumed.
This wouldn't be a problem if the environment variable for overriding the default prefetch wasn't capped to minimum 8.
This means that if I want to write real reactive code like above, I have to set the prefetch to one in every transformation. That sucks.
Is there a better way?

Inconsistent ETags in YouTube API

I'm looking at building a caching layer on top of the YouTube API and making use of the HTTP standard ETag functionality to do this as described here https://developers.google.com/youtube/v3/getting-started#etags
I've done some direct testing of this against the API in most cases it seems to be working - I can get 304 responses etc.
However I'm seeing a few places where the API is returning different ETags when the response has not changed.
In these cases the ETags seem to cycle between a set of values instead of being a single consistent value.
When I pick one of the ETags and use it to send a conditional GET I will sometimes get a 304 back (when it matches) or sometimes get a 200 with a full response (when it was one of the other values) even though the actual response data is the same.
I've found this behaviour in at least two places:
1) youtube/v3/channels part=brandingSettings
In the response here the brandingSettings has a "hints" value which is an array of size 3.
The order of the elements in this array is random and varies on each request however it seems to affect the etag, meaning I get 6 (permutations for 3 items) different possible ETags values for the same data.
Either the array order should be fixed or the ETag generation algorithm should account for this?
2) youtube/v3/channels part=contentDetails
The ETag for the response here seems to vary between 3 different values, despite there being no other differences in the data. In particular the other "etag" value within "items" remains constant.
Is this a bug in the YouTube API Etag implementation? Surely this behaviour will effectively break any caching layer trying to reduce data retrieval from the YouTube API?

Predict delay before next retry on a rate-limited API

I'm using an API that rate-limits me, forcing me to wait and retry my request if I hit a URL too frequently. Let's say for the sake of argument that I don't know what the specific rate limit threshold is (and don't want to hardcode one into my app even if I did know).
I can make an API call and either watch it succeed, or get back a response saying I've been rate limited, then try again shortly, and I can record the total time elapsed for all tries before I was able to successfully complete an API call.
What algorithms would be a good fit for predicting the minimum time I need to wait before retrying an API call after hitting the rate limit?
Use something like a one sided binary search.
See page 134 of The Algorithm Design Manual:
Now suppose we have an array A consisting of a run of 0's, followed by
an unbounded run of 1's, and would like to identify the exact point of
transition between them. Binary search on the array would provide the
transition point in ⌈lg n ⌉ tests, if we had a bound n on the
number of elements in the array. In the absence of such a bound, we can
test repeatedly at larger intervals (A[1], A[2], A[4], A[8], A[16], ...)
until we find a first non-zero value. Now we have a window containing
the target and can proceed with binary search. This one-sided binary
search finds the transition point p using at most 2 ⌈lg(p)⌉
comparisons, regardless of how large the array actally is. One-sided
binary search is most useful whenever we are looking for a key that
probably lies close to our current position.

National Weather Service (NOAA) REST API returns nil for parameters of forecast

I am using the NWS REST API as my weather service for an app I am making. I was initially reluctant to use NWS because of its bad documentation, but I couldn't resist as it is offered completely free.
Now that I am trying to use it, I am running into some difficulty. When making a request for multiple days, the minimum temperature appears nil for several days.
(EDIT: As I have been testing the API more I have found that it is not always the minimum temperatures that are nil. It can be a max temp or a precipitation, it seems completely random. If you would like to make test calls using their web interface, you can do so here: http://graphical.weather.gov/xml/sample_products/browser_interface/ndfdBrowserByDay.htm
and here: http://graphical.weather.gov/xml/sample_products/browser_interface/ndfdXML.htm)
Here is an example of a request the minimum temperatures are empty: http://graphical.weather.gov/xml/sample_products/browser_interface/ndfdBrowserClientByDay.php?listLatLon=40.863235,-73.714780&format=24%20hourly&numDays=7
Surprisingly, on their website, the minimum temperatures are available:
http://forecast.weather.gov/MapClick.php?textField1=40.83&textField2=-73.70
You'll see under the Minimum temperatures that it is filled with about 5 (sometimes less, it is inconsistent) blank fields that say <value xsi:nil="true"/>
If anybody can help me it would be greatly appreciated, using the NWS API can be a little overwhelming at times.
Thanks,
The nil values, from what I can understand of the documentation, here and here, simply indicate that the data is unavailable.
Without making assumptions about NOAA's data architecture, it's conceivable that the information available via the API may differ from what their website displays.
Missing values are represented by an empty element and xsi:nil=”true” (R2.2.1).
Nil values being returned seems to involve the time period. Notice the difference between the time-layout keys (see section 5.3.2) in 1 in these requests:
k-p24h-n7-1
k-p24h-n6-1
The data times are different.
<layout-key> element
The key is derived using the following convention:
“k” stands for key.
“p24h” implies a data period length of 24 hours.
“n7” means that the number of data times is 7.
“1” is a sequential number used to keep the layout keys unique.
Here, startDate is the factor. Leaving it off includes more time and might account for some requested data not yet being available.
Per documentation:
The beginning day for which you want NDFD data. If the string is empty, the start date is assumed to be the earliest available day in the database. This input is only needed if one wants to shorten the time window data is to be retrieved for (less than entire 7 days worth), e.g. if user wants data for days 2-5.
I'm not experiencing the randomness you mention. The folks on NOAA's Yahoo! Groups forum might be able to tell you more.

Ajax Security Question: Supplying Available usernames dynamically

I am designing a simple registration form in ASP.net MVC 1.0
I want to allow the username to be validated while the user is typing (as per the related questions linked to below)
This is all easy enough. But what are the security implications of such a feature?
How do i avoid abuse from people scraping this to determine the list of valid usernames?
some related questions: 1, 2
To prevent against "malicious" activities on some of my internal ajax stuff, I add two GET variables one is the date (usually in epoch) then I take that date add a salt and SHA1 it, and also post that, if the date (when rehashed) does not match the hash then I drop the request otherwise fulfill it.
Of course I do the encryption before the page is rendered and pass the hash & date to the JS. Otherwise it would be meaningless.
The problem with using IP/cookie based limits is that both can be bypassed.
Using a token method with a good, cryptographically strong, salt (say something like one of Steve Gibson's "Perfect Passwords" https://www.grc.com/passwords.htm ) it would take a HUGE amount of time (on the scale of decades) before the method could reliably be predicted and there for ensures a certain amount security.
you could limit the number of requests to maybe 2 per 10 seconds or so (a real user may put in a name that is taken and modify it a bit and try again). kind of like how SO doesn't let you comment more than once every 30 seconds.
if you're really worried about it, you could take a method above and count how many times they tried in a certain time period, and if it goes above a threshold, kick them to another page.
Validated as in: "This username is already taken"? If you limit the number of requests per second it should help
One common way to solve this is simply by adding a delay in the request. If the request is sent to the server, wait 1 (or more) seconds to respond, then respond with the result (if the name is valid or not).
Adding a time barrier doesn't really effect users not trying to scrape, and you have gotten a 60-requests per minute limit for free.
Bulding on the answer provided by UnkwnTech, which is some pretty solid advice.
You could go a step further and make the client perform some of calculation to create the return hash - this could just be some simple arithmatic like subtrating a few numbers, adding the data and multiplying by 2.
The added arithmatic does mean an out-of-box username scraping script is unlikely to work and forces the client into using up greater CPU.

Resources