Count operations in Parse - ios

"Count operations are limited to 160 count queries / minute period for each application. The limit applies to all requests made by all clients of the application".
Sorry but i haven't understand very well what it means. This limit refers to a single app on a single device ? Or if i have 160 clients they can do only 1 request per minute?
Thanks in advance

This limit is per app, all users of your app can make 160 count queries a minute in total, so yes, with 160 users you would get 1 count query per user/minute.
In general, you should avoid count queries altogether and instead use counters to keep track of counts. This however depends on what you are doing and would be out of scope of your question itself.

Related

Twilio API "IncomingPhoneNumber.list()" is slow on Master account (And I only need the total amount of numbers)

I have a Flask App that uses twilio. I display the total amount of incomingphonenumbers in an account/subaccount to the user.
Here is what I'm currently doing to get the total:
client = Client(accountsid,accounttoken)
# Get the list
pnlist = client.incoming_phone_numbers.list(limit=1)
# Get the length of the list
total = len(pnlist)
This takes upwards of 19 seconds just to get the numbers in the Master account. On top of that I have to repeat this for all subaccounts.
Is there a better way to just get the total numbers for a account/subaccount?
Thanks in advance!
Twilio developer evangelist here.
I would recommend that, rather than listing all the numbers every time you want to count them, you cache the count within your own database against the account.
You can then update the count either when you know it changes, if you are purchasing/releasing numbers via the API in your app, or periodically (say, once a day) with a background job.

Prepping Data For Usage Clustering

Dataset: I'm given the number of minutes individual customers use a product each day and am trying to cluster this data in order to find common usage patterns.
My question: How can I format the data so that, for example, a power user with high levels of use for a year looks the same as a different power user who has only been able to use the device for a month before I ended data collection?
So far I've turned each customer into an array where each cell is the number of minutes used that day. This array starts when the user first uses the product and ends after the user's first year of use. All entries in the cells must be double values (e.x. 200.0 minutes used) for the clustering model. I've considered either setting all cells/days after the last day of data collection to either -1.0 or NULL. Are either of these a valid approach? If not what would you suggest?
For the problem where you want both users (one that used the product a lot every day for a year, and the other used it a lot for one month), create a new entry where it's values are:
avg_usage per time_bin
time_bin can be a month, a day or another time bin which best fits your needs.
This way, a user which use a product, let's say 200 minutes per day for one year, will get:
200 * 30 * 12 / 12 = 6000 minutes per month
and the other user, which joined just last month, will also get, with the exact same usage will get:
200 * 30 * 1 / 1 = 6000 minutes per month.
This way, it doesn't matter when you have started to use the product, the only thing that matter, is the usage rate.
An important thing you might take into consideration, that products, may be forgotten for some time. for example, a computer, and I'm away for a vacation. Those days I didn't use my computer, doesn't have (maybe) an effect of my general usage of this product. So, based on your data, product and intuition you might consider removing gaps like the one I mentioned, and not take it into account inside the calculation.
The amount of time a user has used your product could be a signal of something, but if indeed he only started some time ago, and still using it until today, it may be something you need to take into consideration, and for that use, this average binning technique may help.

Max Subscribers Returned (and duplicates) | YouTube API

[Problem 1]
I am using https://developers.google.com/youtube/v3/docs/subscriptions/list for a large channel (1 million subscribers) but after 100 successful pages of results (50 subscribers per page), the API always returns 0 subscribers.
Is there a hard limit of 100 pages or 5,000 subscribers that can be returned?
[Problem 2]
Of the 5,000 Subscribers returned, only 3,577 are unique. The API seems to be returning duplicates in some cases which I know is a long standing issue with getting channel subscribers. Hoping to learn if this will be fixed?
I ran into the second problem today and it seems like the duplicates happens because the default order of the API list is SUBSCRIPTION_ORDER_RELEVANCE
Acceptable values are:
alphabetical – Sort alphabetically.
relevance – Sort by relevance.
unread – Sort by order of activity.
So setting order to be alphabetical solves the problem entirely.

How do I consistently increase a counter cache column?

Lets say I have a counter cache that needs to to be incremented on every page load. Say I have 10 web instances. How do I consistently increase a counter cache column?
Consistency is easy with one web instance. but with several instances running, A race conditions may occur.
Here is a quick explanation. Let's say my counter cache column is called foo_counts and its starting value is 0. If 2 web instance are loaded at the same time, both realize the count as 0. When it comes time to increase the count. They both increment the count from 0 to 1.
I looked at http://guides.rubyonrails.org/active_record_querying.html#locking-records-for-update
Any ideas would be greatly appreciated.
You could use update_counter:
increment_counter(counter_name, id)
Increment a number field by one, usually representing a count.
This does a direct UPDATE in SQL so Model.increment_counter(:c, 11) sends this SQL to the database:
update models set c = coalesce(c, 0) + 1 where id = 11
so you don't have to worry about race conditions. Let the database do its job.
Consider queueing the increments, and having workers do the actual incrementation in the background. You won't have completely up-to-the-millisecond data, but at least it will be accurate.

Iterating over all items in SimpleDB

Let's say I have a AWS SimpleDB domain with around 3 million items, each item has an attribute of "foo" with a value of some arbitrary integer (which is of course actually stored in SimpleDB as a string, but let's ignore the conversion to and from for now). I would like to increment the foo value for each item every 60 seconds, until it reaches a maximum value (max value is not the same for each item, item's max is stored as another attribute-value in item), then reset foo to zero: read, increment, evaluate, store.
Given the large number of items, and the hard 60 second time limit, is this approach feasible in SimpleDB? Anyone have an approach to make this work?
You can do it, but it is not feasible. You can only get between 100-300 PUTs per second for a single domain. You can read upwards of 1000 items per second so writes will be the bottleneck.
To be on the conservative side lets say 100 store operations per second, per domain. You'd need 500 domains to open up enough throughput to store all 3 million each minute. You only get 100 by default, so you'd have to ask for more.
Also it would be expensive. Writes with a small number of attributes are about $3 per million and reads are about $1.30 per million. That's about $13 / minute.
The only thing I can really suggest would be if there was a way to combine the 3 million items into a smaller number of items. If there were a way to put 50 "items" into each real item, you could do it with 10 domains at about $15.50 / hour. But I still wouldn't call that feasible, since you can get a cluster of 10 Extra Large High-CPU EC2 server instances for $6.80 / hour.
Why not generate the value at read time from a trusted clock? I'm going to make up some names:
Touch_time - Epoch value (seconds since 1970) when the item was initialized to zero.
Max_age - Number of minutes when time wraps around.
Current_time - Epoch value of now.
So at any time, you can get the value you were proposing to store in an attribute by
(current_time - touch_time) % (max_age * 60)
Assuming max_age changes relatively infrequently, and everyone trusts touch_time and current_time to within a minute, and that's what NTP is for.

Resources