Is It Possible to Apply SQS Limits for IAM Users? - amazon-sqs

I'm currently working on a project which has a large amount of IAM users, each of whom need limited access to particular SQS queues.
For instance, let's say I have an IAM user named 'Bob' and an SQS queue named 'BobsQueue'. What I'd like to do is grant Bob full permission to manage his queue (BobsQueue), but I'd like to restrict his usage such that:
Bob can make only 10 SQS requests per second to BobsQueue.
Bob cannot make more than 1,000,000 SQS requests per month.
I'd essentially like to apply arbitrary usage restrictions to this SQS queue.
Any ideas?

From the top of my head none of the available AWS services offers resource usage limits at all, except if built into the service's basic modus operandi (e.g. the Provisioned Throughput in Amazon DynamoDB) and Amazon SQS is no exception, insofar the Available Keys supported by all AWS services that adopt the access policy language for access control currently lack such resource limit constraints.
While I can see your use case, I think it's actually more likely to see something like this see the light as an accounting/billing feature, insofar it would make sense to allow cost control by setting (possibly fine grained) limits for AWS resource usage - this isn't available either yet though.
Please note that this feature is frequently requested (see e.g. How to limit AWS resource consumption?) and it's absence actually allows to launch what Christofer Hoff aptly termed an Economic Denial of Sustainability attack (see The Google attack: How I attacked myself using Google Spreadsheets and I ramped up a $1000 bandwidth bill for a somewhat ironic and actually non malicious example).
Workaround
You might be able to achieve an approximation of your specification by facilitating Shared Queues with an IAM policy granting access to user Bob as outlined in Example AWS IAM Policies for Amazon SQS and monitoring this queue with Amazon CloudWatch in turn by Creating Amazon CloudWatch Alarms for one or more of the Amazon SQS Dimensions and Metrics you want to limit, e.g. NumberOfMessagesSent. Once the limit is reached you could revoke the IAM grant for user Bob for this shared queue until he is in compliance again.
Obviously it is not necessarily trivial to implement the 'per second'/'per-month' specification based on this metric alone without some thorough bookkeeping, nor will you be able to 'pull the plug' precisely when the limit is reached, rather need to account for the processing time and API delays.
Good luck!

Related

Is there a GCP equivalent to AWS SQS?

Im curious to understand the implementation of GCP's PubSub. Although Pubsub seems to point to follow a Publish-Subscribe design pattern, it seems more close to AWS's SQS (queue) than AWS SNS (that use publish-subscribe model). Why is think this is, GCP's pubSub
Allows upto 10,000 subscriptions per project.
Allows filtering on subscriptions
It even allows ordering (beta) - which should involve a FIFA queue somewhere.
It exposes synchronous api for request/response pattern.
It makes me wonder if subscriptions in pub/sub are merely queues of SQS.
I would like your opinions on this comparison. The confusion is due to lack of implementation details on PubSub and the obvious name indicating a certain design pattern.
Regards,
The division for messaging in GCP is along slightly different lines than what you may see in AWS. GCP breaks down messaging into three categories:
Torrents: Messaging pipelines that are designed to handle large amounts of throughput on pipes that are persistent. In other words, one creates a new pipeline rarely and sends messages over it for long periods of time. The scaling pattern for torrents is a relatively small number of pipelines transmitting a lot of data. For this category, Cloud Pub/Sub is the right product.
Trickles: Messaging pipelines that are largely ephemeral or require broadcast to a very large number of end-user devices. These pipelines have a low throughput but the number of pipelines can be extremely large. Firebase Cloud Messaging is the product that fits into this category.
Queues: Messaging pipelines where one has more control over the end-to-end message delivery. These pipelines are not really high throughput nor is the number of pipelines large, but more advanced properties are supported, e.g., the ability to delay or cancel the delivery of a message. Cloud Tasks fits in this category, though Cloud Pub/Sub is also adopting features that make it more and more viable for this use case.
So Cloud Pub/Sub is the publish/subscribe aspects of SQS+SNS, where SNS is used as a means to distribute messages to different SQS queues. It also serves as the big-data ingestion mechanism a la Kinesis. Firebase Cloud Messaging covers the portions of SNS designed to reach end user devices. Cloud Tasks (and Cloud Pub/Sub, more and more) provide functionality of a single queue in SQS.
You are correct to say that GCP PubSub is close to AWS SQS. As far as I know, there is no exact SNS tool available in GCP, but I think the closest tool is GCM (Google Cloud Messaging). You are not the only one who has had this query:
AWS SNS equivalent in GCP stack

Video Streaming for Mobile App

I'm building an iOS app for a client that allows users to pay a subscription and unlock additional content within the app. Part of the additional content will be videos which need to be streamed from a server... but I'm not sure whether we should use a hosting service (like Amazon CloudFront or Wowza, perhaps?) or roll our own solution.
Have any of you had experience with either of these options? I looks like this is supported natively by nginx, which we're currently using as our rProxy, but I'd like to hear some thoughts about that. I would be somewhat concerned about saturating our server's 1Gb network connection too...
Whatever the solution, we must be able to verify a person's account before they can access the video content. Variable bitrate is also desirable, and the ability to support >500 concurrent users. This company is also a new startup, so subscription costs are an important factor.
It is usually best to deploy streaming-specific software or services instead of generic HTTP servers such as Nginx. For Wowza, as an example, here's a quick list of features for this type of workflow.
Performance and scalability. You can do a quick comparison on playing back concurrent streams (using load test tools) and see what kind of load can be handled by an HTTP server vs Wowza.
Monitoring. Statistics collection is also integrated with Wowza, which may prove beneficial for start-up companies that need to leverage this kind of data mining.
Security. Wowza also has several options that you can use, such as Secure Token. For example, you can configure your mobile app to query the user's IP address once you determine that they are authorized to receive the stream. You can then generate a hash token based on this IP address and the stream they are authorized for, and only allow playback with the valid token. You can also expire these tokens.
Manager UI. Not as attractive for developers/sys admins, but users can take advantage of a relatively intuitive UI.
Extensibility. Wowza has REST and Java API that can allow you to add custom modules or integrate third-party systems. For example, you can use a custom module that monitors stream connection time, and cuts off any connections that are longer than x number of hours.

Twitter Streaming API to follow thousands of users

I'm considering using the Twitter Streaming API (public streams) to keep track of the latest tweets for many users (up to 100k). Despite having read various sources regarding the different rate limits, I still have couple of questions:
According to the documentation: The default access level allows up to 400 track keywords, 5,000 follow userids. What are the best practices to follow more the 5k users. Creating, for example, 20 applications to get 20 different access tokens?
If I follow just one single user, does the rule of thumb "You get about 1% of all tweets" indeed apply? And how does this changes if I add more users up to 5k?
Might using the REST API be a reasonable alternative somehow, e.g., by polling the latest tweets of users on a minute-by-minute basis?
What are the best practices to follow more the 5k users. Creating, for example, 20 applications to get 20 different access tokens?
You don't want to use multiple applications. This response from a mod sums up the situation well. The Twitter Streaming API documentation also specifically calls out devs who attempt to do this:
Each account may create only one standing connection to the public endpoints, and connecting to a public stream more than once with the same account credentials will cause the oldest connection to be disconnected.
Clients which make excessive connection attempts (both successful and unsuccessful) run the risk of having their IP automatically banned.
A rate limit is a rate limit--you can't get more than Twitter allows.
If I follow just one single user, does the rule of thumb "You get about 1% of all tweets" indeed apply? And how does this changes if I add more users up to 5k?
The 1% rule still applies, but it is very unlikely impossible for one user to be responsible for at least 1% of all tweet volume in a given time interval. More users means more tweets, but unless all 5k are very high-volume tweet-ers you shouldn't have a problem.
Might using the REST API be a reasonable alternative somehow, e.g., by polling the latest tweets of users on a minute-by-minute basis?
Interesting idea, but probably not. You're also rate-limited in the Search API. For GET/statuses/user_timeline, the rate limit is 180 queries per 15 minutes. You can only get the tweets for one user with this endpoint, and the regular GET/search/tweets doesn't accept user id as a parameter, so you can't take advantage of that (also 180 query/15 min rate limited).
The Twitter Streaming and REST API overviews are excellent and merit a thorough reading. Tweepy unfortunately has spotty documentation and Twython isn't too much better, but they both leverage the Twitter APIs directly so this will give you a good understanding of how everything works. Good luck!
To get past the 400 keywords and 5k followers, you need to apply for enterprise access.
Basic
400 keywords, 5,000 userids and 25 location boxes
One filter rule on one allowed connection, disconnection required to adjust rule
Enterprise
Up to 250,000 filters per stream, up to 2,048 characters each.
Thousands of rules on a single connection, no disconnection needed to add/remove rules using Rules API
https://developer.twitter.com/en/enterprise

Desire2Learn Max API Limit

Is there a max limit on the valance API. I've made a number of calls, but I put some self throttling in the program. It makes a call to the user page, loops through the data, and then makes another call. It probably averaged 1 call every second or so.
I'm looking at expanding some functionality and I'm worried that we may reach a limit if we aren't careful about how we go doing everything.
So, is there a limit to how often we can call the valance api?
The back-end LMS can be configured to rate limit on Valence Learning Framework API calls; however, by default this does not get configured as active. To be sure, you should consult with the administrators of your back-end LMS.
Update: Brightspace no longer supports this kind of rate limiting mentioned. As Brightspace evolved, D2L found that the rate limiting was not providing the value that was originally intended, and as a result D2L deprecated the feature. D2L is no longer rate limiting the Brightspace APIs and instead depend on developer self-governance and asynchronous APIs for more resource intensive operations (the APIs around importing courses, for example). When you use the Brightspace APIs, you should be mindful that you are using the same computing resources as made available to end users interacting with the web UI, and if you over-stress these resources (as can easily be done through any API), you can have a negative impact on these end users.

Twitter Rate Limiting IP/OAuth Concerns

I have a series of webapps that collects all terms relating to a subject using the Public Streaming API. So far, I've been taking a very, very arduous route of creating a new account for each stream, setting up a new Twitter application on that account, copying the OAuth tokens, spinning up a new EC2 instance, and setting up the stream.
This allows me to have the streams coming from multiple different IPs, OAuth generation is easy with the generator tool when you create an app, and because they are each in different accounts I don't meet any account limits.
I'm wondering if there's anything I can do to speed up the process, specifically in terms of EC2 instances. Can I have a bunch of streams running off the same instance using different accounts?
If you run multiple consumers from a single machine you may be temporarily banned,
repeated bans may end up getting you banned for longer periods.
At least, this happened to me a few times in the past.
What I had found at the time was:
same credentials, same ip -> block/ban
different credentials, same ip -> mostly ok, but banned from time to time
different credentials, different ip -> ok
This was a few years ago, so I am not sure the same is still true, but I'd expect twitter to have tightened the rules, rather than having relaxed them.
(Also, I think you're infringing their ToS)
You should check the new Twitter API version 1.1. It was released a few days ago and many changes were made on how the rates are calculated.
One big change is that the IP is completely ignored now. So you don't have to create many instances anymore (profit!)
From the Twitter dev #episod:
Unlike in API v1, all rate limiting is per user per app -- IP address has no involvement in rate limiting consideration. Rate limits are completely partitioned between applications.

Resources