In this use-case I need to monitor Twitters stream for tweets with certain hash-tags and then pull those tweets out and store them. I am using Twitter4J for this and Twitters Streaming API. The hash-tags to monitor change frequently so I would like to refresh the filter every 10 minutes or so. When I refresh I am simply pulling all the new hash-tags from the data layer and passing them to the filter query. My two questions:
Is there anything wrong with stopping the connection every 10 minutes and refreshing (in terms of Twitters rate limits etc)
Is there anything to prevent me losing tweets that are made during the short refresh pause?
Thanks in advance.
You should not reconnect any more often than once every ten minutes, or you may be rate limited. You can form your new connection before dropping your old connection, which should help avoid data loss. Note that you may only have one outstanding connection at a time.
Related
I have an application where I need to get complete, realtime search results from twitter (preferably polling every 500ms or less). Based on my understanding, doing this using the search API will run into rate limits very quickly. However, the streaming API doesn't seem to support getting complete anything (only a 5% sample).
More specifically, I have a search query term which typically comes up with <20 matching tweets per hour, and I would like to be informed of these new tweets within 1-2 seconds, and it is considered a failure if I am not notified within 5 seconds. Due to the relatively low frequency of posting, missing even one tweet is very undesirable.
Is there any way I can realistically do this using twitter API, or is my only choice to write a browser extension to repeatedly refresh the search page?
The answer is "yes". Although you are rate limited (the limit is closer to 1% than 5%), that is only a cutoff based on your query results. Very roughly, you can stream about 60 tweets per second max. In your case, you say you expect under 20 tweets per hour, so you should have no problem getting all those tweets.
You also require a latency less than 5 seconds. In my experience latency has always been a second or two. I think you should be fine.
I want to make a twitter APP that tweets at 12:00:00 and exactly at 12:00:00, reducing all delays to the minimum(trying to be faster than just scheduling the tweet in tweetdeck).
Any ideas for which language to use, which twitter API(streaming, REST...), suggestions of possible algorithms...?
You just cannot anticipate the duration of your Http Request nor the time that Twitter will take to process it.
Usually a request is processed in less than second. So I would personally initiate the Http Request to start at 11:59:50 if your connection is good enough.
I am personally the developer of the Tweetinvi library which is developed in C# that will let you publish a tweet in a single line of code, but I don't think any library or algorithm will be able to anticipate the time for Twitter to process your request.
What you could potentially do is have some statistics. Store the DateTime just before you invoke the line of code to publish your tweet. When your tweet is published look at the DateTime of creation in the json. Compare the 2 DateTime.
Repeat the previous operation multiple times and perform some statistics to improve your chance to publish at the exact time.
I have an App that has the locations of 10 different places.
Given your current location, the app should return the estimated arrival time for those 10 locations.
However, Apple has said that
Note: Directions requests made using the MKDirections API are server-based and require a network connection.
There are no request limits per app or developer ID, so well-written apps that operate correctly should experience no problems. However, throttling may occur in a poorly written app that creates an extremely large number of requests.
The problem is that they make no definition on what a well written app is. Is 10 requests bad? Is 20 requests an extremely large number?
Has any one done an app like this before to provide some guidance? If Apple does begin throttling the requests, then people will blame my app and not Apple. Some advice please..
Hi investigate class MKRoute this class contains all information you need.
This object contains
expectedTravelTime
Also you should consider LoadingThrottled
The data was not loaded because data throttling is in effect. This
error can occur if an app makes frequent requests for data over a
short period of time.
For prevent your request from bing throttled, reduce number of requests.
Try to use Completion Handlers to know if you request is finished and only after send another request or cancel previous. From my experience try to handle this request just as regular network request just be sure you are not spamming unnecessary requested to the Apple API. But this is not 100% guarantee that Apple won't throttle your requests.
I am not clear about what the Twitter rate limit, "350 requests per hour per access token/user", means. How they are limiting the request? In 1 request how much data i can get?
The rate limits are based on request, not the amount of data (e.g. bytes) you receive. With that in mind, you can maximize requests by using the available parameters of the particular endpoint you're calling. I'll give you a couple examples to explain what I mean:
One way is to set count, if supported, to the highest available value. On statuses/home_timeline, you can max out count at 200. If you aren't using it now, you're getting the default of 20, which means that you would (theoretically) need to do 10 more queries the get the same amount of data. More queries mean you eat up rate limit.
Using statuses/home_timeline again, notice that you can page through data using since_id and max_id, as described in Working with Timelines. Essentially, you keep track of the tweets you already requested so you can save on requests by only getting the newest tweets.
Rate limits are in 15 minute windows, so you can pace your requests to minimize the chance of running out in any give time window.
Use a combination of Streams and requests, which increases your rate limit.
There are more optimizations like this that help you save request limit, some more subtle than others. Looking at the rate limits per API, studying parameters, and thinking about how the API is used can help you minimize rate limit usage.
I am storing in a database, every 30 minutes, Twitter's trending topics of a country Y. No problem with that.
Now, I want to get as much tweets as possible matching those trending topics for research purposes.
Since I would like to study the patterns of the trends, I would like continuous tweet data of at least 3 days centered in the day the trend peak was detected, for every trending topic. In order to achieve that, I thought of doing the following:
Suppose I am in day X. I could retrieve the unique trends of day X-2, and for every trend, look for tweets matching the trend in the interval [X-3, X-1], that is 3 days. However, the problem here is Twitter rate limitations. If I have 100 trending topics in day X-2, and I make 20 GET search requests/trend, I would end up doing a total of 2,000 requests, which overpasses Twitter's 350 hourly rate limit. If make 300 req/hour, it would take more than 6 hours to get the data for only one day...
Does anybody know any other (better) way for getting tweets associated with trends?
Thanks in advance
Twitter Streaming API?
Twitter Streaming API doesn't deliver any past tweets. You only receive tweets starting from the time the server connection is established. The search API will return tweets matching the current query up to 7 days old in theory, but that is entirely up to Twitter’s current load. (Note*-At times this interval has been as short as 24 hours. In addition, you are limited by the ability to only receive up to 1,500 tweets regardless of how old they are.)
Is there any way to get more tweets from the streaming?
None that I know. But, do refer the below mentioned information if you are considering to switch among search or streaming API.
Please choose your case:
If you need real time data and your number of requests are high:
Go for Streaming API
The streaming API requires that you keep the connection active. This requires a server process with an infinite loop, to get the latest tweets.
Advantage
1)Lag in retrieving results: Tweets delivered with this method are basically real-time, with a lag of a second or two at most between the time the tweet is posted and it is received from the API
2)Not rate limited.
If you need aggregate data regardless of its time range and your number of requests are not high:
Go for Search API
The search API is the easier of the two methods to implement but it is rate limited .Each request will return up to 100 tweets, and you can use a page parameter to request up to 15 pages, giving you a theoretical maximum of 1,500 tweets for a single query.
Advantage
1)Finding tweets in the past:The search API wins by default in this area, because the streaming API doesn’t deliver any past tweets
2)Easier to implement