Collecting old tweets through Tweeter API - twitter

I'm going to collect tweets about an event that has been happened 3 years ago, but I read somewhere that Twitter only let its API users to collect tweets not older than a week. So, I'd like to ask if this is true, how can I collect tweets from 3 or more years ago?

Get tweets using:
time_line_statuses = api.GetUserTimeline(screen_name=screen_name, include_rts=True)
Loop through time_line_statuses using a for loop
Check "created_at" property of each item to see if it is younger than your cut off date.
Each item has an "id" property. Value seems to grow with time. Lower ID = older.
Store 'id' of oldest status from time_line_statuses as oldest_id.
Call
.
time_line_statuses = api.GetUserTimeline(screen_name=screen_name, include_rts=True, max_id=)
Store oldest_id as previous_oldest_id
Repeat 1-6 while checking that oldest_id is not equal to previous_oldest_id before continuing the loop
You can only make 100 get request to twitter per hour. You need to count your Get() calls and have the program sleep for an hour when you've hit that limit. I don't know if their API has a limitation on how far back it can go. You may be able to save API calls if you can find the ID of the tweet that would be at the start of your cutoff date and seed this process from there.

Your only option is to pay for a service such as Gnip. Gnip provides an API that will let you search for tweets older than one week.

Related

How to get the Twitter follower count of a user on a given date, or at the time of a tweet?

I have access to Twitter API for Academic Research, and I'd like to get the follower count on a given date of a user, or at the time of a tweet.
The doc mentions that "This fields parameter enables you to select which specific user fields will deliver in each returned Tweet.", so I assumed that by adding public_metrics to the users.field, the number of followers can be seen in each returned Tweet, however, in each returned Tweet, I can only see user_id. https://developer.twitter.com/en/docs/twitter-api/tweets/search/api-reference.
Is it even possible to achieve what I want with Twitter API for Academic Research? Is there any other approach to make it?
Thank you so much.
You cannot get the follower count on a specific date; it will always be the count at the time you make the API call.
You may need to add expansions to your API call in order to receive the values you are trying to pull out.

Get latest record for each user with ODATA

Due to the PowerShell methods of getting mailbox statistics from Office365 taking about 2 seconds per mailbox, I am working on getting the data from Office 365 Reporting web service, which takes only a few seconds for each 2000 mailboxes.
The problem I'm running into is that the stats are updated periodically and some historical data is kept, so there are numerous records for each user. I only want to get the latest record for each user, but I haven't been able to find a way to do that. The closest I've come is to use $filter=Date ge DateTime'2016-03-10T00:00:00' where the date is concatenated to a couple of days ago. Theoretically, if I sort by Date desc I should get the latest records first, and if there is a user that has a record for 3/10 and 3/11, the 3/11 record would get pulled first, which would work for me. But regardless of how I do the sort it seems to come back with the older records first.
Ideally, I would like to be able to set criteria so that it only returns the latest record for each mailbox, but I can't seem to figure out or find how to do that. The closest I've been able to come is to just start running queries filtered on specific dates, walking the date back a day on each query.
If I can get the latest records to be returned first, I would be able to work with that because I can just discard a record if I've already received a later one.
https://reports.office365.com/ecp/reportingwebservice/reporting.svc/MailboxUsageDetail/
?DelegatedOrg=nnn.onmicrosoft.com&$select=Date,WindowsLiveID,CurrentMailboxSize
&$filter=Date ge DateTime'2016-03-08T00:00:00'&$orderby=Date desc
So the questions are:
Is there a way to specify criteria so that only the latest record for each user is returned?
Is there a way to get it to order by Date descending--what am I doing wrong with the $orderby?
Thanks!
You can use $top=1 to get latest record by applying $orderby on date (desc). $filter and $skip may not require in this case.
https://reports.office365.com/ecp/reportingwebservice/reporting.svc/MailboxUsageDetail/?DelegatedOrg=nnn.onmicrosoft.com&$select=Date,WindowsLiveID,CurrentMailboxSize&$orderby=Date desc&$top=1
Your query looks fine, here is an another example from Odata sample service to get employee detail with most recent birth date.
http://services.odata.org/V4/Northwind/Northwind.svc/Employees?$select=EmployeeID,FirstName,LastName,BirthDate&$orderby=BirthDate%20desc&$top=1

minimizing parse requests while looping through array

I'm working on a pet project using parse as a back end. I'm setting up a viewcontroller that contains a list of people you can possibly add as "friends"; these are people that
a) exist in your contacts list and
b) have already downloaded the app and signed up.
Different buttons will be displayed depending on their status as a user (invite button if they only exist in your contacts list, add to friends button if they're also using the app already).
I'm trying to keep my Parse account to 30 requests/second so that I'm not out of pocket for a pet app.
One way I've thought to figure out who is registered as a user AND who exists in my contacts list is to loop through the contacts list on my phone and query that phone number on parse. However, this would obviously go over my limit on requests/second.
Is there a way (I've looked through Parse documentation and googled it) to take an array (list of contacts on my phone) and run a PFQuery ON THAT ARRAY, checking each object and returning matches?
Unless you have a quarter million users in your app you shouldn't be much concerned, it doesn't work like: 1 user goes through 30 count for loop with one query each and you get 30 req/s:
How does the requests/second limit translate to concurrent users?
Generally when your user count doubles, your requests per second also double. However, different apps send different numbers of requests per second depending on how frequently they save objects or issue queries. We estimate that the average app's active user will issue 10 requests. Thus, if you had a million users on a particular day, and their traffic was evenly spread throughout the day, you could estimate your app would need about 10,000,000 total API requests, or about 120 requests per second. Every app is different, so we strongly encourage you to measure how many requests your users send.
I have run through loops of requests and I barely hit 1 req/s
Is there a way (I've looked through Parse documentation and googled
it) to take an array (list of contacts on my phone) and run a PFQuery
ON THAT ARRAY, checking each object and returning matches?
Yes there is, use:
query?.whereKey(key: String, containedIn: [AnyObject])

Twitter Search Api 1.1 searching by date

Since Twitter Search Api 1.1 does not have since parameter to specify the start date, how do I get the tweets between 2 different dates(within 7 days limit)?
Note: I cannot use the since_id and max_id as parameters because I have only 2 dates and search query as inputs.
There is no direct way of doing it, but here are couple of ideas. You have a from date and a to date, right? So -
Set result_type to recent and until to your to date and count to 100.
from the result of 1, you get 100 tweets and you check if you've hit the from date, if not keep going till you reach from date using the max_id parameter.
Another idea would be -
Set result_type to recent and until to your from date. get the ID of the latest tweet from there. You need all the tweets since that ID till your to date ends.
So you set since_id to that ID you got in step 1 and keep requesting and updating since_id after each request till you hit your to date's end.

How to get list of Tweets by hash tag that are more than 7 days old

I am using this URL http://search.twitter.com/search.json to grab tweets with the hash tag #SameHashTag. The feed only returns items within the past week. Twitter explains why here:
https://dev.twitter.com/docs/faq#8650
I really need to get the last 10 tweets, regardless of when they were created. What is the alternative?
Note I read about user_timeline, but that seems to be based on user instead of hashtag. I read about list_timeline, but that seems to pull tweets from a defined list of users instead of hashtag.
You can't get old tweets from Search API. If your hashtag doesn't change you could search it and save results every day and build your own history.

Resources