Timeline Reconstruction When a User Is Followed - twitter

This question is very similar to this one, however there are no answers on that one. I posted this one with more clarity in hopes of receiving an answer.
According to this presentation, Twitter incorporates a fanout method to push Tweets to each individual user's timeline in Redis. Obviously, this fanout only takes place when a user you're following Tweets something.
Suppose a new user, who has never followed anyone before (and conversely has no Tweets in their timeline), decides to follow someone. Using just the above method, they would have to wait until the user they followed Tweeted something for anything to show up on their timeline. After some observation, this is not the case. Twitter pulls in the latest Tweets from the user.
Now suppose that a new user follows 5 users, how does Twitter organize and push those Tweets into the user's timeline in Redis?
Suppose a user already follows 5 users and they have a fair amount of Tweets from these users in their timeline. When they follow another 5 users, how are these user's individual Tweets pushed into the initial user's timeline in Redis in the correct order? More importantly, how is it able to calculate how many to bring in from each user (seeing that they cap timelines at 800 Tweets).

Here is a way of how I would try to implement it this if I understand well your question.
Store each tweet in a hash. The key of the hash could be something like: tweet:<tweetID>.
Store the IDs of the tweets of a given user in a sorted set named user:<userID>:tweets. You set the score of the tweet as a unix timestamp, so they appear in the correct order. You can then get a list of the 800 most recent tweet IDs for the user with the instruction ZREVRANGEBYSCORE
ZREVRANGEBYSCORE user:<userID>:tweets +inf -inf LIMIT 0 800
When a user follows a new person, you copy the list of ids returned by this instruction in the timeline of the follower (either in the application code, or using a LUA script). This timeline is once again represented by a sorted set, with unix timestamps as scores. If you do the copy in the application code, which is perfectly acceptable with Redis, don't forget to use pipelining to perform your multiples writes in the sorted set in a unique network operation. It will greatly improve the performances.
To get the timeline content, use pipelining too. Request the tweets ID, using ZREVRANGEBYSCORE with a limit option and/or a timestamp as lower limit if you don't want tweets posted before a certain date.

Related

Best way to find determine friendship of users (FacebookSDK)

I'm working on an iOS app which at one point displays a feed of information items to the user, that contain information about other users. These feed items are stored on a server that I run as well. I want to add a functionality that allows this user to filter the information and display only items of his facebook friends. It seems to me that there are three ways to achieve this
1
Client fetches all items.For each item run that FB SDK query /user-id/friends to determine friendship
2
Save all of the facebook ID's of the users friends on the client (each set time),and after fetching all items, determine if item is posted by friend with comparison to local database of friends.
3
The server with the feed items would run the query in the backend, and filter the content it provides to the client
Each of these has it's weakness and advantages, but I'd like to hear which is the preferred and "best" overall. I'm trying to achieve something like VENMO's home feed functionality if that makes sense.
Thanks for the help!

In Twitter API How to check a tweet has been deleted on twitter if I check it via a Tweet id saved in my database

I have used Twitter API for fetching the tweets with a particular hash tag and all the data stored in my database.
I am currently working in core PHP.
API is working fine, thanks to Twitter
Now problem comes with the deleted tweets, Actually when a record of 100 latest tweets comes on my ajax file directly I store them in my database with a flag visible="y" . Now I want that if any of tweet stored in my database is being deleted from responsive twitter account automatically set visible = "N" into my database.
I searched a lot on internet but I could not find any respected answer.
Please reply me and ask me if any further information is needed.
Thanks to Stack Overflow for a good community.
The only way to do what you're suggesting, is to run some process that reads the tweets from your database and checks to see if they are still there.
You can call https://api.twitter.com/1.1/statuses/lookup.json and pass it a comma-delimited list of 100 tweets, and then compare the return value to your database to see if the tweets you asked for are still available or not.
You will have to iterate through your database 100 tweets at a time. Depending on how many tweets you are storing, you may run into rate limit issues.
You can make the status lookup call wit map parameter set to true. It will return null for deleted or protected tweets. It is more reliable way when compare to map set to false, because many a times lookup call doesn't return tweets even if it is not deleted (esp. for retweets).

Accessing huge volumes of data from Facebook

So I am working on a Rails application, and the person I am designing it for has what seem like extremely hefty data volume requirements. They want to gather ALL posts by a user that logs into the application, and all of the posts for each of their friends for the past year.
Before this particular level of detail was communicated to me, I built the thing using the fb_graph gem and would paginate through posts. I am running into the fact that first it takes a very long time to do this, even when I change the number of posts requested per page. Second, I frequently run into the Oauth error #613, more than 600 requests per 600 seconds. After increasing each request to 200 posts I run into this limit less, but it still takes an incredibly long time to get all of this data.
I am not particularly familiar with the FQL alternative, but it seems to me that we are going to have to either prioritize speed or volume of data. Is there a way that I am missing that would allow me to quickly retrieve this level of information?
Edit: I do save all posts to the database as I retrieve them. What is required is to make one pass through and grab all of the posts for the past year, for the user and friends. This process takes a long time and I am basically wondering if there is any way that it can be sped up.
One thing that I'd like to point out here:
You should implement some kind of local caching for user's posts. I mean, instead of querying FB each time for the posts, you should save the posts in your local database and only check for new posts (whenever needed).
This is faster and saves you many API requests.

Incremental searching only from your follower in twitter API

Is there a smart way using Twitter API to quickly incremental search twitter users only from your followers?
For example, there is user Alice who is followed from one million users. One day, Alice wanted to send a DM to another user which is one of her followers, but she only remembered that the name of him/her starts from the letter 'Bo'. So she wants to filter her one million users name with prefix 'Bo'.
How can Alice get all users using Twitter API, which name starts from 'Bo'?
Method 1. Call followers/ids 200 times and filter : Since followers/ids can get only 5,000 followers at a time, you should call followers/ids at least 200 times to get all one million followers. After that, you filter the users by name which starts from 'Bo'. This IS extremely slow because the filtering never starts until the 200 requests end. It's scalability is poor because it uses lots of memory to save and filter loaded follower list for each user.
Method 2. Call users/search many times and check if they're following me : Unfortunately, only the first 1,000 matches are available. Matches over 1,001 will never appear as your followers.
I want to know if there is a better way than this. The following 2 methods are too stupid.
P.S.
Lady Gaga seems to be followed by 28 million users. https://twitter.com/ladygaga

If I call Twitter API to get all of my followers, how many calls to the API is that?

If I want to download a list of all of my followers by calling the twitter API, how many calls is it? Is it one call or is it the number of followers I have?
Thanks!
Sriram
If you just need the IDs of your followers, you can specify:
http://api.twitter.com/1/followers/ids.json?screen_name=yourScreenName&cursor=-1
The documentation for this call is here. This call will return up to 5,000 follower IDs per call, and you'll have to keep track of the cursor value on each call. If you have less than 5,000 followers, you can omit the cursor parameter.
If, however, you need to get the full details for all your followers, you will need to make some additional API calls.
I recommend using statuses/followers to fetch the follower profiles since you can request up to 100 profiles per API call.
When using statuses/followers, you just specify which user's followers you wish to fetch. The results are returned in the order that the followers followed the specified user. This method does not require authentication, however it does use a cursor, so you'll need manage the cursor ID for each call. Here's an example:
http://api.twitter.com/1/statuses/followers.json?screen_name=yourScreenName&cursor=-1
Alternatively, you can user users/lookup to fetch the follower profiles by specifying a comma-separated list of user IDs. You must authenticate in order to make this request, but you can fetch any user profiles you want -- not just those that are following the specified user. An example call would be:
http://api.twitter.com/1/users/lookup.json?user_id=123123,5235235,456243,4534563
So, if you had 2,000 followers, you would use just one call to obtain all of your follower IDs via followers/ids, if that was all you needed. If you needed the full profiles, you would burn 20 calls using statuses/followers, and you would use 21 calls when alternatively using users/lookup due to the additional call to followers/ids necessary to fetch the IDs.
Note that for all Twitter API calls, I recommend using JSON since it is a much more lightweight document format than XML. You will typically transfer only about 1/3 to 1/2 as much data over the wire, and I find that (in my experience) Twitter times-out less often when serving JSON.
http://dev.twitter.com/doc/get/followers/ids
Reading this, it looks like it should only be 1 call since you're just pulling back an xml or json page. Unless you have more than 5000 followers, in which case you would have to make a call for each page of the paginated values.

Resources