Tweepy user_search api is very slow - twitter

After two days of unsuccessful attempt to use twitter gem I have decided to use tweepy of python for a task. (My original attempt was with ruby and I posted the question here)
My task is to collect all those actresses who have a verified account on twitter. I have taken the list of actresses from wikipedia.
Everything looks fine till now. I have started hitting twitter REST api with each name and I check whether it is a verified account or not.
The only problem I have is that the response is very slow. It takes about 12-15 seconds for every request. Am I doing something wrong here or is it how it is suppose to be.
Below is my code in its entirety :
import tweepy
consumer_key = 'xxx'
consumer_secret = 'xxx'
access_token_key = 'xx-xx'
access_token_secret = 'xxx'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token_key, access_token_secret)
api = tweepy.API(auth)
actresses = []
f = open('final','r')
for line in f:
actresses.append(line)
f.close()
print actresses
for actress in actresses:
print actress
users = api.search_users(actress)
for u in users:
if u.verified == True and u.name == actress:
print u.name + " === https://twitter.com/" + u.screen_name
Also is there any better way to extract the verified actresses using that list?

Unfortunately, there is no faster way to do it, given that you only know the actresses' full names, and not their screen names. Each request will take a long time, as Twitter needs to return the results of users matching the query (there may be quite a few). Each one needs to be loaded and examined, which can take a while, depending on how many results were returned.

Related

Pulling all of an accounts Twitter followers, currently can only get the 5000 IDs

I have made a code that will mute the followers of a designated twitter account but obviously when pulling the follower IDs I can only get 5000. Is there a way of me continuing to pull more using a 'last seen' method or cursor?
import tweepy
import time
consumer_key = *****
consumer_secret = *****
key = *****
secret = *****
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(key, secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
user_name = 'realdonaldtrump'
def mute():
muted_users = api.mutes_ids()
followers = api.followers_ids(user_name)
try:
for x in followers:
if x in muted_users :
pass
else:
api.create_mute(x)
time.sleep(5)
except Exception:
print('Error')
Yes, this function does support cursors. From the Tweepy examples, you can use them like this - you’d need to modify this for muting rather than following.
for follower in tweepy.Cursor(api.followers).items():
follower.follow()
The issue you will hit with an account with a very large number of followers is that the rate limit here is low - 15 calls in 15 minutes - so this will take a very long time to complete. You may also hit account limits for the number of accounts you can mute within a time period.

Check if a certain user has tweeted

Is it possible to write a script for twitter that checks when the last time a certain user has tweeted?
Preferably using python.
Yes, it is possible. Here is how using TwitterAPI.
from TwitterAPI import TwitterAPI
SCREEN_NAME = 'justinbieber'
CONSUMER_KEY = 'XXXX'
CONSUMER_SECRET = 'XXXX'
ACCESS_TOKEN_KEY = 'XXXX'
ACCESS_TOKEN_SECRET = 'XXXX'
api = TwitterAPI(CONSUMER_KEY,
CONSUMER_SECRET,
ACCESS_TOKEN_KEY,
ACCESS_TOKEN_SECRET)
r = api.request('statuses/user_timeline', {'screen_name':SCREEN_NAME, 'count':1})
for item in r:
print(item['created_at'])
There is a python library for accessing the Twitter API called Tweepy
More info on the API can be found here: https://dev.twitter.com
An older post but may be relevant: streaming api with tweepy only returns second last tweet and NOT the immediately last tweet

How to make batch request using fb_graph?

when a user login using facebook then i need collect all the movies list liked by the user and his/her friends.
user = FbGraph::User.fetch('me', :access_token => "access_token")
userFrnd = user.friends
movies=[]
userFrnd.each do | uf |
frnd = FbGraph::User.fetch(uf.raw_attributes['id'], :access_token => "access_token")
movies << frnd.movies
end
final_movie_list = movies.flatten.uniq_by {|track| track.raw_attributes["id"]}
this is my fb_graph function and it's working fine.but i need to make it as batch request since the i have 360 friend it take 360 request to process the above function correctly.but help me out to optimize this and reduce the time it takes to calculate this function.
I came to know that batch request may help me but,i don't know how to do that in fb_graph.
Please help me
I'm using FbGraph from ( github.com/nov/fb_graph ) Version: 2.7.8 and I'm able to make a batch request for 100 users, at a time and get following information about them by default.
I'm not using any access token. So you might be able to get more information including movies.
id
name
first_name
last_name
link
username
gender
locale
Here's the demonstration code where ids is an array of Facebook User Ids:
r = FbGraph::User.fetch("?ids=#{ids.join(",")}")
r.raw_attributes #information about the users hashed by their id (String)

Twitter v1.1: 400 Bad request

I have problems with the new Twitter API: v1.0 is working without problems, but if I change the URL to v1.1 I get all the time a error "400 Bad request" (seen with Firebug).
Example:
https://api.twitter.com/1/statuses/user_timeline.json?screen_name=twitterapi
This is working like a charm, everything works as excepted.
Simply changing the URL to .../1.1/... and I get a Bad request error and even to JSON error response or even some content at all.
https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=twitterapi
Note: It couldn't be a rate limitation, because I accessed the URL the first time ever.
https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=twitterapi redirects me to https://api.twitter.com/1/statuses/user_timeline.json?screen_name=twitterapi
Looks like 1.1 is the same thing as 1
UPD: Looks like this is a rate limit (as 1.1 link worked for me 2 hours ago). Even if you hit API page for the first time, some of your apps (descktop or mobile) could use API methods.
UPD2: in 1.1 400 Bad request means you are not autorized (https://dev.twitter.com/docs/error-codes-responses, https://dev.twitter.com/docs/auth/oauth#user-context). So you need to get user context
You need to authenticate and authorize using oauth before using v1.1 apis
Here is something which works with python tweepy - gets statuses from users timeline
def twitter_fetch(screen_name = "BBCNews",maxnumtweets=10):
'Fetch tweets from #BBCNews'
# API described at https://dev.twitter.com/docs/api/1.1/get/statuses/user_timeline
consumer_token = '' #substitute values from twitter website
consumer_secret = ''
access_token = ''
access_secret = ''
auth = tweepy.OAuthHandler(consumer_token,consumer_secret)
auth.set_access_token(access_token,access_secret)
api = tweepy.API(auth)
#print api.me().name
#api.update_status('Hello -tweepy + oauth!')
for status in tweepy.Cursor(api.user_timeline,id=screen_name).items(2):
print status.text+'\n'
if __name__ == '__main__':
twitter_fetch('BBCNews',10)
For me the cause was the size of the media that was attached to the tweet. If it was <1.2MB it went through OK, but if it was over, I would get a 400 error every time.
Strange considering Twitter says the tweet limit is 3MB https://twittercommunity.com/t/getting-media-parameter-is-invalid-after-successfully-uploading-media/58354

Query Collection contents in Python client for Google Docs API

How do I query the contents of a specific collection using the Python client for Google Docs API?
This is how far I've come:
client = gdata.docs.service.DocsService()
client.ClientLogin('myuser', 'mypassword')
FOLDER_FEED1 = "/feeds/documents/private/full/-/folder"
FOLDER_FEED2 = "/feeds/default/private/full/folder%3A"
feed = client.Query(uri=FOLDER_FEED1 + "?title=MyFolder&title-exact=true")
full_id = feed.entry[0].resourceId.text
(res_type, res_id) = full_id.split(":")
feed = client.Query(uri=FOLDER_FEED2 + res_id + "/contents")
for entry in feed.entry:.
print entry.title.text
The first call to Client.Query succeeds and seems to provide a valid resource ID. The second call, however, returns:
{'status': 400, 'body': 'Invalid request URI', 'reason': 'Bad Request'}
How can I correct this to get it working?
It is much easier once you have a folder entry, to call client.GetResources(entry.content.src) rather than generating the URI by yourself and using a Query.
In your case, client.GetResources(feed.entry[0].content.src).

Resources