Tweepy Tweet Extraction

Tweepy Tweet Extraction - twitter

Please share a full working code to extract tweets using Tweepy you can leave the place of consumer keys blank. Moreover, I need to search tweets using multiple keywords and boolean operators, like each tweet mentions Switzerland and one of the many keywords.
Switzerland AND (study OR education OR employment)

Stackoverflow is not a place to ask people to code on your behalf.
You can find a good starting point for your question in the Tweepy examples:
import tweepy
bearer_token = ""
client = tweepy.Client(bearer_token)
# Search Recent Tweets
# This endpoint/method returns Tweets from the last seven days
response = client.search_recent_tweets("Tweepy")
# The method returns a Response object, a named tuple with data, includes,
# errors, and meta fields
print(response.meta)
# In this case, the data field of the Response returned is a list of Tweet
# objects
tweets = response.data
# Each Tweet object has default ID and text fields
for tweet in tweets:
print(tweet.id)
print(tweet.text)
# By default, this endpoint/method returns 10 results
# You can retrieve up to 100 Tweets by specifying max_results
response = client.search_recent_tweets("Tweepy", max_results=100)
You can adapt it with your query: Switzerland (study OR education OR employment)
You will find more details about the search_recent_tweets in the Tweepy documentation.
And finally, if you need help building the query, you can read the Twitter documentation.

Related

Check the number of times a word or a phrase was tweeted

I have a general question regarding twitter APIs in python - is there a way to get the total number of times a particular word, or phrase were tweeted?
Thanks in advance.

You can't get that for the life of Twitter. However, you might be able to the search API to get an idea of how many times over the last 2 weeks, which is the approximate max amount of time the search API goes into the past:
auth = tweepy.auth.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
search_results = api.search(q="<your word>")
Then count the number of tweets you get back for an approximation.
For more info, look at the Tweepy Search API. Also, look at Tweepy Cursors for getting more than the default count of tweets.

How to get total favorite count of a particular follower?

I want to find how many tweets of user A did their specific follower B favorited. Is there any way to do this using either Python's tweepy or R's rtweet?
Many thanks!

You can do this, but it is a bit complicated.
You need to use GET favorites/list
This will let you get up to 200 Tweets that User B has liked. You would then have to search through the returned Tweets to see which ones were posted by User A.
The Tweepy documentation for favorites tells you how to do this:
tweets = API.favorites("edent")
Will get you all the tweets I've liked.

Tweepy: Find all tweets in a specific language

I'd like to extract all tweets in the Arabic language in all countries.
I modified the code in this tutorial.
This is my search query.
api.search(q="*", count=tweetsPerQry, lang ['ar'],tweet_mode='extended'). I expect to find a very large number of tweets, but I only collected about 7000 tweets.
I checked the content of some of them and I noticed that they are posted in my country even I did not specify the location/Country (Can anyone explain why this happen??).
I tried to know the reason for finding a limited number of tweets, so I modified the query by replacing the lang parameter by geocode to find tweets in a city. I fetched more than 65,000 Arabic tweets. After that, I used the lang parameter with the geocode and I found a very limited number of tweets.
Can anyone help me to know why I'm not able to get a large number of tweets when I used lang parameter?

The free twitter API's are good for small projects, but keep in mind that they don't display all of the tweets. Twitter has paid API's that are much more powerful, though what you are trying to achieve should be possible. I ran the query attached bellow, it seemed to work I was able to find a considerable amount of tweets. This method also seemed to work for #ebt_dev too I think it was just the structure of your request was set out like the stream listener version not the cursor search.
# Search Query change the X of .items(X) to the amount of tweets you are looking for
for tweet in tweepy.Cursor(api.search, q='*',tweet_mode='extended', lang='de').items(9999999):
# Defining Tweets Creators Name
tweettext = str( tweet.full_text.lower().encode('ascii',errors='ignore')) #encoding to get rid of characters that may not be able to be displayed
# Defining Tweets Id
tweetid = tweet.id
# printing the text of the tweet
print('\ntweet text: '+str(tweettext))
# printing the id of the tweet
print('tweet id: '+str(tweetid))

Why one tweet can have many IDs?

When searching tweets by the Twitter API, I got many tweets in the response with different IDs, but representing the same tweet. Example of IDs:
898174127525199872
898164436929716224
898163389104406529
898162871690944513
898163196938248193
You can see any of this tweets by URL: twitter.com/Triangle_Global/status/<id> - replacing <id> with a number. All this URLs redirect to the same address, the page with tweet 897793867822411776. Moreover, this ID was not returned in the search query.
Why one tweet has many IDs? Is it possible to construct a query that returns only "original" tweets, without such "duplicate" ids?

All these tweets that you reference are retweets of 897793867822411776. You can see this by looking at the retweeted_status field.
You didn't say which API endpoint you are using. If you are using search/tweets there is no way to return only "original" tweets. What you can do is throw out any tweets that has the retweeted_status field present. If the tweet is not a retweet it will not contain this field.

How to track and stream tweets for keywords with AND operator using phirehose library?

I am trying to connect to streaming API of twitter and retrieve tweets keywords using specific keywords. I am using the phirehose library for the same. It says in the twitter documentation that "commas as logical ORs, while spaces are equivalent to logical ANDs (e.g. ‘the twitter’ is the AND twitter, and ‘the,twitter’ is the OR twitter)."
But I want to search for keywords with AND operator even if there are other words in between. Meaning if we want to search for tweets having Keyword1 AND Keyword2, tweets which have only one keyword should not be retrieved.
Using the settrack function of the phirehose library -
setTrack(array('the , twitter'));
retrieves tweets with either the OR twitter while
setTrack(array('the twitter'));
retrieves tweets with the phrase the twitter and does not retrieve tweets like the busy twitter for example.
Please help.

140dev by Adam Green gives a solution for this by using ``typeenum('words','phrase') NOT NULL DEFAULT 'words'
Please see - http://140dev.com/twitter-api-programming-blog/streaming-api-enhancements-part-2-keyword-collection-database-changes/ and
http://140dev.com/twitter-api-programming-blog/streaming-api-enhancements-part-3-collecting-tweets-based-on-table-of-keywords/

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Tweepy Tweet Extraction - twitter

Related

Check the number of times a word or a phrase was tweeted

How to get total favorite count of a particular follower?

Tweepy: Find all tweets in a specific language

Why one tweet can have many IDs?

How to track and stream tweets for keywords with AND operator using phirehose library?

Categories

Resources