When searching tweets by the Twitter API, I got many tweets in the response with different IDs, but representing the same tweet. Example of IDs:
898174127525199872
898164436929716224
898163389104406529
898162871690944513
898163196938248193
You can see any of this tweets by URL: twitter.com/Triangle_Global/status/<id> - replacing <id> with a number. All this URLs redirect to the same address, the page with tweet 897793867822411776. Moreover, this ID was not returned in the search query.
Why one tweet has many IDs? Is it possible to construct a query that returns only "original" tweets, without such "duplicate" ids?
All these tweets that you reference are retweets of 897793867822411776. You can see this by looking at the retweeted_status field.
You didn't say which API endpoint you are using. If you are using search/tweets there is no way to return only "original" tweets. What you can do is throw out any tweets that has the retweeted_status field present. If the tweet is not a retweet it will not contain this field.
Related
Please share a full working code to extract tweets using Tweepy you can leave the place of consumer keys blank. Moreover, I need to search tweets using multiple keywords and boolean operators, like each tweet mentions Switzerland and one of the many keywords.
Switzerland AND (study OR education OR employment)
Stackoverflow is not a place to ask people to code on your behalf.
You can find a good starting point for your question in the Tweepy examples:
import tweepy
bearer_token = ""
client = tweepy.Client(bearer_token)
# Search Recent Tweets
# This endpoint/method returns Tweets from the last seven days
response = client.search_recent_tweets("Tweepy")
# The method returns a Response object, a named tuple with data, includes,
# errors, and meta fields
print(response.meta)
# In this case, the data field of the Response returned is a list of Tweet
# objects
tweets = response.data
# Each Tweet object has default ID and text fields
for tweet in tweets:
print(tweet.id)
print(tweet.text)
# By default, this endpoint/method returns 10 results
# You can retrieve up to 100 Tweets by specifying max_results
response = client.search_recent_tweets("Tweepy", max_results=100)
You can adapt it with your query: Switzerland (study OR education OR employment)
You will find more details about the search_recent_tweets in the Tweepy documentation.
And finally, if you need help building the query, you can read the Twitter documentation.
How to download recent tweets regardless of keyword? I want any recent tweet from the Twitter API version 2. Is it possible? If it is, how to write a query?
For example, it will download tweets containing word cat:
https://api.twitter.com/2/tweets/search/recent?query=cat
but what query to use to use instead of ?? to get tweets for any keyword:
https://api.twitter.com/2/tweets/search/recent?query=??
In API1.1, it was possible to use * as a query, but it seems it is not working for API2. If I commit a query I get the following error: The query query parameter can not be empty.
I think this would return far too many results as it would be all Tweets from all users, all over the world! The "recent" endpoint will limit results to those from the last 7 days but it would still be millions.
You can query just the tweets from a particular account (e.g. Stackoverflow) with:
https://api.twitter.com/2/tweets/search/recent?query=from:stackoverflow
I have access to Twitter API for Academic Research, and I'd like to get the follower count on a given date of a user, or at the time of a tweet.
The doc mentions that "This fields parameter enables you to select which specific user fields will deliver in each returned Tweet.", so I assumed that by adding public_metrics to the users.field, the number of followers can be seen in each returned Tweet, however, in each returned Tweet, I can only see user_id. https://developer.twitter.com/en/docs/twitter-api/tweets/search/api-reference.
Is it even possible to achieve what I want with Twitter API for Academic Research? Is there any other approach to make it?
Thank you so much.
You cannot get the follower count on a specific date; it will always be the count at the time you make the API call.
You may need to add expansions to your API call in order to receive the values you are trying to pull out.
I am using this URL http://search.twitter.com/search.json to grab tweets with the hash tag #SameHashTag. The feed only returns items within the past week. Twitter explains why here:
https://dev.twitter.com/docs/faq#8650
I really need to get the last 10 tweets, regardless of when they were created. What is the alternative?
Note I read about user_timeline, but that seems to be based on user instead of hashtag. I read about list_timeline, but that seems to pull tweets from a defined list of users instead of hashtag.
You can't get old tweets from Search API. If your hashtag doesn't change you could search it and save results every day and build your own history.
I'd like to find mentions of a hashtag from my friends on Twitter.
At the moment I'm doing:
https://api.twitter.com/1/friends/ids.json?cursor=-1&screen_name=(name)
This returns a list of ids. What I'm then looking at doing is searching for a hashtag mention within that list of ids. Something like:
http://search.twitter.com/search.json?q=#hashtag IN user_id=(1,2,3,4)
One possible (and undesirable) workaround would be to do batch processing of:
http://api.twitter.com/1/users/lookup.json?user_id=(1,2,3,4 ~ 100)
And then construct a query based on the screen names.
Thanks