I am working on tweet sentiment analysis (detecting users' view about plastic surgery treatments), but when I tried to extract the tweets, among the tweets that show the users' view there were also some tweets that are about the marketing of the treatment like I search Liposuction, then there were alot of the tweets that are not showing a user's view but marketing it. My question is how can I remove these tweets and just have the user view tweets for sentiment analysis.
Related
I am building a machine learning model that would suggest attractions in a specific location.
I have most of the details worked out. However, I still need to collect the data of the attractions to train my model.
Is there somewhere I could find a dataset for this (I already checked Kaggle)? If not which websites should I scrape?
If you want to scrape data, twitter probably is the easiest to start. You can use twitter API to get any tweet that contain a specific keyword or hashtag, input your desired location as the keyword and scrape it using tweepy, i would suggest you to scrape from a specific account like Influencer or travel blog to get data about attraction.
Applying to get twitter API might take several days, and you can only scrape tweet within a time range of a weeks. older than that you need to sign up to their premium subscription.
i'm University students of South korea
I'm developing analysis application using bigdata of twitter with my advisor professor. So i'm gathering tweets contains specific keyword(relevant word of crime) at period. I use 'streaming api' and 'search api' now. I have seen that using search api and streaming api result is return tweets of only one week.
I should be get the old data that have keyword of crime and since 2006 until 2016
do you have any idea?
Sadly you can't get tweets from that time range.
From the documentation:
The Search API is not complete index of all Tweets, but instead an index of recent Tweets. At the moment that index includes between 6-9 days of Tweets.
So, you can only get recent tweets from the search API. Be careful too with the data beacuse it's about relevance not completeness, from the same documentation:
Before getting involved, it’s important to know that the Search API is focused on relevance and not completeness. This means that some Tweets and users may be missing from search results. If you want to match for completeness you should consider using a Streaming API instead.
If you really need older tweets you will have to get them from other sources like Gnip. Otherwise you will have to approach differently your problem.
If you have the names (or id's) of all the users that you want to get info you could get the timelines from each user getting up to 3200 tweets.
I have three embedded Twitter timeline feeds that I am tracking on a website and I'd like to be able to count the number of Tweets that appear in each of the respective embedded feeds. Any suggestions would be helpful.
I have list of users
['foo','bar']
I want to search whether they have checked in somewhere or not (using 4sq api)..
So basically, all I am looking for is that whether their tweets contain "\4sq.com\" or not?
I get very confused looking into their api?
Bonus points if the steps can be implemented in python.
Thanks
You have two options for checking a user's tweets:
One option is to look through the tweets in their User Timeline (accessible in Tweepy through api.user_timeline). However, you may have to search through a lot of unrelated Tweets before you come across the one your looking for. Given how many tweets some users have, you might want to only look thorough tweets more recent than a certain date (you can look at the created_at attribute of returned tweets).
The other option is to use the Search API (accessible in Tweepy though 'api.search'). This has the advantage of allowing you specify a search query, giving you relevant tweets. However, you will need to search through the tweets until you find one by the user you're looking for. Again, you might want to limit the date range of tweets that you search through.
I was asked to find Twitter accounts associated with the Dominican Republic (the project had to do with voting). This was a strange request since some twitter accounts have GeoSpatial data associated with the account, we have no idea whether it is accurate.
I wound up searching by hand for keywords that I knew were related: #dominican, #washingtonheights and I hopped along their friends and followers and I found the people I was looking for.
More genereally:
How do I search for Twitter accounts associated with a given topic? How might it be possible to train a bot to identify hashtags relevant to a given topic? And then we can search for those keywords.
#Moderators: This is not really a coding question. If you can think of a better StackExchange, please migrate this!
Since you already have a given Topic i would suggest he following:
Get a couple of Account by Hand by these Hashtags you already mentioned.
Retrieve X tweets for these Accounts
Do some Natural Language Processing of these Tweets to get new ideas for Keywords.
Some things i used in this/similar contex:
tf-idf + NMF to get Topics and then sort by components to retrieve
the topics a user is talking about (user can have multiple topics).
some sort of clustering (your biggest problem here will be the high
sparesity of the data, so PCA could be an option)
use wordnet etc to collect similar keywords