Java -SpringSocial: Getting Twitter data between two dates - twitter

I am using spring social to get twitter data for particular tag say #spring.
What I want to achieve is get all data between two dates/timestamp. eg. data between last fetch time and current time. The spring social API allows me to get data betweem two tweetIDs but I failed to find a way where in one can get data between two dates.
Please help.
Ankit Solanki

You can use Spring social twitter advanced search.
http://docs.spring.io/spring-social-twitter/docs/1.0.5.RELEASE/api/org/springframework/social/twitter/api/impl/SearchParameters.html
It has two parameters since and until. These can be used to retreive based on date / time period.

Related

How do I obtain the data to train my ML model?

I am building a machine learning model that would suggest attractions in a specific location.
I have most of the details worked out. However, I still need to collect the data of the attractions to train my model.
Is there somewhere I could find a dataset for this (I already checked Kaggle)? If not which websites should I scrape?
If you want to scrape data, twitter probably is the easiest to start. You can use twitter API to get any tweet that contain a specific keyword or hashtag, input your desired location as the keyword and scrape it using tweepy, i would suggest you to scrape from a specific account like Influencer or travel blog to get data about attraction.
Applying to get twitter API might take several days, and you can only scrape tweet within a time range of a weeks. older than that you need to sign up to their premium subscription.

Not able to see time zone, place or geolocation of any tweets

I am following two tutorials right now and both are up and running and I've gotten plenty of tweets/sentiment scores from them:
1) Twitter Stream Analytics on Azure https://azure.microsoft.com/en-us/documentation/articles/stream-analytics-twitter-sentiment-analysis-trends/
2) Twitter Analysis with Spark Streaminghttp://ampcamp.berkeley.edu/3/exercises/realtime-processing-with-spark-streaming.html
I am using the free oauth tool provided from apps.twitter.com.
Problem
I've tried getPlace, getGeoLocation in the Spark Streaming app and every tweet I get has a null value for those two fields. I have tried filtering for tweets that only have values for getPlace, get GeoLocation and I get null for both (I ran the app for almost 20 minutes).
I've also tried getting TimeZone in the Azure app (so I can get some sort of geography data) and even then I kept getting null values for TimeZone.
Possible Obstacles
1) Does the free twitter api filter out the place/geoLocation information so I end up buying a subscription to a better api?
2) Do I need to explicitly search for tweets that have geoLocation/Places? Rather than getting all tweets and then filtering out ones that have geoLocation/Places? If so, can I execute this search in Spark Streaming?This is the code that I have in Spark Streaming:
val stream = TwitterUtils.createStream(ssc, None, filters)
val hashTags = stream.map(status => Tweet(status.getPlace().getName(), classifyTweet(status.getText())))
Thank you for the help!
I've personally used the free Twitter api to get locations and publish them on a a map on PowerBi. So you can rule out the first obstacle.
One thing to note is that location field is only available if the client specifically allows the application to have location, which renders it quite rare to be found. The ratio for data with location in my sample data was about 8%.
Don't have an answer for spark side, just wanted to help you rule out the first possibility.
Hope this helps.

Rails current visitor count

How does one implement a current visitors count for individual pages in Rails?
For example, a property website has a list of properties and a remark that says:-
"there are 6 people currently looking at this property" for each individual listing.
I'm aware of the impressionist gem, which is able to log unique impressions for each controller. Just wondering if there is a better way than querying
impressions.where("created_at <= ?", 5.minutes.ago).count
for each object in the array.
Before you get downvoted, I'll give you an idea of how to do it
Recording visitors is in the realm of analytics, of which Google Analytics is the most popular & recognized
Analytics
Analytics systems work with 3 parts:
Capture
Processing
Display
The process of capturing & processing data is fundamentally the same -- put a JS widget on your site to send a query to the server with attached user data. Processing the data puts it into your database
Displaying The Data
The difference for many people is the display of the data they capture
Google Analytics displays the data in their dashboard
Ebay displays the data as x people bought in the past hour
You want to show the number of people viewing an item
The way to do this is to hard-code the processing aspect of the data into your app
I can't explain the exact way to do this, because it's highly dependent on your stack, but this is the general way to do it

YouTube Data API V3: Fetch multiple videoCategoryId videos

I am using YouTube Data API Version 3.0 in one of my projects to fetch my channel video details from YouTube. I don't want the user to login to his/her Google account that's why I am directly using Search.list method instead of going through the OAuth 2.0 way.
Usually I fetch data using following URL.
https://www.googleapis.com/youtube/v3/search?key={API_KEY}&maxResults=5&part=snippet&type=video&channelId={CHANNEL_ID}
Now, I want to fetch data of 5 different categories at a time. What I can do is, hit the same URL 5 times with query string as
key={API_KEY}&maxResults=1&part=snippet&type=video&channelId={CHANNEL_ID}&videoCategoryId={CATEGORY_ID}
or,
Is there a way like
key={API_KEY}&maxResults=5&part=snippet&type=video&channelId={CHANNEL_ID}&videoCategoryId={CATEGORY_ID_1, CATEGORY_ID_2, CATEGORY_ID_3, CATEGORY_ID_4, CATEGORY_ID_5}
I want to fetch only 1 video data per category. That is why I have given 5 comma separated category IDs.
Also, Search.list method does not give videoCategoryId. To get it, I have to use
https://www.googleapis.com/youtube/v3/videos?part=snippet&id={VIDEO_ID}&key={API_KEY}
Is there any way to get videoCategoryId in Search.list method?
You need to have 5 queries for that. Even if you were able to give 5 category ids, it wouldn't understand to pick up one video data for each one.
Search doesn't return videoCategory right now, videos->list has it.
There is no problem going through OAuth2, you can just ask read-only permission, and should be fine. Users can pick their already stored accounts easily.

Retrieving tweets from twitter using twitter4j

I am developing an application to guess locations of tornadoes by analyzing twitter data. For this, I would first need to train a neural network on some manually annotated tweets. I am trying to get tweets from last year which have the word 'tornado' in them. This is my code below :-
Query query = new Query("tornado");
query.setRpp(100);
query.setSince("2010-11-01");
query.setUntil("2011-01-13");
QueryResult queryResult = instance.search(query);
tweetList = queryResult.getTweets();
I am able to retrieve tweets from periods closer to now such as last week and such, but am unable to get any results for periods such as the one listed above. Any clues, suggestions would be help. Thanks in advance.
I just found out the reason through a different medium, thought i'd share the answer in case there are other people with the same issue.
It turns out that the twitter search api does not return tweets older than around a week and also, depending on the server load, at times this could be as low as 24 hours ! Hence, any 3rd party libraries (such as twitter4j) which have a wrapper for the twitter search api will behave similarly.
The best way to go about this would be to use third party search and indexing sites such as snapbird, topsy, etc..

Resources