I need to stream live tweets from twitter API and then analyse them. I should use kafka to get tweets or spark streaming directly or both ?
You can use Kafka Connect to ingest tweets, and then Kafka Streams or KSQL to analyse them. Check out this article here which describes exactly this.
Depending on your language of choice I would use one of the libraries listed here: https://developer.twitter.com/en/docs/developer-utilities/twitter-libraries. Which ever you choose, you will be using statuses/filter in the Twitter API, so get familiar with the doc here: https://developer.twitter.com/en/docs/tweets/filter-realtime/api-reference/post-statuses-filter.html
Related
I want to use Neo4j Graph Database for my Android Social Media Application, which is the best Approach to access Data from the Database.
Drivers
HTTP API
Your best bet is the HTTP API.
However, you may want to take a look at this video, which purports to show how to use the Java driver with Android. I have not tried that approach myself.
Basically, i want to get analytic reports through YouTube API by using Python. After hours searching how to make it happend. I am known that YouTube just supporting API through their graphical design, which is really limited.
Please advised me, is there any way to get daily/weekly/monthly report by using Python?
FYI, at the moment, i am using YouTube's service to automatically update the reports into my database which is BigTable.
I want to develop some stuff with the twitter streaming API and twitter4j in university. I read now about shutting down the share-count API (https://blog.twitter.com/2015/hard-decisions-for-a-sustainable-platform). Will this effect the twitter streaming API and how it works in any way? Because I need this service for at least 6 month.
The Share-Count and the Streaming API do not cross paths, actually you can obtain the share-count from the Streaming API data as suggested in this post.
Since they are discontinuing that service, it will have no effect on the data that you're able to obtain from the Streaming API so it won't effect the progress of your project.
As far as GNIP goes, that's overkill, it should not have been suggested at all. For research base, especially during initial stages and possibly later phases, the Streaming API will provide you with excellent amount of data. You can even request a limit increase through Twitter's Sale Department but it's up to them to make the final decision. They can be contacted at data-sales#twitter.com
Share count and streaming are totally separate APIs.
If you need guaranteed access, I suggest paying for Twitter's GNIP service - https://www.gnip.com/
I am working on a project, which has a few dependencies. I would like to have your insights and best practices on a few matters.
Which is the right Twitter Streaming API to get the tweets from the authenticating user?
Does it make sense to hook that streaming API to Pusher and hook my Arduino on Pusher as well?
What is the best library to hook the streaming API to a Laravel backend?
I sincerely hope that this question is within the rules of StackOverflow, as I am not sure. I would really like to gain this information.
https://dev.twitter.com/rest/reference/get/statuses/home_timeline
You can definitely easily hook an incoming feed from Twitter up with Pusher (see next answer). And Pusher does have an Arduino library that should help you too.
Pusher has a solid PHP library so it will be easy to publish data from a Laravel application. The tricky part may be consuming the Twitter streaming API from Laravel and PHP.Phirehose claims to offer access to the streaming API from PHP.
If you could consider other technologies then there's an in-progress Pusher + Twitter Streaming example that you could look at written in Node. There's also this older example that uses Python.
How to Store particular website tweets in HDFS ?
Suppose one website www.abcd.com and I want to collect all user's tweet for this website and stored into HDFS or Hive.
Flume and sqoop also helpful for storing data.
so anyone please suggest me how flume and sqoop work in storing tweets in HDFS?
Sqoop was not made for this purpose. Flume is used for these kind of needs. You can write your custom Flume source that will pull the tweets and dump them into your HDFS. See this for example. It shows how to use Flume to collect data from the Twitter Streaming API, and forward it to HDFS.
You can find more in the official documentation.