google search as an rss feed - ruby-on-rails

Is there a way to have treat google serach results as an rss feed?
For example say I worked for stackoverflow and wanted to montior how if the results from the following search url: http://www.google.com/search?hl=en&q=stackoverflow changes from day today.
It would be cool if I could append &output=rss to the url and get back a feed like with google news. But that does not seem to be supported.
Anyone have ideas? (Note I am programing with Ruby and Rails, if that matters)
Thanks!
Jonathan

Google has the Google Alerts service, which notifies you whenever it finds new content matching a certain query. Besides sending to an email address (instantly, daily, weekly), it allows you to create an RSS feed out of it.

No, Google doesn't offer that feature.
If you need to parse/convert the result of a query on Google, you can use a (X)HTML parser such as Nokogiri.
Beware that automatic requests to Google may violate its TOS.

Related

Can I use twitter api to search tweets one week before?

I am trying to search keywords in twitter through tweepy.
However, I found it seems like that I can not search the tweets one week before, the code below is the main search code.
for searched_tweets in tweepy.Cursor( API.search,
q = "python",
since = 2014-02-03,
until = 2014-02-04,
lang='en' ).items()
I am not sure whether there is any limited or any better way to search by time, thanks for your help!!
:)
Unfortunately, you can't get tweets older than a week with the twitter search API.
Also note that the search results at twitter.com may return historical results while the Search API usually only serves tweets from the past week. - Twitter documentation.
You can get specific tweets older than a week by looking at individual users or using specific post ids (if you have them) but it's not reasonable to index every single tweet ever to be searchable using the API.
If you need a large time range, you can collect them yourself using the streaming API or check out a service that does (see this dev twitter thread for examples).

How should I get all the tweets of an specific hashtag?

I'm trying to develop some code in order to get all the tweets that were generated with certain hashtags, then parse them and finally analyse them. I believe I've already thought and solve the last two parts of this but I'm having some trouble with the first one. I've already read the Twitter Search API documentation but I haven't realised yet how to do this. Can anyone help me?
If you want to retrieve the tweets sent recently, you should use the search/tweets endpoint of twitter' REST API, and mention the hashtag inside q parameter
In case you want to listen to tweets containing the hashtag and receive them in real time, then twitter's streaming API is what you should use (statuses/filter endPoint).
Have a look at the documentation on twitter's website, there's also plenty of information on how to do this all around the web.

Get RSS feed of a Twitter hashtag

A client of mine wants to get an RSS feed for a twitter hashtag and include it in a list of other feeds from a variety of other sites.
I've Googled it all morning and I get a mix of answers. Some say it's possible, others say not anymore.
Could anyone shed some light on this?
Unfortunately, as of the API v1.1 upgrade on June 11th 2013, twitter now only gives responses in JSON. You can view the official announcement here.
That being said, I think you may need to think broader with your google searching ;)
Within 30 seconds, I found this, and it looks like exactly what you're looking for:
RSS readers cannot subscribe to JSON feeds. twitter-json-to-rss is a set of PHP scripts that you install on your public facing server that allows you to get around this problem.
Heres the library. Note this is entirely untested by myself, so you'll have to check it out.
Basically, you need to send an authenticated request to the twitter 1.1 API, get your data, then convert the returned json to RSS for your client.
If you're using PHP, this is the fastest post to help you get up to speed with requests to the 1.1 API in PHP.
It's not possible anymore. Twitter removed RSS feeds from their data formats and will require you to use their API to achieve that.

RSS feeds in new twitter

Does anyone know where to find the RSS feeds in the new twitter? I cannot find the rss icon and the source of the page just points to "Your Twitter Favorites" even though I am on the page of the user I want to get an RSS feed from...
Simple I know, but its bugging me to no end!
2014 edit:
It looks like Twitter has retired RSS feeds, and now only exports data as JSON:
What output formats will API v1.1 support?
API v1.1 will support JSON only. We’ve been hinting at this for some
time now, first dropping XML support on the Streaming API and more
recently on the trends API. XML, Atom, and RSS are infrequently used
today, and we’ve chosen to throw our support behind the JSON format
shared across the platform. Consequently, we’ve decided to discontinue
support for these other formats. For historical context, when we
originally built the API all major languages did not have performant,
well vetted libraries supporting JSON - today they do.
Orignal 2010 answer:
Here are the various feed URLs (using the account "Twitter" for these examples):
http://twitter.com/statuses/public_timeline.rss
http://twitter.com/statuses/user_timeline/Twitter.rss
http://twitter.com/favorites/Twitter.rss
http://search.twitter.com/search.rss?q=Twitter
The new Twitter layout isn't very RSS-friendly, unfortunately.
You won't be able to find it because Twitter stopped support for RSS :(
Something I needed, so built a Twitter to RSS converter, it works on hashtags, searches and lists. I've now opened it up totally free for anyone else who needs a solution.
Get it here - Twitter RSS Feed Generator
Recommende you to use the free website ahejlsberg, put the id into the textbox next to #, then click the "Fetch RSS" button.
You can get the RSS feed url: https://twitrss.me/twitter_user_to_rss/?user=ahejlsberg.
I found that this works for particular users (I had been trying to figure out their ids which was the way rss used to work but this works fine):
[Updated]
http://api.twitter.com/1/statuses/user_timeline.rss?screen_name=johnpiper
[Updated Sept 2014]
no longer works again...
[Alternative Solution: May 2015]
I have since discovered http://www.queryfeed.net
http://www.queryfeed.net/twitter?q=from%3Ajohnpiper
See the home page for further documentation about how to structure other queries. The service does not seem to return all tweets.

Ruby Rss parser and event trigger

I'm using RSS library so i can parse Atom and RSS in Ruby and Rails and store it in a model.
I've looked at the standard RSS library, but is there one library that will auto-detect that there is a new rss feed so i can update my database ?
what are the best practice to trigger an instruction in order to store the new rss feed ?
should i use threads to handle that problem ?is it going to be slow?
thank you for your help
OK heres the deal.
If you want a real fast feed parser go for Feedzirra. Does not work on windows. http://github.com/pauldix/feedzirra
Autodiscovery?
-Theres truffle-hog if you don't want to do GET redirects. http://github.com/pauldix/truffle-hog
-Theres feedbag if you want to do GET redirects to find feeds from given urls. This is slower though. http://github.com/damog/feedbag
Feedzirra is the best bet if you want to poll for new entries for your feed. But if you want a more non-polling solution to your problem then i would suggest going through the pubsubhubbub spec. Make sure while parsing your feeds they are pubsubhubbub enabled. Check for the link tag. If it points to pubsubhubbub.appspot.com or any other pubsub enabled hub then just subscribe to the feed by sending a subscription request to the hub. You can then define a endpoint in your app which will in turn receive updated entry pings for your feed subscription from the hub. Just read the raw POST data and store it in your database. Stats are that 95% of the blogger blogs are pubsub enabled. That is a lot of data in your hands already. :)
If you are polling for changes then you should check the last-modified or etag from the header rather than parse the entire feed again. Saves you from wasting resources. Feedzirra takes care of this for you.
I am not sure what you mean by "auto-detect" a new feed?
Are you looking for code that can discover when someone creates a new feed on a site? Or, do you mean discover when an existing feed has a new article?
The first is tough because your code needs to know what site to look at so it needs some sort of auto-discovery of sites with new feeds. Searching the google for "new rss feeds" doesn't return anything that looks useful, at least not on the first page. If you, or your users, know of a new site then you can have an interface to add new sites to search. Then you grab the page at that URL, look for the RSS/Atom auto-discovery links, and go from there. Auto-discovery links can open a can of worms because of duplicate content being served using different protocols (RDF, RSS and Atom), so you have to determine which to use, or multiple feeds with alternate content listed.
If you mean you want to discover when an existing feed has new articles, then you have to keep track of the last time your code looked at the feed, and the last article that was seen, then retrieve the feed and see if any articles were not in your list of previously seen articles. Your code needs to be sensitive to the time-to-live information in a lot of feeds too. Hitting the feed every fifteen minutes when they update once a week is bad form. Most aggregation code can do those things already but you might need to configure a database and tell the code how to find it.
Generally, for this sort of task I set up a crontab entry on a production Linux or Unix system and fire off the job periodically, looking in the database for feeds whose last-run-time plus the stored time-to-live value is in the past.
Does that help any?
Very easy solution is to use Dynamic attribute-based finders
When you are filling your model with RSS feed data, instead of Model.create(...) use Model.find_or_create_by_column(value, :other_column => other_value).
You can specify a date as unique value or RSS message title ... (whatever you want)
I think this is pretty easy. You can make some cron task to fill your model once per hour for example. Only new feeds will be added.
There is no chance to get some "event" when RSS is updated without downloading whole RSS feed again.

Resources