How to get videos from rss feed entries - ruby-on-rails

I am trying to get videos(urls) from feed entry url.
I am using Feedjira and MetaInspector in my application to fetch and store articles along with images. Now I want to store videos of articles if any. Can anyone please tell me what could be the best possible ways to store videos from articles
Thank you.

I do this in my project to save all the url found from rss feeds
Source.all.each do |source|
feed = Feedjira::Feed.fetch_and_parse(source.url)
feed.entries.each do |entry|
unless Link.exists? url: entry.url
Link.create!(title: entry.title,
url: entry.url)
end
end
end
in my snippet, I only save the url and title, for videos you just need to add entry.video,
you can see all the entry tag from feed.entries object or from the rss given.
and If you want to add another attributes, ex: media:thumbnail you could add this code before calling fetch_and_parse but you need to call it once not everytime you call fetch_and_parse to avoid memory leak
Feedjira::Feed.add_common_feed_entry_element("media:thumbnail", :value => :url, :as => :pic)
then you could do entry.pic to get the thumbnail url

Related

How to use grackle gem to find hashtags in rails 3.2

I am using grackle gem to get the tweets on rails 3.i am able to get the tweets and show them on the view. Now I want to search the hashtags using this gem. Is it possible. And if yes,then please tell me how because I have already registered to twitter API and got the secret token and I am still trying to modify the code so as to get the hashtags, I already made use of the twitter API and I am able to get the output but how to modify/implement it in my working code.
my tweet.rb file
Class Tweet < ActiveRecord::Base
require 'grackle'
require 'twitter'
require "rubygems"
MY_APPLICATION_NAME = "mike1011"
attr_accessible :content, :created
####Connect to the Twitter API and pull down the latest tweets"""
def self.get_latest
##the call to twitter api works but how to modify this further to use the search api**
tweets = client.search.home_timeline? :screen_name => MY_APPLICATION_NAME # hit the API
##this works on twitter console
##https://api.twitter.com/1.1/search/tweets.json?q=%23haiku
end
private
def self.client
Grackle::Client.new(:auth=>{
:type=>:oauth,
:consumer_key=>'xxxxxxxxxxxxxxx',
:consumer_secret=>'xxxxxxxxxxxxxxx',
:token=>"2227985911-xxxxxxxxxxxxxxxxxx",
:token_secret=>"xxxxxxxxxxxxxxxxxxx"
})
end
end
My view file where I want the user to search the hashtags by using the search button and value filled in the textbox
<fieldset style="height:auto;width:auto">
<%= text_field_tag 'Search', 'Enter your hashtags'%>
<%= button_tag(type: 'button') do
content_tag(:strong, 'Search in twitter!',:id=>"search_button")
end
%>
</fieldset>
Yes, it is possible. If you are not going to save the tweets to a database it is better to create an api so you can call it from any place or put it all inside of a controller.
A benefit of doing so is that you can then use parameters.
Take a look at my code: https://github.com/eflores1975/stock-dashboard/blob/master/app/controllers/tweets_controller.rb
Basically it loads the tweets to an array and then using ActiveSupport::JSON.decode I convert it in a consumable object and start to massage it to suit my needs.
And,finally, I can use it as:
http://gentle-tor-3942.herokuapp.com/tweets/get_tweets?symbol=$aapl
Where symbol=$aapl is the hash tag that I am looking for.

Nokogiri parsing multiple XML feeds at once and sort by date

I am using Rails and Nokogiri to parse some XML feeds.
I have parsed one XML feed, and I want to parse multiple feeds and sort the items by date. They are Wordpress feeds so they have the same structure.
In my controller I have:
def index
doc = Nokogiri::XML(open('http://somewordpressfeed'))
#content = doc.xpath('//item').map do |i|
{'title' => i.xpath('title').text, 'url' => i.xpath('link').text, 'date' => i.xpath('pubDate').text.to_datetime}
end
end
In my view I have:
<ul>
<% #content.each do |l| %>
<li><%= l['title'] %> ( <%= time_ago_in_words(l['date']) %> )</li>
<% end %>
</ul>
The code above works as it should. I tried to parse mulitple feeds and got a 404 error:
feeds = %w(wordpressfeed1, wordpressfeed2)
docs = feeds.each { |d| Nokogiri::XML(open(d)) }
How do I parse multiple feeds and add them to a Hash like I do with one XML feed? I need to parse about fifty XML feeds at once on page load.
I'd write it all differently.
Try changing index to accept an array of URLs, then loop over them using map, concatenating the results to an array, which you return:
def index(*urls)
urls.map do |u|
doc = Nokogiri::XML(open(u))
doc.xpath('//item').map do |i|
{
'title' => i.xpath('title').text,
'url' => i.xpath('link').text,
'date' => i.xpath('pubDate').text.to_datetime
}
end
end
end
#content = index('url1', 'url2')
It'd be more Ruby-like to use symbols instead of strings for your hash keys:
{
:title => i.xpath('title').text,
:url => i.xpath('link').text,
:date => i.xpath('pubDate').text.to_datetime
}
Also:
feeds = %w(wordpressfeed1, wordpressfeed2)
docs = feeds.each { |d| Nokogiri::XML(open(d)) }
each is the wrong iterator. You want map instead, which will return all the parsed DOMs, assigning them to docs.
This won't fix the 404 error, which is a bad URL, and is a different problem. You're not defining your array correctly:
%w(wordpressfeed1, wordpressfeed2)
should be:
%w(wordpressfeed1 wordpressfeed2)
or:
['wordpressfeed1', 'wordpressfeed2']
EDIT:
I was revisiting this page and noticed:
I need to parse about fifty XML feeds at once on page load.
This is completely, absolutely, the wrong way to go about handling the situation when dealing with grabbing data from other sites, especially fifty of them.
WordPress sites typically have a news (RSS or Atom) feed. There should be a parameter in the feed stating how often its OK to refresh the page. HONOR that interval and don't hit their page more often than that, especially when you are tying your load to a HTML page load or refresh.
There are many reasons why, but it breaks down to "just don't do it" lest you get banned. If nothing else, it'd be trivial to commit a DOS attack on your site using web-page refreshes, and it'd be beating their sites as a result, neither of which is being a good web-developer on your part. You protect yourself first, and they inherit from that.
So, what do you do when you want to get fifty sites and have fast response and not beat up other sites? You cache the data in a database, and then read from that when your page is loaded or refreshed. And, in the background you have another task that fires off periodically to scan the other sites, while honoring their refresh rates.

How can I handle youtube's thumbnail in rails?

If user types youtube url in body and save record,
I'd like to show its thumbnail in view.
Is there any good techniques? or good gem for that?
sample typed in url:
http://www.youtube.com/watch?v=qrO4YZeyl0I
Are you parsing the video ID code from the URL? I.e. in your example it'd be qrO4YZeyl0I.
Once you have this you can do anything you want with it. There are four thumbnails generated for each video.
http://img.youtube.com/vi/qrO4YZeyl0I/0.jpg
http://img.youtube.com/vi/qrO4YZeyl0I/1.jpg
http://img.youtube.com/vi/qrO4YZeyl0I/2.jpg
http://img.youtube.com/vi/qrO4YZeyl0I/3.jpg
To simply select the default thumbnail for the video use:
http://img.youtube.com/vi/qrO4YZeyl0I/default.jpg
See this answer for more detail - How do I get a YouTube video thumbnail from the YouTube API?
I needed youtube thumbnails recently, so just an update. Currently urls look like:
http://i1.ytimg.com/vi/video_id/default.jpg
http://i1.ytimg.com/vi/video_id/mqdefault.jpg
http://i1.ytimg.com/vi/video_id/hqdefault.jpg
http://i1.ytimg.com/vi/video_id/sddefault.jpg
http://i1.ytimg.com/vi/video_id/1.jpg
http://i1.ytimg.com/vi/video_id/2.jpg
http://i1.ytimg.com/vi/video_id/3.jpg
You can parse the url, and get the thumbnail from the youtube before saving the model:
Gemfile:
gem 'youtube_it'
Model:
before_save :get_youtube_thumbnail
def get_youtube_thumbnail
url = extract_url_from_body
unless url.blank?
client = YouTubeIt::Client.new
response = client.video_by(url)
self.thumbnail = response.thumbnails.first.url
end
end
def extract_url_from_body
URI.extract(body).first
end
View:
<%= image_tag model.thumbnail, alt: model.title %>

XML feed into Rails object

I'm doing some work with Adcourier. They send me an xml feed with some job data, i.e. job_title, job_description and so on.
I'd like to provide them with a url in my application, i.e. myapp:3000/job/inbox. When they send their feed to that URL, it takes the data and stores it in my database on a Job object that I already created.
What's the best way to structure this? I'm quite new to MVC and i'm not sure where something like this would fit.
How can I get an action to interpret the XML feed from an external source? I use Nokogiri to handle local XMl documents, but never ones from a feed.
I was thinking about using http://api.rubyonrails.org/classes/ActionDispatch/Request.html#method-i-raw_post to handle the post. Doest anyone any thoughts on this?
In your job controller add a action inbox which gets the correct parameter(s) from the post request and saves them (or whatever you need to do with it).
def inbox
data = Xml::ParseStuff(params[:data])
title = data[:title]
description = data[:description]
if Job.create(:title => title, :description => description)
render :string => "Thanks!"
else
render :string => "Data was not valid :("
end
end
Next set your routes.rb to send posts request for that URL to the correct location
resources :jobs do
collection do
post 'inbox'
end
end
Note I did just made up the xml parse stuff here, just google a bit to find out what would be the best solution/gem for parsing your request.

How to upload an image to S3 using paperclip gem

For the life of my I can't understand how the basic paperclip example works. There's only one line included in the controller, and that's
#user = User.create( params[:user] )
I simply don't understand how that's all that is needed to upload an image to s3. I've changed the example quite a bit because i wanted to use jquery file uploader rather than the default rails form helper, so I'm at the point where an image is being POSTed to my controller, but I can't figure out how I'm supposed to take the image from the params and assign it as an attachment. Here's what I'm seeing the logs:
Parameters: {"files"=>[#<ActionDispatch::Http::UploadedFile:0x132263b98 #tempfile=#<File:/var/folders/5d/6r3qnvmx0754lr5t13_y1vd80000gn/T/RackMultipart20120329-71039-1b1ewde-0>, #headers="Content-Disposition: form-data; name=\"files[]\"; filename=\"background.png\"\r\nContent-Type: image/png\r\n", #content_type="image/png", #original_filename="background.png">], "id"=>"385"}
My JS is very simple:
` $('#fileupload').fileupload({
dataType: 'json',
url: '/my_url',
done: function (e, data) {
console.log('done');
}
});`
What would be helpful for me to know is how I can strip the file data from the POSTed parameters given above and pass it to paperclip. I'm sure that I'll have to assign the attachment attribute a value of File.open(...), but I dont know what source of my file is.
I've spent a ridiculous amount of time trying to figure this out and I can't seem to get it. I've tried uploading directly to s3, but the chain of events was terribly confusing, so I want to get this simple pass-through example completed first. Thanks so much for any help you cna give!
You need a few more pieces and it will help if you can show the exact code you're using.
Paperclip can post to S3 by using:
http://rubydoc.info/gems/paperclip/Paperclip/Storage/S3
When your controller creates a User model, it is sending along all the params. This is called "mass assignment" (be sure to read about attr_accessible).
When your model receives the params, it uses the Paperclip AWS processor, which uploads it.
You need the AWS gem, a valid bucket on S3, and a config file.
Try this blog post and let us know if it helps you:
http://blog.trydionel.com/2009/11/08/using-paperclip-with-amazon-s3/
UPDATE 2013-04-03: Pleases see Chloe's comment below-- you may need an additional parameter, and the blog post may be outdated.
If you want to do it manually, approach it like this:
# In order to get contents of the POST request with the photo,
# you need to read contents of request
upload = params[:file].is_a(String)
file_name = upload ? params[:file] : params[:file].original_filename
extension = file_name.split('.').last
# We have to create a temp file which is going to be used by Paperclip for
# its upload
tmp_file = "#{Rails.root}/tmp/file.#{extension}"
file_id = 0
# Check if file with the name exists and generate unique path
while File.exists?(tmp_file) do
tmp_file_path = "#{Rails.root}/tmp/file#{file_id}.#{extension}"
id += 1
end
# Let's write the file from post request to unique location
File.open(tmp_file_path, 'wb') do |f|
if upload
f.write request.body.read
else
f.write params[:file].read
end
end
# Now that file is saved in temp location, we can use Paperclip to mimic one file
# upload
#photo = Photo.new :photo => File.open(tmp_file_path)
# We'll return javascript to say that the file is uploaded and put its thumbnail in
# HTML or whatever else you wanted to do with it
respond_to do |format|
if #photo.save
render :text => "Success"
else
render :text => #photo.errors
end
end
You can rewrite your create or whatever you use as the url to which you are POSTing the form.
This bit:
"files"=>[#<ActionDispatch::Http::UploadedFile:0x132263b98 #tempfile=# <File:/var/folders/5d/6r3qnvmx0754lr5t13_y1vd80000gn/T/RackMultipart20120329-71039-1b1ewde-0>
is the part (I think) that holds the file contents that are posted in the form.
In Rails, the User model will have a helper: has_attached_file
Passing the [:params] to the User.create method allows that helper to pick up the file contents, do any processing on them (eg resizing etc based on attributes supplied to the helper) and then push the image(s) to your storage (eg S3 or whatever - S3 credentials are passed to the helper).
Hopefully that explains the 'how does it do it?' question
re the jQuery bit.. not sure what the code should be there, but why not use the Rails form with :remote => true and handle the response in jquery?

Resources