Parsing custom feed elements using FeedZirra - ruby-on-rails

Is there a way to parse feed's custom elements? Not feed entries', feed's custom elements. I know there is a way to do the same for the entries. Like,
Feedzirra::Feed.add_common_feed_entry_element("wfw:commentRss", :as => :comment_rss)
feed = Feedzirra::Feed.parse(some_atom_xml)
feed.entries.first.comment_rss # => wfw:commentRss is now parsed!
I want to be able to achieve the same for the feed object. Something like,
Feedzirra::Feed.add_common_feed_element("geo:lat", :as => :latitudes)
feed = Feedzirra::Feed.fetch_and_parse(“somerss”)
feed.latitudes # => 44.022448
Is there a way? Or does this requires writing a patch for FeedZirra?

It's a bit late, but more people might be looking for an answer.
Putting the following line in a file in your config/initializers seems to work:
Feedzirra::Parser::RSS.element :latitudes

According to the new http://feedjira.com/extending.html
# Add the generator attribute to all feed types
Feedjira::Feed.add_common_feed_element('generator')
Feedjira::Feed.fetch_and_parse("http://www.pauldix.net/atom.xml").generator # => 'TypePad'
# Add some GeoRss information
Feedjira::Feed.add_common_feed_entry_element('geo:lat', :as => :lat)
Feedjira::Feed.fetch_and_parse("http://www.earthpublisher.com/georss.php").entries.each do |e|
p "lat: #[e.lat}, long: #{e.long]"
end

Related

Accessing HSTORE data in rails controller

So I set up a postgres server and have it working with hstore values.
As of right now, I have a table books, structured with
name:string data:hstore
I have created a sample entry to test:
Book.create(:name => "My First Book", :data => {'author' => 'Kevin', 'pages' => 368})
I have loaded the data into a variable:
#book = Book.where("data ? :key", :key => 'pages')
(just to test, i realize this query would serve no real purpose...)
I print the data as JSON and this works fine, the entry is found and displayed. However, what I am trying to do is access, say the pages, an hstore value. I did some research and found
#book.data['pages']
However, when i try to run this, I get
undefined method `data' for #<Book::ActiveRecord....
Any and all help is greatly appreciated!
The Active Record where will give you an array even if there is only 1 value.
You can do
#book = Book.where("data ? :key", :key => 'pages')[0]
to get that record
and then
#book.data
will work as desired.
If you might get multiple records and just using the first found is ok you could also use:
#book = Book.where("data ? :key", :key => 'pages').first
#book.data
or just
#book = Book.where("data ? :key", :key => 'pages').first.data
After fiddling around, i found that I simply needed to call:
#book[0].data

How to stream large xml in Rails 3.2?

I'm migrating our app from 3.0 to 3.2.x. Earlier the streaming was done by the assigning the response_body a proc. Like so:
self.response_body = proc do |response, output|
target_obj = StreamingOutputWrapper.new(output)
lib_obj.xml_generator(target_obj)
end
As you can imagine, the StreamingOutputWrapper responds to <<.
This way is deprecated in Rails 3.2.x. The suggested way is to assign an object that responds to each.
The problem I'm facing now is in making the lib_obj.xml_generator each-aware.
The current version of it looks like this:
def xml_generator(target, conditions = [])
builder = Builder::XmlMarkup.new(:target => target)
builder.root do
builder.elementA do
Model1.find_each(:conditions => conditions) { |model1| target << model1.xml_chunk_string }
end
end
end
where target is a StreamingOutputWrapper object.
The question is, how do I modify the code - the xml_generator, and the controller code, to make the response xml stream properly.
Important stuff: Building the xml in memory is not an option as the model records are huge. The typical size of the xml response is around 150MB.
What you are looking for is SAX Parsing. SAX reads files "chunks" at a time instead of loading the whole file into DOM. This is super convenient and fortunately there are a lot of people before you who have wanted to do the same thing. Nokogiri offers XML::SAX methods, but it can get really confusing in the disastrous documentation and syntactically, it's a mess. I would suggest looking into something that sits on top of Nokogiri and makes getting your job done, a lot more simple.
Here are a few options -
SAX_stream:
Mapping out objects in sax_stream is super simple:
require 'sax_stream/mapper'
class Product
include SaxStream::Mapper
node 'product'
map :id, :to => '#id'
map :status, :to => '#status'
map :name_confirmed, :to => 'name/#confirmed'
map :name, :to => 'name'
end
and calling the parser in is also simple:
require 'sax_stream/parser'
require 'sax_stream/collectors/naive_collector'
collector = SaxStream::Collectors::NaiveCollector.new
parser = SaxStream::Parser.new(collector, [Product])
parser.parse_stream(File.open('products.xml'))
However, working with the collectors (or writing your own) and end up being slightly confusing, so I would actually go with:
Saxerator:
Saxerator gets the job doen and has some really handy methods for traversing into nodes that can be a little less complex than sax_stream. Saxerator also has a few really great configuration options that are well documented. Simple Saxerator example below:
parser = Saxerator.parser(File.new("rss.xml"))
parser.for_tag(:item).each do |item|
# where the xml contains <item><title>...</title><author>...</author></item>
# item will look like {'title' => '...', 'author' => '...'}
puts "#{item['title']}: #{item['author']}"
end
# a String is returned here since the given element contains only character data
puts "First title: #{parser.for_tag(:title).first}"
If you end up having to pull the XML from an external source (or it is getting updated frequently and do you don't want to have to update the version on your server manually, check out THIS QUESTION and the accepted answer, it works great.
You could always monkey-patch the response object:
response.stream.instance_eval do
alias :<< :write
end
builder = Builder::XmlMarkup.new(:target => response.stream)
...

Rails: Request-based (& Database-based) Routing

I am trying to get rid of some scope-prefixes I am currently using in my app.
At the moment my Routes look like this (simplified example):
scope 'p'
get ':product_slug', as: :product
end
scope 't' do
get ':text_slug', as: :text
end
which for example generates these paths:
/p/car
/t/hello-world
Now I want the paths to work without the prefixed letters (p & t). So I restrict the slugs to the existing database entries (which btw works great):
text_slugs = Text.all.map(&:slug)
get ':text_slug', as: :text, text_slug: Regexp.new( "(#{text_slugs.join('|')})"
product_slugs = Product.all.map(&:slug)
get ':product_slug', as: :product, product_slug: Regexp.new( "(#{product_slugs.join('|')})"
The problem:
This is a multi-tenant app which means that someones text_slug could be another ones product_slug and vice versa. That's why I have to filter the slugs by the current site (by domain).
A solution would look like this:
text_slugs = Site.find_by_domain(request.host).texts.all.map(&:slug)
get ':text_slug', as: :text, text_slug: Regexp.new( "(#{text_slugs.join('|')})"
But request isn't available in routes.rb and I everything I tried won't work.
The direct call to Rack::Request needs the correct env variable which doesn't seem to be present in Application.routes, otherwise this could work:
req = Rack::Request.new(env)
req.host
I really tried alot and am thankful for any hint!
You may be able to use advanced constraints for this: http://guides.rubyonrails.org/routing.html#advanced-constraints.
class SlugConstraint
def initialize(type)
#type = type
end
def matches?(request)
# Find users subdomain and look for matching text_slugs - return true or false
end
end
App::Application.routes.draw do
match :product_slug => "products#index", :constraints => SlugConstraint.new(:product)
match :tag_slug => "tags#index", :constraints => SlugConstraint.new(:tag)
end
BTW - You may run into problems with testing, but that's another issue...

CanCan and Mongoid "or" do not play nice

I want to get all activities that a user owns or has created, so I join it using or:
Activity.or(owner_id: 123).or(creator_id: 123).selector
# => {"$or"=>[{"owner_id"=>123}, {"creator_id"=>123}]}
Now I try to use CanCan on that.
Activity.accessible_by(current_ability)
# => {"$or"=>[{"privacy"=>"public"}, {"user_id"=>"51091cc977bb1eb27a000003"}]}
CanCan creates an or selector by default, as there are more than one rule in the ability.
It would be intuitive now to do just this:
Activity.or(owner_id: 123).or(creator_id: 123).accessible_by(current_ability)
# => {"$or"=>[{"owner_id"=>123}, {"creator_id"=>123}, {"privacy"=>"public"}, {"user_id"=>"51092b9777bb1ec385000003"}]}
But this joins the both or Arrays into one which is not what I want, so I did the following:
Activity.or(criteria).and(Activity.accessible_by(current_ability).selector).desc(:created_at)
# => {"$or"=>[{"owner_id"=>123}, {"creator_id"=>123}], "$and"=>[{"$or"=>[{"privacy"=>"public"}, {"user_id"=>"51092b9777bb1ec385000003"}]}]}
But this seems a bit unclean. Any idea on how to beautify this? Thank you.
PS: An afterthought: Should accessible_by not always return a {'$and' => {'$or' => [...]}} instead of only a {'$or' => [...]}?

Rails 3 - Importing Multiple Lines from text_area

What's the best approach to import multiple lines from a text_area in a form?
I've tried a quick bodge using FasterCSV but get a NoMethodError:
undefined method `pos' for {"name"=>"Carrots\r\nPeas\r\nRed Onion"}*
def create
FasterCSV.parse(params[:ingredient], {:headers => false, :quote_char => '"', :col_sep => ','}).each do |row_data|
new_record = Ingredient.new('name' => row_data[0])
new_record.save
end
I want to apply the final thing to a model with multiple columns hence the col_sep
If you want to use FasterCSV.parse on single lines, you need to get simple lines first.
Split the multi-line data first:
params[:ingredient][:name].split.each do |line|
FasterCSV.parse(line, { ... options ... }).each do |row_data|
... etc ...
I might use parse_line to explicitly communicate I'm working on a single line instead.

Resources