I can successfully pull different feeds using the Feedzirra gem and get feed updates. However, each feed that I'd like to pull has different content (ie: Github Public Feed, last.fm recently played, etc.).
What is the best way to go about combining all of these feeds into one? Right now I have different models for different types of feeds and some feeds use different timestamps than the others.
m,
You could add multiple extra fields to hold each of the unique attributes in an uber-feed object, only filling in the ones that come from each particular feed at time of processing. (It's kind of like the NoSQL model in that way, though not quite, since you have to define the fields ahead of time, but you can add any arbitrary field as a data-holder.)
This is how you add a new field to all instances of a feed...
Feedzirra::Feed.add_common_feed_entry(:my_custom_field)
You'll find a little more dialog about this here...
https://groups.google.com/forum/?fromgroups#!msg/feedzirra/_h4y8_vwDGc/N8sjym6NouEJ
You are creating an activity feed -- here are several gems that you can research on how to create activity feeds: https://www.ruby-toolbox.com/categories/Rails_Activity_Feeds
Related
I'm building an app that gets a lot of data from a web service. The app consists of different entries that have relationships to each other. Let's make an example and say I'm building a TV show tracking app, all the data is coming from the web service, but I want to mark episodes as watched, which is a custom property on one entry so far. All of this gets save in Core Data. I have these entries:
Show ⇒ has many seasons and episodes
Season ⇒ has many episodes and one show
Episode ⇒ has one show and one season
The main part I'm currently struggling with is how I can best update all of these entries when the web service has an updated version of the data (maybe the show got a new season or some wrong data got fixed). At this point, the only custom property on these entries which differs from the data the web service provides is the watched attribute I created on the Episode entry.
So far I tried different ways, like removing the old data and just adding the new one (the custom watched attribute is a problem here) and I also looked into merge policies like NSMergeByPropertyObjectTrumpMergePolicy but this doesn't play nice with relationships and I got to a roadblock there.
Is there a better way or best practice how to solve this?
Im working on the search feature for my app and I would like to give users the option for some sorting of viewing the recipes. Im trying to make a popular feature but Im not entirely sure where to start and if I need to make any modifications to my schema. Ive read this post http://sorentwo.com/2013/12/30/let-postgres-do-the-work.html and get the jist of what its doing. My main question is how do I track when a page is viewed and use that in the calculations? And also track views over time?
Also for using things like comments as a weight is it better to count the number of comments dynamically (query the comment table and add them up) or keep a column in the recipe table that gets added to whenever a comment is added?
I was browsing reddit for the answer to this and came across this conversation which lists out a bunch of search gems for rails, which is cool. But what I wanted was something where I could:
Enter: OMG Happy Cats
It searches the whole database looking for anything that has OMG Happy Cats and returns me a an array of model objects that contain that value, that I can then use Active model serializer (Very important to be able to use this) on to return you a json object of search results so you can display what ever you want to the user.
So that json object, if this was a blog, would have a post object, maybe a category object and even a comment object.
Everything I have seen is very specific to one controller, one model. Which is nice an all but I am more of a "search for what you want, we will return you what you want, maybe grow smarter like this gem, searchkick which also has the ability to offer spelling suggestion.
I am building this with an API, so it would be limited to everything that belongs to a blog object (as to make it not so huge of a search), so it would search things like posts, tags, categories, comments and pages looking for your term, return a json object (as described) and boom done.
Any ideas?
You'll be best considering the underlying technology for this
--
Third Party
As far as I know (I'm not super experienced in this area), the best way to search an entire Rails database is to use a third party system to "index" the various elements of data you require, allowing you to search them as required.
Some examples of this include:
Sunspot / Solr
ElasticSearch
Essentially, having one of these "third party" search systems gives you the ability to index the various records you want in a separate database, which you can then search with your application.
--
Notes
There are several advantages to handling "search" with a third party stack.
Firstly, it takes the load off your main web server - which means it'll be more reliable & able to handle more traffic.
Secondly, it will ensure you're able to search all the data of your application, instead of tying into a particular model / data set
Thirdly, because many of these third party solutions index the content you're looking for, it will free up your database connectivity for your actual application, making it more efficient & scaleable
For PostgreSQL you should be able to use pg_search.
I've never used it myself but going by the documentation on GitHub, it should allow you to do:
documents = PgSearch.multisearch('OMG Happy Cats').to_a
objects = documents.map(&:searchable)
groups = objects.group_by{|o| o.class.name.pluralize.downcase}
json = Hash[groups.map{|k,v| [k,ActiveModel::ArraySerializer.new(v).as_json]}].as_json
puts json.to_json
I want to be able to store media RSS and iTunes podcast RSS feeds into the database. The requirement here is that I don't want to miss out on ANY element or its attributes in the feed. It would make sense to find all most common elements in the feed and have them stored in database as separate columns. The catch here is that there can be feed specific elements that may not be standard. I want to capture them too. Since I don't know what they can be, I won't have a dedicated column for them.
Currently I have 2 tables called feeds and feed_entries. For RSS 2.0 tags like enclosures, categories, I have separate tables that have associations with feeds/feed_entries. I am using feedzirra for parsing the feeds. Feedzirra requires us to know the elements in the feed we want to parse and hence we would not know if feed contains elements beyond what feedzirra can understand.
What would be the best way to go about storing these feeds in the database and not miss single bit of information? (Dumping of the whole feed into the database as is won't work as we want to query most of the attributes). What parser would be the best fit? Feedzirra was chosen for performance, however, getting all data in the feed into the database is a priority.
Update
I'm using MySQL as the database.
I modeled my database on feeds and entries also, and cross-mapped the fields for RSS, RDF and Atom, so I could capture the required data fields as a starting point. Then I added a few others for tagging and my own internal-summarizations of the feed, plus some housekeeping and maintenance fields.
If you move from Feedzirra I'd recommend temporarily storing the actual feed XML in a staging table so you can post-process it using Nokogiri at your leisure. That way your HTTP process isn't bogged down processing the text, it's just retrieving content and filing it away, and updating the records for the processing time so you know when to check again. The post process can extract the feed information you want from the stored XML to store in the database, then delete the record. That means there's one process pulling in feeds periodically as quickly as it can, and another that basically runs in the background chugging away.
Also, both Typhoeus/Hydra and HTTPClient can handle multiple HTTP requests nicely and are easy to set up.
Store the XML as a CLOB, most databases have XML processing extensions that allow you to include XPath type queries as part of a SELECT statement.
Otherwise if your DBMS does not support XML querying, use your languages XPath implementation to query the CLOB. You will probably need to extract certain elements into table columns for speedy querying.
I am a newbie to rails and I have been watching Rails Casts videos.
I am interested to know a little bit more on FeedZirra (Rails casts episode 168) and especially feed parsing.
For example, I need to Parse feeds from Telegraph and Guardian
I want to put all the sports news from both the newspapers in one table, just football news in another table, cricket news in another table etc
How can I achieve that using feed-zirra?
How do I display only football news in one view and only cricket news in another view?
Also, I want the user to know which website he is gonna visit before he actually clicks the link and finds out.
Something like this
Ryder Cup 2010: Graeme McDowell the perfect hero for Europe
5 min ago | Telegraph.co.uk
How do I display Telegraph.co.uk
Looking forward for your help and support
Thanks
There are many questions there, but I'll take this one:
I just know how to put all feeds in
table. I dont know how to keep feeds
in different tables
Create different models to suit your data model, based on what information you need to show rather than what is provided in the feed. (Different tables for each models if required or Single Table Inheritance if possible)
Write a wrapper class that will use FeedZirra (or any other parser for that matter) to read the parsed feeds and process them. These are generally kept in the lib folder.
Create a rake task which can be called to run this script OR if you are familiar with delayed_job, then create a job.
Schedule your rake task through cron or your job through delayed_job, so that you can periodically update your data.