Store ruby Mail (from gem) object in ActiveRecord - ruby-on-rails

I'm currently implementing a very basic IMAP client into an application I'm building in Rails. I'm using the Mail gem which supplies lots of useful ways of parsing the imap data.
I'd like to store the Mail object that it's generating in the database. Is that possible?
i.e.
email = Email.new
email.uid = id
email.mail = Mail.new(imap.fetch(id, "RFC822")[0]["attr"]["RFC822"]
email.save
It's a convenience thing where I don't want to have to download the object again unless I have to since performance on the IMAP call is slow, but I'd like to be able to have it there to look back on (and do any breaking down I needed to later).
I could then call
email.find(x).mail.body
and various other useful things without having to build out that functionality in my own email model.
Q1: How would I set up the active record model?
Q1a: Would I be better off doing something that excluded the attachments to make it an easier object to store? (is that even possible?)
Appreciate your help,

Several database schemata have been developed to store mail. I've worked on one, and there are others. Believe me, it's hard work. The result can be very useful, but since your question doesn't focus on the result I suspect it's not worthwhile in your case.
You might find it easier to use a json library to write your object graph to a file with an automatically inferred structure, as most json libraries seem to support these days. That won't let you do as much, but it's very much easier and lets you store both completely and incompletely retrieved messages. If you haven't fetched a particular body part, the json library will just write a null for that field.

It depends on what you want to do with the stored mails. If you need only specific parts of the mail to be easily accessible trough the database you won't need a complex setup like archiveopteryx, which basically maps a complete representation of emails to relational database tables. In most cases though you won't need that much detail and it will be totally perfect to use a simple data model.
A1: rails g model Email from to subject date:datetime message_id body. this are just the basic parts, should get you started.
A1a: You don't need to store the attachments if you don't want to. If you need them, you'll probably be better off not storing them in the database itself. Attachments are just like uploads so there are plenty of gems that can help you do that (https://www.ruby-toolbox.com/categories/rails_file_uploads).

Using posgres jsonb columns, you can store the email as json, in my case I disregard the attachments (which I store the reference to and retrieve as and when required).
This works pretty well with the Mail gem.

Related

Ruby on Rails. Using Google Client API to parse emails

I am new to Ruby and have a question about how to do a specific task on Rails.
I have a list of inventory and each item has a specific Stock ID that is emailed to my personal Gmail account. I want my web application to listen for emails from a specific email account. When my gmail receives an email from that specific account I want my application to parse it for a couple of fields and insert the stock ID into my database.
For example:
Let's say my database has an item with style code: A5U31 and size:10.
The email will say something like item with style code: A5U31 and size:10 has Stock ID:329193020.
I want my Rails application to search the database for an entry with that specific style code and size, and when it finds an entry to simply insert the stock ID into the row.
I am trying to using the Google-API-Client gem to this, but I am struggling, because I am still a beginner. So far I have done this quick-start guide to authenticate my gmail account with my rails app.
https://developers.google.com/gmail/api/quickstart/ruby?authuser=2
If someone could help me figure out how to write this sort of code and where to put it in my app(models, controllers, views) that would be very appreciated. Thanks!
I know it's been several months since you posted this, so you probably already have it worked out, but in case you still need a solution (or if someone else wants to do something similar), I'll share my thoughts:
At a high level, it sounds like your plan is
Identify when a new email has come in (either by polling or by using a push notification).
Grab the new email's content.
Parse the email's content in order to extract relevant data.
Use the data to query and update a database.
Based on the documentation for the Gmail API, it does look like you should be able to set up push notifications, so you won't have to poll the endpoint to get the information you need.
However, my big takeaway from this list is that none of the items on it really require Rails, since you're not exposing an external web API for requests. I suppose that you could leverage ActiveRecord to create an item model and use that to manage the database; however, since it seems like you'd only need to make some basic SQL queries (and the same ones each time), I'm not sure that bringing in ActiveRecord adds much value.
If I were trying to solve this problem myself, I would probably create a simple Ruby program that (a) uses the gem you mentioned to handle push notifications from the Gmail API, and (b) uses another gem to connect to whatever kind of database you're using (e.g. pg for Postgres) and make the necessary queries.
(All of this assumes, of course, that you aren't specifically using Rails for some other reason, e.g. adding this feature to an existing Rails application).

Rails using fire-base

I have problem during using fire-base with rails to making a real time form in rails with fire-base how it will be done .. please help me i am new in fire-base.
How to make a real time sharing form in Rails using fire-base.
sorry for bad English ;)
It would be good if you can use ids as keys for each of your form element like input ,select etc and can place value for the element to make each key-value pairs.
This would be effective in a case when you have a persistent record for that form i.e for make new records this approach is not good.
You can use the firebase-rails gem, it's description reads as follows: Ruby
wrapper for the Firebase REST API.
Changes are sent to all subscribed clients automatically, so you can
update your clients in realtime from the backend.
It's very easy to use, although the documentation is somewhat lack luster for but it's just enough for a simple app (if you don't know what you're doing).

Binding a irregular URL encoded POST request with Nancy

I'm supposed to make an old sqlite database editable trough a Sketchup Plugin in its WebDialog. Sketchups Ruby is not able to install the sqlite3 gem and since I already need to display the table in a WebDialog, I opted for a micro service with the help of Nancy. The existing Plugin already uses the dhtmlxSuite and its Grid Component is exactly what I need. They even offer a Connector that can send requests based on the actions of the user in the grid.
They also offer a connector for the server side, but that too does not work with sqlite. I already managed to connect to my database, used Dapper to bin the result of an SQL query to an object and return that data with a manually crafted JSON object. Very hacky, but it populates the data successful.
Now the client-Connector sends data manipulated by the user (delete and insert is forbidden) back with an url encoded POST request that looks quite weird:
4_gr_id=4&4_c0=0701.041&4_c1=Diagonale%20f%3Fr%202.07%20X%201.50%20m&4_c2=AR&4_c3=8.3&4_c4=380&4_c5=38.53&4_c6=0&4_c7=0&4_!nativeeditor_status=updated&ids=4
I'm not sure exactly why the hell they would use the row index inside the key for the key-value pairs, but this just makes my life that much harder. The only thing that comes to my mind is extracting the data I'm interested in with a regex, but that sounds even worse than what I did before.
Any suggestions for me how I could map that POST request to something usable?

Search Gems for Rails

I was browsing reddit for the answer to this and came across this conversation which lists out a bunch of search gems for rails, which is cool. But what I wanted was something where I could:
Enter: OMG Happy Cats
It searches the whole database looking for anything that has OMG Happy Cats and returns me a an array of model objects that contain that value, that I can then use Active model serializer (Very important to be able to use this) on to return you a json object of search results so you can display what ever you want to the user.
So that json object, if this was a blog, would have a post object, maybe a category object and even a comment object.
Everything I have seen is very specific to one controller, one model. Which is nice an all but I am more of a "search for what you want, we will return you what you want, maybe grow smarter like this gem, searchkick which also has the ability to offer spelling suggestion.
I am building this with an API, so it would be limited to everything that belongs to a blog object (as to make it not so huge of a search), so it would search things like posts, tags, categories, comments and pages looking for your term, return a json object (as described) and boom done.
Any ideas?
You'll be best considering the underlying technology for this
--
Third Party
As far as I know (I'm not super experienced in this area), the best way to search an entire Rails database is to use a third party system to "index" the various elements of data you require, allowing you to search them as required.
Some examples of this include:
Sunspot / Solr
ElasticSearch
Essentially, having one of these "third party" search systems gives you the ability to index the various records you want in a separate database, which you can then search with your application.
--
Notes
There are several advantages to handling "search" with a third party stack.
Firstly, it takes the load off your main web server - which means it'll be more reliable & able to handle more traffic.
Secondly, it will ensure you're able to search all the data of your application, instead of tying into a particular model / data set
Thirdly, because many of these third party solutions index the content you're looking for, it will free up your database connectivity for your actual application, making it more efficient & scaleable
For PostgreSQL you should be able to use pg_search.
I've never used it myself but going by the documentation on GitHub, it should allow you to do:
documents = PgSearch.multisearch('OMG Happy Cats').to_a
objects = documents.map(&:searchable)
groups = objects.group_by{|o| o.class.name.pluralize.downcase}
json = Hash[groups.map{|k,v| [k,ActiveModel::ArraySerializer.new(v).as_json]}].as_json
puts json.to_json

How should I store scraped HTML in my webapp?

I'm a newbie to web development (and development in general) and I'm building out a rails app which scrapes data from a third party website. I'm using Nokogiri to parse for specific html elements that I'm interested in and these elements are stored in a database.
However, I'd like to save the html of the whole page I'm scraping as a back-up in case I change my mind on what type of information I want and in case the website removes the site (or updates it).
What's the best practice for storing the archived html?
Should I extract it as a string and put it in a database, write it to a log or text file, or what?
Edit:
I should have clarified a bit. I am crawling on the order of 10K websites a week and anticipate only needing to access the back-ups on once-off basis if I redefine the type of data I want.
So as an example, if was crawling UN data on country population data and originally was looking at age distributions but later realized I wanted to get the gender distributions as well, I'd want to go back to all my HTML archives and pull the data out. I don't anticipate this happening much (maybe 1-3 times a month) but when it does I'll want to retrieve it across 10K-100K listings. The task should only take a few hours to do around 10K records so I guess each website fetch should take at most a second. I don't need any versioning capability. Hope this clarifies.
I'm not sure what the "best practice" for this case is (it will vary by the specifics of your project), but as a starting point I'd suggest creating a model with a string field for the URL and a text field for the HTML itself, and save the pages there. You might add a uniqueness validator for the URL, to make sure you don't store the same HTML twice.
You could then optionally add model methods to initiate a nokogiri document from the HTML text, thus using the HTML string as the "master" record (in the DB) and generating the nokogiri document on the fly when needed. But again, as #dave-newton points out, a lot of this will depend on what you're going to do with this HTML.
I would strongly suggest saving it into a table in the same DB as the data you are scraping. Why change what works? Keep it all as you normally would, or write it all to a separate database entirely just in case and keep some form or ref to link the scraped data to the backups just in case.

Resources