Whats the best way to put a small ruby app online? - ruby-on-rails

I have a small ruby application I wrote that's an anagram searcher. It's for learning ruby, but I would like to put it up online for personal use. I have some experience with Rails, and many here have recommended Sinatra. I'm fine with either, but I cannot find any information on how to use a text file instead of a database.
The application is quite simple, validates against a text file of a word list, then finds all anagrams. I have been assuming that this should be quite simple, but I'm stuck on importing that textfile into Rails (or Sinatra if i choose that way). In the Rails project, I have placed the textfile in the lib directory.
Unfortunately, even though the path appears to be correct in Rails, I get an error:
no such file to load -- /Users/court/Sites/cvtest/lib/english.txt
(cvtest is the name of the rails project)
Here is the code. It works great by itself:
file_path = '/Users/court/Sites/anagram/dictionary/english.txt'
input_string = gets.chomp
# validate input to list
if File.foreach(file_path) {|x| break x if x.chomp == input_string}
#break down the word
word = input_string.split(//).sort
# match word
anagrams = IO.readlines(file_path).partition{
|line| line.strip!
(line.size == word.size && line.split(//).sort == word)
}[0]
#list all words except the original
anagrams.each{ |matched_word| puts matched_word unless matched_word == input_string }
#display error if
else
puts "This word cannot be found in the dictionary"
end

Factor the actual functionality (finding the anagrams) into a method. Call that method from your Web app.
In Rails, you'd create a controller action that calls that method instead of ActiveRecord. In Sinatra, you'd just create a route that calls the method. Here's a Sinatra example:
get '/word/:input'
anagrams = find_anagrams(params[:input])
anagrams.join(", ")
end
Then, when you access the http://yourapp.com/word/pool, it will print "loop, polo".

I know the question is marked as answered, but I prefer the following, as it uses query parameters rather than path based parameters, which means you can pass the parameters in using a regular GET form submission:
require 'rubygems'
require 'sinatra'
def find_anagrams word
# your anagram method here
end
get '/anagram' do
#word = params['word']
#anagrams = find_anagrams #word if #word
haml :anagram
end
And the following haml (you could use whatever template language you prefer). This will give you an input form, and show the list of anagrams if a word has been provided and an anagram list has been generated:
%h1
Enter a word
%form{:action => "anagram"}
%input{:type => "text", :name => "word"}
%input{:type => "submit"}
- if #word
%h1
Anagrams of
&= #word
- if #anagrams
%ul
- #anagrams.each do |word|
%li&= word
- else
%p No anagrams found

With sinatra, you can do anything. These examples doesn't even require sinatra, you could roll your own rack interface thing.
require 'rubygems'
require 'sinatra'
require 'yaml'
documents = YAML::load_file("your_data.yml")
Or:
require 'rubygems'
require 'sinatra'
content = Dir[File.join(__DIR__, "content/*.textile)].map {|path|
content = RedCloth(File.read(path)).to_html
}
Etcetera.

Related

ruby/nokogiri scraping - export to multiple CSVs, then take columns from each and combine into final CSV

Ruby n00b here. I am scraping the same page twice - but in a slightly different way each time - and exporting them to separate CSV files. I would like to then combine the first column from CSV no.1 and the second column from CSV no.2 to create CSV no.3.
The code to pull CSVs no.1 & 2 works. But add my attempt to combine the two CSVs into the third one (commented-out at the bottom) returns the following error - the two CSVs populate fine, but the third stays blank and the script is in what appears to be an infinite loop. I know these lines shouldn't be at the bottom, but I can't see where else it would go...
alts.rb:45:in `block in <main>': undefined local variable or method `scrapedURLs1' for main:Object (NameError)
from /Users/JammyStressford/.rvm/rubies/ruby-2.0.0-p451/lib/ruby/2.0.0/csv.rb:1266:in `open'
from alts.rb:44:in `<main>'
The code itself:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'csv'
url = "http://www.example.com/page"
page = Nokogiri::HTML(open(url))
CSV.open("results1.csv", "wb") do |csv|
page.css('img.product-card-image').each do |scrape|
product1 = scrape['alt']
page.css('a.product-card-image-link').each do |scrape|
link1 = scrape['href']
scrapedProducts1 = "#{product1}"[0..-7]
scrapedURLs1 = "{link1}"
csv << [scrapedProducts1, scrapedURLs1]
end
end
end
CSV.open("Results2.csv", "wb") do |csv|
page.css('a.product-card-image-link').each do |scrape|
link2 = scrape['href']
page.css('img.product-card-image').each do |scrape|
product2 = scrape['alt']
scrapedProducts2 = "#{product2}"[0..-7]
scrapedURLs2 = "http://www.lyst.com#{link2}"
csv << [scrapedURLs2, scrapedProducts2]
end
end
end
## Here is where I am trying to combine the two columns into a new CSV. ##
## It doesn't work. I suspect that this part should be further up... ##
# CSV.open("productResults3.csv", "wb") do |csv|
# csv << [scrapedURLs1, scrapedProducts2]
#end
puts "upload complete!"
Thanks for reading.
Thank you for sharing your code and your question. I hope my input helps!
Your scrapedURLs1 = "{link}" and scrapedProducts1 = "#{scrape['alt']}"[0..-7] have a 1 on the end but you don't call it at csv << [scrapedProducts, scrapedURLs] THIS IS THE ERROR YOU ARE GETTING
I would recommend combining your first two steps to skip
writing to a file, but into an Array of Arrays and THEN you can write
them to file.
Do you realize that in the example code you've given
scrapedURLs1, scrapedProducts2 would be mixing the wrong urls to
the wrong products. Is that what you mean to do?
Within the commented out code scrapedURLs1, scrapedProducts2 do not exist, they have not been declared. You need to open both files to read with .each do |scrapedURLs1| and then another with .each do |scrapedProducts2| and then those variable will exist because the each Enumerator instantiates them.
Reusing the same |scrape| variable on your inner iteration isn't a good idea. Change the name to something else such as |scrape2| . It "happens" to work because you've already taken what you need in product=scrape['alt'] before the second loop. If you rename the second loop variable you can move the product=scrape['alt'] line into the inner loop and combine them. Example:
# In your code example you may get many links per product.
# If that was your intent then that may be fine.
# This code should get one link per product.
CSV.open("results1.csv", "wb") do |csv|
page.css('img.product-card-image').each do |scrape|
page.css('a.product-card-image-link').each do |scrape2|
# [ product , link ]
csv << [scrape['alt'][0..-7], scrape2['href']]
# NOTE that scrape['alt'][0..-7] and scrape2['href'] are already strings
# so you don't need to use "#{ }"
end
end
end
Side note: Ruby 2.0.0 does not need the line require "rubygems"
If you're working with CSVs I highly recommend you using James Edward Gray II's faster_csv gem. See an example of usage here: https://github.com/JEG2/faster_csv/blob/master/examples/csv_writing.rb

Replacing {phrase} with phrase in rails

I'd like to search and replace any occurrence of {phrase} with with phrase using rails (erb.html file). Multiple phrases will need to be substituted, and the phrases aren't known in advance.
Full Example:
Hi {guys}, I really like {ruby on rails}
Needs to become
Hi guys, ruby on rails
This is for a user-generated content site (GMT)
it's simple regexp, just use
your_string.gsub(/{(.*?)}/, '\\1')
Example:
"{aaa} is not {bbb} you know".gsub(/{(.*?)}/, '\\1')
will produce
aaa is not bbb you know
You can do this using gsub
irb(main):001:0> str = " I have written this phrase statement, I want to replace occurences of all phrase with other statement"
=> " I have written this phrase statement, I want to replace occurences of all phrase with other statement"
irb(main):002:0> str.gsub("phrase",'phrase')
=> " I have written this phrase statement, I want to replace occurences of all phrase with other statement"
A better way to do this will be to use a Markdown output engine (Redcarpet being one of the most robust)
You'd have to create a custom renderer:
#lib/custom_renderer.rb
class AutoLinks < Redcarpet::Render::HTML
def auto_link(phrase) #-> will need to search through content. Can research further
link_to phrase, "/#{phrase}"
end
end
#controller
markdown = Redcarpet::Markdown.new(AutoLinks, auto_link: "ruby on rails")
Just use a helper in your erb. For example:
tag_helper.rb:
module TagHelper
def atag(phrase)
"<a href='/#{phrase}'>#{phrase}</a>"
end
end
some.html.erb:
<%= atag('guys')%>

How to stream large xml in Rails 3.2?

I'm migrating our app from 3.0 to 3.2.x. Earlier the streaming was done by the assigning the response_body a proc. Like so:
self.response_body = proc do |response, output|
target_obj = StreamingOutputWrapper.new(output)
lib_obj.xml_generator(target_obj)
end
As you can imagine, the StreamingOutputWrapper responds to <<.
This way is deprecated in Rails 3.2.x. The suggested way is to assign an object that responds to each.
The problem I'm facing now is in making the lib_obj.xml_generator each-aware.
The current version of it looks like this:
def xml_generator(target, conditions = [])
builder = Builder::XmlMarkup.new(:target => target)
builder.root do
builder.elementA do
Model1.find_each(:conditions => conditions) { |model1| target << model1.xml_chunk_string }
end
end
end
where target is a StreamingOutputWrapper object.
The question is, how do I modify the code - the xml_generator, and the controller code, to make the response xml stream properly.
Important stuff: Building the xml in memory is not an option as the model records are huge. The typical size of the xml response is around 150MB.
What you are looking for is SAX Parsing. SAX reads files "chunks" at a time instead of loading the whole file into DOM. This is super convenient and fortunately there are a lot of people before you who have wanted to do the same thing. Nokogiri offers XML::SAX methods, but it can get really confusing in the disastrous documentation and syntactically, it's a mess. I would suggest looking into something that sits on top of Nokogiri and makes getting your job done, a lot more simple.
Here are a few options -
SAX_stream:
Mapping out objects in sax_stream is super simple:
require 'sax_stream/mapper'
class Product
include SaxStream::Mapper
node 'product'
map :id, :to => '#id'
map :status, :to => '#status'
map :name_confirmed, :to => 'name/#confirmed'
map :name, :to => 'name'
end
and calling the parser in is also simple:
require 'sax_stream/parser'
require 'sax_stream/collectors/naive_collector'
collector = SaxStream::Collectors::NaiveCollector.new
parser = SaxStream::Parser.new(collector, [Product])
parser.parse_stream(File.open('products.xml'))
However, working with the collectors (or writing your own) and end up being slightly confusing, so I would actually go with:
Saxerator:
Saxerator gets the job doen and has some really handy methods for traversing into nodes that can be a little less complex than sax_stream. Saxerator also has a few really great configuration options that are well documented. Simple Saxerator example below:
parser = Saxerator.parser(File.new("rss.xml"))
parser.for_tag(:item).each do |item|
# where the xml contains <item><title>...</title><author>...</author></item>
# item will look like {'title' => '...', 'author' => '...'}
puts "#{item['title']}: #{item['author']}"
end
# a String is returned here since the given element contains only character data
puts "First title: #{parser.for_tag(:title).first}"
If you end up having to pull the XML from an external source (or it is getting updated frequently and do you don't want to have to update the version on your server manually, check out THIS QUESTION and the accepted answer, it works great.
You could always monkey-patch the response object:
response.stream.instance_eval do
alias :<< :write
end
builder = Builder::XmlMarkup.new(:target => response.stream)
...

ActionView::Helpers::TextHelper excerpt helper is not fully functional

I am using module ActionView::Helpers::TextHelper to generate an excerpt from a text. If a word exists more than once, it will just excerpt the first occurrence.
<%= excerpt('Hello, i am a Ruby lover, a Rails lover and would never come back to PHP', 'lover', :radius => 5) %>
"...lover,..."
I was expecting the return string to be something like, becauee there two occurrences of the word 'lover':
"...lover,...lover ..."
How can i get it to work to display multiple occurrences of a keyword?
I am using rails 3.2.11.
excerpt(text, phrase, options = {}) Link:
Extracts an excerpt from text that matches the first instance of phrase. The :radius option expands the excerpt on each side of the first occurrence of phrase
as the documantation states, is only the first instance of the phrase you search, not every instance of it
I've been using a multi_excerpt() method defined in my application_helper.rb
# Returns a summary of +text+ in the form of +phrase+ excerpts
#
# multi_excerpt('This string is is a very long long long string ', 'string', radius: 5)
# # => ...This string is i...long string ...
def multi_excerpt(text, phrase, options = {})
return unless text && phrase
radius = options.fetch(:radius, 10)
omission = options.fetch(:omission, "...")
raise if phrase.is_a? Regexp
regex = /.{,#{radius}}#{Regexp.escape(phrase)}.{,#{radius}}/i
parts = text.scan(regex)
"#{omission}#{parts.join(omission)}#{omission}"
end
Linking here my related post and PR.

convert ruby hash to URL query string ... without those square brackets

In Python, I can do this:
>>> import urlparse, urllib
>>> q = urlparse.parse_qsl("a=b&a=c&d=e")
>>> urllib.urlencode(q)
'a=b&a=c&d=e'
In Ruby[+Rails] I can't figure out how to do the same thing without "rolling my own," which seems odd. The Rails way doesn't work for me -- it adds square brackets to the names of the query parameters, which the server on the other end may or may not support:
>> q = CGI.parse("a=b&a=c&d=e")
=> {"a"=>["b", "c"], "d"=>["e"]}
>> q.to_params
=> "a[]=b&a[]=c&d[]=e"
My use case is simply that I wish to muck with the values of some of the values in the query-string portion of the URL. It seemed natural to lean on the standard library and/or Rails, and write something like this:
uri = URI.parse("http://example.com/foo?a=b&a=c&d=e")
q = CGI.parse(uri.query)
q.delete("d")
q["a"] << "d"
uri.query = q.to_params # should be to_param or to_query instead?
puts Net::HTTP.get_response(uri)
but only if the resulting URI is in fact http://example.com/foo?a=b&a=c&a=d, and not http://example.com/foo?a[]=b&a[]=c&a[]=d. Is there a correct or better way to do this?
In modern ruby this is simply:
require 'uri'
URI.encode_www_form(hash)
Quick Hash to a URL Query Trick :
"http://www.example.com?" + { language: "ruby", status: "awesome" }.to_query
# => "http://www.example.com?language=ruby&status=awesome"
Want to do it in reverse? Use CGI.parse:
require 'cgi'
# Only needed for IRB, Rails already has this loaded
CGI::parse "language=ruby&status=awesome"
# => {"language"=>["ruby"], "status"=>["awesome"]}
Here's a quick function to turn your hash into query parameters:
require 'uri'
def hash_to_query(hash)
return URI.encode(hash.map{|k,v| "#{k}=#{v}"}.join("&"))
end
The way rails handles query strings of that type means you have to roll your own solution, as you have. It is somewhat unfortunate if you're dealing with non-rails apps, but makes sense if you're passing information to and from rails apps.
As a simple plain Ruby solution (or RubyMotion, in my case), just use this:
class Hash
def to_param
self.to_a.map { |x| "#{x[0]}=#{x[1]}" }.join("&")
end
end
{ fruit: "Apple", vegetable: "Carrot" }.to_param # => "fruit=Apple&vegetable=Carrot"
It only handles simple hashes, though.

Resources