ruby net_dav sample puts - ruby-on-rails

I'm trying to put a file on a site with WEB_DAV. (a ruby gem)
When I follow the example, I get a nil exception
#### GEMS
require 'rubygems'
begin
gem "net_dav"
rescue LoadError
system("gem install net_dav")
Gem.clear_paths
end
require 'net/dav'
uri = URI('https://staging.web.mysite');
user = "dave"
pasw = "correcthorsebatterystaple"
dav = Net::DAV.new(uri, :curl => false)
dav.verify_server = false
dav.credentials(user, pasw)
cargo = ("testing.txt")
File.open(cargo, "rb") { |stream|
dav.put(urI.path +'/'+ cargo, stream, File.size(cargo))
}
when I run this I get
`digest_auth': can't convert nil into String (TypeError)
this relates to line 197 in my nav.rb file.
request_digest << ':' << params['nonce']
So what I'm wondering is what step did I not add?
Is there a reasonable example of the correct use of this gem? Something that does something that works would be sweet :)
SIDE QUESTION: Is this the correct gem to use to do web_DAV? It seems an old unmaintained gem, perhaps there's something used by more to accomplish the task?

Try referencing the hash with a symbol rather than a string, i.e.
request_digest << ':' << params[:nonce]
In a simple test
baz = "baz"
params = {:foo => "bar"}
baz << ':' << params['foo']
results in the same error as you're getting.

Related

Ruby - checking if file is a CSV

I have just wrote a code where I get a csv file passed in argument and treat it line by line ; so far, everything is okay. Now, I would like to secure my code by making sure that what we receive in argument is a .csv file.
I saw in the Ruby doc that it exist a == "--file" option but using it generate an error : the way I understood it, it seems this option only work for the txt files.
Is there a method specific that allowed to check if my file is a csv ? Here some of my code :
if ARGV.empty?
puts "j'ai rien reçu"
# option to check, don't work
elsif ARGV[0].shift == "--file"
# my code so far, whithout checking
else CSV.foreach(ARGV.shift) do |row|
etc, etc...
I think it is unpossible to make a real safe test without additional information.
Just some notes what you can do:
You get a filename in a variable filename.
First, check if it is a file:
File.exist?
Then you could check, if the encoding is correct:
raise "Wrong encoding" unless content.valid_encoding?
Has your csv always the same number of columns? And do you have only one liner?
This can be a possibility to make the next check:
content.each_line{|line|
return false if line.count(sep) < columns - 1
}
This check can be modified for your case, e.g. if you have always an exact number of rows.
In total you can define something like:
require 'csv'
#columns defines the expected numer of columns per line
def csv?(filename, sep: ';', columns: 3)
return false unless File.exist?(filename) #"No file"
content = File.read(filename, :encoding => 'utf-8')
return false unless content.valid_encoding? #"Wrong encoding"
content.each_line{|line|
return false if line.count(sep) < columns - 1
}
CSV.parse(content, :col_sep => sep)
end
if csv = csv?('test.csv')
csv.each do |row|
p row
end
end
You can use ruby-filemagic gem
gem install ruby-filemagic
Usage:
$ irb
irb(main):001:0> require 'filemagic'
=> true
irb(main):002:0> fm = FileMagic.new
=> #<FileMagic:0x7fd4afb0>
irb(main):003:0> fm.file('foo.zip')
=> "Zip archive data, at least v2.0 to extract"
irb(main):004:0>
https://github.com/ricardochimal/ruby-filemagic
Use File.extname() to check the origin file
File.extname("test.rb") #=> ".rb"

Find within the first 10?

I'm using Nokogiri to screen-scrape contents of a website.
I set fetch_number to specify the number of <divs> that I want to retrieve. For example, I may want the first(10) tweets from the target page.
The code looks like this:
doc.css(".tweet").first(fetch_number).each do |item|
title = item.css("a")[0]['title']
end
However, when there is less than 10 matching div tags returned, it will report
NoMethodError: undefined method 'css' for nil:NilClass
This is because, when no matching HTML is found, it will return nil.
How can I make it return all the available data within 10? I don't need the nils.
UPDATE:
task :test_fetch => :environment do
require 'nokogiri'
require 'open-uri'
url = 'http://themagicway.taobao.com/search.htm?&search=y&orderType=newOn_desc'
doc = Nokogiri::HTML(open(url) )
puts doc.css(".main-wrap .item").count
doc.css(".main-wrap .item").first(30).each do |item_info|
if item_info
href = item_info.at(".detail a")['href']
puts href
else
puts 'this is empty'
end
end
end
Return resultes(Near the end):
24
http://item.taobao.com/item.htm?id=41249522884
http://item.taobao.com/item.htm?id=40369253621
http://item.taobao.com/item.htm?id=40384876796
http://item.taobao.com/item.htm?id=40352486259
http://item.taobao.com/item.htm?id=40384968205
.....
http://item.taobao.com/item.htm?id=38843789106
http://item.taobao.com/item.htm?id=38843517455
http://item.taobao.com/item.htm?id=38854788276
http://item.taobao.com/item.htm?id=38825442050
http://item.taobao.com/item.htm?id=38630599372
http://item.taobao.com/item.htm?id=38346270714
http://item.taobao.com/item.htm?id=38357729988
http://item.taobao.com/item.htm?id=38345374874
this is empty
this is empty
this is empty
this is empty
this is empty
this is empty
count reports only 24 elements, but it retuns a 30 array.
And it actually is not an array, but Nokogiri::XML::NodeSet? I'm not sure.
title = item.css("a")[0]['title']
is a bad practice.
Instead, consider writing using at or at_css instead of search or css:
title = item.at('a')['title']
Next, if the <a> tag returned doesn't have a title parameter, Nokogiri and/or Ruby will be upset because the title variable will be nil. Instead, improve your CSS selector to only allow matches like <a title="foo">:
require 'nokogiri'
doc = Nokogiri::HTML('<body>foobar</body>')
doc.at('a').to_html # => "foo"
doc.at('a[title]').to_html # => "bar"
Notice how the first, which is not constrained to look for tags with a title parameter returns the first <a> tag. Using a[title] will only return ones with a title parameter.
That means your loop over the values will never return nil, and you won't have a problem needing to compact them out of the returned array.
As a general programming tip, if you're getting nils like that, look at the code generating the array, because odds are good it's not doing it right. You should ALWAYS know what sort of results your code will generate. Using compact to clean up the array is a knee-jerk reaction to not having written the code correctly most of the time.
Here's your updated code:
require 'nokogiri'
require 'open-uri'
url = 'http://themagicway.taobao.com/search.htm?&search=y&orderType=newOn_desc'
doc = Nokogiri::HTML(open(url) )
puts doc.css(".main-wrap .item").count
doc.css(".main-wrap .item").first(30).each do |item_info|
if item_info
href = item_info.at(".detail a")['href']
puts href
else
puts 'this is empty'
end
end
And here's what's wrong:
doc.css(".main-wrap .item").first(30)
Here's a simple example demonstrating why that doesn't work:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<html>
<body>
<p>foo</p>
</body>
</html>
EOT
In Nokogiri, search',cssandxpath` are equivalent, except that the first is generic and can take either CSS or XPath, while the last two are specific to that language.
doc.search('p') # => [#<Nokogiri::XML::Element:0x3fcf360ef750 name="p" children=[#<Nokogiri::XML::Text:0x3fcf360ef4f8 "foo">]>]
doc.search('p').size # => 1
doc.search('p').map(&:to_html) # => ["<p>foo</p>"]
That shows that the NodeSet returned by doing a simple search returns only one node, and what the node looks like.
doc.search('p').first(2) # => [#<Nokogiri::XML::Element:0x3fe3a28d2848 name="p" children=[#<Nokogiri::XML::Text:0x3fe3a28c7b50 "foo">]>, nil]
doc.search('p').first(2).size # => 2
Searching using first(n) returns "n" elements. If that many aren't found Nokogiri fills them in using nil values.
This is counter what we'd assume first(n) to do, since Enumerable#first returns up-to-n and won't pad with nils. This isn't a bug, but it is unexpected behavior since Enumerable's first sets the expected behavior for methods with that name, but, this is NodeSet#first, not Enumerable#first, so it does what it does until the Nokogiri authors change it. (You can see why it happens if you look at the source for that particular method.)
Instead, slicing the NodeSet does show the expected behavior:
doc.search('p')[0..1] # => [#<Nokogiri::XML::Element:0x3fe3a28d2848 name="p" children=[#<Nokogiri::XML::Text:0x3fe3a28c7b50 "foo">]>]
doc.search('p')[0..1].size # => 1
doc.search('p')[0, 2] # => [#<Nokogiri::XML::Element:0x3fe3a28d2848 name="p" children=[#<Nokogiri::XML::Text:0x3fe3a28c7b50 "foo">]>]
doc.search('p')[0, 2].size # => 1
So, don't use NodeSet#first(n), use the slice form NodeSet#[].
Applying that, I'd write the code something like:
require 'nokogiri'
require 'open-uri'
URL = 'http://themagicway.taobao.com/search.htm?&search=y&orderType=newOn_desc'
doc = Nokogiri::HTML(open(URL))
hrefs = doc.css(".main-wrap .item .detail a[href]")[0..29].map { |anchors|
anchors['href']
}
puts hrefs.size
puts hrefs
# >> 24
# >> http://item.taobao.com/item.htm?id=41249522884
# >> http://item.taobao.com/item.htm?id=40369253621
# >> http://item.taobao.com/item.htm?id=40384876796
# >> http://item.taobao.com/item.htm?id=40352486259
# >> http://item.taobao.com/item.htm?id=40384968205
# >> http://item.taobao.com/item.htm?id=40384816312
# >> http://item.taobao.com/item.htm?id=40384600507
# >> http://item.taobao.com/item.htm?id=39973451949
# >> http://item.taobao.com/item.htm?id=39861209551
# >> http://item.taobao.com/item.htm?id=39545678869
# >> http://item.taobao.com/item.htm?id=39535371171
# >> http://item.taobao.com/item.htm?id=39509186150
# >> http://item.taobao.com/item.htm?id=38973412667
# >> http://item.taobao.com/item.htm?id=38910499863
# >> http://item.taobao.com/item.htm?id=38942960787
# >> http://item.taobao.com/item.htm?id=38910403350
# >> http://item.taobao.com/item.htm?id=38843789106
# >> http://item.taobao.com/item.htm?id=38843517455
# >> http://item.taobao.com/item.htm?id=38854788276
# >> http://item.taobao.com/item.htm?id=38825442050
# >> http://item.taobao.com/item.htm?id=38630599372
# >> http://item.taobao.com/item.htm?id=38346270714
# >> http://item.taobao.com/item.htm?id=38357729988
# >> http://item.taobao.com/item.htm?id=38345374874
Try this
doc.css(".tweet").first(fetch_number).each do |item|
title = item.css("a")[0]['title'] rescue nil
end
And let me know it works or not? It will not show error
Try compact.
[1, nil, 2, nil, 3] # => [1, 2, 3]
http://www.ruby-doc.org/core-2.1.3/Array.html#method-i-compact
(ie: first(fetch_number).compact.each do |item|)

Open filename with embedded spaces with Roo

Ruby 2.0.0, Rails 4.0.3, Windows 8.1 Update, Roo 1.13.2
I am trying to open an Excel spreadsheet with embedded spaces using Roo. So far, I am unable to do that. I don't really know if this problem is restricted to Roo. If I rename it to eliminate the spaces, I have no problem with it. I tried encoding it, but then it simply said the file doesn't exist. Can I open the file while it contains spaces?
Code sample:
exceptions = [URI::InvalidURIError, IOError]
puts "f is #{f}"
puts "f exist? #{File.exist?(f)}"
begin
xls = Roo::Spreadsheet.open(f)
rescue *exceptions => e
puts e.message
end
encoded_f = URI.encode(f).to_s
puts "encoded_f is #{encoded_f}"
puts "encoded_f exist? #{File.exist?(encoded_f)}"
begin
xls = Roo::Spreadsheet.open(encoded_f)
rescue *exceptions => e
puts e.message
end
gsub_f = f.gsub(" ", "") # Rename file without spaces
File.rename(f, gsub_f)
puts "gsub_f is #{gsub_f}"
puts "gsub_f exist? #{File.exist?(gsub_f)}"
begin
xls = Roo::Spreadsheet.open(gsub_f)
rescue *exceptions => e
puts e.message
end
Output sample:
f is Whitt Report 2014-07-28-0803.xls
f exist? true
bad URI(is not URI?): Whitt Report 2014-07-28-0803.xls
encoded_f is Whitt%20Report%202014-07-28-0803.xls
encoded_f exist? false
file Whitt%20Report%202014-07-28-0803.xls does not exist
gsub_f is WhittReport2014-07-28-0803.xls
gsub_f exist? true
No message is given in the end because the file opens successfully.
This is caused by the way in which the URI module is called in the Roo::Spreadsheet#open method.
I posted a fix to this problem which has now been merged. If you update your Roo gem you should no longer have this issue.

REXML::Document.new take a simple string as good doc?

I would like to check if the xml is valid. So, here is my code
require 'rexml/document'
begin
def valid_xml?(xml)
REXML::Document.new(xml)
rescue REXML::ParseException
return nil
end
bad_xml_2=%{aasdasdasd}
if(valid_xml?(bad_xml_2) == nil)
puts("bad xml")
raise "bad xml"
end
puts("good_xml")
rescue Exception => e
puts("exception" + e.message)
end
and it returns good_xml as result. Did I do something wrong? It will return bad_xml if the string is
bad_xml = %{
<tasks>
<pending>
<entry>Grocery Shopping</entry>
<done>
<entry>Dry Cleaning</entry>
</tasks>}
Personally, I'd recommend using Nokogiri, as it's the defacto standard for XML/HTML parsing in Ruby. Using it to parse a malformed document:
require 'nokogiri'
doc = Nokogiri::XML('<xml><foo><bar></xml>')
doc.errors # => [#<Nokogiri::XML::SyntaxError: Opening and ending tag mismatch: bar line 1 and xml>, #<Nokogiri::XML::SyntaxError: Premature end of data in tag foo line 1>, #<Nokogiri::XML::SyntaxError: Premature end of data in tag xml line 1>]
If I parse a document that is well-formed:
doc = Nokogiri::XML('<xml><foo/><bar/></xml>')
doc.errors # => []
REXML treats a simple string as a valid XML with no root node:
xml = REXML::Document.new('aasdasdasd')
# => <UNDEFINED> ... </>
It does not however treat illegal XML (with mismatching tags, for example) as a valid XML, and throws an exception.
REXML::Document.new(bad_xml)
# REXML::ParseException: #<REXML::ParseException: Missing end tag for 'done' (got "tasks")
It is missing an end-tag to <done> - so it is not valid.

Ruby hexdigest sha1 pack('H*') string encoding...

I meet an encoding problem... No errors in the console, but the output is not well encoded.
I must use Digest::SHA1.hexdigest on a string and then must pack the result.
The below example should outputs '{´p)ODýGΗ£Iô8ü:iÀ' but it outputs '{?p)OD?GΗ?I?8?:i?' in the console and '{�p)OD�G^BΗ�I�8^D�:i�' in the log file.
So, my variable called pack equals '{?p)OD?GΗ?I?8?:i?' and not '{´p)ODýGΗ£Iô8ü:iÀ'. That's a big problem... I'm doing it in a Rails task.
Any idea guys?
Thanks
# encoding: utf-8
require 'digest/sha1'
namespace :my_app do
namespace :check do
desc "Description"
task :weather => :environment do
hexdigest = Digest::SHA1.hexdigest('29d185d98c984a359e6e6f26a0474269partner=100043982026&code=34154&profile=large&filter=movie&striptags=synopsis%2Csynopsisshort&format=json&sed=20130527')
pack = [hexdigest].pack("H*")
puts pack # => {?p)OD?GΗ?I?8?:i?
puts '{´p)ODýGΗ£Iô8ü:iÀ' # => {´p)ODýGΗ£Iô8ü:iÀ
end
end
end
This is what I did (my conversion from PHP to Ruby)
# encoding: utf-8
require 'open-uri'
require 'base64'
require 'digest/sha1'
class Allocine
$_api_url = 'http://api.allocine.fr/rest/v3'
$_partner_key
$_secret_key
$_user_agent = 'Dalvik/1.6.0 (Linux; U; Android 4.2.2; Nexus 4 Build/JDQ39E)'
def initialize (partner_key, secret_key)
$_partner_key = partner_key
$_secret_key = secret_key
end
def get(id)
# build the params
params = { 'partner' => $_partner_key,
'code' => id,
'profile' => 'large',
'filter' => 'movie',
'striptags' => 'synopsis,synopsisshort',
'format' => 'json' }
# do the request
response = _do_request('movie', params)
return response
end
private
def _do_request(method, params)
# build the URL
query_url = $_api_url + '/' + method
# new algo to build the query
http_build_query = Rack::Utils.build_query(params)
sed = DateTime.now.strftime('%Y%m%d')
sig = URI::encode(Base64.encode64(Digest::SHA1.digest($_secret_key + http_build_query + '&sed=' + sed)))
return sig
end
end
Then call
allocine = Allocine.new(ALLOCINE_PARTNER_KEY, ALLOCINE_SECRET_KEY)
puts allocine.get('any ID')
get method return 'e7RwKU9E%2FUcCzpejSfQ4BPw6acA%3D' in PHP and 'cPf6I4ZP0qHQTSVgdKTbSspivzg=%0A' in Ruby...
thanks again
I think this "encoding" issue has turned up due to debugging other parts of a conversion from PHP to Ruby. The target API that will consume a digest of params looks like it will accept a signature variable constructed in Ruby as follows (edit: well this is guess, there may also be relevant differences between Ruby and PHP in URI encoding and base64 defaults):
require 'digest/sha1'
require 'base64'
require 'uri'
sig_data = 'edhefhekjfhejk8edfefefefwjw69partne...'
sig = URI.encode( Base64.encode64( Digest::SHA1.digest( sig_data ) ) )
=> "+ZabHg22Wyf7keVGNWTc4sK1ez4=%0A"
The exact construction of sig_data from the parameters that are being signed is also important. That is generated by the PHP method http_build_query, and I do not know what order or escaping that will apply to input params. If your Ruby version gets them in a different order, or escapes differently to PHP, the signature will be wrong (edit: Actually it is possible we are looking here for a signature on the exact query string sent the API - I don't know). It is possibly an issue of that sort that has led you down the rabbit hole of how the signature is constructed?
Thank you guys for your help.
Problem is solved. With the following code I obtain exactly the same string as with PHP:
http_build_query = Rack::Utils.build_query(params)
sed = DateTime.now.strftime('%Y%m%d')
sig = CGI::escape(Base64.strict_encode64(Digest::SHA1.digest($_secret_key + http_build_query + '&sed=' + sed)))
Now I've another problem for which I opened a new question here.
thanks you very much.

Resources