Converting pdf's to images with Imagemagick + RMagic - ruby-on-rails

I'm using paperclip at the moment to convert pdf files to images.
My code looks something like this
def convert_keynote_to_slides
system('convert -size 640x300 ' + keynote.queued_for_write[:original].path + ' ' + KEYNOTE_PATH + '/' + File.basename( self.keynote_file_name )+"%02d.png")
slide_basename = File.basename( self.keynote_file_name )
files = Dir.entries(KEYNOTE_PATH).sort
for file in files
#puts file if file.include?(slide_basename +'-')
self.slides.build("slide" => "#{file}") if file.include?(slide_basename)
end
end
I'm sure this can be re-factored to work better.
My questions are:
Is there a way to figure out the progress of ImageMagick if not how would I put this into a delayed job as im worried this wont scale very well.
Can anyone point me in the direction as to how to make this code better / more efficient. The KEYNOTE_PATH points to a directory in public where all of the images are held in a single folder im not sure if I like this or not. What would probably be better is to assign a random name to each file.

I hope you're doing extensive filtering of keynote.queued_for_write[:original].path and File.basename( self.keynote_file_name ) input variables, so you're not susceptible to shell meta-character injection attacks.

Related

Mixing Ruby and bash commands -- mv returns "x and y are the same file"

So I have a Ruby script (using Ruby because we have a library of pre-existing code that I need to use). From within Ruby I am using backticks to call Linux commands, specifically in this case the "mv" command. I am trying to move one file to another location but I keep getting the error message that x and y are "the same file" even though they are very clearly NOT the same file.
Here is the code in Ruby:
#!/usr/local/rvm/rubies/ruby-2.1.1/bin/ruby
masterFiles=[]
masterFiles << "/mnt/datadrive/Data Capture/QualityControl/UH_HRA_SVY/Scans and DataOutput/Data/UH_HRA_SVY_DATA.txt"
masterFiles << "/mnt/datadrive/Data Capture/QualityControl/UH_HRA_SVY_SPAN/Scans and DataOutput/Data/UH_HRA_SVY_SPAN_DATA.txt"
tm=Time.new.strftime("%Y%m%d")
masterFiles.each do |mf|
if File.exist?(mf)
qmf=39.chr + mf + 39.chr
`cat #{qmf} >> /tmp/QM`
savename=39.chr + \
"/mnt/datadrive/Data Capture/QualityControl/UH_HRA_SVY/Scans and DataOutput/Data/DailyFiles/" + \
File.basename(mf).gsub(".txt","_"+tm) + ".txt" + 39.chr
`mv #{qmf} #{savename}`
end
end
The error that I get is this:
mv: `/mnt/datadrive/Data Capture/QualityControl/UH_HRA_SVY_SPAN/Scans
and DataOutput/Data/UH_HRA_SVY_SPAN_DATA.txt' and `/mnt/datadrive/Data
Capture/QualityControl/UH_HRA_SVY/Scans and
DataOutput/Data/DailyFiles/UH_HRA_SVY_SPAN_DATA_20140530.txt' are the
same file
If I change this line:
`mv #{qmf} #{savename}`
To this:
puts "mv #{qmf} #{savename}"
And then run the output, it works as expected.
I am pretty sure that this has to do with spaces in the path. I have tried every combination of double-quoting, triple-quoting, quadruple-quoting, and back-slashing I can think of to resolve this but no go. I have also tried using FileUtils.mv but get what is basically the same error worded differently.
Can anybody help ? Thanks a lot.
p.s. I realize it's entirely possible that I could be going about this in an entirely wrong-headed way, so feel free to point that out if so. However, I am trying to use the tools which I already have some knowledge of (cat, mv, etc) instead of re-inventing the wheel.
You could use FileUtils.mv
I often do aliases like so:
require 'fileutils'
def mv(from, to)
FileUtils.mv(from, to)
end
And inside the mv() method I do additional safeguards, i.e. if the file does not exist, if there is a lack of permissions and so forth.
If you then still have problems with filenames that have ' ' blank characters, try to put the file into a "" quote like:
your_target_location = "foo/bar bla"

Aptana Studio 3 Snippet Around Selection

So I have recently switched from Dreamweaver to Aptana Studio 3 and I have been playing around with the whole custom snippet feature. For the life of me though I cannot figure out how to take a selection/highlighted text and wrap it with my own custom code and/or text. I have looked around the internet for three days now and cannot find anything regarding snippets. I have found some things using commands and key combinations, but I am wanting to create and use a snippet and trying to modify what I have found is not producing good fruit.
I have been able to create my own category and some basic snippets that insert straight text, but nothing that uses a selection.
I have absolutely NO experience with Ruby so forgive me if what follows is completely atrocious. I have more experience with PHP, HTML, Javascript, Java, etc. Here is what I have so far.
snippet "Selection Test" do |snip|
snip.trigger = "my_code"
snip.input = :selection
selection = ENV['TM_SELECTED_TEXT'] || ''
snip.expansion = "<test>$selection</test>\n"
snip.category = "My Snippets"
end
I haven't done much with custom Snippets, but if it helps, there is an example in the HTML bundle of a snippet that surrounds the selected text with <p></p> tags when you do Ctrl + Shift + W. You can see the code for it in snippets.rb in the HTML bundle:
with_defaults :scope => 'text.html - source', :input => :none, :output => :insert_as_snippet do |bundle|
command t(:wrap_selection_in_tag_pair) do |cmd|
cmd.key_binding = "CONTROL+SHIFT+W"
cmd.input = :selection
cmd.invoke do |context|
selection = ENV['TM_SELECTED_TEXT'] || ''
if selection.length > 0
"<${1:p}>${2:#{selection.gsub('/', '\/')}}</${1:p}>"
else
"<${1:p}>$0</${1:p}>"
end
end
end
end
I fiddled around with putting it into the PHP bundle for a few minutes under CTRL + Shift + P and got it working in HTML files, which was not my goal... but was progress. I may play around with it some more later, but in the meantime, maybe you know enough after all of your research to get something put together. I would be interested to see your results if you get this figured out.

Tracking Upload Progress of File to S3 Using Ruby aws-sdk

Firstly, I am aware that there are quite a few questions that are similar to this one in SO. I have read most, if not all of them, over the past week. But I still can't make this work for me.
I am developing a Ruby on Rails app that allows users to upload mp3 files to Amazon S3. The upload itself works perfectly, but a progress bar would greatly improve user experience on the website.
I am using the aws-sdk gem which is the official one from Amazon. I have looked everywhere in its documentation for callbacks during the upload process, but I couldn't find anything.
The files are uploaded one at a time directly to S3 so it doesn't need to load it into memory. No multiple file upload necessary either.
I figured that I may need to use JQuery to make this work and I am fine with that.
I found this that looked very promising: https://github.com/blueimp/jQuery-File-Upload
And I even tried following the example here: https://github.com/ncri/s3_uploader_example
But I just could not make it work for me.
The documentation for aws-sdk also BRIEFLY describes streaming uploads with a block:
obj.write do |buffer, bytes|
# writing fewer than the requested number of bytes to the buffer
# will cause write to stop yielding to the block
end
But this is barely helpful. How does one "write to the buffer"? I tried a few intuitive options that would always result in timeouts. And how would I even update the browser based on the buffering?
Is there a better or simpler solution to this?
Thank you in advance.
I would appreciate any help on this subject.
The "buffer" object yielded when passing a block to #write is an instance of StringIO. You can write to the buffer using #write or #<<. Here is an example that uses the block form to upload a file.
file = File.open('/path/to/file', 'r')
obj = s3.buckets['my-bucket'].objects['object-key']
obj.write(:content_length => file.size) do |buffer, bytes|
buffer.write(file.read(bytes))
# you could do some interesting things here to track progress
end
file.close
After read the source code of the AWS gem, I've adapted (or mostly copy) the multipart upload method to yield the current progress based on how many chunks have been uploaded
s3 = AWS::S3.new.buckets['your_bucket']
file = File.open(filepath, 'r', encoding: 'BINARY')
file_to_upload = "#{s3_dir}/#{filename}"
upload_progress = 0
opts = {
content_type: mime_type,
cache_control: 'max-age=31536000',
estimated_content_length: file.size,
}
part_size = self.compute_part_size(opts)
parts_number = (file.size.to_f / part_size).ceil.to_i
obj = s3.objects[file_to_upload]
begin
obj.multipart_upload(opts) do |upload|
until file.eof? do
break if (abort_upload = upload.aborted?)
upload.add_part(file.read(part_size))
upload_progress += 1.0/parts_number
# Yields the Float progress and the String filepath from the
# current file that's being uploaded
yield(upload_progress, upload) if block_given?
end
end
end
The compute_part_size method is defined here and I've modified it to this:
def compute_part_size options
max_parts = 10000
min_size = 5242880 #5 MB
estimated_size = options[:estimated_content_length]
[(estimated_size.to_f / max_parts).ceil, min_size].max.to_i
end
This code was tested on Ruby 2.0.0p0

Converting PDFs to PNGs with Dragonfly

I have a Dragonfly processor which should take a given PDF and return a PNG of the first page of the document.
When I run this processor via the console, I get back the PNG as expected, however, when in the context of Rails, I'm getting it as a PDF.
My code is roughly similar to this:
def to_pdf_thumbnail(temp_object)
tempfile = new_tempfile('png')
args = "'#{temp_object.path}[0]' '#{tempfile.path}'"
full_command = "convert #{args}"
result = `#{full_command}`
tempfile
end
def new_tempfile(ext=nil)
tempfile = ext ? Tempfile.new(['dragonfly', ".#{ext}"]) : Tempfile.new('dragonfly')
tempfile.binmode
tempfile.close
tempfile
end
Now, tempfile is definitely creating a .png file, but the convert is generating a PDF (when run from within Rails 3).
Any ideas as to what the issue might be here? Is something getting confused about the content type?
I should add that both this and a standard conversion (asset.png.url) both yield a PDF with the PDF content as a small block in the middle of the (A4) image.
An approach I’m using for this is to generate the thumbnail PNG on the fly via the thumb method from Dragonfly’s ImageMagick plugin:
<%= image_tag rails_model.file.thumb('100x100#', format: 'png', frame: 0).url %>
So long as Ghostscript is installed, ImageMagick/Dragonfly will honour the format / frame (i.e. page of the PDF) settings. If file is an image rather than a PDF, it will be converted to a PNG, and the frame number ignored (unless it’s a GIF).
Try this
def to_pdf_thumbnail(temp_object)
ret = ''
tempfile = new_tempfile('png')
system("convert",tmp_object.path[0],tmpfile.path)
tempfile.open {|f| ret = f.read }
ret
end
The problem is you are likely handing convert ONE argument not two
Doesn't convert rely on the extension to determine the type? Are you sure the tempfiles have the proper extensions?

Modifying Brightness/Contrast of Image with RMagick

I'm trying to write a script to take a PDF and increase the brightness/contrast such that my scanned handwritting is actually readable. I am able to do this with Photoshop (which is really tedious), but I can't figure out what RMagick methods to use to produce a similar result.
Any pointers? Thanks for the help.
I ended up using Fred's ImageMagick scripts to make the handwriting readable see : http://www.fmwconcepts.com/imagemagick/
I ended up not using RMagick for this part; instead I just called imagemagick's convert terminal command from ruby. It is a little bit convoluted - but it worked for me. Some sample code is below:
localthres_script = '~/Downloads/test/localthresh.sh' # CONSTANT LOCATION
params = '-m 3 -r 25 -b 20 -n yes'
pdf = Magick::ImageList.new("#{dir}/#{pdf_name_wo_ext}.pdf")
i=1
pdf.each do |page|
image_name = "#{pdf_name_wo_ext}_#{i}"
puts "==> Enhancing images..."
%x[#{localthres_script} #{params} #{dir}/#{image_name}.png #{dir}/PDF_SCRIPT/enhanced/#{image_name}.png]
puts "==> Moving images..."
%x[mv #{dir}/#{image_name}.png #{dir}/PDF_SCRIPT/original/#{image_name}.png]
i = i+1
end # each
I know this isn't the cleanest code, but it worked for me.

Resources