Ruby tempfile anomalous behavior - ruby-on-rails

This is my pry session output:
[1] pry(SomeTask)> epub
=> #<File:/somepath/tmp/x.epub>
[2] pry(SomeTask)> epub.size
=> 134
[3] pry(SomeTask)> File.size("/somepath/tmp/x.epub")
=> 44299
[4] pry(SomeTask)> epub.class
=> Tempfile
I see that File.size yields a different result than the size method of the Tempfile instance.
How is this possible?

The devil is in the details. From the docs for Tempfile#size (emphasis mine):
size()
Returns the size of the temporary file. As a side effect, the IO buffer is flushed before determining the size.
What's happening is that you're using File.size to read the size of the file before the buffer has been flushed—i.e. before all of the bytes have been written to the file—and then you're using Tempfile#size, which flushes that buffer before it calculates the size:
tmp = Tempfile.new('foo')
tmp.write('a' * 1000)
File.size(tmp)
# => 0
tmp.size
# => 1000
But see what happens when you call tmp.size before File.size(tmp):
tmp = Tempfile.new('bar')
tmp.write('a' * 1000)
tmp.size
# => 1000
File.size(tmp)
# => 1000
You can get the behavior you want out of File.size by manually flushing the buffer:
tmp = Tempfile.new('baz')
tmp.write('a' * 1000)
tmp.flush
File.size(tmp)
# => 1000

I'm using Pry version 0.10.1 on Ruby 2.2.2 and can't duplicate that situation:
[1] (pry) main: 0> foo = Tempfile.new('foo')
#<File:/var/folders/yb/whn8dwns6rl92jswry5cz87dsgk2n1/T/foo20150819-83612-1tpkqm4>
[2] (pry) main: 0> File.size(foo.path)
=> 0
[3] (pry) main: 0> foo.size
=> 0
After initialization, the file size is 0 bytes.
[4] (pry) main: 0> foo.write('a')
=> 1
[5] (pry) main: 0> File.size(foo.path)
=> 0
After writing one character to foo, the data has been buffered and not flushed to disk as I'd expect.
[6] (pry) main: 0> foo.size
=> 1
[7] (pry) main: 0> File.size(foo.path)
=> 1
foo.size flushes the buffer then returns the size of the file, which matches what File.size says it is.
When dealing with temporary files created by Tempfile, we don't care or want to know what their size is. They're temporary and will disappear (eventually) and are treated like buffers. If you need a file that is more permanent, then create and write to a normal file.

Related

Ruby FrozenString error in Sidekiq Worker

I am using Ruby on Rails and trying to use a Sidekiq worker, but at some point I'm running into an issue where the worker calls a view, the view calls a concern, and then the concern isn't able to update a variable in its function because of the FrozenString error.
For example, here's how my worker looks:
class ReportGeneratorWorker
include Sidekiq::Worker, ReportHelper
sidekiq_options queue: Rails.env.to_sym
def perform
ac_base = ApplicationController.new
body_html = ac_base.render_to_string template: "common/report_templates/generate_pdf.html.erb", layout: false
end
end
Again, the view inserts text that leverages a concern, but the concern doesn't allow it to update. See below for example:
[3] pry(#<#<Class:0x00007fec8ceea400>>)> html
=> "<ul>"
[4] pry(#<#<Class:0x00007fec8ceea400>>)> html.class.name
=> "String"
[5] pry(#<#<Class:0x00007fec8ceea400>>)> html << "Hello"
FrozenError: can't modify frozen String
from (pry):5:in `replacement_text'
Any idea why this is happening? If I define the variable again from the Pry console, then it actually works:
[1] pry(#<#<Class:0x0000557f07a25130>>)> html
=> "<ul>"
[2] pry(#<#<Class:0x0000557f07a25130>>)> html << "TEST"
FrozenError: can't modify frozen String
from (pry):2:in `replacement_text'
[3] pry(#<#<Class:0x0000557f07a25130>>)> html = "<ul>"
=> "<ul>"
[4] pry(#<#<Class:0x0000557f07a25130>>)> html << "TEST"
=> "<ul>TEST"
[5] pry(#<#<Class:0x0000557f07a25130>>)>
I was able to resolve this issue by replacing << with +=.

How do I programmatically find out the schedule of a delayed mailer job with Resque Mailer and Resque scheduler?

I am trying to display the next time an email is scheduled using any or all of the below arguments as inputs. I'm using resque, resque-scheduler and resque-mailer.
For example, above are the delayed jobs as displayed in the resque web interface. So I'd like to input "game_starting_reminder" and/or 226 and/or "Beat Box" and be able to then display the timestamp as such:
"Next scheduled email: 2017-10-31 at 9:30 pm".
However, when I try to call for the information in the console, the below is the output I receive
I've tried extending the delay_extensions and methods and using the find_delayed_selection method but that doesn't seem to work.
For example this:
[18] pry(main)> Resque.find_delayed_selection { |job| job["class"] == QuizMailer}
TypeError: no implicit conversion of String into Integer
Or this:
[32] pry(main)> Resque.find_delayed_selection { {
[32] pry(main)* "class": "QuizMailer",
[32] pry(main)* "args": ["game_starting_reminder", [226, "Beat Box"]],
[32] pry(main)* "queue": "mailer"
[32] pry(main)* }}
=> ["{\"class\":\"QuizMailer\",\"args\":[\"game_starting_reminder\",[226,\"Beat Box\"]],\"queue\":\"mailer\"}",
"{\"class\":\"QuizMailer\",\"args\":[\"game_ending_reminder\",[226,\"Beat Box\"]],\"queue\":\"mailer\"}"]
Any other method I can use here? Or tips.
Thank you!
Figured it out. The scheduled_at method is the best candidate here for the job.
First step is to add the DelayingExtensions module to the project. I just added the file from the resque source code on Github to initializers and then in resque.rb added the line:
#resque.rb
rails_root = ENV['RAILS_ROOT'] || File.dirname(__FILE__) + '/../..'
rails_env = ENV['RAILS_ENV'] || 'development'
resque_config = YAML.load_file(rails_root + '/config/resque.yml')
Resque.redis = resque_config[rails_env]
include DelayingExtensions
I modified the scheduled_at method from the github source code slightly because I couldn't get it to work as is and changed the name of the method to scheduled_for_time
#delaying_extensions.rb
def scheduled_for_time(klass, *args)
args = args[0]
search = encode(job_to_hash(klass, args))
redis.smembers("timestamps:#{search}").map do |key|
key.tr('delayed:', '').to_i
end
end
In this case, we can do the following in the console:
[2] pry(main)> klass =QuizMailer
=> QuizMailer
[4] pry(main)> args = ["game_starting_reminder", [230, "Beat Box"]]
=> ["game_starting_reminder", [230, "Beat Box"]]
[5] pry(main)> Resque.scheduled_for_time(QuizMailer, args)
=> [1515081600]
[6] pry(main)> Time.at(_.first)
=> 2018-01-04 21:30:00 +0530
Voila!

The provided regular expression is using multiline anchors (^ or $), which may present a security risk

I am using Ruby 2.3.0 and Rails 4.2.5 and I am follow this document https://www.sitepoint.com/youtube-rails/
Since here there is regular expression are used to validate a youtube url. I am copy paste it in my rails application its working fine on rails console
[1] pry(main)> YT_LINK_FORMAT = /^(?:https?:\/\/)?(?:www\.)?youtu(?:\.be|be\.com)\/(?:watch\?v=)?([\w-]{10,})/
=> /^(?:https?:\/\/)?(?:www\.)?youtu(?:\.be|be\.com)\/(?:watch\?v=)?([\w-]{10,})/
[2] pry(main)> video_url = "https://www.youtube.com/watch?v=aZngT1Eas4w"
=> "https://www.youtube.com/watch?v=aZngT1Eas4w"
[3] pry(main)> uid = video_url.match(YT_LINK_FORMAT)
=> #<MatchData "https://www.youtube.com/watch?v=aZngT1Eas4w" 1:"aZngT1Eas4w">
[4] pry(main)> uid[2]
=> nil
[5] pry(main)> uid[1]
=> "aZngT1Eas4w"
[6] pry(main)> video_url = "https://youtu.be/aZngT1Eas4w"
=> "https://youtu.be/aZngT1Eas4w"
[7] pry(main)> uid = video_url.match(YT_LINK_FORMAT)
=> #<MatchData "https://youtu.be/aZngT1Eas4w" 1:"aZngT1Eas4w">
[8] pry(main)> uid[1]
=> "aZngT1Eas4w"
But when I run my rails application I got this error
"The provided regular expression is using multiline anchors (^ or $), which may present a security risk. Did you mean to use \A and \z, or forgot to add the :multiline => true option?"
I am also try to use this regular expression
/^(?:https?:\/\/)?(?:www\.)?youtu(?:\.be|be\.com)\/(?:watch\?v=)?([\w-]{10,})/
=> /^(?:https?:\/\/)?(?:www\.)?youtu(?:\.be|be\.com)\/(?:watch\?v=)?([\w-]{10,})/
But the same problem is still there
This regular expression is working for rails 5
YT_LINK_FORMAT = /(http:\/\/|https:\/\/|)(www.)?(youtu(be\.com|\.be|be\.com))\/(video\/|embed\/|watch\?v=|v\/)?([A-Za-z0-9._%-]*)(\&\S+)?/

ObjectSpace.each_object(Foo).count

I'am trying to figure out ObjectSpace.each_object
In console:
class Foo; end
Foo.new
ObjectSpace.each_object(Foo).count
=> 1
GC.start
ObjectSpace.each_object(Foo).count
=> 1
I've seen examples and I know that the second count should be 0.
Any ideas what is going on here?
Thanks.
It depends on your console.
IRB
The last result is saved as _, even if it hasn't been explicitely assigned.
Running GC.start won't remove the last object :
irb(main):001:0> class Foo; end
=> nil
irb(main):002:0>
irb(main):003:0* Foo.new
=> #<Foo:0x007fca7a309f98>
irb(main):004:0> p ObjectSpace.each_object(Foo).count; GC.start; p ObjectSpace.each_object(Foo).count
1
1
=> 1
irb(main):005:0> p ObjectSpace.each_object(Foo).count; GC.start; p ObjectSpace.each_object(Foo).count
1
0
=> 0
Pry
You can access the last result and the second to last result with _ and __ :
[1] pry(main)> 'a'
=> "a"
[2] pry(main)> 'b'
=> "b"
[3] pry(main)> p _, __
"b"
"a"
=> ["b", "a"]
Pry saves all the 100 last results in _out_ Pry::HistoryArray:
[1] pry(main)> class Foo; end
=> nil
[2] pry(main)> Foo.new
=> #<Foo:0x007fd093102118>
[3] pry(main)> ObjectSpace.each_object(Foo).count
=> 1
[4] pry(main)> GC.start
=> nil
[5] pry(main)> ObjectSpace.each_object(Foo).count
=> 1
[6] pry(main)> _out_[2]
=> #<Foo:0x007fd093102118>
You can use _out_.pop! to remove its last element :
[1] pry(main)> class Foo; end
=> nil
[2] pry(main)> Foo.new
=> #<Foo:0x007fa90b1ad360>
[3] pry(main)> ObjectSpace.each_object(Foo).count
=> 1
[4] pry(main)> GC.start
=> nil
[5] pry(main)> ObjectSpace.each_object(Foo).count
=> 1
[6] pry(main)> 5.times{_out_.pop!}
=> 5
[7] pry(main)> GC.start
=> nil
[8] pry(main)> ObjectSpace.each_object(Foo).count
=> 0
Inside a script
If you execute :
class Foo; end
Foo.new
p ObjectSpace.each_object(Foo).count
GC.start
p ObjectSpace.each_object(Foo).count
inside a script, you get :
1
0
GC.start does not force the garbage collector to start.
It is slightly unclear from the documentation, but it just instructs the engine to schedule a garbage collection. That said, one can not rely on GC.start would immediately remove objects from the heap.

Rails serialization of Range of integers is broken

I need to serialize ruby Ranges using YAML, in a rails context.
I wanted to check if ranges of integers and ranges of strings were serialized properly.
Here was my test:
# classic irb
require 'yaml' # => true
YAML::VERSION # => "0.60"
YAML.dump(1..2) # => "--- !ruby/range \nbegin: 1\nend: 2\nexcl: false\n"
YAML.dump("1".."2") # => "--- !ruby/range \nbegin: \"1\"\nend: \"2\"\nexcl: false\n"
The two outputs are dutifully distinct, so I got forward and coded it inside my rails application.
However it seems that within a rails context, ruby forgets how to properly serialize a range of integers!
# ./script/rails console
Rails::VERSION::STRING # => "3.0.15"
RUBY_VERSION # => "1.8.7"
YAML::VERSION # => "0.60"
YAML.dump(1..2) # => "--- !ruby/range\n begin: 1\n end: 2\n excl: false"
YAML.dump("1".."2") # => "--- !ruby/range\n begin: 1\n end: 2\n excl: false"
# The two outputs are identical, the distinction between integers and strings is lost!
Both ruby and ruby on rails seem to use the same version of the YAML library.
If I don't get it wrong, my version of ruby doesn't support switching between multiple coder engines.
I have a few questions:
What is the cause of this difference?
Does this problem arise with newer versions of ruby / rails?
How could I fix that properly, in a compatible manner?
Thank you for your help.
A range is a Ruby internal, not a YAML base type like an integer or string. Rather than encode the range as you are, use its start and end points and reconstruct the range on the receiving end.
I use something like:
[1] (pry) main: 0> range = 0..1
=> 0..1
[2] (pry) main: 0> require 'yaml'
=> true
[3] (pry) main: 0> YAML.dump(range)
=> "--- !ruby/range\nbegin: 0\nend: 1\nexcl: false\n"
[4] (pry) main: 0> YAML.dump({'min' => range.min, 'max' => range.max})
=> "---\nmin: 0\nmax: 1\n"
And then I can recreate the range on the receiving side using something like:
Range.new(*YAML.load(YAML.dump({'min' => range.min, 'max' => range.max})).values)
=> 0..1
or this if you're not sure that 'min' and 'max' will be in the right order:
[19] (pry) main: 0> Range.new(*YAML.load(YAML.dump({'min' => range.min, 'max' => range.max})).values_at('min', 'max'))
=> 0..1
Adding some information regarding Ruby 1.9.3+ serializing ranges of characters:
[2] (pry) main: 0> range = '0'..'1'
=> "0".."1"
[3] (pry) main: 0> YAML.dump(range)
=> "--- !ruby/range\nbegin: '0'\nend: '1'\nexcl: false\n"
[5] (pry) main: 0> RUBY_VERSION
=> "1.9.3"
And again with 1.9.2+:
[2] (pry) main: 0> range = '0'..'1'
=> "0".."1"
[3] (pry) main: 0> YAML.dump(range)
=> "--- !ruby/range \nbegin: \"0\"\nend: \"1\"\nexcl: false\n"
[4] (pry) main: 0> RUBY_VERSION
=> "1.9.2"
And, the workaround maintains the range start/end types:
[6] (pry) main: 0> Range.new(*YAML.load(YAML.dump({'min' => range.min, 'max' => range.max})).values_at('min', 'max'))
=> "0".."1"
In both cases the YAML_VERSION is 0.60.

Resources