How to calculate the highest word frequency in Ruby - ruby-on-rails

I have been working on this assignment for a Coursera Intro to Rails course. We have been tasked to write a program that calculates maximum word frequency in a text file. We have been instructed to create a method which:
Calculates the maximum number of times a single word appears in the given content and store in highest_wf_count.
Identify the words that were used the maximum number of times and store that in highest_wf_words.
When I run the rspec tests that were given to us, one test is failing. I printed my output to see what the problem is but haven't been able to fix it.
Here is my code, the rspec test, and what I get:
class LineAnalyzer
attr_accessor :highest_wf_count
attr_accessor :highest_wf_words
attr_accessor :content
attr_accessor :line_number
def initialize(content, line_number)
#content = content
#line_number = line_number
#highest_wf_count = 0
#highest_wf_words = highest_wf_words
calculate_word_frequency
end
def calculate_word_frequency()
#highest_wf_words = Hash.new(0)
#content.split.each do |word|
#highest_wf_words[word.downcase!] += 1
if #highest_wf_words.has_key?(word)
#highest_wf_words[word] += 1
else
#highest_wf_words[word] = 1
end
#highest_wf_words.sort_by{|word, count| count}
#highest_wf_count = #highest_wf_words.max_by {|word, count| count}
end
end
def highest_wf_count()
p #highest_wf_count
end
end
This is the rspec code:
require 'rspec'
describe LineAnalyzer do
subject(:lineAnalyzer) { LineAnalyzer.new("test", 1) }
it "has accessor for highest_wf_count" do
is_expected.to respond_to(:highest_wf_count)
end
it "has accessor for highest_wf_words" do
is_expected.to respond_to(:highest_wf_words)
end
it "has accessor for content" do
is_expected.to respond_to(:content)
end
it "has accessor for line_number" do
is_expected.to respond_to(:line_number)
end
it "has method calculate_word_frequency" do
is_expected.to respond_to(:calculate_word_frequency)
end
context "attributes and values" do
it "has attributes content and line_number" do
is_expected.to have_attributes(content: "test", line_number: 1)
end
it "content attribute should have value \"test\"" do
expect(lineAnalyzer.content).to eq("test")
end
it "line_number attribute should have value 1" do
expect(lineAnalyzer.line_number).to eq(1)
end
end
it "calls calculate_word_frequency when created" do
expect_any_instance_of(LineAnalyzer).to receive(:calculate_word_frequency)
LineAnalyzer.new("", 1)
end
context "#calculate_word_frequency" do
subject(:lineAnalyzer) { LineAnalyzer.new("This is a really really really cool cool you you you", 2) }
it "highest_wf_count value is 3" do
expect(lineAnalyzer.highest_wf_count).to eq(3)
end
it "highest_wf_words will include \"really\" and \"you\"" do
expect(lineAnalyzer.highest_wf_words).to include 'really', 'you'
end
it "content attribute will have value \"This is a really really really cool cool you you you\"" do
expect(lineAnalyzer.content).to eq("This is a really really really cool cool you you you")
end
it "line_number attribute will have value 2" do
expect(lineAnalyzer.line_number).to eq(2)
end
end
end
This is the rspec output:
13 examples, 1 failure
Failed examples:
rspec ./course01/module02/assignment-Calc-Max-Word-Freq/spec/line_analyzer_spec.rb:42 # LineAnalyzer#calculate_word_frequency highest_wf_count value is 3
My output:
#<LineAnalyzer:0x00007fc7f9018858 #content="This is a really really really cool cool you you you", #line_number=2, #highest_wf_count=[nil, 10], #highest_wf_words={"this"=>2, nil=>10, "is"=>1, "a"=>1, "really"=>3, "cool"=>2, "you"=>3}>
Based on the test string, the word counts aren't correct.
"nil" is being included in the hash.
The hash is not being sorted by the value (count) like it should.
I tried several things to fix these problems and nothing has worked. I went through the lecture material again, but can't find anything that would help and the discussion boards are not often monitored for questions from students.

Accoriding to Ruby documentation:
downcase!(*args) public
Downcases the contents of str, returning nil if no changes were made.
Due to this unexpected behavior of .downcase! method, if the word is already all lowercase, you're incrementing occurrences of nil in this line:
#highest_wf_words[word.downcase!] += 1
Tests are also failing because #highest_wf_words.max_by {|word, count| count} returns an array containing the count and a word, while we want to get only the count.
A simplified calculate_word_frequency method passing the tests would look like this:
def calculate_word_frequency()
#highest_wf_words = Hash.new(0)
#content.split.each do |word|
# we don't have to check if the word existed before
# because we set 0 as default value in #highest_wf_words hash
# use .downcase instead of .downcase!
#highest_wf_words[word.downcase] += 1
# extract only the count, and then get the max
#highest_wf_count = #highest_wf_words.map {|word, count| count}.max
end
end

Nil:
The nil is from downcase!
This modifies the String inplace and returns nil if nothing has changed.
If you say "this is weird", then you are right (IMHO).
# just use the non destructive variant
word.downcase
Sorting:
sort_by returns a new object (Hash, Array, ...) and does not modify the receiver of the method. You either need to re-assign or to use sort_by!
unsorted = [3, 1, 2]
sorted = unsorted.sort
p unsorted # => [3, 1, 2]
p sorted # => [1, 2, 3]
unsorted.sort!
p unsorted # => [1, 2, 3]
Faulty word count:
Once you corrected those two mistakes it should look better. Be aware, that the method does not return a single integer but a two-element array with the word and count, so it should look something like this: ["really", 6]
Simplifiying things:
If you can use ruby 2.7, then there is the handy Enumerable#tally method!
%w(foo foo bar foo baz foo).tally
=> {"foo"=>4, "bar"=>1, "baz"=>1}
Example taken from
https://medium.com/#baweaver/ruby-2-7-enumerable-tally-a706a5fb11ea

Related

how subject work in rspec in ruby on rails

I recently learn about rspec test in rails and as following the link https://relishapp.com/rspec/rspec-core/v/3-6/docs/subject/explicit-subject that said "The subject is memoized within an example but not across examples " with the code below:
RSpec.describe Array do
# This uses a context local variable. As you can see from the
# specs, it can mutate across examples. Use with caution.
element_list = [1, 2, 3]
subject { element_list.pop }
it "is memoized across calls (i.e. the block is invoked once)" do
expect {
3.times { subject }
}.to change{ element_list }.from([1, 2, 3]).to([1, 2])
expect(subject).to eq(3)
end
it "is not memoized across examples" do
expect{ subject }.to change{ element_list }.from([1, 2]).to([1])
expect(subject).to eq(2)
end
end
Can anyone explain to me:
why 3.times { subject } exec only once element_list.pop
the sentence 'it "is not memoized across examples"' meant element_list still [1, 2, 3] but in this example, it is only [1, 2] ?
Thank you.
First, by way of clarification, the RSpec subject helper is nothing more than a special case of RSpec's let helper. Using subject { element_list.pop } is equivalent to let(:subject) { element_list.pop }.
Like any other let, subject is evaluated once per example. If subject has a value, that value is returned without re-evaluation. The general term for this is "memoization."
Ruby's ||= operator does the same thing. It says, "if a value exists in that variable, return the value, otherwise evaluate the expression, assign it to the variable, and return the value." You can see this concept in action in the console with this example:
>> element_list = [1, 2, 3]
>> subject ||= element_list.pop
=> 3
>> element_list
=> [1, 2]
>> subject ||= element_list.pop
=> 3
>> element_list
=> [1, 2]
>> subject
=> 3
The fact that subject is not memoized across examples means that its value is reset when a new example is executed. So for your next it block, subject will start out unassigned and its expression will be re-evaluated when it is first used in that next it block.

Rails 4 model custom validation with method - works in console but fails in spec

Attempting to write a custom model validation and having some trouble. I'm using a regular expression to confirm that a decimal amount is validated to be in the following format:
First digit between 0 and 4
Format as "#.##" - i.e. a decimal number with precision 3 and scale 2. I want 2 digits behind the decimal.
nil values are okay
while the values are nominally numeric, I decided to give the column a data type of string, in order to make it easier to use a regular expression for comparison, without having to bother with the #to_s method. Since I won't be performing any math with the contents this seemed logical.
The regular expression has been tested on Rubular - I'm very confident with it. I've also defined the method in the ruby console, and it appears to be working fine there. I've followed the general instructions on the Rails Guides for Active Record Validations but I'm still getting validation issues that have me headscratching.
Here is the model validation for the column :bar -
class Foo < ActiveRecord::Base
validate :bar_format
def bar_format
unless :bar =~ /^([0-4]{1}\.{1}\d{2})$/ || :bar == nil
errors.add(:bar, "incorrect format")
end
end
end
The spec for Foo:
require 'rails_helper'
describe Foo, type: :model do
let(:foo) { build(:foo) }
it "has a valid factory" do
expect(foo).to be_valid
end
describe "bar" do
it "can be nil" do
foo = create(:foo, bar: nil)
expect(foo).to be_valid
end
it "accepts a decimal value with precision 3 and scale 2" do
foo = create(:foo, bar: "3.50")
expect(foo).to be_valid
end
it "does not accept a decimal value with precision 4 and scale 3" do
expect(create(:foo, bar: "3.501")).not_to be_valid
end
end
end
All of these specs fail for validation on bar:
ActiveRecord::RecordInvalid:
Validation failed: bar incorrect format
In the ruby console I've copied the method bar_format as follows:
irb(main):074:0> def bar_format(bar)
irb(main):075:1> unless bar =~ /^([0-4]{1}\.{1}\d{2})$/ || bar == nil
irb(main):076:2> puts "incorrect format"
irb(main):077:2> end
irb(main):078:1> end
=> :bar_format
irb(main):079:0> bar_format("3.50")
=> nil
irb(main):080:0> bar_format("0.0")
incorrect format
=> nil
irb(main):081:0> bar_format("3.5")
incorrect format
=> nil
irb(main):082:0> bar_format("3.1234")
incorrect format
=> nil
irb(main):083:0> bar_format("3.00")
=> nil
The method returns nil for a correctly formatted entry, and it returns the error message for an incorrectly formatted entry.
Suspecting this has something to do with the validation logic, but as far as I can understand, validations look at the errors hash to determine if the item is valid or not. The logical structure of my validation matches the structure in the example on the Rails Guides, for custom methods.
* EDIT *
Thanks Lazarus for suggesting that I remove the colon from the :bar so it's not a symbol in the method. After doing that, most of the tests pass. However: I'm getting a weird failure on two tests that I can't understand. The code for the tests:
it "does not accept a decimal value with precision 4 and scale 3" do
expect(create(:foo, bar: "3.501")).not_to be_valid
end
it "generates an error message for an incorrect decimal value" do
foo = create(:foo, bar: "4.506")
expect(scholarship.errors.count).to eq 1
end
After turning the symbol :bar into a variable bar the other tests pass, but for these two I get:
Failure/Error: expect(create(:foo, bar: "3.501")).not_to be_valid
ActiveRecord::RecordInvalid:
Validation failed: 3 501 incorrect format
Failure/Error: foo = create(:bar, min_gpa: "4.506")
ActiveRecord::RecordInvalid:
Validation failed: 4 506 incorrect format
Any ideas why it's turning the input "3.501" to 3 501 and "4.506" to 4 506?
You use the symbol instead of the variable when checking against the regex or for nil.
unless :bar =~ /^([0-4]{1}\.{1}\d{2})$/ || :bar == nil
errors.add(:bar, "incorrect format")
end
Remove the : from the :bar
* EDIT *
It's not the specs that are failing but the model's validations upon creation. You should use build instead of create
Don't use symbol to refer an argument.
class Foo < ActiveRecord::Base
validate :bar_format
def bar_format
unless bar =~ /^([0-4]{1}\.{1}\d{2})$/ || bar == nil
errors.add(:bar, "incorrect format")
end
end
end
But if you want an regex for decimal like '1.0', '1.11', '1111.00' I advise you to use this regex:
/^\d+(\.\d{1,2})?$/
If you can to use regex for money, here is:
/^(\d{1,3}\,){0,}(\d{1,3}\.\d{1,2})$/
Good luck ^^

Ruby: Parsing arrays into categories, returning symbols

I am working on a ruby challenge requesting that I create a method that inputs an array of strings and separates the strings into 3 categories returned as symbols. These symbols will return in an array.
If the string contains the word "cat", then it returns the symbol
:cat.
If "dog", then it returns :dog.
If the string does not contain "dog" or "cat" it returns the symbol
:none
So far I have the following code but having trouble getting it to pass.
def pets (house)
if house.include?/(?i:cat)/
:cat = house
elsif house.include?/(?i:dog)/
:dog = house
else
:none = house
end
end
input = [ "We have a dog", "Cat running around!", "All dOgS bark", "Nothing to see here", nil ]
It should return [ :dog, :cat, :dog, :none, :none ]
I'm surprised that nobody went for the case/when approach, so here it is:
def pets(house)
house.map do |item|
case item
when /dog/i
:dog
when /cat/i
:cat
else
:none
end
end
end
map isn't that complicated: you use it whenever you have an array of n elements that you want to turn into another array of n elements.
I suspect people don't use case/when because they can't remember the syntax, but it's designed for just this situation, when you're testing one item against multiple alternatives. It's much cleaner than the if/elsif/elsif syntax, IMHO.
def pets (house)
results = []
house.each do |str|
if str.to_s.downcase.include?('dog')
results << :dog
elsif str.to_s.downcase.include?('cat')
results << :cat
else
results << :none
end
end
return results
end
This works. And here's the above code, written in pseudo-code (plain english, following a code-like thought process) so you can see how I've come to the above solution.
def pets (house)
# Define an empty array of results
#
# Now, loop over every element in the array
# that was passed in as a parameter:
#
# If that element contains 'dog',
# Then add :dog to the results array.
# If that element contains 'cat'
# Then add :cat to the results array
# Otherwise,
# Add :none to the results array
#
# Finally, return the array of results.
end
There's a few concepts you seem to be not quite solid on - and I don't think I'll be able to explain them effectively here within a reasonable length. If at all possible, try to see if you can meet an experienced programmer face to face and go through the problem - it will be far easier than trying to battle it out yourself.
Here is a solution using the Array#map method.
def pets (house)
house.map do |animal|
if animal.to_s.downcase.include?('cat')
:cat
elsif animal.to_s.downcase.include?('dog')
:dog
else
:none
end
end
end
You can just use a hash of matchers as keys, and results as values, as follows:
ma = [ :cat, :dog ]
input = [ "We have a dog", "Cat running around!", "All dOgS bark", "Nothing to see here", nil ]
input.map {|s| ma.reduce(:none) {|result,m| s.to_s =~ /#{m}/i && m || result } }
# => [:dog, :cat, :dog, :none, :none]

Ruby undefined method `id' for nil:NilClass or Loop error

...
begin
last = Folder.where(name: #folders.last, user_id: #user_id)
prev = Folder.where(name: #folders[count], user_id: #user_id)
for z in 0..last.count
for x in 0..prev.count
valid = Folder.exists?(name: last[z].name, parent_id: prev[x].id)
case valid
when true
#test += valid.to_s
#ids << Folder.find_by(id: prev[x].id).id
##ids = #ids[0].id
else
end
end
end
#test += 'MSG'
rescue Exception => e
#test = e.message
valid = false
else
end
This is a portion of code, everything working fine except the code after loops which displays message #test += 'MSG'. There is an exception in rescue block, which says undefined method `id' for nil:NilClass but the method returns the id, so its working. What is the issue, please help? Why the code after two loops will not working
The loop should be 0..(last.count-1) and 0..(prev.count-1), to account for 0 index.
Or a more readable excluded end range (as suggested by Neil Slater)
0...last.count and 0...prev.count
EDIT
Lets say last has 3 items in it. Then looping through 0..3 will go through
last[0], last[1], last[2], last[3] #(4 items) Which will result in error
So instead, you should loop through 0..2 or 0...3 (three dots means exclude last num)
Your problem is that your iterators x and z get too large and reference empty array indexes. But your code does not actually need them, as you only use x and z to index into the separate arrays.
It is quite rare in Ruby to use for loops to iterate through an Array. The core Array class has many methods that give ways to iterate through and process lists of objects, and it is usually possible to find one that does more precisely what you want, simplifying your code and improving readability.
Your code could be re-written using Array#each:
last = Folder.where(name: #folders.last, user_id: #user_id)
prev = Folder.where(name: #folders[count], user_id: #user_id)
last.each do |last_folder|
prev.each do |prev_folder|
valid = Folder.exists?(name: last_folder.name, parent_id: prev_folder.id)
case valid
when true
#test += valid.to_s
#ids << Folder.find_by(id: prev_folder.id).id
else
end
end
end
#test += 'MSG'
... etc

Match multiple yields in any order

I want to test an iterator using rspec. It seems to me that the only possible yield matcher is yield_successive_args (according to https://www.relishapp.com/rspec/rspec-expectations/v/3-0/docs/built-in-matchers/yield-matchers). The other matchers are used only for single yielding.
But yield_successive_args fails if the yielding is in other order than specified.
Is there any method or nice workaround for testing iterator that yields in any order?
Something like the following:
expect { |b| array.each(&b) }.to yield_multiple_args_in_any_order(1, 2, 3)
Here is the matcher I came up for this problem, it's fairly simple, and should work with a good degree of efficiency.
require 'set'
RSpec::Matchers.define :yield_in_any_order do |*values|
expected_yields = Set[*values]
actual_yields = Set[]
match do |blk|
blk[->(x){ actual_yields << x }] # ***
expected_yields == actual_yields # ***
end
failure_message do |actual|
"expected to receive #{surface_descriptions_in expected_yields} "\
"but #{surface_descriptions_in actual_yields} were yielded."
end
failure_message_when_negated do |actual|
"expected not to have all of "\
"#{surface_descriptions_in expected_yields} yielded."
end
def supports_block_expectations?
true
end
end
I've highlighted the lines containing most of the important logic with # ***. It's a pretty straightforward implementation.
Usage
Just put it in a file, under spec/support/matchers/, and make sure you require it from the specs that need it. Most of the time, people just add a line like this:
Dir[File.dirname(__FILE__) + "/support/**/*.rb"].each {|f| require f}
to their spec_helper.rb but if you have a lot of support files, and they aren't all needed everywhere, this can get a bit much, so you may want to only include it where it is used.
Then, in the specs themselves, the usage is like that of any other yielding matcher:
class Iterator
def custom_iterator
(1..10).to_a.shuffle.each { |x| yield x }
end
end
describe "Matcher" do
it "works" do
iter = Iterator.new
expect { |b| iter.custom_iterator(&b) }.to yield_in_any_order(*(1..10))
end
end
This can be solved in plain Ruby using a set intersection of arrays:
array1 = [3, 2, 4]
array2 = [4, 3, 2]
expect(array1).to eq (array1 & array2)
# for an enumerator:
enumerator = array1.each
expect(enumerator.to_a).to eq (enumerator.to_a & array2)
The intersection (&) will return items that are present in both collections, keeping the order of the first argument.

Resources