I am working on a coding problem where I have 3 lines of text and I have to calculate the words that appear the most in those lines. The answer is: ['it','really','will'] because the text is:
This is a really really really cool experiment really
Cute little experiment
Will it work maybe it will work do you think it will it will
Everything works in the code below except the highest_count_words_across_lines method. It's supposed to return ['it','really','will'] but instead returns 2 hashes inside an array:[{"a"=>1, "cool"=>1, "experiment"=>1, "is"=>1, "really"=>4, "this"=>1}, {"cute"=>1, "experiment"=>1, "little"=>1}, {"do"=>1, "it"=>4, "maybe"=>1, "think"=>1, "will"=>4, "work"=>2, "you"=>1}].
I've tried iterating through a hash with multiple select statements to no avail.
This is my full code so far:
class LineAnalyzer
attr_accessor :highest_wf_count, :highest_wf_words, :content, :line_number #Implement the following read-only attributes in the LineAnalyzer class.
def initialize(content, line)
#content = content #* initialize the content and line_number attributes
#line_number = line
#highest_wf_count = 0
calculate_word_frequency()
end
def calculate_word_frequency()
#highest_wf_words = Hash.new
words = #content.downcase.split
words.each { |w|
if #highest_wf_words.has_key?(w)
#highest_wf_words[w] += 1
else
#highest_wf_words[w] = 1
end
}
#highest_wf_words.sort_by { |word, count| count }
#highest_wf_words.each do |key, value|
if value > #highest_wf_count
#highest_wf_count = value
end
end
end
def highest_wf_count= (number)
#highest_wf_count = number
end
end
class Solution
attr_reader :analyzers, :highest_count_across_lines, :highest_count_words_across_lines # Implement the following read-only attributes in the Solution class.
def initialize()
#analyzers = []
highest_count_across_lines = nil
highest_count_words_across_lines = []
end
def analyze_file()
File.foreach('test.txt').with_index(1) do |content, line|
line_analyzer = LineAnalyzer.new(content, line)
#analyzers << line_analyzer
end
end
def calculate_line_with_highest_frequency()
#highest_count_across_lines = analyzers.map(&:highest_wf_count).max
#highest_count_words_across_lines = analyzers.select { |k,v| v = #highest_count_across_lines }
end
def print_highest_word_frequency_across_lines()
"The following words have the highest frequency per line: \n #{highest_count_words_across_lines} (appears in line #{line_num} \n"
end
end
This is the error message I get:
Failures:
1) Solution#calculate_line_with_highest_frequency calculates highest count words across lines to be will, it, really
Failure/Error: expect(words_found).to match_array ["will", "it", "really"]
expected collection contained: ["it", "really", "will"]
actual collection contained: [{"a"=>1, "cool"=>1, "experiment"=>1, "is"=>1, "really"=>4, "this"=>1}, {"cute"=>1, "experiment"=>1, "little"=>1}, {"do"=>1, "it"=>4, "maybe"=>1, "think"=>1, "will"=>4, "work"=>2, "you"=>1}]
the missing elements were: ["it", "really", "will"]
the extra elements were: [{"a"=>1, "cool"=>1, "experiment"=>1, "is"=>1, "really"=>4, "this"=>1}, {"cute"=>1, "experiment"=>1, "little"=>1}, {"do"=>1, "it"=>4, "maybe"=>1, "think"=>1, "will"=>4, "work"=>2, "you"=>1}]
# ./spec/solution_spec.rb:39:in `block (3 levels) in <top (required)>'
Finished in 0.26418 seconds (files took 0.38 seconds to load)
19 examples, 1 failure
Failed examples:
rspec ./spec/solution_spec.rb:31 # Solution#calculate_line_with_highest_frequency calculates highest count words across lines to be will, it, really
I've tried iterating through the hashes within an array but kept getting an error message. I am trying to find the keys where the values (counts) are equal to the highest count (4). So the final answer should be ["it","really","will"]. Any suggestions?
Step 1
Merge the array of hash into one single hash:
lets say, your array of hash is arrayOfHash
hash = arrayOfHash.inject(:merge)
Step 2
collect the keys which contains the maximum values in that single hash we created in step 1:
result = arrayOfHash.collect{|k, v| k if v == arrayOfHash.values.max}.compact
Related
I have an array PARTITION which stores days.
I want to group_by my posts (ActiveRecord::Relation) according to how old are they and in which partition they lie.
Example: PARTITION = [0, 40, 60, 90]
I want to group posts which are 0 to 40 days old, 40 to 60 days old, 60 to 90 days old and older than 90 days.
Please note that I will get array data from an external source and I don't want to use a where clause because I am using includes and where fires db query making includes useless.
How can I do this?
Here's a simple approach:
posts.each_with_object(Hash.new { |h, k| h[k] = [] }) do |post, hash|
days_old = (Date.today - post.created_at.to_date).to_i
case days_old
when 0..39
hash[0] << post
when 40..59
hash[40] << post
when 60..89
hash[60] << post
when 90..Float::INFINITY # or 90.. in the newest Ruby versions
hash[90] << post
end
end
This iterates through the posts, along with a hash which has a default value of an empty array.
Then, we simply check how many days ago a post was created and add it to relevant key of the hash.
This hash is then returned when all posts have been processed.
You can use whatever you want for the keys (e.g. hash["< 40"]), though I've used your partitions for illustrative purposes.
The result will be something akin to the following:
{ 0: [post_1, post_3, etc],
40: [etc],
60: [etc],
90: [etc] }
Hope this helps - let me know if you've got any questions.
Edit: it's a little trickier if your PARTITIONS are coming from an external source, though the following would work:
# transform the PARTITIONS into an array of ranges
ranges = PARTITIONS.map.with_index do |p, i|
return 0..(p - 1) if i == 0 # first range is 0..partition minus 1
return i..Float::INFINITY if i + 1 == PARTITIONS.length # last range is partition to infinity
p..(PARTITIONS[i + 1] - 1)
end
# loop through the posts with a hash with arrays as the default value
posts.each_with_object(Hash.new { |h, k| h[k] = [] }) do |post, hash|
# loop through the new ranges
ranges.each do |range|
days_old = Date.today - post.created_at.to_date
hash[range] << post if range.include?(days_old) # add the post to the hash key for the range if it's present within the range
end
end
A final edit:
Bit silly using each_with_object when group_by will handle this perfectly. Example below:
posts.group_by |post|
days_old = (Date.today - post.created_at.to_date).to_i
case days_old
when 0..39
0
when 40..59
40
when 60..89
60
when 90..Float::INFINITY # or 90.. in the newest Ruby versions
90
end
end
Assumptions:
This partitioning is for display purposes.
The attribute you want to group by is days
You want to the result a hash
{ 0 => [<Post1>], 40 => [<Post12>], 60 => [<Post41>], 90 => [<Post101>] }
add these methods to your model
# post.rb
def self.age_partitioned
group_by(&:age_partition)
end
def age_partition
[90, 60, 40, 0].find(days) # replace days by the correct attribute name
end
# Now to use it
Post.where(filters).includes(:all_what_you_want).age_partitioned
As per the description given in the post, something done as below could help you group the data:
result_array_0_40 = [];result_array_40_60 = [];result_array_60_90 = [];result_array_90 = [];
result_json = {}
Now, we need to iterate over values and manually group them into dynamic key value pairs
PARTITION.each do |x|
result_array_0_40.push(x) if (0..40).include?(x)
result_array_40_60.push(x) if (40..60).include?(x)
result_array_60_90.push(x) if (60..90).include?(x)
result_array_90.push(x) if x > 90
result_json["0..40"] = result_array_0_40
result_json["40..60"] = result_array_40_60
result_json["60..90"] = result_array_60_90
result_json["90+"] = result_array_90
end
Hope it Helps!!
Let's say i have two relation arrays of a user's daily buy and sell.
how do i iterate through both of them using .each and still let the the longer array run independently once the shorter one is exhaused. Below i want to find the ratio of someone's daily buys and sells. But can't get the ratio because it's always 1 as i'm iterating through the longer array once for each item of the shorter array.
users = User.all
ratios = Hash.new
users.each do |user|
if user.buys.count > 0 && user.sells.count > 0
ratios[user.name] = Hash.new
buy_array = []
sell_array = []
date = ""
daily_buy = user.buys.group_by(&:created_at)
daily_sell = user.sells.group_by(&:created_at)
daily_buy.each do |buy|
daily_sell.each do |sell|
if buy[0].to_date == sell[0].to_date
date = buy[0].to_date
buy_array << buy[1]
sell_array << sell[1]
end
end
end
ratio_hash[user.name][date] = (buy_array.length.round(2)/sell_array.length)
end
end
Thanks!
You could concat both arrays and get rid of duplicated elements by doing:
(a_array + b_array).uniq.each do |num|
# code goes here
end
Uniq method API
daily_buy = user.buys.group_by(&:created_at)
daily_sell = user.sells.group_by(&:created_at
buys_and_sells = daily_buy + daily_sell
totals = buys_and_sells.inject({}) do |hsh, transaction|
hsh['buys'] ||= 0;
hsh['sells'] ||= 0;
hsh['buys'] += 1 if transaction.is_a?(Buy)
hsh['sells'] += 1 if transaction.is_a?(Sell)
hsh
end
hsh['buys']/hsh['sells']
I think the above might do it...rather than collecting each thing in to separate arrays, concat them together, then run through each item in the combined array, increasing the count in the appropriate key of the hash returned by the inject.
In this case you can't loop them with each use for loop
this code will give you a hint
ar = [1,2,3,4,5]
br = [1,2,3]
array_l = (ar.length > br.length) ? ar.length : br.length
for i in 0..array_l
if ar[i] and br[i]
puts ar[i].to_s + " " + br[i].to_s
elsif ar[i]
puts ar[i].to_s
elsif br[i]
puts br[i].to_s
end
end
I have a method that looks like this:
def self.average_top_level_comments_leaders
top_level_comment_count = CrucibleComment.group(:user_id).where(parent_comment_id: nil).order('count_all DESC').count
code_review_assigned_count = Reviewer.group(:user_id).order('count_all DESC').count
division_result = top_level_comment_count.inject({}) do |result, item|
id = item.first #id =12
count = item.last #value = 57
if (count && code_review_assigned_count[id])
result[id] = (count/ code_review_assigned_count[id]).round(2)
#result[12] = 57/12 = 3.3, => {12, 3.3}
end
result
end
end
This method returns a hash with the IDs as keys and the results of the division as the values.
I have successfully tested top_level_comment_count and code_review_assigned count, but I am having trouble figuring out how I can test the 4 other things that are in the do block:
.first, .last, .round(2), result
I am trying to test .first and this is what I have so far:
describe '#average_top_level_comments_leaders' do
subject { User.average_top_level_comments_leaders}
let(:avg_top_level_comments) { double }
let(:code_review_count) { double }
let(:item) { double( {id: 12}) }
context 'when getting the comment count succeeds ' do
before do
allow(CrucibleComment).to receive(:group).with(:user_id).and_return(avg_top_level_comments)
allow(avg_top_level_comments).to receive(:where).with(parent_comment_id: nil).and_return(avg_top_level_comments)
allow(avg_top_level_comments).to receive(:order).with('count_all DESC').and_return(avg_top_level_comments)
allow(avg_top_level_comments).to receive(:count).and_return(avg_top_level_comments)
allow(avg_top_level_comments).to receive(:inject).and_return(avg_top_level_comments)
allow(item).to receive(:first).and_return(item)
allow(Reviewer).to receive(:group).with(:user_id).and_return(code_review_count)
allow(code_review_count).to receive(:order).with('count_all DESC').and_return(code_review_count)
allow(code_review_count).to receive(:count).and_return(code_review_count)
allow(code_review_count).to receive(:round).with(2).and_return(code_review_count)
end
it 'and the correct parameters are called' do
expect(CrucibleComment).to receive(:group).with(:user_id)
subject
end
it 'and comment count is calling descending correctly' do
expect(avg_top_level_comments).to receive(:order).with('count_all DESC')
subject
end
it 'item gets the first result' do
expect(item).to receive(:first)
subject
end
end
end
I cannot get the last it statement to pass. I am trying to expect(item).to receive(:first), but it says this in the error:
Failure/Error: expect(item).to receive(:first)
(Double).first(*(any args))
expected: 1 time with any arguments
received: 0 times with any arguments
Any idea why this is not passing? The other two its are passing
The item double is never used in the test, so when it reaches:
expect(item).to receive(:first)
it fails.
If you were expecting the item double to be used within the inject block here:
division_result = top_level_comment_count.inject({}) do |result, item|
merely by virtue of it having the same name, it doesn't work that way. You'd need to define a method on the avg_top_level_comments double that returns the item double when inject is called.
But, you shouldn't do that. Throw all of this out and use real model instances for the test. It will be much easier to read and maintain.
In my Rails 3.2 app a Connector has_many Incidents.
To get all incidents of a certain connector I can do this:
(In console)
c = Connector.find(1) # c.class is Connector(id: integer, name: string, ...
i = c.incidents.all # all good, lists incidents of c
But how can I get all incidents of many connectors?
c = Connector.find(1,2) # works fine, but c.class is Array
i = c.incidents.all #=> NoMethodError: undefined method `incidents' for #<Array:0x4cc15e0>
Should be easy! But I don't get it!
Here’s the complete code in my statistics_controller.rb
class StatisticsController < ApplicationController
def index
#connectors = Connector.scoped
if params['connector_tokens']
logger.debug "Following tokens are given: #{params['connector_tokens']}"
#connectors = #connectors.find_all_by_name(params[:connector_tokens].split(','))
end
#start_at = params[:start_at] || 4.weeks.ago.beginning_of_week
#end_at = params[:end_at] || Time.now
##time_line_data = Incident.time_line_data( #start_at, #end_at, 10) #=> That works, but doesn’t limit the result to given connectors
#time_line_data = #connectors.incidents.time_line_data( #start_at, #end_at, 10) #=> undefined method `incidents' for #<ActiveRecord::Relation:0x3f643c8>
respond_to do |format|
format.html # index.html.haml
end
end
end
Edit with reference to first 3 answers below:
Great! With code below I get an array with all incidents of given connectors.
c = Connector.find(1,2)
i = c.map(&:incidents.all).flatten
But idealy I'd like to get an Active Records object instead of the array, because I'd like to call where() on it as you can see in methode time_line_data below.
I could reach my goal with the array, but I would need to change the whole strategy...
This is my time_line_data() in Incidents Model models/incidents.rb
def self.time_line_data(start_at = 8.weeks.ago, end_at = Time.now, lim = 10)
total = {}
rickshaw = []
arr = []
inc = where(created_at: start_at.to_time.beginning_of_day..end_at.to_time.end_of_day)
# create a hash, number of incidents per day, with day as key
inc.each do |i|
if total[i.created_at.to_date].to_i > 0
total[i.created_at.to_date] += 1
else
total[i.created_at.to_date] = 1
end
end
# create a hash with all days in given timeframe, number of incidents per day, date as key and 0 as value if no incident is in database for this day
(start_at.to_date..end_at.to_date).each do |date|
js_timestamp = date.to_time.to_i
if total[date].to_i > 0
arr.push([js_timestamp, total[date]])
rickshaw.push({x: js_timestamp, y: total[date]})
else
arr.push([js_timestamp, 0])
rickshaw.push({x: js_timestamp, y: 0})
end
end
{ :start_at => start_at,
:end_at => end_at,
:series => rickshaw #arr
}
end
As you only seem to be interested in the time line data you can further expand the map examples given before e.g.:
#time_line_data = #connectors.map do |connector|
connector.incidents.map do |incident|
incident.time_line_data(#start_at, #end_at, 10)
end
end
This will map/collect all the return values of the time_line_data method call on all the incidents in the collection of connectors.
Ref:- map
c = Connector.find(1,2)
i = c.map(&:incidents.all).flatten
I would like to analyse data in my database to find out how many times certain words appear.
Ideally I would like a list of the top 20 words used in a particular column.
What would be the easiest way of going about this.
Create an autovivified hash and then loop through the rows populating the hash and incrementing the value each time you get the same key (word). Then sort the hash by value.
A word counter...
I wasn't sure if you were asking how to get rails to work on this or how to count words, but I went ahead and did a column-oriented ruby wordcounter anyway.
(BTW, at first I did try the autovivified hash, what a cool trick.)
# col: a column name or number
# strings: a String, Array of Strings, Array of Array of Strings, etc.
def count(col, *strings)
(#h ||= {})[col = col.to_s] ||= {}
[*strings].flatten.each { |s|
s.split.each { |s|
#h[col][s] ||= 0
#h[col][s] += 1
}
}
end
def formatOneCol a
limit = 2
a.sort { |e1,e2| e2[1]<=>e1[1] }.each { |results|
printf("%9d %s\n", results[1], results[0])
return unless (limit -= 1) > 0
}
end
def formatAllCols
#h.sort.each { |a|
printf("\n%9s\n", "Col " + a[0])
formatOneCol a[1]
}
end
count(1,"how now")
count(1,["how", "now", "brown"])
count(1,[["how", "now"], ["brown", "cow"]])
count(2,["you see", "see you",["how", "now"], ["brown", "cow"]])
count(2,["see", ["see", ["see"]]])
count("A_Name Instead","how now alpha alpha alpha")
formatAllCols
$ ruby count.rb
Col 1
3 how
3 now
Col 2
5 see
2 you
Col A_Name Instead
3 alpha
1 how
$
digitalross answer looks too verbose to me, also, as you tag ruby-on-rails and said you use DB.. i'm assuming you need an activerecord model so i'm giving you a full solution
in your model:
def self.top_strs(column_symbol, top_num)
h = Hash.new(0)
find(:all, :select => column_symbol).each do |obj|
obj.send(column_symbol).split.each do |word|
h[word] += 1
end
end
h.map.sort_by(&:second).reverse[0..top_num]
end
for example, model Comment, column body:
Comment.top_strs(:body, 20)