Consolidating a Ruby Array of Hashes - ruby-on-rails

I'm interning at a company right now, and I have a database that I connect to in order to get some specific data about our customers. The query is all worked out, the database is returning the data I need, but now I need to figure out how to consolidate the data into the necessary format. We are using Ruby 3.1.2 for reference.
This is the format that I receive from our database (the real data will be much larger, so I'm only using a small dataset until the logic is solid).
[{ "product_id"=>1, "customer_id"=>001 },
{ "product_id"=>2, "customer_id"=>001 },
{ "product_id"=>1, "customer_id"=>002 },
{ "product_id"=>3, "customer_id"=>002 },
{ "product_id"=>1, "customer_id"=>003 },
{ "product_id"=>2, "customer_id"=>003 },
{ "product_id"=>3, "customer_id"=>003 }]
When I get this data, I need to get a list of each distinct "customer_id" with a list of every "product_id" they have assigned to them. Example of what I need to get back below.
{001=>[1, 2], 002=>[1, 3], 003=>[1, 2, 3]}
I thought I had a solution with the line below, but it doesn't seem to work how I expected.
data.group_by(&:customer_id).transform_values { |p| p.pluck(:product_id) }

In addition to #group_by one might use #each_with_object to iteratively build the needed hash.
data.each_with_object({}) { |x, h|
h[x["customer_id"]] ||= []
h[x["customer_id"]] << x["product_id"]
}
# => {1=>[1, 2], 2=>[1, 3], 3=>[1, 2, 3]}

I would do this:
array = [{ "product_id"=>1, "customer_id"=>001 },
{ "product_id"=>2, "customer_id"=>001 },
{ "product_id"=>1, "customer_id"=>002 },
{ "product_id"=>3, "customer_id"=>002 },
{ "product_id"=>1, "customer_id"=>003 },
{ "product_id"=>2, "customer_id"=>003 },
{ "product_id"=>3, "customer_id"=>003 }]
array.group_by { |hash| hash['customer_id'] }
.transform_values { |values| values.map { |value| value['product_id'] } }
#=> { 1 => [1, 2], 2 => [1, 3], 3 => [1, 2, 3] }

The other answers are correct in their own ways but in my opinion code-readability should also always be considered and considering that factor the accepted answer looks more nearer and my solution below is also based on the same logic with the only difference that I have moved out the string keys outside the loop because the string literal inside the loop should create, in each iteration, different String instances in memory for those keys.
array = [{ "product_id"=>1, "customer_id"=>001 },
{ "product_id"=>2, "customer_id"=>001 },
{ "product_id"=>1, "customer_id"=>002 },
{ "product_id"=>3, "customer_id"=>002 },
{ "product_id"=>1, "customer_id"=>003 },
{ "product_id"=>2, "customer_id"=>003 },
{ "product_id"=>3, "customer_id"=>003 }]
transformed_data = {}
customer_id_key = "customer_id"
product_id_key = "product_id"
array.each do |h|
customer_id = h[customer_id_key]
product_id = h[product_id_key]
transformed_data[customer_id] ||= []
transformed_data[customer_id] << product_id
end
transformed_data

Related

How to map ruby hashes correctly based on key provided

My data is like:
h = { themes_data: {
Marketing: [
{
id: 68,
projectno: "15",
}
],
Produktentwicklung: [
{
id: 68,
projectno: "15",
},
{
id: 4,
projectno: "3",
}
],
Marketing_summary: [
{
ges: {
result: "47.6"
},
theme: "Marketing"
}
],
Produktentwicklung_summary: [
{
ges: {
result: "87.7"
},
theme: "Produktentwicklung"
}
]
}
}
And my output should be like:
{ "marketing" => [
{
id: 68,
projectno: "15",
},
{
ges: {
result: "47.6"
},
theme: "Marketing"
}
],
"Produktentwicklung" => [
{
id: 68,
projectno: "15"
},
{
id: 4,
projectno: "3",
},
{
ges: {
result: "87.7"
},
theme: "Produktentwicklung"
}
]
}
Code:
def year_overview_theme
branch_hash = {}
#themes_data.each do |td|
arr = []
td[1].map do |dt|
arr << [{content: dt[:projectno], size: 5, align: :right, background_color: 'D8E5FF'}]
end
branch_hash["#{td[0]}"] = arr
end
branch_hash
end
The problem is that it does not iterate for right hash key.
For example, i want like:
marketing + marketing_summary as 1 hash and similarly
Produktentwicklung = Produktentwicklung_summary as one hash but there is some problem in my logic.
Is there a way that I can check like after 2 iteration,
it should do arr << data with branch_hash["#{td[0]}"] = arr ?
The desired hash can be constructed as follows.
h[:themes_data].each_with_object({}) { |(k,v),g|
g.update(k.to_s[/[^_]+/]=>v) { |_,o,n| o+n } }
#=> { "Marketing"=>[
# {:id=>68, :projectno=>"15"},
# {:ges=>{:result=>"47.6"}, :theme=>"Marketing"}
# ],
# "Produktentwicklung"=>[
# {:id=>68, :projectno=>"15"},
# {:id=>4, :projectno=>"3"},
# {:ges=>{:result=>"87.7"}, :theme=>"Produktentwicklung"}
# ]
# }
This uses the form of Hash#update (aka merge) that employs a block to determine the values of keys that are present in both hashes being merged. Here that block is:
{ |_,o,n| o+n }
The first block variable, _, is the common key. I have represented it with an underscore (a valid local variable) to tell the reader that it is not used in the block calculation. That is common practice. The values of the other two block variables, o and n, are explained at the link for the method update.
The regular expression /[^_]+/, matches one or more characters from the start of the string that are not (^) underscores. When used with the method String#[], we obtain:
"Marketing"[/[^_]+/] #=> "Marketing"
"Marketing_summary"[/[^_]+/] #=> "Marketing"
Let me start with a note: This looks to me like something that should rather be solved in SQL (if it's coming from SQL) instead of Ruby.
With that out of the way, here's a solution that should work:
output = {}
themes_data.each do |theme, projects|
projects.each do |project|
key = project[:theme] || theme.to_s
output[key] ||= [] # make sure the target is initialized
output[key] << project
end
end
There would probably be more elegant solutions using reduce or each_with_object but this works and it's simple enough.
keys = themes_data.keys
summary_keys = themes_data.keys.grep(/_summary/)
result = {}.tap do |hash|
(keys - summary_keys).each do |key|
hash[key] = themes_data[key] + themes_data["#{key}_summary".to_sym]
end
end

Convert and Sort Data for HighChart Column Range Representation

I have some data coming in random order and would like to convert into into a specific order for Highchart column ranges. Any insight on doing this effectively and insight would help
Also regardless of order of input data I always want to show chart in Apple Orange Banana order with their correct representation
I have tried using maps,sets,array in ruby and have something working which is super brittle and not the most effective.
headers = Array.wrap(raw_data.dig('data', 'dimensions', 'axes', 'headers'))
values = Array.wrap(raw_data.dig('data', 'values', 'c')).map(&:to_f)
labels = headers.map { |header| Array.wrap(header['label']) }
data = values.each_slice(2)
This is the weight of the fruits LOW is lowest weight and HIGH is highest weight. The problem is order of data is ordered by weight so I cant just slice consecutive values of array.
JSON DATA
{
"data": {
"dimensions": {
"axes": {
"headers": [{
"label": ["Apple", "Low"]
}, {
"label": ["Apple", "High"]
}, {
"label": ["Orange", "Low"]
}, {
"label": ["Banana", "Low"]
}, {
"label": ["Orange", "High"]
}, {
"label": ["Banana", "High"]
}]
}
}
"values": {
"c": ["173", "273", "414", "608", "610", "1050"]
}
}
EXPECTED OUTPUT
{
series: [
{'name': 'Weight', 'data': [[173, 273], [414, 610], [608, 1050]]}
],
axis_labels: ['Apple', 'Orange', 'Banana'],
}
chart
https://jsfiddle.net/Praveen2710/7sdqz6Le/8/
You need to preprocess your data to the format required by Highcharts:
var json = {...}
var series = {
name: 'Weight',
data: []
},
i,
header1,
header2,
value,
indexOf,
point,
categories = [];
for (i = 0; i < json.data.values.c.length; i++) {
labels = json.data.dimensions.axes.headers[i].label;
header1 = labels[0].toLowerCase(),
header2 = labels[1].toLowerCase(),
value = json.data.values.c[i];
indexOf = categories.indexOf(header1);
if (indexOf !== -1) {
series.data[indexOf][header2] = Number(value);
} else {
categories.push(header1);
series.data.push({
[header2]: Number(value),
x: series.data.length
});
}
}
Highcharts.chart('container', {
...,
series: [series]
});
Live demo: http://jsfiddle.net/BlackLabel/nm976qho/

Combining results of two tables in mongoid/mongo

Hi guys what would be the best way to combine results of two mongoid queries.
My issue is that I would like to know active users, A user can send a letter and a notification, both are separate table and a user if he sends either the letter or the notification is considered active. What I want to know is how many active users were there per month.
right now what I can think of is doing this
Letter.collection.aggregate([
{ '$match': {}.merge(opts) },
{ '$sort': { 'created_at': 1 } },
{
'$group': {
_id: '$customer_id',
first_notif_sent: {
'$first': {
'day': { '$dayOfMonth': '$created_at' },
'month': { '$month': '$created_at' },
'year': { '$year': '$created_at' }
}
}
}
}])
Notification.collection.aggregate([
{ '$match': {}.merge(opts) },
{ '$sort': { 'created_at': 1 } },
{
'$group': {
_id: '$customer_id',
first_notif_sent: {
'$first': {
'day': { '$dayOfMonth': '$created_at' },
'month': { '$month': '$created_at' },
'year': { '$year': '$created_at' }
}
}
}
}])
What I am looking for is to get the minimum of the dates and then combine the results and get the count. Right now I can get the results and loop over each of them and create a new list. But I wanted to know if there is a way to do it in mongo directly.
EDIT
For letters
def self.get_active(tenant_id)
map = %{
function() {
emit(this.customer_id, new Date(this.created_at))
}
}
reduce = %{
function(key, values) {
return new Date(Math.min.apply(null, values))
}
}
where(tenant_id: tenant_id).map_reduce(map, reduce).out(reduce: "#{tenant_id}_letter_notification")
end
Notifications
def self.get_active(tenant_id)
map = %{
function() {
emit(this.customer_id, new Date(this.updated_at))
}
}
reduce = %{
function(key, values) {
return new Date(Math.min.apply(null, values))
}
}
where(tenant_id: tenant_id, transferred: true).map_reduce(map, reduce).out(reduce: "#{tenant_id}_outgoing_letter_standing_order_balance")
end
This is what I am thinking of going with, one of the reason is that, lookup does not work with my version of mongo.
the customer created a new notification, or a new letter, and I would like to get the first created at of either.
Let's address this first as a foundation. Given examples of document schema as below:
Document schema in Letter collection:
{ _id: <ObjectId>,
customer_id: <integer>,
created_at: <date> }
And, document schema in Notification collection:
{ _id: <ObjectId>,
customer_id: <integer>,
created_at: <date> }
You can utilise aggregation pipeline $lookup to join the two collections. For example using mongo shell :
db.letter.aggregate([
{"$group":{"_id":"$customer_id", tmp1:{"$max":"$created_at"}}},
{"$lookup":{from:"notification",
localField:"_id",
foreignField:"customer_id",
as:"notifications"}},
{"$project":{customer_id:"$_id",
_id:0,
latest_letter:"$tmp1",
latest_notification: {"$max":"$notifications.created_at"}}},
{"$addFields":{"latest":
{"$cond":[{"$gt":["$latest_letter", "$latest_notification"]},
"$latest_letter",
"$latest_notification"]}}},
{"$sort":{latest:-1}}
], {cursor:{batchSize:100}})
The output of the above aggregation pipeline is a list of customers in sorted order of created_at field from either Letter or Notification. Example output documents:
{
"customer_id": 0,
"latest_letter": ISODate("2017-12-19T07:00:08.818Z"),
"latest_notification": ISODate("2018-01-26T13:43:56.353Z"),
"latest": ISODate("2018-01-26T13:43:56.353Z")
},
{
"customer_id": 4,
"latest_letter": ISODate("2018-01-04T18:55:26.264Z"),
"latest_notification": ISODate("2018-01-25T02:05:19.035Z"),
"latest": ISODate("2018-01-25T02:05:19.035Z")
}, ...
What I want to know is how many active users were there per month
To achieve this, you can just replace the last stage ($sort) of the above aggregation pipeline with $group. For example:
db.letter.aggregate([
{"$group":{"_id":"$customer_id", tmp1:{$max:"$created_at"}}},
{"$lookup":{from:"notification",
localField:"_id",
foreignField:"customer_id",
as:"notifications"}},
{"$project":{customer_id:"$_id",
_id:0,
latest_letter:"$tmp1",
latest_notification: {"$max":"$notifications.created_at"}}},
{"$addFields":{"latest":
{"$cond":[{"$gt":["$latest_letter", "$latest_notification"]},
"$latest_letter",
"$latest_notification"]}}},
{"$group":{_id:{month:{"$month": "$latest"},
year:{"$year": "$latest"}},
active_users: {"$sum": "$customer_id"}
}
}
],{cursor:{batchSize:10}})
Where the example output would be as below:
{
"_id": {
"month": 10,
"year": 2017
},
"active_users": 9
},
{
"_id": {
"month": 1,
"year": 2018
},
"active_users": 18
},

Iterate through a hash. However, my value is changing every time

I'm currently working on a simple hash loop, to manipulate some json data. Here's my Json data:
{
"polls": [
{ "id": 1, "question": "Pensez-vous utiliser le service de cordonnerie/pressing au moins 2 fois par mois ?" },
{ "id": 2, "question": "Avez-vous passé une bonne semaine ?" },
{ "id": 3, "question": "Le saviez-vous ? Il existe une journée d'accompagnement familial." }
],
"answers": [
{ "id": 1, "poll_id": 1, "value": true },
{ "id": 2, "poll_id": 3, "value": false },
{ "id": 3, "poll_id": 2, "value": 3 }
]
}
I want to have the poll_id value and the value from the answers hash. So here's what I code :
require 'json'
file = File.read('data.json')
datas = JSON.parse(file)
result = Hash.new
datas["answers"].each do |answer|
result["polls"] = {"id" => answer["poll_id"], "value" => answer["value"]}
end
polls_json = result.to_json
However, it returns me :
{
"polls": {
"id": 2,
"value": 3
}
}
Here's the output i am looking for :
{
"polls": [
{
"id": 1,
"value": true
},
{
"id": 2,
"value": 3
},
{
"id": 3,
"value": false
}
]
}
It seems that the value is not saved into my loop. I've tried different method but I still cannot find a solution .. Any suggestions?
You should be using reduce here, i.e.
datas["answers"].reduce({ polls: [] }) do |hash, data|
hash[:polls] << { id: data["poll_id"], value: data["value"] }
hash
end
This method iterates through the answers, making available the object supplied to reduce (in this case a hash with a :polls array) to which we pass each data hash.
I'd personally, um, reduce this a little further with the following, although it's at some cost to readability:
datas["answers"].reduce({ polls: [] }) do |hash, data|
hash.tap { |h| h[:polls] << { id: data["poll_id"], value: data["value"] } }
end
It's the cleanest method to achieve what you're looking for, using a built-for-purpose method.
Docs for reduce here: https://ruby-doc.org/core-2.1.0/Enumerable.html#method-i-reduce
(I'd also be inclined to update the variable names - data is already plural, so 'datas' is a little confusing to anyone else coming to your code.)
Edit: #max makes a great point re symbol / string keys from your data - keep that in mind if you attempt to apply this.
try the below:
require 'json'
file = File.read('data.json')
datas = JSON.parse(file)
result = Hash.new
poll_json = []
datas["answers"].each do |answer|
poll_json << {"id" => answer["poll_id"], "value" => answer["value"]}
end
p "json = "#{poll_json}"
{
polls: datas["answers"].map do |a|
{ id: a["poll_id"], value: a["value"] }
end
}
In general use .map to iterate through arrays and hashes and return new objects. .each should only be used when you are only concerned about the side effects (like in a view when you are outputting values).
require 'json'
json = JSON.parse(File.read('data.json'))
result = {
polls: json["answers"].map do |a|
{ id: a["poll_id"], value: a["value"] }
end
}
puts result.to_json
The output is:
{"polls":[{"id":1,"value":true},{"id":3,"value":false},{"id":2,"value":3}]}

Restructure hash

I have an initial hash which has a structure given below
Initial Hash
initial_hash = {
`section1`:{
'person_name1':{
'city': 'City1',
'country': 'Country1'
},
'person_name2':{
'city': 'City2',
'country': 'Country2'
},
...
},
`section2`:{
'person_name12':{
'city': 'City12',
'country': 'Country12'
},
'person_name23':{
'city': 'City23',
'country': 'Country23'
},
...
}
}
Final Hash
final_hash = {
`section1`:{
'country1':{
'city': 'City1',
'person_name': 'person_name1'
},
'country2':{
'city': 'City2',
'person_name': 'person_name2'
},
...
},
`section2:{
'country12':{
'city': 'City12',
'person_name': 'person_name12'
},
'country23':{
'city': 'City23',
'person_name': 'person_name23'
},
...
}
}
As you can see that the final_hash has been restructured so as country and person_name has taken the place of each other. So far my attempt for it is as below:
My attempt:
final_hash = {}
initial_hash.each do |h|
final_hash[h[0]] = {}
final_hash[h[0]] = h[1].group_by{|x| x[1]['country']}.each{|_, v| v.map!{|h| h[1]}}
end
The above attempt helps me getting this structure:
final_hash = {
'section1':{
'country'1: {
'city': 'City1',
'country': 'Country1'
},
'country2': {
'city': 'City2',
'country': 'Country2'
},
...
},
'section2':{
'country'12: {
'city': 'City12',
'country': 'Country12'
},
'country23': {
'city': 'City23',
'country': 'Country23'
},
...
}
}
I'm not able to understand how to place the person_name in place of country. I tried to add up each to result of map! block. But no luck so far. To add to this problem, i have a json data which consist of 1000 records, so performance is a concern here.
Thanks in advance
try to use inject:
1) only inject for inner hash:
initial_hash.inject({}){ |h,(section,inner_hash)| h.merge section => inner_hash.inject({}) { |inner_h,(k,v)| inner_h.merge v.delete(:country) => v.merge(person_name: k) }}
2) use map & inject:
Hash[initial_hash.map { |section, inner_hash| [section, inner_hash.inject({}) { |inner_h, (k, v)| inner_h.merge v.delete(:country) => v.merge(person_name: k) }]}]
benchmark(for 1000):
user system total real
injects: 0.060000 0.010000 0.070000 ( 0.057551)
map&inject: 0.060000 0.000000 0.060000 ( 0.053678)

Resources