How do I sum the product of two values, across multiple objects in Rails? - ruby-on-rails

Imagine I have a portfolio p that has 2 stocks port_stocks. What I want to do is run a calculation on each port_stock, and then sum up all the results.
[60] pry(main)> p.port_stocks
=> [#<PortStock:0x00007fd520e064e0
id: 17,
portfolio_id: 1,
stock_id: 385,
volume: 2000,
purchase_price: 5.9,
total_spend: 11800.0>,
#<PortStock:0x00007fd52045be68
id: 18,
portfolio_id: 1,
stock_id: 348,
volume: 1000,
purchase_price: 9.0,
total_spend: 9000.0>]
[61] pry(main)>
So, in essence, using the code above I would like to do this:
ps = p.port_stocks.first #(`id=17`)
first = ps.volume * ps.purchase_price # 2000 * 5.9 = 11,800
ps = p.port_stocks.second #(`id=18`)
second = ps.volume * ps.purchase_price # 1000 * 9.0 = 9,000
first + second = 19,800
I want to simply get 19,800. Ideally I would like to do this in a very Ruby way.
If I were simply summing up all the values in 1 total_spend, I know I could simply do: p.port_stocks.map(&:total_spend).sum and that would be that.
But not sure how to do something similar when I am first doing a math operation on each object, then adding up all the products from all the objects. This should obviously work for 2 objects or 500.

The best way of doing this using Rails is to pass a block to sum, such as the following:
p.port_stocks.sum do |port_stock|
port_stock.volume * port_stock.purchase_price
end
That uses the method dedicated to totalling figures, and tends to be very fast and efficient - particularly when compared to manipulating the data ahead of calling a straight sum without a block.
A quick benchmark here typically shows it performing ~20% faster than the obvious alternatives.
I've not been able to test, but give that a try and it should resolve this for you.
Let me know how you get on!
Just a quick update as you also mention the best Ruby way, sum was introduced in 2.4, though on older versions of Ruby you can use reduce (also aliased to inject):
p.port_stocks.reduce(0) do |sum, port_stock|
sum + (port_stock.volume * port_stock.purchase_price)
end
This isn't as efficient as sum, but thought I'd give you the options :)

You are right to use Array#map to iterate through all stocks, but instead to sum all total_spend values, you could calculate it for each stock. After, you sum all results and your done:
p.port_stocks.map{|ps| ps.volume * ps.purchase_price}.sum
Or you could use Enumerable#reduce like SRack did. This would return the result with one step/iteration.

Related

dask equivalent of df.loc[df.index.intesection(mylabels)]

When I run df.loc[mylabels] in dask I get a warning with the link to
Warning Starting in 0.21.0, using .loc or [] with a list with one or more missing labels, is deprecated, in favor of .reindex *
This page also says:
Alternatively, if you want to select only valid keys, the following is idiomatic and efficient; it is guaranteed to preserve the dtype of the selection.
In [106]: labels = [1, 2, 3]
In [107]: s.loc[s.index.intersection(labels)]
Out[107]:
1 2
2 3
dtype: int64
Dask indexes do not have an intersection method.
So hat is the recommended way to achieve the above effect in dask?
The problem with df.loc[mylabels] is that mylabels contains items not in df.index.
For now it looks like you should continue calling df.loc[labels].
It looks like things have changed upstream and probably dask.dataframe needs to follow a bit. I recommend submitting a bug report to https://github.com/dask/dask/issues/new

luajit copy table is slow

Within a larger lua-script, I have to copy several tables dt:
for i=1,dt:nrow() do
local r = {}
for j=1,dt:ncol() do
r[j] = dt[i][j]
end
rslt:append(r)
end
The tables are about 50,000 lines x 25 cols, containing mainly doubles. luajit takes about 10 times as long as "standard" lua. On all other calculations/operations I do before, luajit is faster (1.5 to 3 times).
As silly as this may sound, try pre-allocating the r table with 25 values:
local r = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
Unfortunately Lua API doesn't allow pre-allocation of tables, so this is the only way to avoid re-allocations caused by array assignment in the inner loop. My tests show noticeable improvement, but not close to 10x (although I don't use your methods, so your results may vary).

Set certain size and content to an array

I am developing an Rails 3.2.14 app and in this app I am creating
an array with exactly 31 zeros in it:
<% #total = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] %>
I know there must be a better way to do this right?
Thankful for all input!
Array.new is probably the cleanest way to do this:
Array.new(31, 0)
The first argument is the size, the second is the default value.
Some other alternatives:
[0] * 31
31.times.collect{0}
31.times.inject([]){|array, count| array << 0}
These methods are trivial if you're filling with zeros, but if you are calculating values then they can be quite powerful.
You can use Array#new or Array#fill.
Example:
Array.new(31, 0)
or
[].fill(0, 0..30)
both yields the same result.

Moving Average across Variables in Stata

I have a panel data set for which I would like to calculate moving averages across years.
Each year is a variable for which there is an observation for each state, and I would like to create a new variable for the average of every three year period.
For example:
P1947=rmean(v1943 v1944 v1945), P1947=rmean(v1944 v1945 v1946)
I figured I should use a foreach loop with the egen command, but I'm not sure about how I should refer to the different variables within the loop.
I'd appreciate any guidance!
This data structure is quite unfit for purpose. Assuming an identifier id you need to reshape, e.g.
reshape long v, i(id) j(year)
tsset id year
Then a moving average is easy. Use tssmooth or just generate, e.g.
gen mave = (L.v + v + F.v)/3
or (better)
gen mave = 0.25 * L.v + 0.5 * v + 0.25 * F.v
More on why your data structure is quite unfit: Not only would calculation of a moving average need a loop (not necessarily involving egen), but you would be creating several new extra variables. Using those in any subsequent analysis would be somewhere between awkward and impossible.
EDIT I'll give a sample loop, while not moving from my stance that it is poor technique. I don't see a reason behind your naming convention whereby P1947 is a mean for 1943-1945; I assume that's just a typo. Let's suppose that we have data for 1913-2012. For means of 3 years, we lose one year at each end.
forval j = 1914/2011 {
local i = `j' - 1
local k = `j' + 1
gen P`j' = (v`i' + v`j' + v`k') / 3
}
That could be written more concisely, at the expense of a flurry of macros within macros. Using unequal weights is easy, as above. The only reason to use egen is that it doesn't give up if there are missings, which the above will do.
FURTHER EDIT
As a matter of completeness, note that it is easy to handle missings without resorting to egen.
The numerator
(v`i' + v`j' + v`k')
generalises to
(cond(missing(v`i'), 0, v`i') + cond(missing(v`j'), 0, v`j') + cond(missing(v`k'), 0, v`k')
and the denominator
3
generalises to
!missing(v`i') + !missing(v`j') + !missing(v`k')
If all values are missing, this reduces to 0/0, or missing. Otherwise, if any value is missing, we add 0 to the numerator and 0 to the denominator, which is the same as ignoring it. Naturally the code is tolerable as above for averages of 3 years, but either for that case or for averaging over more years, we would replace the lines above by a loop, which is what egen does.
There is a user written program that can do that very easily for you. It is called mvsumm and can be found through findit mvsumm
xtset id time
mvsumm observations, stat(mean) win(t) gen(new_variable) end

matlab indexing into nameless matrix [duplicate]

For example, if I want to read the middle value from magic(5), I can do so like this:
M = magic(5);
value = M(3,3);
to get value == 13. I'd like to be able to do something like one of these:
value = magic(5)(3,3);
value = (magic(5))(3,3);
to dispense with the intermediate variable. However, MATLAB complains about Unbalanced or unexpected parenthesis or bracket on the first parenthesis before the 3.
Is it possible to read values from an array/matrix without first assigning it to a variable?
It actually is possible to do what you want, but you have to use the functional form of the indexing operator. When you perform an indexing operation using (), you are actually making a call to the subsref function. So, even though you can't do this:
value = magic(5)(3, 3);
You can do this:
value = subsref(magic(5), struct('type', '()', 'subs', {{3, 3}}));
Ugly, but possible. ;)
In general, you just have to change the indexing step to a function call so you don't have two sets of parentheses immediately following one another. Another way to do this would be to define your own anonymous function to do the subscripted indexing. For example:
subindex = #(A, r, c) A(r, c); % An anonymous function for 2-D indexing
value = subindex(magic(5), 3, 3); % Use the function to index the matrix
However, when all is said and done the temporary local variable solution is much more readable, and definitely what I would suggest.
There was just good blog post on Loren on the Art of Matlab a couple days ago with a couple gems that might help. In particular, using helper functions like:
paren = #(x, varargin) x(varargin{:});
curly = #(x, varargin) x{varargin{:}};
where paren() can be used like
paren(magic(5), 3, 3);
would return
ans = 16
I would also surmise that this will be faster than gnovice's answer, but I haven't checked (Use the profiler!!!). That being said, you also have to include these function definitions somewhere. I personally have made them independent functions in my path, because they are super useful.
These functions and others are now available in the Functional Programming Constructs add-on which is available through the MATLAB Add-On Explorer or on the File Exchange.
How do you feel about using undocumented features:
>> builtin('_paren', magic(5), 3, 3) %# M(3,3)
ans =
13
or for cell arrays:
>> builtin('_brace', num2cell(magic(5)), 3, 3) %# C{3,3}
ans =
13
Just like magic :)
UPDATE:
Bad news, the above hack doesn't work anymore in R2015b! That's fine, it was undocumented functionality and we cannot rely on it as a supported feature :)
For those wondering where to find this type of thing, look in the folder fullfile(matlabroot,'bin','registry'). There's a bunch of XML files there that list all kinds of goodies. Be warned that calling some of these functions directly can easily crash your MATLAB session.
At least in MATLAB 2013a you can use getfield like:
a=rand(5);
getfield(a,{1,2}) % etc
to get the element at (1,2)
unfortunately syntax like magic(5)(3,3) is not supported by matlab. you need to use temporary intermediate variables. you can free up the memory after use, e.g.
tmp = magic(3);
myVar = tmp(3,3);
clear tmp
Note that if you compare running times with the standard way (asign the result and then access entries), they are exactly the same.
subs=#(M,i,j) M(i,j);
>> for nit=1:10;tic;subs(magic(100),1:10,1:10);tlap(nit)=toc;end;mean(tlap)
ans =
0.0103
>> for nit=1:10,tic;M=magic(100); M(1:10,1:10);tlap(nit)=toc;end;mean(tlap)
ans =
0.0101
To my opinion, the bottom line is : MATLAB does not have pointers, you have to live with it.
It could be more simple if you make a new function:
function [ element ] = getElem( matrix, index1, index2 )
element = matrix(index1, index2);
end
and then use it:
value = getElem(magic(5), 3, 3);
Your initial notation is the most concise way to do this:
M = magic(5); %create
value = M(3,3); % extract useful data
clear M; %free memory
If you are doing this in a loop you can just reassign M every time and ignore the clear statement as well.
To complement Amro's answer, you can use feval instead of builtin. There is no difference, really, unless you try to overload the operator function:
BUILTIN(...) is the same as FEVAL(...) except that it will call the
original built-in version of the function even if an overloaded one
exists (for this to work, you must never overload
BUILTIN).
>> feval('_paren', magic(5), 3, 3) % M(3,3)
ans =
13
>> feval('_brace', num2cell(magic(5)), 3, 3) % C{3,3}
ans =
13
What's interesting is that feval seems to be just a tiny bit quicker than builtin (by ~3.5%), at least in Matlab 2013b, which is weird given that feval needs to check if the function is overloaded, unlike builtin:
>> tic; for i=1:1e6, feval('_paren', magic(5), 3, 3); end; toc;
Elapsed time is 49.904117 seconds.
>> tic; for i=1:1e6, builtin('_paren', magic(5), 3, 3); end; toc;
Elapsed time is 51.485339 seconds.

Resources