Bitwise computation and count number of set bits afterward - ruby-on-rails

I have a table:
create_table "fingerprint" do |t|
t.bit "fp1", limit: 64
t.bit "fp2", limit: 64
t.bit "fp3", limit: 64
t.bit "fp4", limit: 64
t.bit "fp5", limit: 64
end
fp1 | fp2 | fp3 | fp4 | fp5
---------------------------
001 | 010 | 011 | 100 | 101
And an array of 5 elements
fp = [5,4,3,2,1]
I'd like and bitwise each record of the table with each element fp and then count the total number of set bits over 5 columns.
For example:
(001 & 5) = 001
(010 & 4) = 000
(011 & 3) = 011
(100 & 2) = 000
(101 & 1) = 001
Total number of set bits: 4
I want to loop this procedure in every row of my table. Please help me with an efficient way to do it (the table has about 100k rows).
Thank you in advance.

The example for the type bit(3). You can easily adapt this for bit(64):
create table the_table (fp1 bit(3), fp2 bit(3), fp3 bit(3), fp4 bit(3), fp5 bit(3));
insert into the_table values
('001', '010', '011', '100', '101'),
('101', '011', '111', '110', '101');
with the_array(arr) as (
values (array[5,4,3,2,1])
),
new_values as (
select
fp1 & arr[1]::bit(3) n1,
fp2 & arr[2]::bit(3) n2,
fp3 & arr[3]::bit(3) n3,
fp4 & arr[4]::bit(3) n4,
fp5 & arr[5]::bit(3) n5
from the_table
cross join the_array
)
select
*,
length(
translate(
concat(n1::text, n2::text, n3::text, n4::text, n5::text),
'0',
'')
) bit_set
from new_values;
n1 | n2 | n3 | n4 | n5 | bit_set
-----+-----+-----+-----+-----+---------
001 | 000 | 011 | 000 | 001 | 4
101 | 000 | 011 | 010 | 001 | 6
(2 rows)

Related

Maximum of column 1 where value of column 2 matches some condition

Let's say I have the following in a table :
A | B | desired_output
----------------------------
1 | 10 | 1 | 0
2 | 20 | 7 | 0
3 | 30 | 3 | 0
4 | 20 | 2 | 0
5 | 30 | 5 | 1
I'd like to find a formula for each of the cells in the desired_output column which looks at the max of B1:B5 but only for rows for which A = max(A1:A5)
If that's not clear, I'll try to put it another way :
for all the rows in A1:A5 that are equal to max(A1:A5) // so that's rows 3 and 5
find the one which has the max value on B // so between B3 and B5, that's B5
output 1 for this one, 0 for the other
I'd say there would be a where somewhere if such a function existed, something like = if(B=(max(B1:B5) where A = max(A1:A5)), 1, 0) but I can't find how to do it...
I can do it in two columns with a trick :
A | B | C | D
----------------------------
1 | 10 | 1 | | 0
2 | 20 | 7 | | 0
3 | 30 | 3 | 3 | 0
4 | 20 | 2 | | 0
5 | 30 | 5 | 5 | 1
With Cn = if(An=max(A$1:A$5),Bn,"") and Dn = if(Cn = max(C$1:C$5), 1, 0)
But I still can't find how to do it in one column
For systems without MAXIFS, put this in C1 and fill down.
=--(B1=MAX(INDEX(B$1:B$5-(A$1:A$5<>MAX(A$1:A$5))*1E+99, , )))
=ARRAYFORMULA(IF(LEN(A1:A), IF(IFERROR(VLOOKUP(CONCAT(A1:A&"×", B1:B),
JOIN("×", QUERY(A1:B, "order by A desc, B desc limit 1")), 1, 0), )<>"", 1, 0), ))
or shorter:
=ARRAYFORMULA(IF(A:A<>"",N(A:A&"×"&B:B=JOIN("×",SORTN(A:B,1,,1,0,2,0))),))
=ARRAYFORMULA(IF(A:A<>"",N(A:A&B:B=JOIN(,SORTN(A:B,1,,1,0,2,0))),))
How about the following:
=--AND(A5=MAX($A$1:$A$5),B5=MAXIFS($B$1:$B$5,$A$1:$A$5,MAX($A$1:$A$5)))

How to grab a value corresponding to a particular date, if the date is before/after the dates in the table?

I have a Google Sheet table with a number of inventory additions:
Date | Product | New Units | # Total Units
-----------|---------|-----------|---------------
1/11/2017 | Coke | 14 | 14
1/31/2017 | Pepsi | 6 | 6
2/12/2017 | Coke | 3 | 17
3/13/2017 | Coke | 12 | 29
3/13/2017 | Pepsi | 13 | 19
e.g., on Feb 12th 2017, I received 3 new units of Coke, for a total of 17 units. I'd like to be able to say for any given product and any given date, how many units of that product did I have on that date?
For example, given the following list of dates in a separate sheet, based on the data above, I'd hope to see this output:
Date | Coke | Pepsi
-----------|------|-------
1/10/2017 | 0 | 0
1/11/2017 | 14 | 0
2/10/2017 | 14 | 6
2/15/2017 | 17 | 6
3/15/2017 | 29 | 19
Is there a formula or formulas I could use to calculate values for B2:B6 and C2:C6?
paste in G3 (skip the 1st avail row to avoid #REF!) then drag down, right and up
=ARRAYFORMULA(IF($F3<MIN($A$2:$A), 0, IFERROR(IFERROR(
QUERY($A$2:$D,
"select D where A >= date '"&TEXT($F2, "yyyy-mm-dd")&"'
and A <= date '"&TEXT($F3, "yyyy-mm-dd")&"'
and B = '"&G$1&"' ", 0),
QUERY($A$2:$D,
"select D where A >= date '"&TEXT($F1, "yyyy-mm-dd")&"'
and A <= date '"&TEXT($F3, "yyyy-mm-dd")&"'
and B = '"&G$1&"' ", 0)), 0)))
paste in G3 (skip the 1st avail row to avoid #REF!) then drag down, right and up
=ARRAYFORMULA(IF($F2<MIN($A$2:$A), 0, IFERROR(IFERROR(
QUERY(TO_TEXT({VALUE($A$2:$A), $B$2:$D}),
"select Col4 where Col1 >= '"&VALUE($F1)&"'
and Col1 <= '"&VALUE($F2)&"'
and Col2 = '"&G$1&"' ", 0),
QUERY(TO_TEXT({VALUE($A$2:$A), $B$2:$D}),
"select Col4 where Col1 >= '"&VALUE(#REF!)&"'
and Col1 <= '"&VALUE($F2)&"'
and Col2 = '"&G$1&"' ", 0)), 0)))

Column SUM with "min-max" cell format

I'm trying to make two SUMs on the same column.
Here's my columns:
| 1-2 | 1 |
| 2 | 2-3 |
| 1 | 5 |
|-------|-------|
| 4 | 8 | Sum 1 that take the "min" value of each cells
| 5 | 9 | Sum 2 that take the "max" value of each cells
Sum 1 Column 1 : 1 + 2 + 1 = 4
Sum 2 Column 1 : 2 + 2 + 1 = 5
The cells notation is either {num} which is an absolute value, or {min}-{max} which is the min and max value
This is to create some work timing estimations and we would like to have this "min-max" concept. We have already something with split columns, but it will be more comfortable to keep 1 column with 2 possible values in each cells.
For the min:
=ArrayFormula(SUM(--(IFERROR(LEFT(A1:A3,FIND("-",A1:A3)-1),A1:A3))))
For the Max:
=ArrayFormula(SUM(--(IFERROR(RIGHT(A1:A3,len(A1:A3)-FIND("-",A1:A3)),A1:A3))))

Joining rows with columns in SAS

I have 2 tables.
1 table with all possible mistakes, looks like
mistake|description
m1 | a
m2 | b
m3 | c
second table is my data:
n | m1 | m2 | m3
1 | 1 | 0 | 1
2 | 0 | 1 | 1
3 | 1 | 1 | 0
where n is row_num, and for each m I put 1 with mistake, 0 - without.
In total I want to join them showing row_nums (or other info) for each mistake.
Something like:
mistake | n
m1 |1
m1 |3
m2 |2
m2 |3
m3 |1
m3 |2
It looks to me like you are just asking to transpose the data.
data have;
input n m1 m2 m3 ;
cards;
1 1 0 1
2 0 1 1
3 1 1 0
;
proc transpose data=have out=want ;
by n ;
var m1 m2 m3 ;
run;

Neo4j Divide ( / ) by Zero ( 0 )

In neo4j I am querying
MATCH (n)-[t:x{x:"1a"}]->()
WHERE n.a > 1 OR n.b > 1 AND toFloat(n.a) / (n.a+n.b) * 100 < 90
RETURN DISTINCT n, toFloat(n.a) / (n.a + n.b) * 100
ORDER BY toFloat(n.a) / (n.a + n.b) * 100 DESC
LIMIT 10
but I got / by zero error.
Since I declared one of n.a or n.b should be 1, if both zero it should skip that row and I shouldn't get this error. This looks like a logic issue in Neo4j. There is no problem when I delete AND toFloat(n.a)/(n.a+n.b)*100 < 90 from WHERE clause. But I want the results only lower than 90. How can I overcome this?
Can either of n.a or n.b be negative? I was able to reproduce this with:
WITH -2 AS na, 2 AS nb
WHERE (na > 1 OR nb > 1) AND toFloat(na)/(na+nb)*100 < 90
RETURN na, nb
And I get: / by zero
Perhaps try changing your WHERE clause to:
WITH -2 AS na, 2 AS nb
WHERE (na + nb > 0) AND toFloat(na)/(na+nb)*100 < 90
RETURN na, nb
And I get: zero rows.
It seems the second condition, toFloat(na) / (na + nb) * 100 < 90, is tested before the first. Look at the Filter(1) operator in this execution plan:
+--------------+---------------+------+--------+--------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+--------------+---------------+------+--------+--------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection | 1 | 3 | 0 | anon[111], anon[138], n, toFloat(n.a)/(n.a + n.b)* 100 | anon[111]; anon[138] |
| Top | 1 | 3 | 0 | anon[111], anon[138] | { AUTOINT6}; |
| Distinct | 0 | 3 | 24 | anon[111], anon[138] | anon[111], anon[138] |
| Filter(0) | 0 | 3 | 6 | anon[29], n, t | t.x == { AUTOSTRING0} |
| Expand(All) | 1 | 3 | 6 | anon[29], n, t | ( n#7)-[t:x]->() |
| Filter(1) | 1 | 3 | 34 | n | (Ors(List(n#7.a > { AUTOINT1}, Multiply(Divide(ToFloatFunction( n#7.a),Add( n#7.a, n#7.b)),{ AUTOINT3}) < { AUTOINT4})) AND Ors(List( n#7.a > { AUTOINT1}, n.b > { AUTOINT2}))) |
| AllNodesScan | 4 | 4 | 5 | n | |
+--------------+---------------+------+--------+--------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
You can get around this by force breaking the filter into two clauses.
MATCH (n)-[t:x { x:"1a" }]->()
WHERE n.a > 1 OR n.b > 1
WITH n
WHERE toFloat(n.a) / (n.a + n.b) * 100 < 90
RETURN DISTINCT n, toFloat(n.a) / (n.a + n.b) * 100
ORDER BY toFloat(n.a) / (n.a + n.b) * 100 DESC
LIMIT 10
I found this behavior surprising, but as I think about it I suppose it isn't wrong for the execution engine to rearrange the filter in this way. There may be the assumption that the condition will abandon early on failing the first declared condition, but Cypher is exactly that: declarative. So we express the "what", not the "how", and in terms of the "what" A and B is equivalent to B and A.
Here is the query and a sample graph, you can check if it translates to your actual data:
http://console.neo4j.org/r/f6kxi5

Resources