In a Stored Procedure, I have value1, value2, value3, ..., value8
value outcomes will be between A and F.
After values are stored into variable, is there a way to see how many distinct values are stored in value1~8?
example)
value1 = F; value2 = A; value3 = B; value4 = B;
value5 = B; value6 = D; value7 = D; value8 = F;
Then after my count(distinct *), the result should be 4 different group.
select Value, COUNT(*) from (
select value1 as Value from sometable
union all
select value2 as Value from sometable
union all
select value3 as Value from sometable
union all
select value4 as Value from sometable
union all
select value5 as Value from sometable
union all
select value6 as Value from sometable
union all
select value7 as Value from sometable
union all
select value8 as Value from sometable) as SomeTable
group by Value
Related
When given the choice to either join or filter in Pig, which is more performance-intensive?
Joins are always costly as you have to scan through second table for each tuple in table one. Consider below example
A = LOAD 'data1' AS (a1:int,a2:int,a3:int);
DUMP A;
(1,2,3)
(4,2,1)
(8,3,4)
(4,3,3)
(7,2,5)
(8,4,3)
B = LOAD 'data2' AS (b1:int,b2:int);
DUMP B;
(2,4)
(8,9)
(1,3)
(2,7)
(2,9)
(4,6)
(4,9)
X = JOIN A BY a1, B BY b1;
DUMP X;
(1,2,3,1,3)
(4,2,1,4,6)
(4,3,3,4,6)
(4,2,1,4,9)
(4,3,3,4,9)
(8,3,4,8,9)
(8,4,3,8,9)
When we join in X we traverse through each tuple in B for each tuple in A. For filter we just traverse once through dataset and perform filter operation on each tuple.
X = FILTER A BY a3 == 3;
DUMP X;
(1,2,3)
(4,3,3)
(8,4,3)
I have a query expression of the form:
let result = query { for row in context.Table do
where (row.id = 111111)
select (row.col1,row.col2,row.col3) }
Result returns a value of type IQueryable<Nullable<float>*Nullable<float>*Nullable<float>>. I want it to return seq<float>*seq<float>*seq<float>.
I can try altering it like so:
let result :seq<float>*seq<float>*seq<float> = query { for row in context.Table do
where (row.id = 111111)
select (row.col1,row.col2,row.col3) }
|> Seq.cast
But I get:
Type mismatch. Expecting a IQueryable<Nullable<float> * Nullable<float> * Nullable<float>> -> seq<float> * seq<float> * seq<float>
but given a IQueryable<Nullable<float> * Nullable<float> * Nullable<float>> -> seq<'a>
The type 'seq<float> * seq<float> * seq<float>' does not match the type 'seq<'a>'
What am I doing wrong?
EDIT: Here's what I am trying to do, it sounds like my question may give me a usable answer but not the best way to do what I want. This code is ugly but works:
let col1 : seq<float> = query { for row in context.Table do
where (row.id = 111111)
select row.col1 }
|> Seq.cast
let col2 : seq<float> = query { for row in context.Table do
where (row.id = 111111)
select row.col2 }
|> Seq.cast
let model = MathNet.Numerics.Interpolation.CubicSpline.InterpolateAkima(col1,col2)
If I don't cast to float, InterpolateAkima won't work because it doesn't accept type Nullable. I don't want to have to do a query for each column on its own, though. My eventual goal is to have a function where I can pass in any value of row.id and get the model for col1,col2 then col1,col3 and so on.
There are two parts in this question:
Transforming a seq<a * b * c> into seq<a> * seq<b> * seq<c>: List.unzip3 and Array.unzip3 do exactly that.
Getting rid of the Nullable: it depends on what you want to happen when a value is null.
If you want to return 0 for null values:
let col1, col2, col3 =
query { for row in context.Table do
where (row.id = 111111)
let col1 = if row.col1.HasValue then row.col1.Value else 0.
let col2 = if row.col2.HasValue then row.col2.Value else 0.
let col3 = if row.col3.HasValue then row.col3.Value else 0.
select (col1, col2, col3) }
|> Array.ofSeq
|> Array.unzip3
If you want to ignore rows where there is a null:
let col1, col2, col3 =
query { for row in context.Table do
where (row.id = 111111 && row.col1.HasValue && row.col2.HasValue && row.col3.HasValue)
select (row.col1.Value, row.col2.Value, row.col3.Value) }
|> Array.ofSeq
|> Array.unzip3
I, like Mark, am wondering what you are trying to accomplish with this, but nevertheless, here is one way how you could do it:
open System
open System.Linq
// Helpers to recreate your circumstances.
type Context = { id : Int32; col1 : Nullable<Double>; col2 : Nullable<Double>; col3 : Nullable<Double>}
let context = Unchecked.defaultof<IQueryable<Context>>
let result = query { for row in context do
where (row.id = 111111)
select (row.col1,row.col2,row.col3) }
let seqTuple =
result
|> Seq.fold (fun (col1s, col2s, col3s) (col1, col2, col3) ->
(if col1.HasValue then col1.Value :: col1s else col1s),
(if col2.HasValue then col2.Value :: col2s else col2s),
(if col3.HasValue then col3.Value :: col3s else col3s)
) ([], [], [])
|> fun (col1s, col2s, col3s) ->
List.rev col1s,
List.rev col2s,
List.rev col3s
Consider the following two tables, with 3 columns each:
Table 1:
a INTEGER NOT NULL,
b INTEGER NOT NULL,
c INTEGER NOT NULL
Table 2:
d INTEGER NOT NULL,
e INTEGER,
f INTEGER NOT NULL
I'm trying to write a query expression that joins the two tables on a 2 part, composite key: (b, c) = (e, f).
I know that if column e was not Nullable I could just write:
query {
for r1 in c.table1 do
join r2 in c.table2 on ((r1.b, r1.c) = (r2.e, r2.f))
.
.
}
But how do I do it if column e is Nullable but column b in not?
Suppose I have the following table:
MyTable
id INTEGER PRIMARY KEY
column_a TEXT
column_b TEXT
Now I want to return all rows where column_a is not unique, but where column_b is unique. So if I have the following data in the table:
id column_a column_b
1 A x
2 B x
3 A y
4 A x
5 B x
6 C z
I want the SQL statement to return this:
id column_a column_b
1 A x
3 A y
because column_a is the same in both rows but column_b differs. The rows with column_a="B" have the same value in column_b, so they should not be returned. And the row with column_a="C" has a unique column_a, so it shouldn't be returned either. How would I do that?
I've come half way by the following SQL:
SELECT *
FROM MyTable
JOIN
(
SELECT column_a, column_b
FROM MyTable
GROUP BY column_a
HAVING COUNT(*) >= 2
) TmpTable
ON MyTable.column_a = TmpTable2.column_a
WHERE MyTable.column_b != TmpTable.column_b
but that omits the last of the rows that I want to return, so in the above example it would only return
id column_a column_b
1 A x
SELECT MIN(id),
column_a,
column_b
FROM MyTable
WHERE column_a IN (SELECT column_a
FROM MyTable
GROUP BY column_a
HAVING COUNT(DISTINCT column_b) >= 2)
GROUP BY column_a,
column_b
I have a table like this in sql
ID NAME SIZE GROUP1 GROUP2 SIZE2
1 casa xl 1 2
2 casa l 1 2
I'd like to obtain a table like this
ID NAME SIZE GROUP1 GROUP2 SIZE2
1 casa xl 1 2 l
2 casa l 1 2 xl
So the value of GROUP1 and GROUP2 identify the id that have similar NAME but different value for size
Ho can I do?
Join in the same table again, with the id that is not the same as the record itself:
select
t.ID, t.NAME, t.SIZE, t.GROUP1, t.GROUP2, t2.SIZE
from
TheTable t
inner join TheTable t2 on t2.ID = case t.GROUP1 when t.ID then t.GROUP2 else t.GROUP1 end
To select from table1 and insert it into table2:
insert into table2
select
t.ID, t.NAME, t.SIZE, t.GROUP1, t.GROUP2, t2.SIZE
from
table1 t
inner join table1 t2 on t2.ID = case t.GROUP1 when t.ID then t.GROUP2 else t.GROUP1 end