SQL - several columns to one - psql

Is it possible to do a query in sql, to turn a several amount of columns to only one?
An example, turn the current database structure:
**product_ID | month_A | month_B | month_C**
AAAAA | 15 | 18 | 16
BBBBB | 20 | 21 | 26
CCCCC | 40 | 48 | 41
That I would like to change, so I can better use pivot tables in Excel:
**product_ID | sales_qt| month**
AAAAA | 15 | A
AAAAA | 18 | B
AAAAA | 16 | C
BBBBB | 20 | A
BBBBB | 21 | B
BBBBB | 26 | C
CCCCC | 40 | A
CCCCC | 48 | B
CCCCC | 41 | C
Best regards!!!

I am not intimately familiar with Pervasive. But a standard SQL method would be:
select product_ID, month_A as sales_qt, 'A' as month
from t
union all
select product_ID, month_B as sales_qt, 'B' as month
from t
union all
select product_ID, month_C as sales_qt, 'C' as month
from t;

Related

How to get SUM of the values of comma separated variables and the variable count in a single spreadsheet cell?

I would like to get the SUM of the values of comma separated variables (Issued Items) in Google Sheet. Please see the table below;
+-----------------------+-----------+
| Issued Items | SUM |
+-----------------------+-----------+
| A-22, A-22, B-11 | 120 |
+-----------------------+-----------+
| C-33, 11, 22-X | 160 |
+-----------------------+-----------+
| 22-X, D-54, 22 | 110 |
+-----------------------+-----------+
Edited: The Values for each Item will be stored in another sheet. And how can I get the count of Issued items?
Please note that the Items may repeat in a single cell and may also have prefix and suffix which are needed to be counted as individuals.
+-----------------------+-------+
| Items | Values | QTY |
+-----------------------+-------+
| A-22 | 50 | 2 |
+-----------------------+-------+
| B-11 | 20 | 1 |
+-----------------------+-------+
| C-33 | 70 | 1 |
+-----------------------+-------+
| D-54 | 40 | 1 |
+-----------------------+-------+
| 11 | 30 | 1 |
+-----------------------+-------+
| 22 | 10 | 1 |
+-----------------------+-------+
| 22-X | 60 | 2 |
+-----------------------+-------+
It would save a lot of time and effort for me.
Please help and thanks in advance.
=SUM(ARRAYFORMULA(IFERROR(VLOOKUP(TRANSPOSE(SPLIT(D2, ", ")), A:B, 2, 0))))
={"QTY"; ARRAYFORMULA(IFERROR(VLOOKUP(TO_TEXT(A2:A),
QUERY(TRANSPOSE(SPLIT(TEXTJOIN(", ", 1, D2:D), ", ")),
"select Col1,count(Col1) group by Col1 label count(Col1)''", 0), 2, 0)))}

Counting if range matches ranged criteria 1:1

I have an ongoing scoreboard with a friend for a game we play. It looks like this:
A B C D E F
+-----------------------------+-------+------+--------+--------+------------+
1 | Through the Ages Scoreboard | | | | | |
+-----------------------------+-------+------+--------+--------+------------+
2 | Game title | Kevin | M | First? | Winner | Difference |
+-----------------------------+-------+------+--------+--------+------------+
3 | thekoalaz's Game | 174 | 213 | Kevin | M | 39 |
4 | Game #0 | 242 | 126 | Kevin | Kevin | 116 |
5 | Game #1 | 105 | 146 | Kevin | M | 41 |
6 | Game #2 | 158 | 135 | Kevin | Kevin | 23 |
7 | Game #3 | 149 | 145 | M | Kevin | 4 |
8 | Game #4 | 91 | 145 | Kevin | M | 54 |
9 | Game #5 | 211 | 187 | M | Kevin | 24 |
10 | Game #6 | 160 | 158 | M | Kevin | 2 |
11 | Game #7 | 154 | 215 | Kevin | M | 61 |
12 | Game #8 | 169 | 177 | M | M | 8 |
13 | Game #9 | 135 | 129 | M | Kevin | 6 |
14 | Game #10 | 156 | 262 | Kevin | M | 106 |
15 | Game #11 | 205 | 171 | M | Kevin | 34 |
16 | Game #12 (2) | 186 | 203 | Kevin | M | 17 |
17 | | | | | | |
+-----------------------------+-------+------+--------+--------+------------+
Where there's space at the end of the board to add scores for future games.
How do I count how many times the player who goes first wins? In this case it should be 3: D4 = E4, D6 = E6, D12 = E12. Is this possible to do in a single formula? And I'd like to make adding future game scores "just work" with this as well.
Here, first is {K;K;K;K;M;K;M;M;K;M;M;K;M;K}
And winner is {M;K;M;K;K;M;K;K;M;M;K;M;K;M}
I tried =COUNTIF($E$3:$E, $D$3:$D), but this gives me 7, which I presume is the same as =COUNTIF($E$3:$E, $D$3), without the ranged criteria.
Other ranged criteria questions didn't seem to focus on this 1:1 necessity (or maybe I don't know how to word it).
Here's what I used:
=SUMPRODUCT(D3:D=E3:E, E3:E<>"")
Let's break it down.
D3:D=E3:E (also expressible as EQ(D3:D, E3:E)) - equality. I tried to figure out the concept of testing equality of ranges, but the best thing I could find was Microsoft's tutorial on array formulas. What I can say is if you just put =D3:D=E3:E in your Google sheet, it will just be one of the results--the one that matches the row. It requires =ArrayFormula(D3:D=E3:E) to enter as the array of equality results.
SUMPRODUCT - Sums the product of corresponding array elements between multiple arrays. For example, SUMPRODUCT({1,3}, {2,4}) = 1*2 + 2*4 = 10. If used with one array, it would just aggregate the array's values. TRUE=1 and FALSE=0, so when considering the array formula above, it will count how many times D3:D=E3:E is true. Ranges work as arrays, so maybe that's why wrapping the equality with ArrayFormula(...) isn't necessary
E3:E<>"" - Another array formula testing if the E cell is not empty (<> is the "not equals" sign). Because I want this to automatically work for any new entries, D3:D=E3:E will evaluate true for any empty entries (empty=empty). Mutliplying these two array formulas together is effectively an AND operator--"sum this if Dn=En AND En is not empty". To convince you, here are the truth tables:
+-----+---+---+ +------+---+---+
| AND | T | F | | MULT | 1 | 0 |
+-----+---+---+ +------+---+---+
| T | T | F | | 1 | 1 | 0 |
| F | F | F | | 0 | 0 | 0 |
+-----+---+---+ +------+---+---+

SQL: Advanced time slice in vertica

Hey folks: I have the following table in a vertica DB:
+-----+------+----------+
| Tid | item | time_sec |
+-----+------+----------+
| 1 | A | 1 |
| 1 | B | 2 |
| 1 | C | 4 |
| 1 | D | 5 |
| 1 | E | 6 |
| 2 | A | 5 |
| 2 | E | 5 |
+-----+------+----------+
My goal is to create new item groups that lie within a time window deltaT. Meaning that the difference between the first and last item's timestamp is smaller or equal to deltaT. Example: if deltaT = 2 sec we would get the new table:
+-----+------+
| Tid | item |
+-----+------+
| 11 | A |
| 11 | B |
| 12 | B |
| 12 | C |
| 13 | C |
| 13 | D |
| 13 | E |
| 14 | D |
| 14 | E |
| 15 | E |
| 21 | A |
| 21 | E |
+-----+------+
Here is the walk through of the table:
First we inspect all items with the Tid 1, and create sub groups with Tid 1n, where n is a counter.
Our first sub group with the Tid 11 consists of item A, B since deltaT between the last and first item is =<2. The next group has Tid 12 with item B,C. The group after that one has the Tid 13 and items C,D,E since all items are within a time span of 2 seconds. This goes on until the last item with Tid 1. Than we start over with the group that has Tid 2.
The new Tid numbering for the sub groups can be continous (1...6), I just choose this kind of numbering to show the relation to the original table.
I am looking at the vertica functions LAG and Time_slice but cannot figure out a way how to handle such a problem elegantly.
This is how far I got - and it does not answer your question, really. But it could constitute a few pointers:
WITH
-- your input
input(Tid,item,time_sec) AS (
SELECT 1,'A',1
UNION ALL SELECT 1,'B',2
UNION ALL SELECT 1,'C',4
UNION ALL SELECT 1,'D',5
UNION ALL SELECT 1,'E',6
UNION ALL SELECT 2,'A',5
UNION ALL SELECT 2,'E',5
)
-- end of your input, start your "real" WITH clause here
,
input_w_ts AS (
SELECT
*
, TIMESTAMPADD('SECOND',time_sec-1,TIMESTAMP '2000-01-01 00:00:00') AS ts
FROM input
)
SELECT
TS_LAST_VALUE(Tid) AS Tid
, item
, TS_LAST_VALUE(time_sec) AS time_sec
, tsr
FROM input_w_ts
TIMESERIES tsr AS '2 SECONDS' OVER (PARTITION BY item ORDER BY ts)
ORDER BY 1,4
;
Output:
Tid|item|time_sec|tsr
1|A | 1|2000-01-01 00:00:00
1|B | 2|2000-01-01 00:00:00
1|A | 1|2000-01-01 00:00:02
1|C | 4|2000-01-01 00:00:02
1|D | 5|2000-01-01 00:00:04
1|E | 6|2000-01-01 00:00:04
2|A | 5|2000-01-01 00:00:04

Calculate a bunch of data to display on stacked bar

I'm struggeling with creating my first chart.
i have a dataset of ordinal scaled data from a survey.
There i have several question with the possible answer from 1 - 5.
So have around 110 answers from different persons which i want to collect and show in a stacked bar.
Those data looks like:
| taste | region | brand | price |
| 1 | 3 | 4 | 2 |
| 1 | 1 | 5 | 1 |
| 1 | 3 | 4 | 3 |
| 2 | 2 | 5 | 1 |
| 1 | 1 | 4 | 5 |
| 5 | 3 | 5 | 2 |
| 1 | 5 | 5 | 2 |
| 2 | 4 | 1 | 3 |
| 1 | 3 | 5 | 4 |
| 1 | 4 | 4 | 5 |
...
to can display that in a stacked bar chart, i need to sum that.
so i know at the end it need to be calculated like:
| | taste | region | brand | price |
| 1 | 60 | 20 | 32 | 12 |
| 2 | 23 | 32 | 54 | 22 |
| 3 | 24 | 66 | 36 | 65 |
| 4 | 55 | 68 | 28 | 54 |
| 5 | 10 | 10 | 12 | 22 |
(this is just to demonstarte, the values are not correct)
Or somehow there is already a function for it on spss but i have now idea where an how.
Any advice how to do that?
I can't think of a single command but there are many ways to get to where you want. Here's one:
first recreating your sample data:
data list list/ taste region brand price .
begin data
1 3 4 2
1 1 5 1
1 3 4 3
2 2 5 1
1 1 4 5
5 3 5 2
1 5 5 2
2 4 1 3
1 3 5 4
1 4 4 5
end data.
Now counting the values for each row:
vector t(5) r(5) b(5) p(5).
* the vector command is only nescessary so the new variables will be ordered compfortably for the following parts.
do repeat vl= 1 to 5/t=t1 to t5/r=r1 to r5/b=b1 to b5/p=p1 to p5.
compute t=(taste=vl).
compute r=(region=vl).
compute b=(brand=vl).
compute p=(price=vl).
end repeat.
Now we can aggregate and restructure to arrive to the the exact data structure you specified:
aggregate /outfile=* /break= /t1 to t5 r1 to r5 b1 to b5 p1 to p5 = sum(t1 to p5).
varstocases /make taste from t1 to t5 /make region from r1 to r5
/make brand from b1 to b5/ make price from p1 to p5/index=val(taste).
compute val = char.substr(val,2,1).
alter type val(f1).

writing a custom template/parser/filter for use in syslog-ng

My application generates logs and sends them to syslog-ng.
I want to write a custom template/parser/filter for use in syslog-ng to correctly store the fields in tables of an SQLite database (MyDatabase).
This is the legend of my log:
unique-record-id usename date Quantity BOQ possible,item,profiles Count Vendor applicable,vendor,categories known,request,types vendor_code credit
All these 12 fields are tab separated, and the parser must store them into 12 columns of table MyTable1 in MyDatabase.
Some of the fields: the 6th, 9th, and 10th however also contain "sub-fields" as comma-separated values.
The number of values within each of these sub-fields, is variable, and can change in each line of log.
I need these fields to be stored in respective separate tables
MyItem_type, MyVendor_groups, MyReqs
These "secondary" tables have 3 columns, record the Unique-Record-ID, and Quantity against each of their occurence in the log
So the schema in MyItem_type table looks like:
Unique-Record-ID | item_profile | Quantity
Similarly the schema of MyVendor_groups looks like:
Unique-Record-ID | vendor_category | Quantity
and the schema of MyReqs looks like:
Unique-Record-ID | req_type | Quantity
Consider these sample lines from the log:
unique-record-id usename date Quantity BOQ possible,item,profiles Count Vendor applicable,vendor,categories known,request,types vendor_code credit
234.44.tfhj Sam 22-03-2016 22 prod1 cat1,cat22,cat36,cat44 66 ven1 t1,t33,t43,t49 req1,req2,req3,req4 blue 64.22
234.45.tfhj Alex 23-03-2016 100 prod2 cat10,cat36,cat42 104 ven1 t22,t45 req1,req2,req33,req5 red 66
234.44.tfhj Vikas 24-03-2016 88 prod1 cat101,cat316,cat43 22 ven2 t22,t43 req1,req23,req3,req6 red 77.12
234.47.tfhj Jane 25-03-2016 22 prod7 cat10,cat36,cat44 43 ven3 t77 req1,req24,req3,req7 green 45.89
234.48.tfhj John 26-03-2016 97 serv3 cat101,cat36,cat45 69 ven5 t1 req11,req2,req3,req8 orange 33.04
234.49.tfhj Ruby 27-03-2016 85 prod58 cat10,cat38,cat46 88 ven9 t33,t55,t99 req1,req24,req3,req9 white 46.04
234.50.tfhj Ahmed 28-03-2016 44 serv7 cat110,cat36,cat47 34 ven11 t22,t43,t77 req1,req20,req3,req10 red 43
My parser should store the above log into MyDatabase.Mytable1 as:
unique-record-id | usename | date | Quantity | BOQ | item_profile | Count | Vendor | vendor_category | req_type | vendor_code | credit
234.44.tfhj | Sam | 22-03-2016 | 22 | prod1 | cat1,cat22,cat36,cat44 | 66 | ven1 | t1,t33,t43,t49 | req1,req2,req3,req4 | blue | 64.22
234.45.tfhj | Alex | 23-03-2016 | 100 | prod2 | cat10,cat36,cat42 | 104 | ven1 | t22,t45 | req1,req2,req33,req5 | red | 66
234.44.tfhj | Vikas | 24-03-2016 | 88 | prod1 | cat101,cat316,cat43 | 22 | ven2 | t22,t43 | req1,req23,req3,req6 | red | 77.12
234.47.tfhj | Jane | 25-03-2016 | 22 | prod7 | cat10,cat36,cat44 | 43 | ven3 | t77 | req1,req24,req3,req7 | green | 45.89
234.48.tfhj | John | 26-03-2016 | 97 | serv3 | cat101,cat36,cat45 | 69 | ven5 | t1 | req11,req2,req3,req8 | orange | 33.04
234.49.tfhj | Ruby | 27-03-2016 | 85 | prod58 | cat10,cat38,cat46 | 88 | ven9 | t33,t55,t99 | req1,req24,req3,req9 | white | 46.04
234.50.tfhj | Ahmed | 28-03-2016 | 44 | serv7 | cat110,cat36,cat47 | 34 | ven11 | t22,t43,t77 | req1,req20,req3,req10 | red | 43
And also parse the "possible,item,profiles" to record into MyDatabase.MyItem_type as:
Unique-Record-ID | item_profile | Quantity
234.44.tfhj | cat1 | 22
234.44.tfhj | cat22 | 22
234.44.tfhj | cat36 | 22
234.44.tfhj | cat44 | 22
234.45.tfhj | cat10 | 100
234.45.tfhj | cat36 | 100
234.45.tfhj | cat42 | 100
234.44.tfhj | cat101 | 88
234.44.tfhj | cat316 | 88
234.44.tfhj | cat43 | 88
234.47.tfhj | cat10 | 22
234.47.tfhj | cat36 | 22
234.47.tfhj | cat44 | 22
234.48.tfhj | cat101 | 97
234.48.tfhj | cat36 | 97
234.48.tfhj | cat45 | 97
234.48.tfhj | cat101 | 97
234.48.tfhj | cat36 | 97
234.48.tfhj | cat45 | 97
234.49.tfhj | cat10 | 85
234.49.tfhj | cat38 | 85
234.49.tfhj | cat46 | 85
234.50.tfhj | cat110 | 44
234.50.tfhj | cat36 | 44
234.50.tfhj | cat47 | 44
We also need to similarly parse "applicable,vendor,categories" and
store them into MyDatabase.MyVendor_groups. And parse
"known,request,types" for storage into MyDatabase.MyReqs The first
column for MyDatabase.MyItem_type, MyDatabase.MyVendor_groups and
MyDatabase.MyReqs will always be the Unique-Record-ID that was
witnessed in the log.
Therefore yes, this column does not contain unique data, like other columns, in these three tables.
The third column will always be the Quantity that was witnessed in the log.
I know a bit of PCRE, but it is the use of nested parsers in syslog-ng that's completely confusing me.
Documentation of Syslog-ng suggests this is possible, but simply failed to get a good example. If any kind hack around here has some reference or sample to share, it will be so useful.
Thanks in advance.
I think all of these can be done using the csv-parser a few times.
First, use a csv-parser with the tab delimiter("\t") to split the initial fields into named columns. Use this parser on the entire message.
Then you'll have to parse the fields that have subfields using other instances of the csv-parser on the columns that need further parsing.
You can find some examples at https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/csv-parser.html and https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/reference-parsers-csv.html
(It is possible that you can get it done with a single parser, if you specify both the tab and the comma as delimiters, but it might not work for the fields with variable number of fields.).

Resources