select a row except all the other rows based on some condition psql - psql

Consider this scenario:
There is a table T (name,habit) where combination of name and habit is the primary key for the table T.
Suppose the data is as follows:
name | habit
a1 | smoking
a1 | drinking
a2 | sleeping
a3 | jogging
a2 | jogging
a4 | sleeping
Now I want to select names which have all the habits as unique. Here clearly a2,a3 and a4 have habits in common so they should be filtered out.
So the output should be like
OUTPUT:
name
a1
My question:
How can I do this using except in psql?

you don't need except for it:
t=# with a as (
select *,count(1) over (partition by habit)
from t
)
select distinct name
from a
where count = 1;
name
------
a1
(1 row)
schema:
t=# create table T (name text,habit text);
CREATE TABLE
Time: 14.162 ms
t=# copy t from stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> a1 | smoking
a1 | drinking
a2 | sleeping
a3 | jogging
a2 | jogging
a4 | sleeping>> >> >> >> >>
>> \.
COPY 6
Time: 3216.573 ms

Related

Google sheets COUNTIF excluding hidden rows

In google sheets, I have a list of strings (1 per row) where each string is split with 1 character per column, so my sheet looks something like below:
A
B
C
D
E
F
1
F
R
A
N
K
2
P
A
S
S
1
2
I then have this sheet filtered, so Can select only the rows where the first character is F, for example. On another sheet in the same workbook, I have a table of how often each character appears in each column, that looks something like this:
A
B
C
D
E
F
1
Char
Overall
1
2
3
2
A
979
141
304
165
3
B
281
173
69
15
I would like to have this table dynamically update, so that when I filter the first sheet my table shows the frequency only for the strings that meet the filter.
In Excel, this can be accomplished using a combination of SUMPRODUCT and SUBTOTAL but this doesn't work in google sheets. I've seen this done in sheets using helper columns, but I would like the solution to work for a string of an arbitrary number of strings with different lengths without having to change the sheet. Can this be done in Google Sheets?
Thanks!
Hidden cells are assigned with the value 0. One way to solve this is by adding a "helper" column in column A and set all the values in it to 1.
| A | B | C | D | E | F | G
--+--------+------+---+---------+-----+-----+-----
1 | Helper | Char | | Overall | 1 | 2 | 3
--+--------+------+---+---------+-----+-----+-----
2 | 1 | A | | 979 | 141 | 304 | 165
3 | 1 | B | | 281 | 173 | 69 | 15
Now instead of using COUNTIF, use the COUNTIFS formula where the second condition A2:A = 1. For example:
=COUNTIFS([YOUR_CONDITION], A2:A,"=1")
the A column values of hidden rows will calculate as 0, therefore will not be counted.

Calculate time differences and sum duration

I am trying to create a small "app" using tasker on my android phone that am supposed to track my workhours and over/under-time. I have managed to get tasker to send timestamps on the start/end of each workday and are writing them to a google sheet so it gets recorded like:
<Not implemented> <Not implemented>
| A | B | C | D | E | F |
| 2020-01-29 | 07:24 | 16:33 | 00:09 | | -02:51 |
| 2020-01-30 | 07:00 | 12:00 | -03:00 | | |
Where the "D" column is the difference between ordinary workhours (8) and actually registred hours.
The "F" column should summarize the "D" column and show the sum of all values.
The data in the three first columns are beeing sent correctly but I cant figure out how to set up formulas so that the values for the "D" column is added and and same thing with the cell in the "F" column. I have been trying to change to different formats and tried creating my own formats to but do not understand how to get it to work.
I'm getting a different result than you in D1. I wonder if you're also accounting for a lunch hour (so subtract 9 instead of 8), but these formulas worked for me:
in Column D: =(C1-B1)-(8/24)
in Cell F1: =sum(D1:D2)
Column D and Cell F1 are formatted as Time > Duration.
Here's the result:

Find query matches for a multi-row input

Basically I have a set of CSV that I want to search through using a key from outside the query area. The query data is arranged like:
| COLUMN L | COLUMN K |
|----------|----------|
| A,B,C,D | 1 |
| E,F,G,H | 2 |
| I,J,K | 3 |
etc
What I have so far is
query(A1:L12,"Select L where K matches '(?:^|,)"&A13&"(?:,.*|$)'",-1)
So if A13's value is G, 2 is returned.
This works fine and returns the correct row value where it matches A13, but what I actually want to do is for this to process a column of data and return results for each key (or A13 value), but I can't work out how to translate this into "arrayformula" format. I tried this:
query(A1:L12,"Select L where K matches '(?:^|,)"&A13:A24&"(?:,.*|$)'",-1)
But no luck.

working with strings inside of a string_agg

In PSQL I am aggregating concatenated strings from a table called genus_synonym
An example of the table is as follows
id|genus_synonym|specific_epithet_synonym
---|----------|-----------
1 | Acer | rubrum
2 | Acer | nigrum
3 | Betula | lenta
4 | Carya | ovata
5 | Carya | glabra
6 | Carya | tomentosa
here is an image of my table if that is easier
the code I am using is like this
Select
string_agg(CONCAT(CONCAT(s."genus_synonym"), ' ', s.specific_epithet_synonym), ', ')as syno
FROM
"public"."synonyms" as s
The result is:
Acer rubrum, Acer nigrum, Betula lenta, Carya ovata, Carya glabra, Carya tomentosa
What I am trying to figure out is if it is possible to instead produce this:
Acer rubrum, A. nigrum, Betula lenta, Carya ovata, C. glabra, C. tomentosa
Basically I am wanting to abbreviate the genus name to a single letter with a period following it, for the second and additional time a genus is repeated.
Even if this is not possible it would be good to know this and then if there was another way I could go about solving this.
Also, it doesn't look like anyone is responding to my question. Is it not clear? I haven't been able to find anything like this being asked before. Please let me know what I can do to make this question better.
qry:
t=# with a as (
select *,case when row_number() over (partition by genus_synonym) > 1 and count(1) over (partition by genus_synonym) > 1 then substr(genus_synonym,1,1)||'.' else genus_synonym end sh
from s92
)
select string_agg(concat(sh,' ',specific_epithet_synonym),',')
from a;
string_agg
-----------------------------------------------------------------------
Acer rubrum,A. nigrum,Betula lenta,Carya ovata,C. glabra,C. tomentosa
(1 row)
Time: 0.353 ms
mockup your data:
t=# create table s92 (id int,genus_synonym text,specific_epithet_synonym text);
CREATE TABLE
Time: 7.587 ms
t=# copy s92 from stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1 | Acer | rubrum
2 | Acer | nigrum
3 | Betula | lenta
4 | Carya | ovata
5 | Carya | glabra
6 | Carya | tomentosa
>> >> >> >> >> >> \.
COPY 6
Time: 6308.728 ms

Google Spreadsheets: How do you concat strings in an aggregation function

Say I have a table:
A, 1
B, 1
C, 2
D, 1
E, 2
How do I view the table grouping by the 2nd column and aggregating by the first with a comma separated concat function ie:
1, "A,B,D"
2, "C,E"
In both defining a pivot table and using the QUERY syntax, it seems that the only aggregation functions available are numerical aggregations like MIN, MAX, SUM, etc. Can I define my own aggregation function?
You have to add a "Calculated Field" to the pivot table, and then select "Summarise by > Custom". This will make the column names in your formula refer to an array of values (instead of a single value). Then you can type a formula like:
= JOIN(", ", MyStringColumn)
More specifically, if you have the following table:
Create a pivot table by going to "Data > Pivot table", with the following configuration. Ensure "Summarize by" is set to "Custom"!
Another option: if the data is in A2:B, then, say, in D2:
=UNIQUE(B2:B)
and then in E2:
=JOIN(",",FILTER(A$2:A,B$2:B=D2))
which is filled down as required.
There are one-formula, auto-expanding solutions, although they get quite convoluted.
You're right, there's no easy way with pivot tables. This though, will do the trick. Inspired by this brilliant answer here.
First, have a header row and run a sort on column A to group by category.
So far, in your example, we have
| A | B
---+-----------+-----------
1 | CATEGORY | ATTRIBUTE
2 | 1 | A
3 | 1 | B
4 | 1 | D
5 | 2 | C
6 | 2 | E
In column C, let's prep the concatenated strings. Start in cell C2 with the following formula, and fill out vertically.
=IF(A2<>A1, B2, C1 & "," & B2)
...looking good...
| A | B | C
---+-----------+-----------+-----------
1 | CATEGORY | ATTRIBUTE | STRINGS
2 | 1 | A | A
3 | 1 | B | A,B
4 | 1 | D | A,B,D
5 | 2 | C | C
6 | 2 | E | C,E
In column D, let's validate the rows we want to select in a later step, with the following formula, starting in cell D2 and filling out. Basically we are marking the final category rows that carry the full concatenated strings.
=A2<>A3
...almost there now
| A | B | C | D
---+-----------+-----------+----------+-----------
1 | CATEGORY | ATTRIBUTE | STRINGS | VALIDATOR
2 | 1 | A | A | FALSE
3 | 1 | B | A,B | FALSE
4 | 1 | D | A,B,D | TRUE
5 | 2 | C | C | FALSE
6 | 2 | E | C,E | TRUE
Now, lets copy column C and D and paste special as values in the same place. Then add a filter on the whole table and filter out column D for the rows labeled TRUE. Now, remove the filter, delete columns B and D and row 1.
| A | B
---+-----------+-----------
1 | 1 | A,B,D
2 | 2 | C,E
Done. Get ice cream. Watch Road House.

Resources