In PSQL I am aggregating concatenated strings from a table called genus_synonym
An example of the table is as follows
id|genus_synonym|specific_epithet_synonym
---|----------|-----------
1 | Acer | rubrum
2 | Acer | nigrum
3 | Betula | lenta
4 | Carya | ovata
5 | Carya | glabra
6 | Carya | tomentosa
here is an image of my table if that is easier
the code I am using is like this
Select
string_agg(CONCAT(CONCAT(s."genus_synonym"), ' ', s.specific_epithet_synonym), ', ')as syno
FROM
"public"."synonyms" as s
The result is:
Acer rubrum, Acer nigrum, Betula lenta, Carya ovata, Carya glabra, Carya tomentosa
What I am trying to figure out is if it is possible to instead produce this:
Acer rubrum, A. nigrum, Betula lenta, Carya ovata, C. glabra, C. tomentosa
Basically I am wanting to abbreviate the genus name to a single letter with a period following it, for the second and additional time a genus is repeated.
Even if this is not possible it would be good to know this and then if there was another way I could go about solving this.
Also, it doesn't look like anyone is responding to my question. Is it not clear? I haven't been able to find anything like this being asked before. Please let me know what I can do to make this question better.
qry:
t=# with a as (
select *,case when row_number() over (partition by genus_synonym) > 1 and count(1) over (partition by genus_synonym) > 1 then substr(genus_synonym,1,1)||'.' else genus_synonym end sh
from s92
)
select string_agg(concat(sh,' ',specific_epithet_synonym),',')
from a;
string_agg
-----------------------------------------------------------------------
Acer rubrum,A. nigrum,Betula lenta,Carya ovata,C. glabra,C. tomentosa
(1 row)
Time: 0.353 ms
mockup your data:
t=# create table s92 (id int,genus_synonym text,specific_epithet_synonym text);
CREATE TABLE
Time: 7.587 ms
t=# copy s92 from stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1 | Acer | rubrum
2 | Acer | nigrum
3 | Betula | lenta
4 | Carya | ovata
5 | Carya | glabra
6 | Carya | tomentosa
>> >> >> >> >> >> \.
COPY 6
Time: 6308.728 ms
Related
I have InfluxDB measurement currently set up with following "schema":
+----+-------------+-----------+
| ts | cost(field) | type(tag) |
+----+-------------+-----------+
| 1 | 10 | 'a' |
| 1 | 20 | 'b' |
| 2 | 12 | 'a' |
| 2 | 18 | 'b' |
| 2 | 22 | 'c' |
+------------------+-----------+
I am trying to write a query that will group my table by timestamp and get a delta between field values of two different tags. If I want to get delta between tag 'a' and tag 'b', it will give me following result (please not that I ignore tag 'c'):
+----+-----------+------------+
| ts | type(tag) | delta_cost |
+----+-----------+------------+
| 1 | 'a' | 10 |
| 2 | 'b' | 6 |
+----+-----------+------------+
Is it something Influx can do or am I using the wrong tool?
Just managed to answer my own question. While one of the obvious ways would be performing self-join, Influx does not support joins anymore. We can, however, use nested selects in a following format:
SELECT MEAN(cost_a) - MEAN(cost_b) as delta_cost
FROM
(SELECT cost as cost_a, tag, tablename where tag='a'),
(SELECT cost as cost_b, tag, tablename where tag='b')
GROUP BY time(60s)
Since I am getting my data every 60 seconds anyway, and I have a guarantee of just one point per tag per 60 seconds, I can use GROUP BY and take MEAN without any problems
I have three tables
Personal_video
+------------------------------+
|presonal_video_id | title |
----------------------------
1 | test1|
2 | test2|
3 | test3|
4 | test4|
personal_video_tags
+------------------------------+
|tag_id | tag_title |
----------------------------
1 | august|
2 | 2016 |
3 | 2015 |
4 | 2014 |
personal_video_tag_mapping
+------------------------------+
|tag_id | presonal_video_id |
----------------------------
1 | 1 |
2 | 2 |
3 | 3 |
4 | 1 |
Now i want to write a query which will return me the videos on the basis of common tags like if user select tag "August" & "2014" then the query should return videos which is connected to both the tags.
currently my query is
SELECT presonal_video_id,title
FROM personal_video
WHERE presonal_video_id IN
(
SELECT personal_video_id AS PID
FROM personal_video_tag_mapping
WHERE tag_id IN ("1","2") AND privacy_level != 2
GROUP BY personal_video_id
HAVING COUNT( PID ) > 1
)
It is giving me write result but when there is large data then it takes long time. Can someone teel me correct way to write this query
Thank You in advance
Try this query:
SELECT t1.presonal_video_id, t1.title
FROM personal_video AS t1
JOIN personal_video_tag_mapping AS t2
ON t1.presonal_video_id = t2.presonal_video_id
JOIN personal_video_tags AS t3
ON t2.tag_id = t3.tag_id
WHERE t3.tag_title IN ('august', '2014')
GROUP BY t1.presonal_video_id, t1.title
HAVING COUNT(*) = 2
Im trying to find an efficient way to solve the problem:
I need to find all rows in a table where there is another row with an opposite column value.
For example I have transactions with columns id and amount
| id | amount |
|----|--------|
| 1 | 1 |
| 2 | -1 |
| 3 | 2 |
| 4 | -2 |
| 5 | 3 |
| 6 | 4 |
| 7 | 5 |
| 8 | 6 |
The query should return only the first 4 rows:
| id | amount |
|----|--------|
| 1 | 1 |
| 2 | -1 |
| 3 | 2 |
| 4 | -2 |
My current solution is terribly efficient as I am going through 1000's of transactions:
transactions.find_each do |transaction|
unless transactions.where("amount = #{transaction.amount * -1}").count > 0
transactions = transactions.where.not(amount: transaction.amount).order("# amount DESC")
end
end
transactions
Are there any built in Rails or Postgresql functions that could help with this?
Use following query:
SELECT DISTINCT t1.*
FROM transactions t1
INNER JOIN transactions t2 ON t1.amount = t2.amount * -1;
SELECT * FROM the_table t
WHERE EXISTS (
SELECT * FROM the_table x
WHERE x.amount = -1*t.amount
-- AND x.amount > t.amount
);
Consider storing an absolute value indexed column then query for the positive value. Postgres has an absolute value function; but I think the beauty of ActiveRecord is that Arel abstracts away the SQL. DB specific SQL can be a pain if you change later.
There is type called abs which will return irrespective of symobol. From my example data is the table name
SELECT id,amount FROM DATA WHERE id = ABS(amount)
This is the sample test table
Here is the output
Say I have a table:
A, 1
B, 1
C, 2
D, 1
E, 2
How do I view the table grouping by the 2nd column and aggregating by the first with a comma separated concat function ie:
1, "A,B,D"
2, "C,E"
In both defining a pivot table and using the QUERY syntax, it seems that the only aggregation functions available are numerical aggregations like MIN, MAX, SUM, etc. Can I define my own aggregation function?
You have to add a "Calculated Field" to the pivot table, and then select "Summarise by > Custom". This will make the column names in your formula refer to an array of values (instead of a single value). Then you can type a formula like:
= JOIN(", ", MyStringColumn)
More specifically, if you have the following table:
Create a pivot table by going to "Data > Pivot table", with the following configuration. Ensure "Summarize by" is set to "Custom"!
Another option: if the data is in A2:B, then, say, in D2:
=UNIQUE(B2:B)
and then in E2:
=JOIN(",",FILTER(A$2:A,B$2:B=D2))
which is filled down as required.
There are one-formula, auto-expanding solutions, although they get quite convoluted.
You're right, there's no easy way with pivot tables. This though, will do the trick. Inspired by this brilliant answer here.
First, have a header row and run a sort on column A to group by category.
So far, in your example, we have
| A | B
---+-----------+-----------
1 | CATEGORY | ATTRIBUTE
2 | 1 | A
3 | 1 | B
4 | 1 | D
5 | 2 | C
6 | 2 | E
In column C, let's prep the concatenated strings. Start in cell C2 with the following formula, and fill out vertically.
=IF(A2<>A1, B2, C1 & "," & B2)
...looking good...
| A | B | C
---+-----------+-----------+-----------
1 | CATEGORY | ATTRIBUTE | STRINGS
2 | 1 | A | A
3 | 1 | B | A,B
4 | 1 | D | A,B,D
5 | 2 | C | C
6 | 2 | E | C,E
In column D, let's validate the rows we want to select in a later step, with the following formula, starting in cell D2 and filling out. Basically we are marking the final category rows that carry the full concatenated strings.
=A2<>A3
...almost there now
| A | B | C | D
---+-----------+-----------+----------+-----------
1 | CATEGORY | ATTRIBUTE | STRINGS | VALIDATOR
2 | 1 | A | A | FALSE
3 | 1 | B | A,B | FALSE
4 | 1 | D | A,B,D | TRUE
5 | 2 | C | C | FALSE
6 | 2 | E | C,E | TRUE
Now, lets copy column C and D and paste special as values in the same place. Then add a filter on the whole table and filter out column D for the rows labeled TRUE. Now, remove the filter, delete columns B and D and row 1.
| A | B
---+-----------+-----------
1 | 1 | A,B,D
2 | 2 | C,E
Done. Get ice cream. Watch Road House.
A | B | C | D | E | F | G
name|num|quant|item|quant2
car | 5 | 100 |
| | |wheel| 4
| | |axel | 2
| | |engine|1
truck| 2 | 20 |
| | |wheel| 6
| | |bed | 1
| | | axel| 2
I need a formula which will do B*C*E. the tables look like this, so it needs to be something like
=b$2*c$2*e3 and then dragged.... and then the next set, b$6*c$6*e7 and dragged, etc but i want sure how to get the cieling sort of something. if b5 is empty, look at each above until it finds the one not filled.
I am trying to use this to get total quantity of parts per car, truck etc.... and then group by part.
I dont have a set of DB tables to do this, just a spreadsheet.
I had to add some additional information to resolve this.
I was thinking there would be a way to do a google script that would do this and update the file, but i couldnt seem to find it.
I first summed each group item:
=b$3*e4
and dragged for that grouping.
Then afterwards, i went to a selection of space and wrote up a query.
=query(D:F, "select D,sum(F) group by D")