Creating a new column/field OVER id in denodo - denodo

I was wondering if I could get some help in terms of calculating a difference between the date of various statuses.
I have a view with column named “id”, “create_dt” and “status”. I would have various statuses like submit, approve, deliver etc. Now, I want to find the time it took for a specific id between Approve and Submit status. What I am thinking currently is, creating few additional fields based on the status (I can use case statement for that) and finding the difference in time between the newly created date columns for various statuses.
The problem here is, I am not sure how to drive the calculation based on the specific id. I can’t do lag or lead because some “id” might go through different statuses and some might not (it’s not consistent).
I can’t create a the new date column based on id (something like partition by) because I am using case statement.
Could someone point me to the right direction?
Below is the screenshot of how my data currently looks like (using the case statement) and what my desired output is.
Current Result
Expected Result
From the expected result, I could easily find the difference between submitted and approved status for any ID using case statement whereas for the current result, I am not able to.
Thank you,

I would try pivoting the data. Here is a link to a Denodo community site that shows how to do this:
https://community.denodo.com/kb/view/document/How%20to%20Pivot%20and%20Unpivot%20views?category=Combining+Data
For your specific case, I created a small excel data source to simulate your issue in a view I named "p_sample" (using simplified dates and status names):
id | status | create_dt 1 | submit | 1/1/2017 1 | approve | 2/1/2017 1 | deliver | 2/2/2017
2 | submit | 1/1/2017 2 | approve | 1/10/2017
2 | deliver | 2/1/2017 3 | submit |
1/1/2017 ....
Since Denodo doesn't appear to support the PIVOT operator, instead we can use the following VQL to pivot your status dates so they are all on the same row:
select id
, max(case when status = 'submit' then create_dt end) as submit_dt
, max(case when status = 'approve' then create_dt end) as approve_dt
, max(case when status = 'deliver' then create_dt end) as deliver_dt
, max(case when status = 'reject' then create_dt end) as reject_dt
, max(case when status = 'other' then create_dt end) as other_dt
from p_sample
group by id
Then we can use that query as an inline view to perform the date math (or in Denodo you could make this 2 views - one with the above VQL and then a selection view on top of that which applies the date math):
select *, approve_dt - submit_dt as time_to_aprove
from (
select id
, max(case when status = 'submit' then create_dt end) as submit_dt
, max(case when status = 'approve' then create_dt end) as approve_dt
, max(case when status = 'deliver' then create_dt end) as deliver_dt
, max(case when status = 'reject' then create_dt end) as reject_dt
, max(case when status = 'other' then create_dt end) as other_dt
from p_sample
group by id
) AS Pivot
When you run this, you get each status date for the ID, as well as the time between submission and approval.
Query Results
The only drawback is if the status code list is very large or not well controlled, then this solution will not be flexible enough, but your example seems to indicate this won't be an issue.

Related

call data from sql in processmaker

i have a table in sql which is like this:
| product code | weight|
| ----------------------|-----------|
| 1235896 | 0.5|
| 3256kms | 1.5|
| dk-kmsw | 3 |
and the data type for [product code] is nvarchar
now i want to call the weight by putting the product code in processmaker
the code that i wrote is this:
select [weight] from table where [product code] = ##textVar047
and by this code i get nothing, i have changed the ## to ##, #= but it did not work.
how can i do this?
any comment is appreciated.
When you use ## in the SQL of a control, it means you are referencing another control's value. If that's your scenario I'd suggest first to retrieve the full list of product codes in a Suggest control (instead of a Textbox) with this SQL:
select PRODUCT_CODE, PRODUCT_CODE FROM YOUR_TABLE_NAME
(you call product code twice since Suggest controls, like dropdowns, need 2 values to be filled up, one for the id and one for the label)
Now that you have a way to obtain the actual code and it's saved in the suggest control id, you can make another field a dependent field with the SQL you where proposing:
select WEIGHT FROM YOUR_TABLE_NAME where PRODUCT_CODE = ##your_suggest_control_id
(## should work just fine as it just adds quotes to the variable)
You can also check the wiki page to get an in depth explanation of this. https://wiki.processmaker.com/3.1/Dependent_Fields
I hope this helps!
select CAST(weight AS nvarchar(max)) from table where [product code] = ##textVar047

create view of count based on latest timestamp for a particular id

I have been provided with the following code to run a query that counts the number of connector_pks grouped by group_status based on the latest timestamp:
SELECT
`group_status`,COUNT(*) 'Count of status '
FROM
(SELECT `connector_pk`, `group_status`, `status_timestamp`
FROM connector_status_report t1
WHERE `status_timestamp` = (SELECT MAX(`status_timestamp`)
FROM connector_status_report t2 WHERE t2.`connector_pk` = t1.`connector_pk`))
t3
GROUP BY `group_status`
Unfortunately this takes about 30 minutes to run so I was hoping for an optimised solution.
Example table
connector_pk group_status status timestamp
1 Available 2020-02-11 19:14:45
1 Charging 2020-02-11 19:18:45
2 Available 2020-02-11 19:15:45
2 Not Available 2020-02-11 19:18:45
3 Not Available 2020-02-11 19:14:45
The desired output would look like this:
group_Status | Count of status
Available | 0
Charging | 1
Not Available | 2
For my original question I was pointed to the following question (and answers):
Get records with max value for each group of grouped SQL results
I would like to create a view with the output
Is it possible to also add the following to the query to include in the View:
SELECT status, = IF(status = 'charging', 'Charging', if(status = 'Not
Occupied','Available', 'Occupied') AS group_status FROM
connector_status_report
I managed to speed up the query using the following:
CREATE VIEW statuscount AS
Select group_status, COUNT(*) 'Count of status'
FROM
(SELECT tt.*
FROM connector_status_report tt
INNER JOIN
(SELECT connector_pk, MAX(status_timestamp) AS MaxDateTime
FROM connector_status_report
GROUP BY connector_pk) groupedtt
ON tt.connector_pk = groupedtt.connector_pk
AND tt.status_timestamp = groupedtt.MaxDateTime) t3
GROUP BY group_status
If anybody can help with inserting the query that creates the 'Group_Status' column it would be much appreciated

How do I write a Rails finder method that will return the greatest date grouped by record?

I'm using Rails 5 with PostGres 9.5. I have a table that tracks prices ...
Table "public.crypto_prices"
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------------
id | integer | not null default nextval('crypto_prices_id_seq'::regclass)
crypto_currency_id | integer |
market_cap_usd | bigint |
total_supply | bigint |
last_updated | timestamp without time zone |
created_at | timestamp without time zone | not null
updated_at | timestamp without time zone | not null
I would like to get the latest price per currency (where last_updated is greatest) for a select currencies. I can find all the prices related to certain currencies like so
current_prices = CryptoPrice.where(crypto_currency_id: CryptoIndexCurrency.all.pluck(:crypto_currency_id).uniq)
Then I can sort them by currency into arrays, looping through each until I find the one with the greatest last_updated value, but how can I write a finder that will return exactly one row per currency with the greatest last_updated date?
Edit: Tried Owl Max's suggestion like so
ids = CryptoIndexCurrency.all.pluck(:crypto_currency_id).uniq
crypto_price_ids = CryptoPrice.where(crypto_currency_id: ids).group(:crypto_currency_id).maximum(:last_updated).keys
puts "price ids: #{crypto_price_ids.length}"
#crypto_prices = CryptoPrice.where(crypto_currency_id: crypto_price_ids)
puts "ids: #{#crypto_prices.size}"
Although the first "puts" only reveals a size of "12" the second puts reveals over 38,000 results. It should only be returning 12 results, one for each currency.
We can write a finder that will return exactly one row per currency with the greatest last_updated date in such a way like
current_prices = CryptoPrice.where(crypto_currency_id: CryptoIndexCurrency.all.pluck(:crypto_currency_id).uniq).select("*, id as crypto_price_id, MAX(last_updated) as last_updated").group(:crypto_currency_id)
I hope that this will took you closer to your goal. Thank you.
Only works with Rails5 because of or query method
specific_ids = CryptoIndexCurrency.distinct.pluck(:crypto_currency_id)
hash = CryptoPrice.where(crypto_currency_id: specific_ids)
.group(:crypto_currency_id)
.maximum(:last_updated)
hash.each_with_index do |(k, v), i|
if i.zero?
res = CryptoPrice.where(crypto_currency_id: k, last_updated: v)
else
res.or(CryptoPrice.where(crypto_currency_id: k, last_updated: v))
end
end
Explanation:
You can use group to regroup all your CryptoPrice object by each CryptoIndexCurrency presents in your table.
Then using maximum (thanks to #artgb) to take the biggest value last_updated. This will output a Hash with keys: crypto_currency_id and value
last_updated.
Finally, you can use keys to only get an Array of crypto_currency_id.
CryptoPrice.group(:crypto_currency_id).maximum(:last_updated)
=> => {2285=>2017-06-06 09:06:35 UTC,
2284=>2017-05-18 15:51:05 UTC,
2267=>2016-03-22 08:02:53 UTC}
The problem with this solution is that you get the maximum date for each row without getting the whole records.
To get the the records, you can do a loop on the hash pairwise. with crypto_currency_id and last_updated. It's hacky but the only solution I found.
Using this code you can fetch the latest updated row here from particular table.
CryptoPrice.order(:updated_at).pluck(:updated_at).last
This Should be help for you.
This is currently not easy to do in Rails in one statement/query. If you don't mind using multiple statements/queries than this is your solution:
cc_ids = CryptoIndexCurrency.distinct.pluck(:crypto_currency_id)
result = cc_ids.map do |cc_id|
max_last_updated = CryptoPrice.where(crypto_currency_id: cc_id).maximum(:last_updated)
CryptoPrice.find_by(crypto_currency_id: cc_id, last_updated: max_last_updated)
end
The result of the map method is what you are looking for. This produces 2 queries for every crypto_currency_id and 1 query to request the crypto_currency_ids.
If you want to do this with one query you'll need to use OVER (PARTITION BY ...). More info on this in the following links:
Fetch the row which has the Max value for a column
https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql
https://blog.codeship.com/folding-postgres-window-functions-into-rails/
But in this scenario you'll have to write some SQL.
EDIT 1:
If you want a nice Hash result run:
cc_ids.zip(result).to_h
EDIT 2:
If you want to halve the amount of queries you can shove the max_last_updated query in the find_by as sub-query like so:
cc_ids = CryptoIndexCurrency.distinct.pluck(:crypto_currency_id)
result = cc_ids.map do |cc_id|
CryptoPrice.find_by(<<~SQL.squish)
crypto_currency_id = #{cc_id} AND last_updated = (
SELECT MAX(last_updated)
FROM crypto_prices
WHERE crypto_currency_id = #{cc_id})
SQL
end
This produces 1 queries for every crypto_currency_id and 1 query to request the crypto_currency_ids.

Query the most recent timestamp (MAX/Last) for a specific key, in Influx

Using InfluxDB (v1.1), I have the requirement where I want to get the last entry timestamp for a specific key. Regardless of which measurement this is stored and regardless of which value this was.
The setup is simple, where I have three measurements: location, network and usage.
There is only one key: device_id.
In pseudo-code, this would be something like:
# notice the lack of a FROM clause on measurement here...
SELECT MAX(time) WHERE 'device_id' = 'x';
The question: What would be the most efficient way of querying this?
The reason why I want this is that there will be a decentralised sync process. Some devices may have been updated in the last hour, whilst others haven't been updated in months. Being able to get a distinct "last updated on" timestamp for a device (key) would allow me to more efficiently store new points to Influx.
I've also noticed there is a similar discussion on InfluxDB's GitHub repo (#5793), but the question there is not filtering by any field/key. And this is exactly what I want: getting the 'last' entry for a specific key.
Unfortunately there wont be single query that will get you what you're looking for. You'll have to do a bit of work client side.
The query that you'll want is
SELECT last(<field name>), time FROM <measurement> WHERE device_id = 'x'
You'll need to run this query for each measurement.
SELECT last(<field name>), time FROM location WHERE device_id = 'x'
SELECT last(<field name>), time FROM network WHERE device_id = 'x'
SELECT last(<field name>), time FROM usage WHERE device_id = 'x'
From there you'll get the one with the greatest time stamp
> select last(value), time from location where device_id = 'x'; select last(value), time from network where device_id = 'x'; select last(value), time from usage where device_id = 'x';
name: location
time last
---- ----
1483640697584904775 3
name: network
time last
---- ----
1483640714335794796 4
name: usage
time last
---- ----
1483640783941353064 4
tl;dr;
The first() and last() selectors will NOT work consistently if the measurement have multiple fields, and fields have NULL values. The most efficient solution is to use these queries
First:
SELECT * FROM <measurement> [WHERE <tag>=value] LIMIT 1
Last:
SELECT * FROM <measurement> [WHERE <tag>=value] ORDER BY time DESC LIMIT 1
Explanation:
If you have a single field in your measurement, then the suggested solutions will work, but if you have more than one field and values can be NULL then first() and last() selectors won't work consistently and may return different timestamps for each field. For example, let's say that you have the following data set:
time fieldKey_1 fieldKey_2 device
------------------------------------------------------------
2019-09-16T00:00:01Z NULL A 1
2019-09-16T00:00:02Z X B 1
2019-09-16T00:00:03Z Y C 2
2019-09-16T00:00:04Z Z NULL 2
In this case querying
SELECT first(fieldKey_1) FROM <measurement> WHERE device = "1"
will return
time fieldKey_1
---------------------------------
2019-09-16T00:00:02Z X
and the same query for first(fieldKey_2) will return a different time
time fieldKey_2
---------------------------------
2019-09-16T00:00:01Z A
A similar problem will happen when querying with last.
And in case you are wondering, it wouldn't do querying 'first(*)' since you'll get an 'epoch-0' time in the results, such as:
time first_fieldKey_1 first_fieldKey_2
-------------------------------------------------------------
1970-01-01T00:00:00Z X A
So, the solution would be querying using combinations of LIMIT and ORDER BY.
For instance, for the first time value you can use:
SELECT * FROM <measurement> [WHERE <tag>=value] LIMIT 1
and for the last one you can use
SELECT * FROM <measurement> [WHERE <tag>=value] ORDER BY time DESC LIMIT 1
It is safe and fast as it will relay on indexes.
Is curious to mention that this more simple approach was mentioned in the thread linked in the opening post, but was discarded. Maybe it was just lost overlooked.
Here there's a thread in InfluxData blogs about the subject also suggesting to use this approach.
I tried this and it worked for me in a single command :
SELECT last(<field name>), time FROM location, network, usage WHERE device_id = 'x'
The result I got :
name: location
time last
---- ----
1483640697584904775 3
name: network
time last
---- ----
1483640714335794796 4
name: usage
time last
---- ----
1483640783941353064 4

How to show same column in dbgrid with different criteria

i need your help to finish my delphi homework.
I use ms access database and show all data in 1 dbgrid using sql. I want to show same column but with criteria (50 record per column)
i want select query to produce output like:
No | Name | No | Name |
1 | A | 51 | AA |
2 | B | 52 | BB |
3~50 | | 53~100| |
Is it possible ?
I can foresee issues if you choose to return a dataset with duplicate column names. To fix this, you must change your query to enforce strictly unique column names, using as. For example...
select A.No as No, A.Name as Name, B.No as No2, B.Name as Name2 from TableA A
join TableB B on B.Something = A.Something
Just as a note, if you're using a TDBGrid, you can customize the column titles. Right-click on the grid control in design-time and select Columns Editor... and a Collection window will appear. When adding a column, link it to a FieldName and then assign a value to Title.Caption. This will also require that you set up all columns. When you don't define any columns here, it automatically returns all columns in the query.
On the other hand, a SQL query may contain duplicate field names in the output, depending on how you structure the query. I know this is possible in SQL Server, but I'm not sure about MS Access. In any case, I recommend always returning a dataset with unique column names and then customizing the DB Grid's column titles. After all, it is also possible to connect to an excel spreadsheet, which can very likely have identical column names. The problem arrives when you try to read from one of those columns for another use.

Resources