Handcrafted OData queries on Exact Online with Invantive - odata

We are currently running a number of hand-crafted and optimized OData queries on Exact Online using Python. This runs on several thousand of divisions. However, I want to migrate them to Invantive SQL for ease of maintenance.
But some of the optimizations like explicit orderby in the OData query are not forwarded to Exact Online by Invantive SQL; they just retrieve all data or the top x and then do an orderby.
Especially for maximum value determination that can be a lot slower.
Simple sample on small table:
https://start.exactonline.nl/api/v1/<<division>>/financial/Journals?$select=BankAccountIBAN,BankAccountDescription&$orderby=BankAccountIBAN desc&$top=5
Is there an alternative to optimize the actual OData queries executed by Invantive SQL?

You can either use the Data Replicator or send the hand-craft OData query through a native platform request, such as:
insert into NativePlatformScalarRequests
( url
, orig_system_group
)
select replace('https://start.exactonline.nl/api/v1/{division}/financial/Journals?$select=BankAccountIBAN,BankAccountDescription&$orderby=BankAccountIBAN desc&$top=5', '{division}', code)
, 'MYSTUFF-' || code
from systempartitions#datadictionary
limit 100 /* First 100 divisions. */
create or replace table exact_online_download_journal_top5#inmemorystorage
as
select jte.*
from ( select npt.result
from NativePlatformScalarRequests npt
where npt.orig_system_group like 'MYSTUFF-%'
and npt.result is not null
) npt
join jsontable
( null
passing npt.result
columns BankAccountDescription varchar2 path 'd[0].BankAccountDescription'
, BankAccountIBAN varchar2 path 'd[0].BankAccountIBAN'
) jte
From here on you can use the in memory table, such as:
select * from exact_online_download_journal_top5#inmemorystorage
But of course you can also 'insert into sqlserver'.

Related

How to use result cache while using joins in SAP HANA?

SELECT
cntdpts."PROJECT_SID",
cntdpts."USER_SID",
"CNTDPTS",
"CNTQUERIES"
FROM (
SELECT
"PROJECT_SID",
"USER_SID",
COUNT("DATA_POINT_SID") AS "CNTDPTS"
FROM
CNTDPTS
GROUP BY
"PROJECT_SID",
"USER_SID" WITH HINT(RESULT_CACHE) ) cntdpts
INNER JOIN (
SELECT
"PROJECT_SID",
"USER_SID",
COUNT("QUERY_SID") AS "CNTQUERIES"
FROM
CNTQUERIES
GROUP BY
"PROJECT_SID",
"USER_SID" WITH HINT(RESULT_CACHE) ) cntqueries ON
cntdpts."PROJECT_SID" = cntqueries."PROJECT_SID"
AND cntdpts."USER_SID" = cntqueries."USER_SID" WITH HINT(RESULT_CACHE)
I am having troubles with using cached table functions. If I run the two subqueries "cntdpts" and "cntqueries" individually they return the result within <100ms (because they use the cache of the table function CNTDPTS and CNTQUERIES. However if I run the full query with joining the two subqueries it takes >5s and HANA does not seem to take advantage of the cached results from the subqueries. Is there any HINT I still need to add maybe?
You will need to add WITH HINT(RESULT_CACHE_NON_TRANSACTIONAL) to your outermost query.
See also https://help.sap.com/viewer/9de0171a6027400bb3b9bee385222eff/2.0.05/en-US/3ad0e93de0aa408e9238fa862e4780df.html

Count occurrence of values in a serialized attribute(array) in Active Admin dashboard (Rails, Active admin 1.0, Postgresql database, postgres_ext gem)

I'd like to have a basic table summing up the number of occurence of values inside arrays.
My app is a Daily Deal app built to learn more Ruby on Rails.
I have a model Deals, which has one attribute called Deal_goal. It's a multiple select which is serialized in an array.
Here is the deal_goal taken from schema.db:
t.string "deal_goal",:array => true
So a deal A can have deal= goal =[traffic, qualification] and another deal can have as deal_goal=[branding, traffic, acquisition]
What I'd like to build is a table in my dashboard which would take each type of goal (each value in the array) and count the number of deals whose deal_goal's array would contain this type of goal and count them.
My objective is to have this table:
How can I achieve this? I think I would need to group each deal_goal array for each type of value and then count the number of times where this goals appears in the arrays. I'm quite new to RoR and can't manage to do it.
Here is my code so far:
column do
panel "top of Goals" do
table_for Deal.limit(10) do
column ("Goal"), :deal_goal ????
# add 2 columns:
'nb of deals with this goal'
'Share of deals with this goal'
end
end
Any help would be much appreciated!
I can't think of any clean way to get the results you're after through ActiveRecord but it is pretty easy in SQL.
All you're really trying to do is open up the deal_goal arrays and build a histogram based on the opened arrays. You can express that directly in SQL this way:
with expanded_deals(id, goal) as (
select id, unnest(deal_goal)
from deals
)
select goal, count(*) n
from expanded_deals
group by goal
And if you want to include all four goals even if they don't appear in any of the deal_goals then just toss in a LEFT JOIN to say so:
with
all_goals(goal) as (
values ('traffic'),
('acquisition'),
('branding'),
('qualification')
),
expanded_deals(id, goal) as (
select id, unnest(deal_goal)
from deals
)
select all_goals.goal goal,
count(expanded_deals.id) n
from all_goals
left join expanded_deals using (goal)
group by all_goals.goal
SQL Demo: http://sqlfiddle.com/#!15/3f0af/20
Throw one of those into a select_rows call and you'll get your data:
Deal.connection.select_rows(%q{ SQL goes here }).each do |row|
goal = row.first
n = row.last.to_i
#....
end
There's probably a lot going on here that you're not familiar with so I'll explain a little.
First of all, I'm using WITH and Common Table Expressions (CTE) to simplify the SELECTs. WITH is a standard SQL feature that allows you to produce SQL macros or inlined temporary tables of a sort. For the most part, you can take the CTE and drop it right in the query where its name is:
with some_cte(colname1, colname2, ...) as ( some_pile_of_complexity )
select * from some_cte
is like this:
select * from ( some_pile_of_complexity ) as some_cte(colname1, colname2, ...)
CTEs are the SQL way of refactoring an overly complex query/method into smaller and easier to understand pieces.
unnest is an array function which unpacks an array into individual rows. So if you say unnest(ARRAY[1,2]), you get two rows back: 1 and 2.
VALUES in PostgreSQL is used to, more or less, generate inlined constant tables. You can use VALUES anywhere you could use a normal table, it isn't just some syntax that you throw in an INSERT to tell the database what values to insert. That means that you can say things like this:
select * from (values (1), (2)) as dt
and get the rows 1 and 2 out. Throwing that VALUES into a CTE makes things nice and readable and makes it look like any old table in the final query.

executing query from ruby on rails the right way

I'm just beginning with ruby on rails and have a question regarding a bit more complex query. So far I've done simple queries while looking at rails guide and it worked really well.
Right now I'm trying to get some Ids from database and I would use those Ids to get the real objects and do something with them. Getting those is a bit more complex than simple Object.find method.
Here is how my query looks like :
select * from quotas q, requests r
where q.id=r.quota_id
and q.status=3
and r.text is not null
and q.id in
(
select A.id from (
select max(id) as id, name
from quotas
group by name) A
)
order by q.created_at desc
limit 1000;
This would give me 1000 ids when executing this query from sql manager. And I was thinking to obtain the list of ids first and then find objects by id.
Is there a way to get these objects directly by using this query? Avoiding ids lookup? I googled that you can execute query like this :
ActiveRecord::Base.connection.execute(query);
Assuming Quota has_many :requests,
Quota.includes(:requests).
where(status:3).
where('requests.text is not null').
where("quotas.id in (#{subquery_string_here})").
order('quotas.created_at desc').limit(1000)
I'm by no means an expert but most basic SQL functionality is baked into ActiveRecord. You might also want to look at the #group and #pluck methods for ways to eliminate the ugly string subquery.
Calling #to_sql on a relationship object will show you the SQL command it is equivalent to, and may help with your debugging.
I would use find_by_sql for this. I wouldn't swear that this is exactly right, but as I recall you can pretty much plonk an SQL statement into a find_by_sql and the resulting columns will be returned as attributes of an array of objects of the class you call it on:
status = 3
Quota.find_by_sql('
select *
from quotas q, requests r
where q.id=r.quota_id
and q.status= ?
and r.text is not null
and q.id in
(
select A.id from (
select max(id) as id, name
from quotas
group by name) A
)
order by q.created_at desc
limit 1000;', status)
If you come to Rails as someone used to writing raw SQL, you're probably better off using this syntax than stringing together a bunch of ActiveRecord methods - the result is the same, so it's just a matter of what you find more readable.
Btw, you shouldn't use string interpolation (i.e. #{variable} syntax) inside an SQL query. Use the '?' syntax instead (see my example) to avoid SQL injection potential.

Linq to Sql and T-SQL Performance Discrepancy

I have an MVC web site the presents a paged list of data records from a SQL Server database. The UI allows the user to filter the returned data on a number of different criteria, e.g. email address. Here is a snippet of code:
Stopwatch stopwatch = new Stopwatch();
var temp = SubscriberDB
.GetSubscribers(model.Filter, model.PagingInfo);
// Inspect SQL expression here
stopwatch.Start();
model.Subscribers = temp.ToList();
stopwatch.Stop(); // 9 seconds plus compared to < 1 second in Query Analyzer
When this code is run, the StopWatch shows an execution time of around 9 seconds. If I capture the generated SQL expression (just before it is evaluated with the .ToList() method) and cut'n'paste that as a query into SQL Server Management Studio, the execution times drops to less than 1 second. For reference here is the generated SQL expression:
SELECT [t2].[SubscriberId], [t2].[Email], [t3].[Reference] AS [DataSet], [t4].[Reference] AS [DataSource], [t2].[Created]
FROM (
SELECT [t1].[SubscriberId], [t1].[SubscriberDataSetId], [t1].[SubscriberDataSourceId], [t1].[Email], [t1].[Created], [t1].[ROW_NUMBER]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[Email], [t0].[SubscriberDataSetId]) AS [ROW_NUMBER], [t0].[SubscriberId], [t0].[SubscriberDataSetId], [t0].[SubscriberDataSourceId], [t0].[Email], [t0].[Created]
FROM [dbo].[inbox_Subscriber] AS [t0]
WHERE [t0].[Email] LIKE '%_EMAIL_ADDRESS_%'
) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN 0 + 1 AND 0 + 20
) AS [t2]
INNER JOIN [dbo].[inbox_SubscriberDataSet] AS [t3] ON [t3].[SubscriberDataSetId] = [t2].[SubscriberDataSetId]
INNER JOIN [dbo].[inbox_SubscriberDataSource] AS [t4] ON [t4].[SubscriberDataSourceId] = [t2].[SubscriberDataSourceId]
ORDER BY [t2].[ROW_NUMBER]
If I remove the email filter clause, then the controller's StopWatch returns a similar response time to the SQL Management Studio query, less than 1 second - so I am assuming that the basic interface to SQL plumbing is working correctly and that the problem lies with the evaluation of the Linq expression. I should also mention that this is quite a large database with upwards of 1M rows in the subscriber table.
Can anyone throw any light on why there should be such a high (x10) performance differential and what, if anything can be done to address this?
Well not sure about that. 1M rows with a full like can take quiet time. Is Email indexed? Can you run the query with Email% instead of %Email% and see what happen?

Handle URL query parameters in Play! SQL statement?

I am testing a Ext JS application (Client Side) and Play Framework (Service Side).
I am using a grid in Ext JS with pagination.
The pagination part requires to send URL Query Parameters to my Play! server. This is no big deal, but how to process these parameters in the SQL Statement??
Example:
First request:
http://myDomain:9000/GetUsers?_dc=123456789&page=1&start=0&limit=25
Second reqeust:
http://myDomain:9000/GetUsers?_dc=123456789&page=2&start=25&limit=25
My thoughts:
Normally in SQL you can set the TOP results:
SELECT TOP 25 FROM USERS
But how to translate the second request into a Sql query?
Thank you for taking time to help me out!
======>>
EDIT: I am developing on SQL Server 2008, but I want this working on Sql Server 2005 or higher and Oracle 9 and higher :-)
Since you're using the Play! framework, what you should do is have a proper model, with entities representing your SQL tables. Then receiving a range of results is built in:
// 25 max users start at 25
List<User> users = User.all().from(25).fetch(25);
You should also look at the pagination module. I haven't tested it, but it looks like exactly what you want.
You could try something like:
WITH Query_1 AS (
SELECT
Field1, Field2, etc
ROW_NUMBER() OVER (ORDER BY Field1, Field2, etc) AS RowID
FROM Table
WHERE x=y
)
SELECT * FROM Query_1 WHERE RowID >= #start
AND RowID < #start + #limit
Of course ROW_NUMBER didn't exist back in SQL 2000 but since you've not told us which SQL you're working with I'm assuming something newer.

Resources