Crystal reports data not displaying in left outer linked tables - crystal-reports-xi

In Crystal Reports I have 2 tables linked via a LEFT OUTER JOIN - one for budget period and one for the actual expense balance per period.
Although the data in the expense table is correct, the data in the budget table returns no value if the value in the expense table is 0.

The main tables used are the BUDGET table(contains budget balances per GL acc per period) and BALANA table (contains the actual expense balance per GL acc per period)
This is the sql command:
SELECT "FISCALYEAR"."FIYNUM_0", "BALANA"."ACC_0", "BALANA"."DEBLED_0", "BALANA"."CDTLED_0", "BALANA"."DEBLED_1", "BALANA"."DEBLED_2", "BALANA"."DEBLED_3", "BALANA"."DEBLED_4", "BALANA"."DEBLED_5", "BALANA"."DEBLED_6", "BALANA"."DEBLED_7", "BALANA"."DEBLED_8", "BALANA"."DEBLED_9", "BALANA"."DEBLED_10", "BALANA"."DEBLED_11", "BALANA"."DEBLED_12", "BALANA"."CDTLED_1", "BALANA"."CDTLED_2", "BALANA"."CDTLED_3", "BALANA"."CDTLED_4", "BALANA"."CDTLED_5", "BALANA"."CDTLED_6", "BALANA"."CDTLED_7", "BALANA"."CDTLED_8", "BALANA"."CDTLED_9", "BALANA"."CDTLED_10", "BALANA"."CDTLED_11", "BALANA"."CDTLED_12", "BALANA"."LEDTYP_0", "BALANA"."CPY_0", "BALANA"."FCY_0", "BALANA"."BPR_0", "BALANA"."CURLED_0", "BALANA"."CCE1_0", "BUD"."AMT_1", "BUD"."AMT_2", "BUD"."AMT_3", "BUD"."AMT_4", "BUD"."AMT_5", "BUD"."AMT_6", "BUD"."AMT_7", "BUD"."AMT_8", "BUD"."AMT_9", "BUD"."AMT_10", "BUD"."AMT_11", "BUD"."ACC_0", "BUD"."AMT_0"
FROM ("sagex3v7live"."LIVE"."FISCALYEAR" "FISCALYEAR" LEFT OUTER JOIN "sagex3v7live"."LIVE"."BALANA" "BALANA" ON (("FISCALYEAR"."CPY_0"="BALANA"."CPY_0") AND ("FISCALYEAR"."FIYNUM_0"="BALANA"."FIY_0")) AND ("FISCALYEAR"."LEDTYP_0"="BALANA"."LEDTYP_0")) LEFT OUTER JOIN "sagex3v7live"."LIVE"."BUD" "BUD" ON ((((("FISCALYEAR"."FIYNUM_0"="BUD"."FIY_0") AND ("FISCALYEAR"."LEDTYP_0"="BUD"."LEDTYP_0")) AND ("FISCALYEAR"."CPY_0"="BUD"."CPY_0")) AND ("BALANA"."ACC_0"="BUD"."ACC_0")) AND ("BALANA"."CUR_0"="BUD"."CUR_0")) AND ("BALANA"."CCE1_0"="BUD"."CCE1_0")
WHERE "FISCALYEAR"."FIYNUM_0"=4 AND "BALANA"."LEDTYP_0"=1 AND "BALANA"."BPR_0"=N'' AND "BALANA"."FCY_0"<>N'' AND "BALANA"."CPY_0"=N'MAJ' AND "BALANA"."CURLED_0"=N'ZAR' AND ("BALANA"."CCE1_0">=N'BFN' AND "BALANA"."CCE1_0"<=N'PLZ')
ORDER BY "BALANA"."ACC_0", "BALANA"."CCE1_0"

Related

How to show all records from multiple tables regardless of match on join statement

I am trouble figuring out the proper syntax to structure this query correctly. I am trying to show ALL records from both the SalesHistoryDetail AND from the SalesVsBudget table. I believe my query allows for some of the records on SalesVsBudget to not be pulled, whereas I want them all for that period, regardless of whether there was a corresponding sale. Here is my code:
SELECT MAX(a.DispatchCenterOrderKey) AS DispatchCenter,
a.CustomerKey,
CASE WHEN a.CustomerKey IN
(SELECT AddressKey
FROM FinancialData.dbo.DimAddress
WHERE AddressKey >= 99000 AND AddressKey <= 99599) THEN 1 ELSE 0 END AS InterCompanyFlag,
MAX(a.Customer) AS Customer,
a.SalesmanID,
MAX(a.Salesman) AS Salesman,
a.SubCategoryKey,
MAX(a.SubCategoryDesc) AS Subcategory,
SUM(a.Value) AS SalesAmt,
b.FiscalYear AS Year,
b.FiscalWeekOfYear AS Week,
MAX(c.BudgetLbs) AS BudgetLbs,
MAX(c.BudgetDollars) AS BudgetDollars
FROM dbo.SalesHistoryDetail AS a
LEFT OUTER JOIN dbo.M_DateDim AS b ON a.InvoiceDate = b.Date
FULL OUTER JOIN dbo.SalesVsBudget AS c ON a.SalesmanID = c.SalesRepKey
AND a.CustomerKey = c.CustomerKey
AND a.SubCategoryKey = c.SubCategoryKey
AND b.FiscalYear = c.Year AND b.FiscalWeekOfYear = c.WeekNo
GROUP BY a.SalesmanID, a.CustomerKey, a.SubCategoryKey, b.FiscalYear, b.FiscalWeekOfYear
There are two different data sets that I am pulling from, obviously the SalesHistoryDetail table and the SalesVsBudget table. I'm hoping to get ALL budgetLbs, and BudgetDollars values from the SalesVsBudget table regardless of whether they match in the join. I want all of the matching joining records too, but I also want EVERY record from SalesVsBudget. Essentially I want to show ALL sales records and I want to reference the budget values from SalesVsBudget when the salesman,customer,subcategory, year and week match but I also want to see budget entries that fall in my date range that don't have corresponding sales records in that period. Hopefully that makes sense. I feel I am very close, but my budget numbers doesn't reflect the whole story and I think that is because some of my records are being excluded! Please help.
I was able to accomplish this through playing with the FULL OUTER JOIN. My problems was there were more records in SalesVsBudget than SalesHistory_V. Therefore I had to make SalesVsBudget the initial FROM table and SaleHistory_V with a FULL OUTER JOIN and all records lined up.

Hive join query returning Cartesian product on inner join

I am doing inner join on two tables that are created using Hive. One is a big table "trades_bucket" and another is a small table "counterparty_bucket". They are created as follows :-
DROP TABLE IF EXISTS trades_bucket;
CREATE EXTERNAL TABLE trades_bucket(
parentId STRING,
BookId STRING) CLUSTERED BY(parentId) SORTED BY(parentId) INTO 32 BUCKETS;
DROP TABLE IF EXISTS counterparty_bucket;
CREATE EXTERNAL TABLE counterparty_bucket(
Version STRING,AccountId STRING,childId STRING)
CLUSTERED BY(childId ) SORTED BY(childId) INTO 32 BUCKETS;
The Join between the tables
SELECT /*+ MAPJOIN(counterparty_bucket) */ BookId , t.counterpartysdsid, c.sds
FROM counterparty_bucket c join trades_bucket t
on c.childId = t.parentId
where c.childId ='10001684'
The problem is that the join is producing Cartesian product out of the two tables. What I mean is if big table has 100 rows and small table has 4 rows for a given id, I expect the join to return 100 rows, but I am getting back 400 rows. Anyone have a clue or anyone witnessed similar situation?

Hive Bucketed Map Join

I am facing issue in executing bucketed map join.
I am using hive 0.10.
Table1 is a partitioned table on year,month and day. Each partition data is bucketed by column c1 into 128 buckets. I have almost 100 million records per day.
Table 1
create table1
(
....
....
)
partitioned by (year int,month int,day int)
CLUSTERED BY(c1) INTO 128 BUCKETS;
Table2 is a large lookup table bucketed on column c1. I have 80 million records loaded into 128 buckets.
Table 2
create table2
(
c1
c2
...
)
CLUSTERED BY(c1) INTO 128 BUCKETS;
I have checked the data and it's loaded as per expectation into buckets.
Now, I am trying to enforce bucketed map join.That's where I am stuck.
set hive.auto.convert.join=true;
set hive.optimize.bucketmapjoin = true;
set hive.mapjoin.bucket.cache.size=1000000;
select a.c1 as c1_tb2,a.c2
b.c1,b....
from table2 a
JOIN table1 b
ON (a.c1=b.c1);
I am still not getting bucketed map join. Am I missing something? Even I tried to execute join on only 1 partition. But, still I am getting same result.
Or
Bucketed map join doesn't work partition tables?
Please help.Thanks.
This explanation is for Hive 0.13. AFAICT, bucketed map join doesn't take effect for auto converted map joins. You will need to explicitly call out map join in the syntax like this:
set hive.optimize.bucketmapjoin = true;
explain extended select /* +MAPJOIN(b) */ count(*)
from nation_b1 a
join nation_b2 b on (a.n_regionkey = b.n_regionkey);
Note that only explain extended shows you the flag that indicates if bucket map join is being used or not. Look for this line in the plan.
BucketMapJoin: true
Tables are bucketed in hive to manage/process the portion of data individually. It will make the process easy to manage and efficient in terms of performance.
Lets understand the join when the data is stored in buckets:
Lets say there are two tables user and user_visits and both table data is bucketed using user_id in 4 buckets . It means bucket 1 of user will contain rows with same user ids as that of bucket 1 of user_visits. And if a join is performed on these two tables on user_id columns, if it is possible to send bucket 1 of both tables to same mapper then good amount of optimization can be achieved. This is exactly done in bucketed map join.
Prerequisites for bucket map join:
Tables being joined are bucketized on the join columns,
The number of buckets in one table is a same/multiple of the number of buckets in the other table.
The buckets can be joined with each other, If the tables being joined are bucketized on the join columns. If table A has 4 buckets and table B has 4 buckets, the following join
SELECT /*+ MAPJOIN(b) */ a.key, a.valueFROM a JOIN b ON a.key = b.key
can be done on the mapper only. Instead of fetching B completely for each mapper of A, only the required buckets are fetched. For the query above, the mapper processing bucket 1 for A will only fetch bucket 1 of B. It is not the default behavior, and is governed by the following parameter
set hive.optimize.bucketmapjoin = true
If the tables being joined are sorted and bucketized on the join columns, and they have the same number of buckets, a sort-merge join can be performed. The corresponding buckets are joined with each other at the mapper. If both A and B have 4 buckets,
SELECT /*+ MAPJOIN(b) */ a.key, a.valueFROM A a JOIN B b ON a.key = b.key
can be done on the mapper only. The mapper for the bucket for A will traverse the corresponding bucket for B. This is not the default behavior, and the following parameters need to be set:
set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;

Change Data Capture with table joins in ETL

In my ETL process I am using Change Data Capture (CDC) to discover only rows that have been changed in the source tables since the last extraction. Then I do the transformation only for this rows. The problem is when I have for example 2 tables which I want to join into one dimension, and only one of them has changed. For example I have table Countries and Towns as following:
Countries:
ID Name
1 France
Towns:
ID Name Country_ID
1 Lyon 1
Now lets say a new row is added to Towns table:
ID Name Country_ID
1 Lyon 1
2 Paris 2
The Countries table has not been changed, so CDC for these tables shows me only the row from Towns table. The problem is when I do the join between Countries and Towns, there is no row in Countries change set, so the join will result in empty set.
Do you have an idea how to solve it? Of course there might be more difficult cases, involving 3 and more tables, and consequential joins.
This is a typical problem found when doing Realtime Change-Data-Capture, or even Incremental-only daily changes.
There's multiple ways to solve this.
One way would be to do your joins on the natural keys in the dimension or mapping table, to get the associated country (SELECT distinct country_name, [..other attributes..] from dim_table where country_id = X).
Another alternative would be to do the join as part of the change capture process - when a row is loaded to towns, a trigger goes off that loads the foreign key values into the associated staging tables (country, etc).
There is allot i could babble on for more information on but i will be specific to what is in your question. I would suggest the following to get the results...
1st Pass is where everything matches via the join...
Union All
2nd Pass Gets all towns where there isn't a country
(left outer join with a where condition that
requires the ID in the countries table to be null/missing).
You would default the Country ID value in that unmatched join to something designated as a "Unmatched Value" typically 0 or -1 is used or a series of standard -negative numbers that you could assign descriptions to later to identify why data is bad for your example -1 could be "Found Town Without Country".

How to select the max record for each of a group of candidates in Grails?

I have a table that gets populated every day with records from reporting systems.
I have a list of the serial numbers those i am interested in returning in an asset list.
How do I get Grails to return the records that match the maximum "epoch" entry for each asset? In sql I would cross join the table back to itself after picking out the maximum such as:
select a.* from assetTable a inner join (select sn, max(epoch) epoch from assetTable group by sn) b on a.sn = b.sn and a.epoch = b.epoch
but I cannot figure out how to get this done efficiently with Grails...
From a domain class perspective it is pretty simple. Consider for the same of example that I have a single domain class "AssetTable" and it has Integer epoch, String sn, ...
Literally, all I want to do is get the latest entry (all fields) for a subset of serial numbers (sn) that I have in a List.

Resources