Inner join return duplicated record

Inner join return duplicated record - join

My table:
Report_Period
Entity
Tag
Users Count
Report_Period_M-1
Report_Period_Q-1
...
2017-06-30
entity 1
X
471
2017-05-31
2017-03-31
...
2020-12-31
entity 2
A
135
2020-11-30
2020-09-30
...
2020-11-30
entity 3
X
402
2020-10-31
2020-08-31
...
What I want:
Report_Period
Entity
Tag
Users Count
Users_Count_M-1
Users_Count_Q-1
...
2017-06-30
entity 1
X
471
450
438
...
2020-12-31
entity 2
A
135
122
118
...
2020-11-30
entity 3
X
402
380
380
...
I have have tried this code but it duplicate records! How can I avoid it?
SELECT M."Entity",M."Tag",M."Report_Period",M."Users Count",
M."Report_Period_M-1",M1."Users Count" AS "Users Count M1",
FROM "DB"."SCHEMA"."PERIOD" M, "DB"."SCHEMA"."PERIOD" M1
WHERE M."Report_Period_M-1"= M1."Report_Period"

Your join clause should include the entity column and tag (I suspect)
SELECT M."Entity",
M."Tag",
M."Report_Period",
M."Users Count",
M."Report_Period_M-1",
M1."Users Count" AS "Users Count M1",
FROM "DB"."SCHEMA"."PERIOD" M,
"DB"."SCHEMA"."PERIOD" M1
WHERE M."Report_Period_M-1"= M1."Report_Period"
AND M."Entity" = M1."Entity"
AND M."Tag" = M1."Tag"

Related

left outer join query in Informix

I have two queries from which I expected to get same result.
First one:
select d.serialno, d.location, d.ticket, d.vehicle, o.name, d.parrivaltime, e.etype
from deliveries d left outer join events e
on d.serialno=e.eserialno and d.client=e.eclient and d.carrier=e.ecarrier and d.location=e.elocation
left outer join operators o
on d.client=o.client AND d.driver=o.code AND d.carrier=o.carrier AND d.location=o.location
where d.statusmessage='CURRENT' and d.scheduleddate BETWEEN '08/02/2017' AND '08/03/2017' and d.dropno!=0 and d.location in ('MIAMID') AND (e.etype=107 or e.etype=108)
order by d.location, d.dropno, e.etype;
Second one:
select d.serialno, d.location, d.ticket, d.vehicle, o.name, d.parrivaltime, e.etype
from deliveries d, outer events e, outer operators o
where d.serialno=e.eserialno and d.client=e.eclient and d.carrier=e.ecarrier and d.location=e.elocation
AND d.client=o.client and d.driver=o.code and d.carrier=o.carrier and d.location=o.location
AND d.statusmessage='CURRENT' and d.scheduleddate BETWEEN '08/02/2017' AND '08/03/2017' and d.dropno!=0 and d.location in ('MIAMID') AND (e.etype=107 or e.etype=108)
order by d.location, d.dropno, e.etype;
However, from the first query, I got 1044 records. But from the second query, I got 876 records.
I also checked the explain.out file which is as below. But I still cannot figure out why the output record is different.
QUERY: (OPTIMIZATION TIMESTAMP: 09-06-2017 11:27:22)
------
select d.serialno, d.location, d.ticket, d.vehicle, o.name, d.parrivaltime, e.etype from deliveries d, outer events e, outer operators o where d.serialno=e.eserialno and d.client=e.eclient and d.carrier=e.ecarrier and d.location=e.elocation AND d.client=o.client AND d.driver=o.code AND d.carrier=o.carrier AND d.location=o.location AND d.statusmessage='CURRENT' and d.scheduleddate BETWEEN '08/02/2017' AND '08/03/2017' and d.dropno!=0 and d.location in ('MIAMID') AND (e.etype=107 or e.etype=108) order by d.location, d.dropno, e.etype
Estimated Cost: 304
Estimated # of Rows Returned: 71
Temporary Files Required For: Order By
1) jlong.d: INDEX PATH
Filters: (jlong.d.statusmessage = 'CURRENT' AND jlong.d.dropno != 0 )
(1) Index Name: informix.delivcapac1idx
Index Keys: scheduleddate location client customername (Serial, fragments: ALL)
Index Self Join Keys (scheduleddate )
Lower bound: jlong.d.scheduleddate >= 08/02/2017
Upper bound: jlong.d.scheduleddate <= 08/03/2017
Lower Index Filter: jlong.d.scheduleddate = jlong.d.scheduleddate AND jlong.d.location = 'MIAMID'
2) jlong.o: INDEX PATH
(1) Index Name: informix. 126_300
Index Keys: client carrier location code (Serial, fragments: ALL)
Lower Index Filter: (((jlong.d.driver = jlong.o.code AND jlong.d.location = jlong.o.location ) AND jlong.d.carrier = jlong.o.carrier ) AND jlong.d.client = jlong.o.client )
NESTED LOOP JOIN
3) jlong.e: INDEX PATH
Filters: ((((jlong.d.carrier = jlong.e.ecarrier AND (jlong.e.etype = 107 OR jlong.e.etype = 108 ) ) AND jlong.d.location = jlong.e.elocation ) AND jlong.d.client = jlong.e.eclient ) AND jlong.e.elocation = 'MIAMID' )
(1) Index Name: informix.ix154_17
Index Keys: eserialno (Serial, fragments: ALL)
Lower Index Filter: jlong.d.serialno = jlong.e.eserialno
NESTED LOOP JOIN
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 d
t2 o
t3 e
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 603 71 742 00:00.00 122
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t2 603 741 603 00:00.00 1
type rows_prod est_rows time est_cost
-------------------------------------------------
nljoin 603 71 00:00.00 158
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t3 876 1597 1729 00:00.00 2
type rows_prod est_rows time est_cost
-------------------------------------------------
nljoin 1044 71 00:00.01 291
type rows_sort est_rows rows_cons time est_cost
------------------------------------------------------------
sort 1044 71 1044 00:00.01 14
QUERY: (OPTIMIZATION TIMESTAMP: 09-06-2017 11:27:29)
------
select d.serialno, d.location, d.ticket, d.vehicle, o.name, d.parrivaltime, e.etype from deliveries d left outer join events e on d.serialno=e.eserialno and d.client=e.eclient and d.carrier=e.ecarrier and d.location=e.elocation left outer join operators o on d.client=o.client AND d.driver=o.code AND d.carrier=o.carrier AND d.location=o.location where d.statusmessage='CURRENT' and d.scheduleddate BETWEEN '08/02/2017' AND '08/03/2017' and d.dropno!=0 and d.location in ('MIAMID') AND (e.etype=107 or e.etype=108) order by d.location, d.dropno, e.etype
Estimated Cost: 254
Estimated # of Rows Returned: 1
Temporary Files Required For: Order By
1) jlong.d: INDEX PATH
Filters: (jlong.d.statusmessage = 'CURRENT' AND jlong.d.dropno != 0 )
(1) Index Name: informix.delivcapac1idx
Index Keys: scheduleddate location client customername (Serial, fragments: ALL)
Index Self Join Keys (scheduleddate )
Lower bound: jlong.d.scheduleddate >= 08/02/2017
Upper bound: jlong.d.scheduleddate <= 08/03/2017
Lower Index Filter: jlong.d.scheduleddate = jlong.d.scheduleddate AND jlong.d.location = 'MIAMID'
2) jlong.e: INDEX PATH
Filters: ((jlong.e.etype = 107 OR jlong.e.etype = 108 ) AND jlong.e.elocation = 'MIAMID' )
(1) Index Name: informix.ix154_17
Index Keys: eserialno (Serial, fragments: ALL)
Lower Index Filter: jlong.d.serialno = jlong.e.eserialno
ON-Filters:(((jlong.d.serialno = jlong.e.eserialno AND jlong.d.client = jlong.e.eclient ) AND jlong.d.carrier = jlong.e.ecarrier ) AND jlong.d.location = jlong.e.elocation )
NESTED LOOP JOIN
3) jlong.o: INDEX PATH
(1) Index Name: informix. 126_300
Index Keys: client carrier location code (Serial, fragments: ALL)
Lower Index Filter: (((jlong.d.client = jlong.o.client AND jlong.d.driver = jlong.o.code ) AND jlong.d.carrier = jlong.o.carrier ) AND jlong.d.location = jlong.o.location )
ON-Filters:(((jlong.d.client = jlong.o.client AND jlong.d.driver = jlong.o.code ) AND jlong.d.carrier = jlong.o.carrier ) AND jlong.d.location = jlong.o.location )
NESTED LOOP JOIN(LEFT OUTER JOIN)
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 d
t2 e
t3 o
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 603 71 742 00:00.00 122
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t2 1752 1597 1729 00:00.00 2
type rows_prod est_rows time est_cost
-------------------------------------------------
nljoin 876 1 00:00.00 254
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t3 1752 9862 876 00:00.00 1
type rows_prod est_rows time est_cost
-------------------------------------------------
nljoin 876 1 00:00.01 254
type rows_sort est_rows rows_cons time est_cost
------------------------------------------------------------
sort 876 1 876 00:00.01 0
Can anybody help to analyze the explain file and give the reason of different output?
Thanks

aggration and self join

hi I am working on this query, wondering whether there is a good way to achieve. TableA as
zone_no price produceDate
54 12.33 20161201
58 7.88 20161224
64 28.27 20160812
67 20.45 20160405
87 14.08 20161102
92 1.69 20160101
101 12.57 20140501
141 22.21 20150601
157 14.28 20160417
select max(price) from tableA where zone_no between 54 and 145
select max(price) from tableA where Zone_no between 92 and 141
outcome:
price(Zone 54-145) price(zone 92-141)
28.27 22.57
how to achieve this without CTE? thanks

alternative solution
select sum(een), sum(twee) from (
select max(price) as een, 0 as twee from tableA where zone_no between 54 and 145
union
select 0, max(price) from tableA where Zone_no between 92 and 141
)

Dask: ValueError: Integer column has NA values

I tried to use dask and found something that appears to be a bug in dask.dataframe.read_csv.
import dask.dataframe as dd
types = {'id': 'int16', 'Semana': 'uint8', 'Agencia_ID': 'uint16', 'Canal_ID': 'uint8',
'Ruta_SAK': 'uint16' ,'Cliente_ID': 'float32', 'Producto_ID': 'float32'}
name_map = {'Semana': 'week', 'Agencia_ID': 'agency', 'Canal_ID': 'channel',
'Ruta_SAK': 'route', 'Cliente_ID': 'client', 'Producto_ID': 'prod'}
test = dd.read_csv(os.path.join(datadir, 'test.csv'), usecols=types.keys(), dtype=types)
test = test.rename(columns=name_map)
gives :
ValueError: Integer column has NA values in column 1
However, the same pandas read_csv operation completes fine and does not yield any NA:
types = {'id': 'int16', 'Semana': 'uint8', 'Agencia_ID': 'uint16', 'Canal_ID': 'uint8',
'Ruta_SAK': 'uint16' ,'Cliente_ID': 'float32', 'Producto_ID': 'float32'}
name_map = {'Semana': 'week', 'Agencia_ID': 'agency', 'Canal_ID': 'channel',
'Ruta_SAK': 'route', 'Cliente_ID': 'client', 'Producto_ID': 'prod'}
test = pd.read_csv(os.path.join(datadir, 'test.csv'), usecols=types.keys(), dtype=types)
test = test.rename(columns=name_map)
test.isnull().any()
id False
week False
agency False
channel False
route False
client False
prod False
dtype: bool
Should I consider this to be an established bug and raise a JIRA for it?
Full traceback:
ValueError Traceback (most recent call last)
in ()
4 'Ruta_SAK': 'route', 'Cliente_ID': 'client', 'Producto_ID': 'prod'}
5
----> 6 test = dd.read_csv(os.path.join(datadir, 'test.csv'), usecols=types.keys(), dtype=types)
7 test = test.rename(columns=name_map)
D:\PROGLANG\Anaconda2\lib\site-packages\dask\dataframe\csv.pyc in read_csv(filename, blocksize, chunkbytes, collection, lineterminator, compression, sample, enforce, storage_options, **kwargs)
195 else:
196 header = sample.split(b_lineterminator)[0] + b_lineterminator
--> 197 head = pd.read_csv(BytesIO(sample), **kwargs)
198
199 df = read_csv_from_bytes(values, header, head, kwargs,
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
560 skip_blank_lines=skip_blank_lines)
561
--> 562 return _read(filepath_or_buffer, kwds)
563
564 parser_f.name = name
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in _read(filepath_or_buffer, kwds)
323 return parser
324
--> 325 return parser.read()
326
327 _parser_defaults = {
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in read(self, nrows)
813 raise ValueError('skip_footer not supported for iteration')
814
--> 815 ret = self._engine.read(nrows)
816
817 if self.options.get('as_recarray'):
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in read(self, nrows)
1312 def read(self, nrows=None):
1313 try:
-> 1314 data = self._reader.read(nrows)
1315 except StopIteration:
1316 if self._first_chunk:
pandas\parser.pyx in pandas.parser.TextReader.read (pandas\parser.c:8748)()
pandas\parser.pyx in pandas.parser.TextReader._read_low_memory (pandas\parser.c:9003)()
pandas\parser.pyx in pandas.parser.TextReader._read_rows (pandas\parser.c:10022)()
pandas\parser.pyx in pandas.parser.TextReader._convert_column_data (pandas\parser.c:11397)()
pandas\parser.pyx in pandas.parser.TextReader._convert_tokens (pandas\parser.c:12093)()
pandas\parser.pyx in pandas.parser.TextReader._convert_with_dtype (pandas\parser.c:13057)()
ValueError: Integer column has NA values in column 1

Rails - exchange attributes values

I have a table called decoders_contracts with this attributes:
id | decoder_id | contract_id
1 3 31
2 3 31
3 1 31
4 1 31
...
I need to exchange the decoder_id this way:
id | decoder_id | contract_id
1 1 31
2 1 31
3 3 31
4 3 31
...
I tried something like this, but it doesn't works:
contract_id = params[:contract_id] # 31
dc1 = params[:dc1] # 1
dc2 = params[:dc2] # 3
DecodersContract.where(contract_id: contract_id, decoder_id: dc1).update_all(decoder_id: dc2)
DecodersContract.where(contract_id: contract_id, decoder_id: dc2).update_all(decoder_id: dc1)
All decoders_id became 1

Yes, you can't do it that way. After the first where all the DecodersContract have the same value.
Better would be to have an intermediate value, preferably one that can't occur naturally.
DecodersContract.where(contract_id: contract_id, decoder_id: dc1).update_all(decoder_id: 999)
DecodersContract.where(contract_id: contract_id, decoder_id: dc2).update_all(decoder_id: dc1)
DecodersContract.where(contract_id: contract_id, decoder_id: 999).update_all(decoder_id: dc2)

Entity framework joins

I'm using the entity framework 4.0 and I'm having some issues with the syntax of my query. I'm trying to join 2 tables and pass a parameter to find the value at the same time.I would like find all of the products in table 2 by finding the correlating value in table 1.
Can someone help me out with syntax please?
Thanks in advance.
sample data
table 1
ID productID categoryID
361 571 16
362 572 17
363 573 16
364 574 19
365 575 26
table 2
productID productCode
571 sku
572 sku
573 sku
574 sku
575 sku
var q = from i in context.table1
from it in context.table2
join <not sure>
where i.categoryID == it.categoryID and < parameter >
select e).Skip(value).Take(value));
foreach (var g in q)
{
Response.Write(g.productID);
}

var q = from i in context.table1
join it in context.table2 on i.productID equals it.productID
where i.categoryID == it.categoryID and it.productCode = xyz
select i).Skip(value).Take(value));

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Inner join return duplicated record - join

Related

left outer join query in Informix

aggration and self join

Dask: ValueError: Integer column has NA values

Rails - exchange attributes values

Entity framework joins

Categories

Resources