I have a form with a TPageControl that has two tabs. In each of them there is a TVirtualStringTree and I have defined these two structures:
typedef struct tagTTreeMun
{
AnsiString Municipio;
int Padron;
int Censo;
double Relacion;
int Codigo;
} TTreeMun, *PTreeMun;
typedef struct tagTTreePro
{
AnsiString Proceso;
int Padron;
int Censo;
double Relacion;
int Codigo;
}TTreePro, *PTreePro;
I know: they are almost the same; then I explain it. The first one is loaded from four nested querys and it does it without any problem, but the second one ... There's no way!
To load this second I need two querys:
SELECT DISTINCT Date FROM Elections ORDER BY Date DESC
that field Date contains only the year and runs without any problem.
SELECT A.Codigo, B.Name, SUM (C.Padron) Padron, SUM (C.Censo) Census, A.Closed
FROM Elections A, Process B, HisElec C
WHERE A.CodPrv = (SELECT Literal FROM Installation WHERE Label = 'Province')
AND A.CodPrv = B.CodPrv AND B.Codigo = A.Process AND A.Closed = 1
AND A.CodPrv = C.CodPrv AND A.Codigo = C.Election
AND A. Date =: Date
GROUP BY 1, 2, 5
UNION
SELECT A.Codigo, B.Name, SUM (C.Padron) Padron,
(SELECT SUM (Census) FROM Tables WHERE CodPrv = (SELECT Literal FROM Installation WHERE Label = 'Province')) Census,
A.Closed
FROM Elections A, Process B, Dl01 C
WHERE A.CodPrv = (SELECT Literal FROM Installation WHERE Label = 'Province')
AND A.CodPrv = B.CodPrv AND A.Process = B.Code AND A.Closed = 0
AND A. Date =: Date
GROUP BY 1, 2, 5
ORDER BY 1 DESC, 3
It also runs without problems or errors. The problem comes when trying to pass that data to the corresponding TVirtualStringTree.
PTreePro DatPro;
PVirtualNode Node1, Node2, Node3, Node4;
LisPro->NodeDataSize = sizeof (TTreePro);
LisPro->BeginUpdate ();
LisPro->Clear ();
for (;! qTemp1->Eof; qTemp1->Next ())
{
Node1 = LisPro->AddChild (NULL);
DatPro = (PTreePro) LisPro->GetNodeData (Node1);
DatPro->Process = IntToStr(qTemp1->FieldByName ("Date")->AsInteger);
qTemp2->Close ();
qTemp2->ParamByName ("Date")->AsInteger = qTemp1->FieldByName("Date")->AsInteger;
qTemp2->Open ();
for (;! qTemp1->Eof; qTemp1->Next())
{
Node2 = LisPro->AddChild(Node1);
DatPro = (PTreePro)LisPro->GetNodeData(Node2);
DatPro->Process = qTemp1->FieldByName("Name")
[...]
}
}
When trying to create that Node1 in this query, the lines Node1 = ... and DatPro = (PTreePro) ... are executed without major problem, except that Node1, after running the AddChild, has a NULL value and therefore, from then on, it can only give an error because when trying to give value to Process, it automatically gives an execution error.
I have tried putting the load of each of the trees in a different function, by isolating code; I have tried with the same structure (in the end they are identical) or as in the example with two structures, to change the order of execution. For more tests that I have done I am not able to load both trees, in LisPro I ALWAYS do the same to me.
Informix 12.10
tblItems
(
Type SMALLINT, {Precious Metal = 1, Other = 2}
Description VARCHAR,
Quantity SMALLINT,
Name VARCHAR,
Weight DECIMAL(5,1),
Purity SMALLINT,
Brand VARCHAR,
Model VARCHAR,
SerialNum VARCHAR
);
EDIT UPDATE: Sample data below is stored in tblItems.Type and tblItems.Description. Please note that the contents in Description column are all uppercase characters and may also include punctuation character.
2|1LAPTOP APPLE 15.5" MODEL MACKBOOK PRO,S/N W80461WCAGX, WITH CHARGER||||||||
1|1RING 2.3PW 14K||||||||
2|DRILL RIOBY, MODEL D5521 S/N77720||||||||
2|TRIMMER TORO, MODEL 0242 S/N 66759||||||||
2|CELL SAMSUNG NOTE3, MODEL SM-N900T S/N RV8F90YLZ9W||||||||
I need to parse the sample item descriptions into the columns below, using the rules mentioned in the comments :
Quantity, {if description string does not start with a number, then Quantity = 1}
Name, {Always the first element if description has no quantity, second element if quantity present]
Weight, {Always before "PW" if Type = 1, Default to zero if Type = 2}
Purity, {Always before "K" if Type = 1, Default to NULL if Type = 2}
Brand, {Always the second element in description, if present}
Model, {Always after "MODEL", with or without a space}
Serial Number {Always after "S/N", with or without a space}
I would like to do this with an UPDATE statement, but if Informix has an import utility tool like SQL-Server's SSIS, then that could be a better option.
UPDATE, Expected Results:
Quantity 1 1 1 1 1
Name LAPTOP RING DRILL TRIMMER CELL
Weight 0.0 2.3 0.0 0.0 0.0
Purity 14
Brand APPLE RIOBY TORO SAMSUNG
Model MACKBOOK PRO D5521 0242 SM-N900T
SerialNum W8046WCAGX 77720 66759 RV8F90YLZ9W
Assuming you are using Informix 12.10.XC8 or above, you can try using regular expressions to parse the description string (see the online documentation here).
For the serial number, for example, you can do:
UPDATE tblitems
SET
serialnum =
DECODE
(
regex_match(description, '(.*)(S\/N)(.*)', 3)
, 't'::BOOLEAN, regex_replace(description, '(.*)(S\/N)([[:blank:]]?)([[:alnum:]]*)(.*)', '\4', 0, 3)
, 'f'::BOOLEAN, ''
)
So in the previous example I am testing if the description contains the S/N string and if that is true I use regex_replace to return the value after it, in this case the 4th matching group in the regular expression (I am not using regex_extract to get the value because it seems to return multiple values and I get error -686).
You can extend this approach to the rest of the columns and see if regular expressions are enough to parse the description column.
If you're looking for a SQL Server option and open to a Split/Parse function which maintains the sequence
Example
Select A.Type
,A.Description
,C.*
From YourTable A
Cross Apply (values ( replace(
replace(
replace(
replace(A.Description,',',' ')
,' ',' ')
,'Model ','Model')
,'S/N ','S/N')
)
)B(CleanString)
Cross Apply (
Select Quantity = IsNull(left(max(case when RetSeq=1 then RetVal end),NullIf(patindex('%[^0-9]%',max(case when RetSeq=1 then RetVal end)) -1,0)),1)
,Name = substring(max(case when RetSeq=1 then RetVal end),patindex('%[^0-9]%',max(case when RetSeq=1 then RetVal end)),charindex(' ',max(case when RetSeq=1 then RetVal end)+' ')-1)
,Weight = IIF(A.Type=2,null,try_convert(decimal(5,1),replace(max(case when RetVal like '%PW' then RetVal end),'PW','')))
,Purity = try_convert(smallint ,replace(max(case when RetVal like '%K' then RetVal end),'K',''))
,Brand = IIF(A.Type=1,null,max(case when RetSeq=2 then RetVal end))
,Model = replace(max(case when RetVal Like 'Model[0-9,A-Z]%' then RetVal end),'Model','')
,SerialNum = replace(max(case when RetVal Like 'S/N[0-9,A-Z]%' then RetVal end),'S/N','')
From [dbo].[tvf-Str-Parse](CleanString,' ') B1
) C
Returns
The TVF if Interested
CREATE FUNCTION [dbo].[tvf-Str-Parse] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(#String,#Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
);
EDIT - If you don't want or can't use a TVF
dbFiddle
Select A.Type
,A.Description
,C.*
From YourTable A
Cross Apply (values ( replace(
replace(
replace(
replace(A.Description,',',' ')
,' ',' ')
,'Model ','Model')
,'S/N ','S/N')
)
)B(CleanString)
Cross Apply (
Select Quantity = IsNull(left(max(case when RetSeq=1 then RetVal end),NullIf(patindex('%[^0-9]%',max(case when RetSeq=1 then RetVal end)) -1,0)),1)
,Name = substring(max(case when RetSeq=1 then RetVal end),patindex('%[^0-9]%',max(case when RetSeq=1 then RetVal end)),charindex(' ',max(case when RetSeq=1 then RetVal end)+' ')-1)
,Weight = IIF(A.Type=2,null,try_convert(decimal(5,1),replace(max(case when RetVal like '%PW' then RetVal end),'PW','')))
,Purity = try_convert(smallint ,replace(max(case when RetVal like '%K' then RetVal end),'K',''))
,Brand = IIF(A.Type=1,null,max(case when RetSeq=2 then RetVal end))
,Model = replace(max(case when RetVal Like 'Model[0-9,A-Z]%' then RetVal end),'Model','')
,SerialNum = replace(max(case when RetVal Like 'S/N[0-9,A-Z]%' then RetVal end),'S/N','')
From (
Select RetSeq = row_number() over (Order By (Select null))
,RetVal = ltrim(rtrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(CleanString,' ','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
) B1
) C
I have a table with n columns that I'll call A. In this table there are three columns that i'll need:
vat -> String
tax -> String
card -> String
vat or tax can be null, but not at the same time.
For every unique couple of vat and tax there is at least one card.
I need to alter this table, adding a column count_card in which I put a text based on the number of cards every unique combination of tax and vat has.
So I've done this:
val cardCount = A.groupBy("tax", "vat").count
val sqlCard = udf((count: Int) => {
if (count > 1)
"MULTI"
else
"MONO"
})
val B = cardCount.withColumn(
"card_count",
sqlCard(cardCount.col("count"))
).drop("count")
In the table B I have three columns now:
vat -> String
tax -> String
card_count -> Int
and every operation on this DataFrame is smooth.
Now, because I wanted to import the new column in A table, i performed the following join:
val result = A.join(B,
B.col("tax")<=>A.col("tax") and
B.col("vat")<=>A.col("vat")
).drop(B.col("tax"))
.drop(B.col("vat"))
Expecting to have the original table A with the column card_count.
Problem is that the join hangs, getting all system resources blocking the pc.
Additional details:
Table A has ~1.5M elements and is read from parquet file;
Table B has ~1.3M elements.
System is a 8 thread and 30GB of RAM
Let me know what I'm doing wrong
At the end, I didn't found out which was the issue, so I changed approach
val cardCount = A.groupBy("tax", "vat").count
val cardCountSet = cardCount.filter(cardCount.col("count") > 1)
.rdd.map(r => r(0) + " " + r(1)).collect().toSet
val udfCardCount = udf((tax: String, vat:String) => {
if (cardCountSet.contains(tax + " " + vat))
"MULTI"
else
"MONO"
})
val result = A.withColumn("card_count",
udfCardCount(A.col("tax"), A.col("vat")))
If someone knows a better approach let me know it
Input file with two column:
Visit ProductString
101 ;Cross Trainers;1;69.95,;Athletic Socks;10;29.99
102 ;Amplifier;1;120.90,;Headphone;2;59.99;leather wallet;1;99.99;
I am looking for Pig script that can parse "ProductString" value in each row and provide cumulative revenue.
ie.,Output:
69.95+29.99+120.90+59.99+99.99=380.82
I'll assume there should be a , after 59.99 and that there shouldn't be a ; after 99.99. If so, you need to tokenize and flatten on the , to extract products and then split on the ; to get item prices and qty.
Query:
data = LOAD 'db.table';
A = FOREACH data GENERATE visit, FLATTEN(TOKENIZE(product_string, ',')) AS tmp_col;
B = FOREACH A GENERATE visit, STRSPLIT(tmp_col, ';') AS prod;
C = FOREACH B GENERATE visit, prod.$1 AS item:chararray
, (int)prod.$2 AS qty:int, (double)prod.$3 AS revenue:double;
grpd = GROUP C all;
D = FOREACH grpd GENERATE SUM(C.revenue);
DUMP D;
Output:
(380.82)
Using Pig .12.
I want to use a FOREACH with a block statement; the result, via GENERATE depends on some value. I know that i can use Flatten with a statement in it. I.e.:
grunt> d = foreach j2 {
ord = order j1 by A::a1 desc;
l = limit ord 1;
generate flatten((IsEmpty(l)?'f':'g')); };
This works.
But what i want to do:
grunt> d = foreach j2 {
ord = order j1 by A::a1 desc;
l = limit ord 1;
generate flatten((l.$0=='2') ?'f':'g'); };
2014-05-07 22:28:08,750 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 1200: mismatched input '?' expecting RIGHT_PAREN
Why does it say this?
I tried:
grunt> d = foreach j2 {
ord = order j1 by A::a1 desc;
l = limit ord 1;
generate flatten(((l.$0=='2') ?'f':'g')); };
2014-05-07 22:40:44,895 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 1039:
(Name: Equal Type: null Uid: null)incompatible types in Equal Operator left hand side:bag :tuple(A::a1:chararray) right hand side:chararray
And i dont know how to resolve this.
I just want to generate my results in a test case, but first need to flatten the test condition.
Help ?
thanks,
Matt