I have a ClientDataSet with the following data
IDX EVENT BRANCH_ID BRANCH
1 E1 7 B7
2 E2 5 B5
3 E3 7 B7
4 E4 1 B1
5 E5 2 B2
6 E6 7 B7
7 E7 1 B1
I need to transform this data into
IDX EVENT BRANCH_ID BRANCH
1 E1 7 B7
2 E2 5 B5
4 E4 1 B1
5 E5 2 B2
The only fields of importance are the BRANCH_ID and BRANCH
and BRANCH_ID must be unique
As there is a lot of data I do not what two have copies of it.
QUESTION:
Can you suggest a way to transfrom the data using a Cloned version of the original data ?
Cloning won't allow you to actually change data in a clone and not have same change reflected in the original, so if that's what you want you might rethink the cloning idea.
Cloning does give you a separate cursor into the clone and allows you to filter and index (i.e. order) it independently of the master clientdataset. From the data you've provided it looks like you want to filter some branch data and order by branch_id. You can accomplish that by setting up a new filter and index on the clone. Here's a good article that includes examples of how to do that:
http://edn.embarcadero.com/article/29416
Taking a second look at your question, seems like all you'd need to do would be to set up a unique index on branch_id on the cloned dataset. Linked article above has info on how to set up index; check docs on clientdataset.addindex function for more details and info on setting the index to show only unique values, if I recall it may just mean you set branch_id as the primary key.
I can't think of a slick way to do this, but you could index on BRANCH_ID, add an fkInternalCalc boolean field to your dataset, then initialize that field to True on the first row of each branch (using group state or manually) and then filter the clone on the value of the field. You'd have to update the field on data changes though.
I have a feeling that a better solution would be to have a master dataset with a row for each branch.
You don't provide many details about your use case so I'll try to give you some hints:
"A lot of data" suggests that you might have it from a SQL backend. Using a 'SELECT DISTINCT...' or 'SELECT ... GROUP BY BRANCH_ID' (or similar syntax depending on what SQL backend do you) will give the desired result with ease and speed. Please confirm and I'll give you more details.
As the others said a simple 'clone' wouldn't work. The most simpler (and perhaps quicker) sollution, assuming that ussually the brances are few in number WRT to data is to have an index outside of your dataset. If really you want to filter your original data then add a status field (eg boolean) on your data and put a flag (eg. 'True') on the first occurence.
PseudoCode:
(Let's asume that:
your ClientDataSet is cds1
your cds1 have a status field cds1Status (boolean) - this is optional, needed only if you want to sort/filter/search the cds1
you have a lIndex which is a TStringList)
lIndex.Clear;
lIndex.Sorted:=True;
with cds1 do
try
DisableControls;
First;
while not Eof do //scan the dataset
begin
cVal:=cds1Branch_ID.AsString;
Edit; //we anyway update the Status field
if lIndex.Find(cVal, nDummy) then //nDummy - we don't use it.
begin //already in index
cds1Status.AsBoolean:=False; //we say here "No, isn't the 1st occurence"
end
else
begin //Not found! - Well, let's add it...
lIndex.Append(cVal); //update the index
cds1Status.AsBoolean:=True; //mark the first occurence
end;
Post; //save the changes in the status field
Next;
end; //scan
finally
EnableControls; //housekeeping
end;
//WARNING! - NOT tested. I wrote it from my head but I think that you got the idea...
...Depending on what you try to accomplish (which would be the best thing that you might share with us) and what selectivity do you have on BRANCH_ID perhaps the Status engine isn't needed at all. If you have a very low selectivity on that field (selectivity = no. of unique values / no. of records) perhaps it's much faster to have a new dataset and copy there only the unique values rather than putting each record of the original cds in Edit + Post states. (Changing dataset states are costly operations. Especially if your cds is linked to a remote data storage - ie. a server).
hth,
PS: My sollution is intended to be mostly simple. Also you can test with lIndex.Sorted:=False and use lIndex.IndexOf instead of Find. In some (rare) cases is better. Depends on your data. If you want to complicate the things and the speed is really a concern you can implement a full-blown BTree index to do your searces (libraries available). Also you can use the index engine of CDS and index the BRANCH_ID and do many 'Locate' on a clone but because your selectivity is clearly < 1 scaning the cds's entire index theorethically should be slower that a scan on a unique index, especially if your custom-made index is tailored to your data type, structure, distribuition etc.
just my2c
Related
If i have a variable in SPSS, with name (My_Variable), label (My Variable), values(1: Yes, 2: No) etc but without data (the column in data view is empty), i want to add data using syntax! For example, i want to add a participant in 1st row, who answered "Yes", so i want 1 to be added!!! How can i do it???
I found similar questions, but the solutions refers to creating A NEW SPSS window and add the values there! But i dont want this! I want to add data in an existing variable, without creating new SPSS file!
Apparently there is no way to directly add cases to an SPSS dataset through syntax.
But the following seems to me pretty close - you don't create new files but you create a new dataset and add it to your original.
Let's first create a small data to demonstrate on:
Data list list/ID (a5) var1 var2 var3 (3f2).
begin data
"first" 1 17 7
"secnd" 5 5 12
"third" 34 11 91
end data.
dataset name originalDataset.
So this is your original data. Now imaging that you want to add a new case to the data, with the ID value of "hello" and the number 42 in all the columns. This is what you do:
* creating the new case in a separate dataset.
Data list list/ID (a5) var1 var2 var3 (3f2).
begin data
"hello" 42 42 42
end data.
dataset name addition.
* going back to original dataset and adding the new case.
dataset activate originalDataset.
add files /file=* /file=addition.
exe.
dataset close addition.
You don't have to create data in the first data set. Just create the variables and define them however you want.
DATASET CLOSE ALL.
INPUT PROGRAM.
NUMERIC My_Variable (F1).
VARIABLE LABELS My_Variable "I want this!".
VALUE LABELS My_Variable 1 "Yes" 2 "No".
END FILE.
END INPUT PROGRAM.
DATASET NAME Empty.
DATA LIST FREE /My_Variable.
BEGIN DATA.
1 2
END DATA.
APPLY DICTIONARY /FROM Empty
/SOURCE VARIABLES=My_Variable
/TARGET VARIABLES=My_Variable
/VARINFO VALLABELS=REPLACE VARLABEL.
DATASET CLOSE Empty.
FREQUENCIES VARIABLES ALL.
I used DATASET but you could have save the empty file to disk.
See the APPLY DICTIONARY command for more details about how it works.
Using python you can add data with the cases.append() method
begin program.
import spss
spss.StartDataStep()
dataset = spss.Dataset()
dataset.cases.append([1])
spss.EndDataStep()
end program.
Say you have 3 variables, you can assign values to each by appending the list passed to the method
begin program.
spss.StartDataStep()
dataset = spss.Dataset()
dataset.cases.append([1,2,3])
spss.EndDataStep()
end program.
Would add a case wit value 1 in the first variable, value 2 in the second variable, 3 in the third variable.
Note: the method will only work within an open datastep.
Check out the ADD FILES command. You can also add cases with Python code.
I have a table called "Scores" which has 4 columns, "first", "second", "third", and "average" for keeping record of user's score.
When a user create the record initially, he can leave "average" column blank. Then he can edit all 3 scores later.
After editing, the user can see the computed average (or sum, or any calculation result.) in his show page, since I have
def show
#ave = (#score.first + #score.second + #score.third)/3
end
However, #ave is not in the database, how can I update #ave into the column of "average" of my database?
Ideally, it would be the best if the computing takes place before updating into database, so all 4 values can be updated into database together. It might have something to do with Active Record Callbacks, but I don't know how to do that.
Second approach, I think i need a "trigger" in database so that it can compute and update "average" column as soon as other 3 columns got updated. If this is how you do it, please let me know and the advantage of comparing with solution number 1.
Last approach, since the user already know the average in his show page, I don't have to update the computed average into "average" column immediately. I think i can leave this to a delayed_job or background job. If this is how you do it, please let know me how.
Thank you in advance!(ruby 2.3, rails 5.0.1, postgresql 9.5
Unless you really do need the average stored in the database for some reason, I would add an attribute to the Score model:
def average
(first + second + third)/3.0
end
If one or more might not be present, I would:
def average
actual_scores = [first, second, third].compact
return nil if actual_scores.empty?
actual_scores.sum / actual_scores.size
end
If you do need the average saved, then I would add a before_validate callback:
before_validation do
self.average = (first + second + third)/3.0
end
Ideas 1 and 2 are perfectly valid approaches. Idea 3 is overkill and I would strongly recommend against that approach.
In idea 1, all you need to do (in any language) is simply look at each individual value put in (not including average) and generate the average value to be included in your insert statement. It's really as simple as that.
Idea 2 requires making a trigger as follows:
CREATE OR REPLACE FUNCTION update_average()
RETURNS trigger AS
$BODY$
BEGIN
NEW.AVERAGE=(NEW.first+NEW.second+NEW.third)/3;
RETURN NEW;
END;
$BODY$
Then assign it to run on update or insert of your table:
CREATE TRIGGER last_name_changes
BEFORE INSERT or UPDATE
ON scores
FOR EACH ROW
EXECUTE PROCEDURE update_average();
‘Student Needs’! Columns I through O contain information on when each student attends an intervention class. Intervention classes take place during the second half of the classes (Science or social studies) or during the second half of Co-taught classes (math or ELA). Science and social studies interventions are done on either Monday/Wednesday or Tuesday/Friday (Thursdays have a special schedule that we do not need to consider). Math and ELA interventions occur on all four days.
In ‘Student Master’!, each student’s schedule is listed for both MW and TF. In Columns E, G, K, and M, I would like to populate any of the interventions that are listed in the ‘Student Needs’! sheet. For instance, Lindsey Lukowski has Social Skills on MW2 (Mondays and Wednesdays 2nd hour). So in cell ‘Student Master’! G31 should return “Social Skills”.
William Watters is getting Read Naturally and Reading Comp during his 5th Hour Co-Taught ELA. So ‘Student Master’! K51 and K52 should both return Read Naturally & Reading Comp (in the same cell).
Here is the workbook:
https://docs.google.com/spreadsheets/d/1aW7ExATzMn9Rf8IFLI4v-CQiqsXnxyDm8PxqMW999bY/edit?usp=sharing
Here is a complex functions that seems to do what you want. I have tested it in a copy of your sheet
Just Change E$2 for different columns.
=IFERROR(INDIRECT("'Student Needs'!"&CHAR(72+IFERROR(MATCH("HR "&E$2,ARRAYFORMULA(REGEXEXTRACT(INDIRECT("'Student Needs'!I"®EXEXTRACT($A3,"[0-9]+")+5&":N"®EXEXTRACT($A3,"[0-9]+")+5),"[A-Z ]+[0-9]")),0),MATCH($C3,ARRAYFORMULA(REGEXEXTRACT(INDIRECT("'Student Needs'!I"®EXEXTRACT($A3,"[0-9]+")+5&":N"®EXEXTRACT($A3,"[0-9]+")+5),"[A-Z]+")),0)))&5))
Also I am not sure where "6- Science" should go? Is this also HR 6?
In order for it to work with actual
=IFERROR(INDIRECT("'Student Needs'!"&CHAR(72+IFERROR(MATCH("HR "&E$2,ARRAYFORMULA(REGEXEXTRACT(INDIRECT("'Student Needs'!I"&ROW(VLOOKUP($A3,'Student Needs'!$A$1:$B,2,false))&":N"&ROW(VLOOKUP($A3,'Student Needs'!$A$1:$B,2,false))),"[A-Z ]+[0-9]")),0),MATCH($C3,ARRAYFORMULA(REGEXEXTRACT(INDIRECT("'Student Needs'!I"&ROW(VLOOKUP($A3,'Student Needs'!$A$1:$B,2,false))&":N"&ROW(VLOOKUP($A3,'Student Needs'!$A$1:$B,2,false))),"[A-Z]+")),0)))&5))
INFORMIX-SQL 7.3 Perform Screens:
According to documentation, in an "after editadd editupdate of table" control block, its instructions are executed before the row is added or updated to the table, whereas in an "after add update of table" control block, its instructions are executed after the row has been added or updated to the table. Supposedly, this would mean that any instructions which would alter values of field-tags linked to table.columns would not be committed to the table, but field-tags linked to displayonly fields will change?
However, when using "after add update of table", I placed instructions which alter values for field-tags linked to table.columns and their displayed and committed values also changed! I would have thought that an "after add update of table" would only alter displayonly fields.
TABLES
customer
transaction
branch
interest
dates
ATTRIBUTES
[...]
q = transaction.trx_type, INCLUDE=("E","C","V","P","T"), ...;
tb = transaction.trx_int_table,
LOOKUP f1 = ta_days1_f,
t1 = ta_days1_t,
i1 = ta_int1,
[...]
JOINING *interest.int_table, ...;
[...]
INSTRUCTIONS
customer MASTER OF transaction
transaction MASTER OF customer
delimiters ". ";
AFTER QUERY DISPLAY ADD UPDATE OF transaction
if z = "E" then let q = "E"
if z = "C" then let q = "C"
if z = "1" then let q = "E"
[...]
END
Is 'z' a column in the transaction table?
Is the trouble that the value in 'z' is causing a change in the value of 'q' (aka transaction.trx_type), and the modified value is being stored in the database?
Is the value in 'z' part of the transaction table?
Have you verified that the value in the DB is indeed changed - using the Query Language option or a simple (default) form?
It might look as if it is because the instruction is also used AFTER DISPLAY, so when the values are retrieved from the DB, the value displayed in 'q' would be the mapped values corresponding to the value stored in 'z'. You would have to inspect the raw data to hide that mapping.
If this is not the problem, please:
Amend the question to show where 'z' comes from.
Also describe exactly what you do and see.
Confirm that the data in the database, as opposed to on the screen, is amended.
Please can you see whether this table plus form behaves the same for you as it does for me?
Table Transaction
CREATE TABLE TRANSACTION
(
trx_id SERIAL NOT NULL,
trx_type CHAR(1) NOT NULL,
trx_last_type CHAR(1) NOT NULL,
trx_int_table INTEGER NOT NULL
);
Form
DATABASE stores
SCREEN SIZE 24 BY 80
{
trx_id [f000]
trx_type [q]
trx_last_type [z]
trx_int_table [f001 ]
}
END
TABLES
transaction
ATTRIBUTES
f000 = transaction.trx_id;
q = transaction.trx_type, UPSHIFT, AUTONEXT,
INCLUDE=("E","C","V","P","T");
z = transaction.trx_last_type, UPSHIFT, AUTONEXT,
INCLUDE=("E","C","V","P","T","1");
f001 = transaction.trx_int_table;
INSTRUCTIONS
AFTER ADD UPDATE DISPLAY QUERY OF transaction
IF z = "E" THEN LET q = "E"
IF z = "C" THEN LET q = "C"
IF z = "1" THEN LET q = "E"
END
Experiments
[The parenthesized number is automatically generated by IDS/Perform.]
Add a row with data (1), V, E, 23.
Observe that the display is: 1, E, E, 23.
Exit the form.
Observe that the data in the table is: 1, V, E, 23.
Reenter the form and query the data.
Update the data to: (1), T, T, 37.
Observe that the display is: 1, T, T, 37.
Exit the form.
Observe that the data in the table is: 1, T, T, 37.
Reenter the form and query the data.
Update the data to: (1), P, 1, 49
Observe that the display is: 1, E, 1, 49.
Exit the form.
Observe that the data in the table is: 1, P, 1, 49.
Reenter the form and query the data.
Observe that the display is: 1, E, 1, 49.
Choose 'Update', and observe that the display changes to: 1, P, 1, 49.
I did the 'Observe that the data in the table is' steps using:
sqlcmd -d stores -e 'select * from transaction'
This generated lines like these (reflecting different runs):
1|V|E|23
1|P|1|49
That is my SQLCMD program, not Microsoft's upstart of the same name. You can do more or less the same thing with DB-Access, except it is noisier (13 extraneous lines of output) and you would be best off writing the SELECT statement in a file and providing that as an argument:
$ echo "select * from transaction" > check.sql
$ dbaccess stores check
Database selected.
trx_id trx_type trx_last_type trx_int_table
1 P 1 49
1 row(s) retrieved.
Database closed.
$
Conclusions
This is what I observed on Solaris 10 (SPARC) using ISQL 7.50.FC1; it matches what the manual describes, and is also what I suggested in the original part of the answer might be the trouble - what you see on the form is not what is in the database (because of the INSTRUCTIONS section).
Do you see something different? If so, then there could be a bug in ISQL that has been fixed since. Technically, ISQL 7.30 is out of support, I believe. Can you upgrade to a more recent version than that? (I'm not sure whether 7.32 is still supported, but you should really upgrade to 7.50; the current release is 7.50.FC4.)
Transcribing commentary before deleting it:
Up to a point, it is good that you replicate my results. The bad news is that in the bigger form we have different behaviour. I hope that ISQL validates all limits - things like number of columns etc. However, there is a chance that they are not properly validated, given the bug, or maybe there is a separate problem that only shows with the larger form. So, you need to ensure you have a supported version of the product and that the problem reproduces in it. Ideally, you will have a smaller version of the table (or, at least, of the form) that shows the problem, and maybe a still smaller (but not quite as small as my example) version that shows the absence of the problem.
With the test case (table schema and Perform screen that shows the problem) in hand, you can then go to IBM Tech Support with "Look - this works correctly when the form is small; and look, it works incorrectly when the form is large". The bug should then be trackable. You will need to include instructions on how to reproduce the bug similar to those I gave you. And there is no problem with running two forms - one simple and one more complex and displaying the bug - in parallel to show how the data is stored vs displayed. You could describe the steps in terms of 'Form A' and 'Form B', with Form A being Absolutely OK and Form B being Believed to be Buggy. So, add a record with certain values in Form B; show what is displayed in Form B after; show what is stored in the database in Form A after too; show that they are not different when they should be.
Please bear in mind that those who will be fixing the issue have less experience with the product than either you or me - so keep it as simple as possible. Remove as many attributes as you can; leave comments to identify data types etc.
Delphi does not seem to like multi-field indexes.
How do I physically sort a a table so that I wind up with a table that has the rows in the desired order?
Example:
mytable.dbf
Field Field-Name Field-Type Size
0 Payer Character 35
1 Payee Character 35
2 PayDate Date
3 Amount Currency
I need to produce a table sorted alphabetically by "Payee"+"Payer"
When I tried using an index of "Payee+Payer", I got an error:
"Field Index out of range"
The index field names need to be separated by semicolons, not plus symbols. Try that and it should work.
Ok, let's try to put some order.
First, isn't advisable to physically sort a table. In fact the most RDBMS even don't provide you this feature. Usually, one, in order to not force a full table scan (it is called sometimes natural scan) creates indexes on the table fields on which he thinks that the table will be sorted / searched.
As you see, the first step in order to sort a table is usually index creation. This is a separate step, it is done once, usually at, let's say, "design time". After this, the DB engine will take care to automatically update the indexes.
The index creation is done by you (the developer) using (usually) not Delphi (or any other development tool) but the admin tool of your RDBMS (the same tool which you used when you created your table).
If your 'DB engine' is, in fact, a Delphi memory dataset (TClientDataSet) then you will go to IndexDefs property, open it, add a new index and set the properties there accordingly. The interesting property in our discussion is Fields. Set it to Payee;Payer. Set also the Name to eg. "idxPayee". If you use other TDataSet descendant, consult the docs of your DB engine or ask another question here on SO.com providing the details.
Now, to use the index. (IOW, to sort the table, as you say). In your program (either at design time either at run time) set in your 'Table' the IndexName to "idxPayee" or any other valid name you gave or set IndexFieldNames to Payee;Payer.
Note once again that the above is an example based on TClientDataSet. What you must retain from the above (if you don't use it) is that you must have an already created index in order to use it.
Also, to answer at your question, yes, there are some 'table' types (TDataSet descendants in Delphi terminology) which support sorting, either via a Sort method (or the like) either via a SortFields property.
But nowadays usually when one works with a SQL backend, the preferred solution is to create the indexes using the corresponding admin tool and then issue (using Delphi) an SELECT * FROM myTable ORDER BY Field1.
HTH
If you're still using BDE you can use the BDE API to physically sort the DBF table:
uses
DbiProcs, DbiTypes, DBIErrs;
procedure SortTable(Table: TTable; const FieldNums: array of Word; CaseInsensitive: Boolean = False; Descending: Boolean = False);
var
DBHandle: hDBIDb;
RecordCount: Integer;
Order: SORTOrder;
begin
if Length(FieldNums) = 0 then
Exit;
Table.Open;
RecordCount := Table.RecordCount;
if RecordCount = 0 then
Exit;
DBHandle := Table.DBHandle;
Table.Close;
if Descending then
Order := sortDESCEND
else
Order := sortASCEND;
Check(DbiSortTable(DBHandle, PAnsiChar(Table.TableName), nil, nil, nil, nil, nil,
Length(FieldNums), #FieldNums[0], #CaseInsensitive, #Order, nil, False, nil, RecordCount));
end;
for example, in your case:
SortTable(Table1, [2, 1]); // sort by Payee, Payer
Cannot check, but try IndexFieldNames = "Payee, Payer".
Sure indexes by these 2 fields should exist.
You can create an index on your table using the TTable.AddIndex method in one call. That will sort your data when you read it, that is if you use the new index by setting the TTable.IndexName property to the new index. Here's an example:
xTable.AddIndex('NewIndex','Field1;Field2',[ixCaseInsensitive]);
xTable.IndexName := 'NewIndex';
// Read the table from top to bottom
xTable.First;
while not xTable.EOF do begin
..
xTable.Next;
end;