IBM Cognos 10 - Smple way to globally rename a table column? - sdk

My client has decided they want to rename a very commonly used data item name.
So, for example, the database has a column called 'Cost' and they see 'Cost' on a heap of reports.
The client now wants to see 'Net Cost' everywhere.
So we need to change every occurrence of 'Cost' and change it to 'Net Cost'
I can do this in Framework Manager easily enough, and I can even run Tools > Report Dependency to find all the reports that use the 'Cost' column. But if there's 4,000 of them, that's a lot of work to update them all.
One idea is to deploy the entire content store to a Cognos Deployment zip file, extract that & do a global search & replace on the XML. But that's going to be messy & dangerous.
Option 2 is to use MotioPI to do a search & replace. I don't think the client will spring for buying this product just for this task.
Are there other options?
has anyone written anything in the Cognos SDK which will do a rename?
has someone investigated the Content Store database to the degree
that they could do a rename on all the report specs in SQL?
are there other options I've overlooked?
Any ideas would be greatly welcomed ...

Your first option is the way to go. This essentially boils down to an XML find-and-replace scenario. You'll need to isolate just the instances of the word "Cost" which are relevant to you. This may involve the start and end tags.
To change the data source reference across reports, you'll need to find and replace on the three part name [Presentation Layer].[Namespace].[Cost]. If there are filters on the item, they may just reference the one part name from the Query. Likewise, any derived queries would reference the two part name. Handle these by looking through the XML report spec and figuring out how to isolate the text.
I'm assuming your column names are set to inherit the business name from the model and not hard coded (Source Type should be Data Item Label, NOT Text). If not, you'll need to handle these as well. Looking at the XML, you would see <staticValue>Cost</staticValue> for these.
It's not really dangerous as you have a backup. Just take multiple passes, each with as granular a find and replace as possible.
Motio will just look at the values inside the tags, so you will be unable to isolate Cost, thus it can't be used for this. However, it would come in handy for mass validation of reports after the find and replace. A one seat license for the year could be justified by the amount of development time it could save here.

Have you tried using DRU? (http://www-01.ibm.com/support/docview.wss?uid=swg24021248)
I have used this tool before to do what you are describing.
Good luck.

You can at least search for text in the Content Store (v10.2.1) using something like:
set define off
select distinct T4.name as folder_name, T2.name as report_name
from cmobjprops7 T1
inner join cmobjnames T2 on T1.cmid=T2.cmid
inner join cmobjects T3 on T1.cmid=T3.cmid
inner join cmobjnames T4 on T3.pcmid=T4.cmid
inner join ( -- only want the latest version (this still shows reports that have been deleted)
Select T4.name as folder_name, T2.name as report_name, max(T3.modified) as latest_version_date
from cmobjnames T2
inner join cmobjects T3 on T2.cmid=T3.cmid
inner join cmobjnames T4 on T3.pcmid=T4.cmid
Where T2.Name like ‘%myReport%’ -- search only reports with 'myReport' i the name
and substr(T4.name,1,1) in ('Project Zeus','Project Jupiter') -- search only folders with this in the name
Group by T4.name, T2.name
) TL on TL.folder_name=T4.name and TL.report_name=T2.name and TL.latest_version_date=T3.modified
where T1.spec like '%[namespace].[column_name]%' -- text you want to find
and substr(T4.name,1,2) in ('Project Zeus','Project Jupiter')
order by 1 desc, 2;

Related

Use an "IF statement" on the "ORDER BY" in a report in INFORMIX 4GL

I'm having a report which is called many times but I want the order to be different each time.
How can I change the "order by" depending on a variable.
For example
report print_label()
If is_reprint
then
order by rpt.item_code, rpt.description
else
order by rpt.description, rpt.item_code
end if
I tried passing in a variable when calling the report:
let scratch = "rpt.item_code, rpt.description"
start print_label(scratch)
And in the report I did:
order by scratch
But it didn't work...... Any other suggestions ??
Thank you!
The technique I've used for that type of problem is along the likes of
REPORT report_name(x)
DEFINE x RECORD
param1,param2, ..., paramN ...,
sort_method ...,
data ...
END RECORD
ORDER [EXTERNAL] BY x.param1, x.param2, ..., x.paramN
BEFORE GROUP OF x.param1
CASE
WHEN x.sort_method ...
PRINT ...
WHEN x.sort_method ...
PRINT ...
END CASE
BEFORE GROUP OF x.param2
# similar technique as above
...
BEFORE GROUP OF x.paramN
# similar technique as above
ON EVERY ROW
PRINT ...
AFTER GROUP OF x.paramN
# similar technique as above
...
AFTER GROUP OF x.param2
# similar technique as above
AFTER GROUP OF x.param1
# similar technique as above
... and then in the 4gl that calls the REPORT, populate x.param1, x.param2, ..., x.paramN with the desired parameters used for sorting e.g.
CASE x.sort_method
WHEN "product,branch"
LET x.param1 = x.data.product_code
LET x.param2 = x.data.branch_code
WHEN "branch,product"
LET x.param1 = x.data.branch_code
LET x.param2 = x.data.product_code
END CASE
OUTPUT TO REPORT report_name(x.*)
So as per my example, that's a technique I've seen and used for things like stock reports. The warehouse/branch/store manager wants to see things ordered by warehouse/branch/store, and then by product/sku/item, whilst a product manager wants to see things ordered by product/sku/item, and then warehouse/branch/store. More analytical reports with more potential parameters can be done using the same technique. I think the record I have seen is 6. So in that case, much better with 1 report covering all 6!=720 potential combinations, rather than writing a separate report for each possible order combination.
So probably similar to Jonathan option 1, although I don't have the same reservations about the complexity. I don't recall catching at code review any of my junior developers getting it badly wrong. In fact if the report is generic enough, you'll find that you don't need to touch it too often.
Short answer
The ORDER BY clause in an I4GL REPORT function has crucial effects on how the code implementing the report is generated. It is simply not feasible to rewire the generated code like that at run-time.
Therefore, you can't achieve your desired result directly.
Notes
Note that you should probably be using ORDER EXTERNAL BY rather than ORDER BY — the difference is that with EXTERNAL, the report can assume the data is presented in the correct order, but without, the report has to save up all the data (in a temporary table in the database), then select the data from the table in the required sorted order, making into into a two-pass report.
If you're brave and have the I4GL c-code compiler, you should take a look at the code generated for a report, but be aware it is some of the scariest code you're likely to encounter in a long time. It uses all sorts of tricks that you wouldn't dream of using yourself.
Workaround solutions — in outline
OK; so you can do it directly. What are your options? In my view, you have two options:
Use two parameters specifically for choosing the ordering, and then use an ORDER BY (without EXTERNAL) clause that always lists them in a fixed order. However, when it comes time to use the report, choose which sequence you want the arguments in.
Write two reports that differ only in the report name and the ORDER EXTERNAL BY clause. Arrange to call the correct report depending on which order you want.
Of these, option 2 is by far the simpler — except for the maintenance issue. Most likely, you'd arrange to generate the code from a single copy. That is, you'd save REPORT print_label_code_desc in one file, and then arrange to edit that into REPORT print_label_desc_code (use sed, for example) — and the edit would reverse the order of the names in the ORDER BY clause. This isn't all that hard to do in a makefile, though it requires some care.
Option 1 in practice
What does option 1 look like in practice?
DECLARE c CURSOR FOR
SELECT * FROM SomeTable
START REPORT print_label -- optional specification of destination, etc.
FOREACH c INTO rpt.*
IF do_item_desc THEN
OUTPUT TO REPORT print_label(rpt.item_code, rpt.description, rpt.*)
ELSE
OUTPUT TO REPORT print_label(rpt.description, rpt.item_code, rpt.*)
END IF
END FOREACH
FINISH REPORT print_label
The report function itself might look like:
REPORT print_label(col1, col2, rpt)
DEFINE col1 CHAR(40)
DEFINE col2 CHAR(40)
DEFINE rpt RECORD LIKE SomeTable.*
ORDER BY col1, col2
FORMAT
FIRST PAGE HEADER
…
BEFORE GROUP OF col1
…
BEFORE GROUP OF col2
…
ON EVERY ROW
…
AFTER GROUP OF col1
…
AFTER GROUP OF col2
…
ON LAST ROW
…
END REPORT
Apologies for any mistakes in the outline code; it is a while since I last wrote any I4GL code.
The key point is that the ordered by values are passed specially to the report and used solely for controlling its organization. You may need to
be able to print different details in the BGO (shorthand for BEFORE GROUP OF; AGO for AFTER GROUP OF) sections for the two columns. That will typically be handled by (gasp) global variables — this is I4GL and they are the normal way of doing business. Actually, they should be module variables rather than global variables if the report driver code (the code which calls START REPORT, OUTPUT TO REPORT and FINISH REPORT) is in the same file as the report itself. You need this because in general the reporting at the group levels (in the BGO and AGO blocks) will need different titles or labels depending on whether you're sorting code before description or vice versa. Note that the meaning of the group aggregates change depending on the order in the ORDER BY clause.
Note that not every report necessarily lends itself to such reordering. Simply running the BGO and AGO blocks in a different order is not sufficient to make the report output look sensible. In that case, you will fall back onto option 2 — or option 2A, which is write two separate reports that don't pretend to be just a reordering of the ORDER BY clause because the formatting of the data needs to be different depending on the ORDER BY clause.
As you can see, this requires some care — quite a bit more care than the alternative (option 2). If you use dynamic SQL to create the SELECT statement, you can arrange to put the right ORDER BY clause into the string that is then prepared so that the cursor will fetch the data in the correct order — allowing you to use ORDER EXTERNAL BY after all.
Summary
If you're a newcomer to I4GL, go with option 2. If your team is not reasonably experienced in I4GL, go with option 2. I don't like it very much, but it is the way that can be handled easily and is readily understood by yourself, your current colleagues, and those still to come.
If you're reasonably comfortable with I4GL and your team is reasonably experienced with I4GL — and the report layout really lends itself to being reorganized dynamically — then consider option 1. It is trickier, but I've done worse things in I4GL in times past.
You can have a case statement within the order by clause as follows :
order by
case
when 1 = 1 then
rpt.item_code, rpt.description
else
rpt.description, rpt.item_code
end
You can use prepare:
let query_txt="select ... "
If is_reprint then
let query_txt=query_txt clipped, " order by rpt.item_code,
rpt.description"
else
let query_txt=query_txt clipped, " order by rpt.description,
rpt.item_code"
end if
prepare statement1 from query_txt
declare cursor_name cursor for statement1
And now start report, use foreach etc, etc...
P.S. You must define query_txt as char long enough for whole text.

Find changes quickly in larger SQL database?

There is a Java Swing application which uses an Informix database. I have user rights granted for the Swing application (i.e. no source code), and read only access to a mirror of the database.
Sometimes I need to find a database column, which is backing a GUI element (TextBox, TableField, Label...). What would be best approach to find out which database column and table is holding the data shown e.g. in a TextBox?
My general approach is to capture the state of the database. Commit a change using the GUI and then capture the state of the database again. Then I need to examine the difference. I've already tried:
Use the nrows field of systables: Didn't work, because the number in nrows does not seem to be a realtime representation of the row count.
Create a script with SELECT COUNT(*) ... for all tables: didn't work because too many tables (> 5000). Also tried to optimize by removing empty tables, but there are still too many left.
Is there a simple solution that I'm missing?
Please look at the Change Data Capture API and check if this suits your needs
There probably isn't a simple solution.
You probably need to build yourself a map of the database, or a data dictionary for it. It sounds as though you can eliminate many of the tables from consideration since they're empty — at least for a preliminary pass. If you're dealing with information in a text box, the chances are it is some sort of character data; you can analyze which (non-empty) tables which contain longer character strings, and they'd be the primary targets of your searches. If the schema is badly designed with lots of VARCHAR(255) columns even though the columns normally only hold short strings, life is more difficult. Over time, you can begin to classify tables and columns so that you end up knowing where to look for parts of the application.
One problem to beware of: the tabid in informix.systables isn't necessarily as stable as you'd like. Your data dictionary needs to record its own dd_tabid for the table it describes, and can store the last known tabid from informix.systables, but it needs to be ready to find a new tabid value on occasion. You should probably only mark data in your dictionary for logical deletion.
To some extent, this assumes you can create a database in which to record this information. If you can't create an Informix database, you may have to use something else (MySQL, or SQLite, perhaps) to store the data dictionary. Alternatively, go to your DBA team and ask them for the information. Unless you're trying something self-evidently untoward, they're likely to help (but politics can get in the way — I've no idea how collegial your teams are).

Performance of generated T-SQL from Entity Framework

I recently used Entity Framework for a project, despite my DBA's strong disapproval. So one day he came to my office complaining about generated T-SQL that reaches his database.
For instance, when I want to select a product based on the id, I write something like this:
context.Products.FirstOrDefault(p=>p.Id==id);
Which translates to
SELECT ... FROM (SELECT TOP 1 ... FROM PRODUCTS WHERE ID=#id)
So he is shouting, "Why on earth would you write a SELECT * FROM (SELECT TOP 1)"
So I changed my code to
context.Products.Where(p=>p.Id==id).ToList().FirstOrDefault()
and this produces a much cleaner T-SQL:
SELECT ... FROM PRODUCTS WHERE ID=#id
The inner query and the TOP 1 dissappeared. Enough mambling, my question is this: Does the first query really put an overhead for SQL Server? Is it harder to parse than the second method? The Id column has a Clustered index on. I want a good answer so I can rub it on his face (or mine)
Thanks,
Themos
Have you tried running the queries manually and comparing the executions plans?
The biggest problem here isn't that the SQL isn't perfectly formed to your DBA's standards (although I'm fairly certain that the query engine will optimize out the extra select). The second query actually returns the entire contents of the Products table which you then analyse in memory and this is definitely a task that should be performed by the DB and not the application layer.
In short, he's being a pedant; leave it the way it was.

Do I have to use UNION insted of JOIN?

An article about Optimizing your SQL queries has suggested to use Union insted of OR `cause:
Utilize Union instead of OR
Indexes lose their speed advantage when using them in OR-situations in
MySQL at least. Hence, this will not be useful although indexes is
being applied 1 SELECT * FROM TABLE WHERE COLUMN_A = 'value' OR
COLUMN_B = 'value'
On the other hand, using Union such as this will utilize Indexes.
1- SELECT * FROM TABLE WHERE COLUMN_A = 'value'
2- UNION
3- SELECT * FROM
TABLE WHERE COLUMN_B = 'value'
How much this suggestion is true? Should I turn my OR queries to Union?
I do not recommend this as a general rule. I don't have MySQL installed here so can't check the generated execution plan, but certainly in Oracle the plans are much different and the 'OR' plan is more efficient than the 'UNION' plan. (Basically, the 'UNION' has to perform two SELECT's and then merge the results - the 'OR' only has to do a single SELECT). Even a 'UNION ALL' plan has a higher cost than the 'OR' plan. Stick with the 'OR'.
I very strongly recommend writing the clearest, simplest code you possibly can. To me, that means you should use the 'OR' rather than the 'UNION'. I've found over the years that attempting to "pre-optimize" something, or in other words to guess where I'll encounter performance problems and then to try and code around those problems, is a waste of time. Code tends to spend a lot of time in the darndest places, and you'll only find them once you've got your code running. A long time ago I learned three rules that I've found to be useful in this context:
Make if run.
Make it run right.
Make it run right fast.
Optimization is literally the last thing you should be doing.
Share and enjoy.
Followup: I hadn't noticed that the 'OR' was looking at different columns (my bad), but my advice regarding "keep it simple" still holds.
It helps to think of indexes like names in a phone book. A phone book, you could say, has a naturally ordered index by name, meaning, if you want to find all names John Smith, it would take you little to no time to find it. You'd simply open the phone book to the S section and begin looking up Smith.
Now what if I told you to look for entries in the phone book with name John Smith or phone number 863-2253. Not as quick to do, eh? To provide a precise answer, you'd need a phone book to look up John Smith and another one sorted by phone numbers in order to find a name by his or her phone number.
Perhaps a more sophisticated engine could see the need for this separation and do it automatically, but apparently MySQL does not. So while it might seem a hassle to have to do it this way, I assure you the difference in tables with high record counts is noticeable.

Indices not working on sqlite table

I am using indices on columns on which I am making a search. The indices are created like this:
CREATE INDEX index1 on <TABLE>(<col1> COLLATE NOCASE ASC)
CREATE INDEX index2 on <TABLE>(<col2> COLLATE NOCASE ASC)
CREATE INDEX index3 on <TABLE>(<col3> COLLATE NOCASE ASC)
Now, the select query to search for records is like this:
select <col1> from <TABLE> where <col1> like '%monit%' AND <col2> like '%84%' GROUP BY <col1> limit 0,501;
When I run EXPLAIN QUERY PLAN on my sqlite database like this:
EXPLAIN QUERY PLAN select <col1> from <TABLE> where <col1> like '%monit%' AND <col2> like '%84%' GROUP BY <col1> limit 0,501;
It returns the output as:
0|0|0|SCAN TABLE USING INDEX (~250000 rows)
and when I drop the index, the output this EXPLAIN QUERY PLAN produces is:
0|0|0|SCAN TABLE (~250000 rows)
0|0|0|USE TEMP B-TREE FOR GROUP BY
Isn't the number of rows that are scanned (~250000 rows) were supposed to be lesser when index was used in searching the table???
I guess the problem here is with LIKE keyword, because I have read somewhere that LIKE keyword nullifies the use if indices... Here is the link
EDIT: For indices to work on a query which is using LIKE, The right-hand side of the LIKE must be a string literal that does not begin with a wildcard character. So, in the above query, I tried using search parameter in like without '%' at the beginning:
EXPLAIN QUERY PLAN select <col1> from <TABLE> where <col1> like 'monit%' AND <col2> like '84%' GROUP BY <col1> limit 0,501;
and the output I got was this:
0|0|0|SEARCH TABLE partnumber USING INDEX model_index_partnumber (model>? AND model
so,you see. The number of rows being searched (rather than scan) are (~15625 rows) in this.
But the problem now is I cannot do away with % wild card at the beginning. Anyone pls suggest me an alternative way to achieve the same....
EDIT:
I have tried using FTS3 from terminal but when I typed this query:
CREATE VIRTUAL TABLE <tbl> USING FTS3 (<col_list>);
Its throwing error as:
Error: no such module: FTS3
Someone pls help me to enable FTS3 from terminal as well as XCode (need the steps I must perform for both tasks).
I am using sqlcipher and have already perform this from terminal:
CFLAGS="-DSQLITE_ENABLE_FTS3=1" ./configure
EDIT:
Please visit the question sqlite table taking time to fetch the records in LIKE query posted by me
EDIT:
Hey All, I got some success. I modified my select query to look like this:
select distinct description collate nocase as description from partnumber where rowid BETWEEN 1 AND (select max(rowid) from partnumber) AND description like '%a%' order by description;
And Bingo, the search time was like never before. But the problem now is when I execute the command EXPLAIN QUERY PLAN like this, it shows me using B-Tree for distinct which I dont want to use.
explain query plan select distinct description collate nocase as description from partnumber where rowid BETWEEN 1 AND (select max(rowid) from partnumber) AND description like '%a%' order by description;
Output:
0|0|0|SEARCH TABLE partnumber USING INTEGER PRIMARY KEY (rowid>? AND rowid<?) (~15625 rows)
0|0|0|EXECUTE SCALAR SUBQUERY 1
1|0|0|SEARCH TABLE partnumber USING INTEGER PRIMARY KEY (~1 rows)
0|0|0|USE TEMP B-TREE FOR DISTINCT
A couple other options ...
Full Text Indexes:
http://sqlite.org/fts3.html
The most common (and effective) way to describe full-text searches is
"what Google, Yahoo and Altavista do with documents placed on the
World Wide Web".
SELECT count(*) FROM enrondata1 WHERE content MATCH 'linux'; /* 0.03 seconds */
SELECT count(*) FROM enrondata2 WHERE content LIKE '%linux%'; /* 22.5 seconds */
Word Breaking:
If you're looking for words (or words that start with), you can break text blobs into words yourself and store your own indexed word tables. But even then, you'll be able to only do word like 'monit%' to get hits like "monitor"
If possible, use the full text - it will be much less code. But, if that's not an option for some reason, then you can fall back to your own word breaking tables but that's limited words begins with to avoid scans. (better than whole text block begins with).
Be aware that the sqlite that comes with iOS does not have Full Text enabled. You can work around that. There's instructions on that and it's use at:
http://longweekendmobile.com/2010/06/16/sqlite-full-text-search-for-iphone-ipadyour-own-sqlite-for-iphone-and-ipad/
The full docs on creating and querying full text tables are here: http://sqlite.org/fts3.html
To get FTS3 to also work from terminal, see:
Compiling the command line interface # http://www.sqlite.org/howtocompile.html
sqlite3 using fts3 create table in my mac terminal and how to use it in iphone xcode project?
This is quite simple. You are telling SQLITE to examine every record in the table. It is faster to do this without using an index, because using an index wuld involve additional IO. And index is used when you want to examine a subset of the records in a table where the extra IO of using the index is paid back by not having to examine every record in the table.
When you say LIKE "%something" that means all records with anything at all at the beginning of the field, followed by something. The only way to do this is to examine every single record. Note that indexes should still be used if you only use LIKE "something%" because in this case, SQLITE can use the index to find the subset of records beginning with "something". In the old days when databases where not so clever we used to write it like this to enforce the use of an index. SELECT * WHERE col1 >= "something" AND col1 < "somethinh", Note the intentional mispelling of something in the second condition.
If you can it is best to avoid using % at the beginning of a LIKE condition. In some cases you may be able to change your schema so that data is stored in two columns rather than one. Then you use a LIKE "something%" search on the second of the two columns. Of course this depends on your data being structured right.
But even if splitting into two columns is not possible, it may be possible to divide and conquer the data in another way. For instance you could split the search fields into words, and index every word in a single column in another search table. That way "look for something or other" becomes a list of records where "something" is an exact match on a record in the search table. No LIKE required. You would then get a record ID to retrieve the original record. This is one of the things that SOLR does internally so if you must stick with SQLITE and cannot leverage SOLR or LUCENE in any way, then you can always read up on how they build inverted indices and do the same thing yourself in your SQLITE db.
Remember that LIKE "%something%" must examine every record, but if you can select a subset of the data first, and then apply the LIKE search, this will run a lot faster. Filling the cache will have the same effect which is what your experiments with DISTINCT were doing. Maybe all you need to do is to enlarge the cache to get acceptable search times. The first search will still be slow, but people are often quite forgiving of problems which go away when you retry it.
When you use arbitrary wildcards like that you are getting very close to a full text search engine requirement like SOLR. These work by indexing the data 100% in RAM. With SQLITE you might be able to do something similar by creating a second in-memory database, reading all data from the disk tables into the in-memory db, then using the in-memory db for searching with wildcards. You would still have full-table scans with queries such as LIKE "%monit%" however that scan takes place in RAM where it is not as timeconsuming. You don't need to import all your data into RAM, only the parts where you need "%something%" searches, because SQLITE can do cross-database joins. SQLITE makes it easy to create an in-memory database, and the ATTACH DATABASE and DETACH DATABASE commands make it easy to connect a second database to your app. There is some example code for IOS in this question Can iPhone sqlite apps attach to other databases?
Not sure why you don't like EXPLAIN using B-Trees since the b-tree is probably the fastest possible search structure available when your data has to be read from a filesystem.
I have a MySQL book that suggests REVERSE() the text (and if your application permits, store in a column). Then search the reversed text using LIKE(REVERSE('%something')).

Resources