Scriptella: XML to DB: Insert Into from XPATH - scriptella

I have an XML file that looks like this:
<XML>
<Table name='test'>
<Row>
<Field name='key'>1000</Field>
<Field name='text'>Test</Field>
</Row>
</Table>
</XML>
id like to parse this xml and use it within an insert statement:
<query connection-id="in">
/XML/Table/Row
<script connection-id="out">
INSERT INTO X t (
t.entitykey,
t.text
)
VALUES
(
????????
);
</script>
</query>
How do I access a specific Field-Tag from within the insert statement using XPATH?
We prefer to have one XSD that takes all table layouts into account and not to maintain n xsd for each table hence the Field[#name] design.
Thanks
Matthias

Xpath driver exposes a variable called node which provides a context for executing xpath expressions over a currently returned node. You can use the following expression to get the value of a particular field:
<script connection-id="out">
INSERT INTO X t (t.entitykey, t.text)
VALUES ( ?{node.getString("./Field[#name = 'text']")} );
</script>

Related

copy data using scriptella based on a test on special column

I have two databases in postgresql. I want to copy data from database to another based on a condition.I used scriptella but what i want is to copy rows when a column is not empty.But i always have the empty ones which are copied here what i did , i want to copy based on condition on a special column .
Here is the file
<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
<description>
test script
</description>
<connection id="in" driver="postgresql" url="jdbc:postgresql://localhost:5432/testMonoprix" user="postgres" password="maher" >
</connection>
<connection id="out" driver="postgresql" url="jdbc:postgresql://localhost:5432/testMonoprix2" user="postgres" password="maher">
</connection>
<query connection-id="in" >
SELECT * FROM public.param_type;
<script connection-id="out" if=" parent_param_type_id != null ">
INSERT INTO public.param_type VALUES (?1, ?2,?3,?4,?5,?6) ;
</script>
</query>
</etl>
How would the file be in order to copy non empty ones ,
Thanks

xpath expression to select specific xml nodes that are available in a file

I was trying to find the out a way for my strange problem.
How to write an xpath to select specific xml nodes that are available in another text file.
For Instance,
<xsl:for-each select="SUBSCRIBER_PROFILE_LIST/SUBSCRIBER_PROFILE_INFO[GROUP_NAME eq (group name list in a text file as input)]">
For example,
<xsl:for-each select="SUBSCRIBER_PROFILE_LIST/SUBSCRIBER_PROFILE_INFO[GROUP_NAME eq collection('select_nodes.txt')]">
select_nodes.txt contains list of string that can be selected only
For example
ABC
IJK
<SUBSCRIBER>
<MSISDN>123456</MSISDN>
<SUBSCRIBER_PROFILE_LIST>
<SUBSCRIBER_PROFILE_INFO>
<PROFILE_MSISDN>12345</PROFILE_MSISDN>
<GROUP_NAME>ABC</GROUP_NAME>
<GROUP_ID>18</GROUP_ID>
</SUBSCRIBER_PROFILE_INFO>
<SUBSCRIBER_PROFILE_INFO>
<PROFILE_MSISDN>456778</PROFILE_MSISDN>
<GROUP_NAME>DEF</GROUP_NAME>
<GROUP_ID>100</GROUP_ID>
</SUBSCRIBER_PROFILE_INFO>
<SUBSCRIBER_PROFILE_INFO>
<PROFILE_MSISDN>78876</PROFILE_MSISDN>
<GROUP_NAME>IJK</GROUP_NAME>
<GROUP_ID>3</GROUP_ID>
</SUBSCRIBER_PROFILE_INFO>
</SUBSCRIBER>
XSLT2 has limited functionality for parsing arbitrary text files. I would suggest:
Make the select_nodes.txt an XML file and load it using the doc() function:
<xsl:variable name="group_names" as="xs:string *"
select="doc('select_nodes.xml')/groups/group"/>
with select_nodes.xml looking like this:
<?xml version="1.0" encoding="UTF-8"?>
<groups>
<group>ABC</group>
<group>IJK</group>
</groups>
Pass the group names as a stylesheet parameter. (How you do this depends on which XSLT engine you're using and whether it's through the command line or an API.) If it's through an API, then you may be able to pass the values in directly as xs:string-typed objects. Otherwise you'll have to parse the parameter:
<xsl:param name="group_names_param"/>
<!-- Assuming the input string is a whitespace-separated list of names -->
<xsl:variable name="group_names" as="xs:string *"
select="tokenize($group_names_param, '\s+')"/>
In either case your for-each expression would then look like this:
<xsl:for-each select="
SUBSCRIBER_PROFILE_LIST/SUBSCRIBER_PROFILE_INFO[GROUP_NAME = $group_names]">
<!-- Do something -->
</xsl:for-each>

Handling empty resultset from a query

I have two databases with the same structure and I want to compare records between databases. The records in second database are copied from the first database, but the copying process sometime doesn't work and in the first database in one table I have more records than in the same table in the second database. So I want to know which records from the first database doesn't exists in the second database. I have tried with something like that:
<etl>
<connection id="db1" driver="auto"
url="jdbc:mysql://localhost:3306/db" user="user"
password="xxx"
classpath="C:/mysql-connector-java-5.1.20.jar" />
<connection id="db2" driver="auto"
url="jdbc:mysql://localhost:3307/db" user="user"
password="xxx"
classpath="C:/mysql-connector-java-5.1.20.jar" />
<connection id="text" driver="text" />
<query connection-id="db1">
SELECT * FROM table;
<query connection-id="db2">
SELECT * FROM table WHERE id = '$id';
<script connection-id="text">
sometext, $rownum
</script>
</query>
</query>
</etl>
The problem is, when the result of the query against db2 is empty the script is not executed.
How to solve this problem?
Regards,
Jacek
You can use count to check the actual number of records. In this case resultset will always return one row. Example:
<query connection-id="db1">
SELECT * FROM table;
<query connection-id="db2">
SELECT count(id) as CNT FROM table WHERE id = ?id;
<!-- The script is executed ONLY IF number of results is zero -->
<script connection-id="text" if="CNT==0">
No matching record for id $id
</script>
</query>
</query>
Probably he doesn't need the if condition, because his problem is the script wasn't executed ;)

"count" field is always "1" with custom JavaScript YQL Open Data Table

YQL returns the number of records retrieved in its XML output:
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="2" yahoo:created="2012-08-24T14:02:32Z" yahoo:lang="en-US">
<diagnostics>
But I've been experimenting with my own custom Open Data Tables, at least ones which employ an execute block containing JavaScript to create the response, and no matter how I create the response the count field is always set to 1 when I make a query using the table.
I've also dug around in the documentation and can't seem to find anything addressing this.
Is this by design? Is it a bug? Have I missed something obvious?
This is commonly caused by only returning one result, which shouldn't really come as a surprise. The most usual cause of this, from my own experience, is forgetting to specify a suitable itemPath for the <select>.
Take the following examples, and see how the response.object structure and itemPath combine to give the query results.
Without an itemPath
<select itemPath="" produces="XML">
<execute>
<![CDATA[
response.object = <letters>
<letter>A</letter>
<letter>B</letter>
<letter>C</letter>
</letters>
]]>
</execute>
</select>
Produces a query result similar to:
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="1" yahoo:created="…" yahoo:lang="en-US">
<results>
<letters>
<letter>A</letter>
<letter>B</letter>
<letter>C</letter>
</letters>
</results>
</query>
With itemPath="letters"
<select itemPath="letters" produces="XML">
…
</select>
Produces a query result identical to the previous result.
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="1" yahoo:created="…" yahoo:lang="en-US">
<results>
<letters>
<letter>A</letter>
<letter>B</letter>
<letter>C</letter>
</letters>
</results>
</query>
With itemPath="letters.letter"
<select itemPath="letters.letter" produces="XML">
…
</select>
Note that now, the path now specifies a collection of letter items. This produces a query result similar to:
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="3" yahoo:created="…" yahoo:lang="en-US">
<results>
<letter>A</letter>
<letter>B</letter>
<letter>C</letter>
</results>
</query>

Uncaught exception 'DOMException' with message 'Not Found Error'

Bascially I'm writing a templating system for my CMS and I want to have a modular structure which involves people putting in tags like:
<module name="news" /> or <include name="anotherTemplateFile" /> which I then want to find in my php and replace with dynamic html.
Someone on here pointed me towards DOMDocument, but I've already come across a problem.
I'm trying to find all <include /> tags in my template and replace them with some simple html. Here is my template code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>CMS</title>
<include name="head" />
</head>
<body>
<include name="header" />
<include name="content" />
<include name="footer" />
</body>
</html>
And here is my PHP:
$template = new DOMDocument();
$template->load("template/template.tpl");
foreach( $template->getElementsByTagName("include") as $include ) {
$element = '<input type="text" value="'.print_r($include, true).'" />';
$output = $template->createTextNode($element);
$template->replaceChild($output, $include);
}
echo $template->saveHTML();
Now, I get the fatal error Uncaught exception 'DOMException' with message 'Not Found Error'.
I've looked this up and it seems to be that because my <include /> tags aren't necessarily DIRECT children of $template its not replacing them.
How can I replace them independently of descent?
Thank you
Tom
EDIT
Basically I had a brainwave of sorts. If I do something like this for my PHP I see its trying to do what I want it to do:
$template = new DOMDocument();
$template->load("template/template.tpl");
foreach( $template->getElementsByTagName("include") as $include ) {
$element = '<input type="text" value="'.print_r($include, true).'" />';
$output = $template->createTextNode($element);
// this line is different:
$include->parentNode->replaceChild($output, $include);
}
echo $template->saveHTML();
However it only seems to change 1 occurence in the <body> of my HTML... when I have 3. :/
This is a problem with your DOMDocument->load, try
$template->loadHTMLFile("template/template.tpl");
But you may need to give it a .html extension.
this is looking for a html or an xml file. also, whenever you are using DOMDocument with html it is a good idea to use libxml_use_internal_errors(true); before the load call.
OKAY THIS WORKS:
foreach( $template->getElementsByTagName("include") as $include ) {
if ($include->hasAttributes()) {
$includes[] = $include;
}
//var_dump($includes);
}
foreach ($includes as $include) {
$include_name = $include->getAttribute("name");
$input = $template->createElement('input');
$type = $template->createAttribute('type');
$typeval = $template->createTextNode('text');
$type->appendChild($typeval);
$input->appendChild($type);
$name = $template->createAttribute('name');
$nameval = $template->createTextNode('the_name');
$name->appendChild($nameval);
$input->appendChild($name);
$value = $template->createAttribute('value');
$valueval = $template->createTextNode($include_name);
$value->appendChild($valueval);
$input->appendChild($value);
if ($include->getAttribute("name") == "head") {
$template->getElementsByTagName('head')->item(0)->replaceChild($input,$include);
}
else {
$template->getElementsByTagName("body")->item(0)->replaceChild($input,$include);
}
}
//$template->load($nht);
echo $template->saveHTML();
However it only seems to change 1 occurence in the of my HTML... when I have 3. :/
DOM NodeLists are ‘live’: when you remove an <include> element from the document (by replacing it), it disappears from the list. Conversely if you add a new <include> into the document, it will appear in your list.
You might expect this for a NodeList that comes from an element's childNodes, but the same is true of NodeLists that are returned getElementsByTagName. It's part of the W3C DOM standard and occurs in web browsers' DOMs as well as PHP's DOMDocument.
So what you have here is a destructive iteration. Remove the first <include> (item 0 in the list) and the second <include>, previously item 1, become the new item 0. Now when you move on to the next item in the list, item 1 is what used to be item 2, causing you to only look at half the items.
PHP's foreach loop looks like it might protect you from that, but actually under the covers it's doing exactly the same as a traditional indexed for loop.
I'd try to avoid creating a new templating language for PHP; there are already so many, not to mention PHP itself. Creating one out of DOMDocument is also going to be especially slow.
eta: In general regex replace would be faster, assuming a simple match pattern that doesn't introduce loads of backtracking. However if you are wedded to an XML syntax, regex isn't very good at parsing that. But what are you attempting to do, that can't already be done with PHP?
<?php function write_header() { ?>
<p>This is the header bit!</p>
<? } ?>
<body>
...
<?php write_header(); ?>
...
</body>

Resources