reading xml with Linq - xml-parsing

I cannot figure out how to get the all the ItemDetail nodes in the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<AssessmentMetadata xmlns="http://tempuri.org/AssessmentMetadata.xsd">
<ItemDetails>
<ItemName>I1200</ItemName>
<ISC_Inactive_Codes>NS,NSD,NO,NOD,ND,NT,SP,SS,SSD,SO,SOD,SD,ST,XX</ISC_Inactive_Codes>
<ISC_StateOptional_Codes>NQ,NP</ISC_StateOptional_Codes>
</ItemDetails>
<ItemDetails>
<ItemName>I1300</ItemName>
<ISC_Inactive_Codes>NS,NSD,NO,NOD,ND,NT,SP,SS,SSD,SO,SOD,SD,ST,XX</ISC_Inactive_Codes>
<ISC_StateOptional_Codes>NQ,NP</ISC_StateOptional_Codes>
</ItemDetails>
<ItemDetails>
<ItemName>I1400</ItemName>
<ISC_Active_Codes>NC</ISC_Active_Codes>
<ISC_Inactive_Codes>NS,NSD,NO,NOD,ND,NT,SP,SS,SSD,SO,SOD,SD,ST,XX</ISC_Inactive_Codes>
<ISC_StateOptional_Codes>NQ,NP</ISC_StateOptional_Codes>
</ItemDetails>
</AssessmentMetadata>
I have tried a number of things, I am thinking it might be a namespace issue, so this is my last try:
var xdoc = XDocument.Load(asmtMetadata.Filepath);
var assessmentMetadata = xdoc.XPathSelectElement("/AssessmentMetadata");

You need to get the default namespace and use it when querying:
var ns = xdoc.Root.GetDefaultNamespace();
var query = xdoc.Root.Elements(ns + "ItemDetails");
You'll need to prefix it for any element. For example, the following query retrieves all ItemName values:
var itemNames = xdoc.Root.Elements(ns + "ItemDetails")
.Elements(ns + "ItemName")
.Select(n => n.Value);

Related

Parse XML Feed via Google Apps Script (Cannot read property 'getChildren' of undefined")

I need to parse a Google Alert RSS Feed with Google Apps Script.
Google Alerts RSS-Feed
I found a script which should do the job but I cant get it working with Google's RSS Feed:
The feed looks like this:
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:idx="urn:atom-extension:indexing">
<id>tag:google.com,2005:reader/user/06807031914929345698/state/com.google/alerts/10604166159629661594</id>
<title>Google Alert – garbe industrial real estate</title>
<link href="https://www.google.com/alerts/feeds/06807031914929345698/10604166159629661594" rel="self"/>
<updated>2022-03-17T19:34:28Z</updated>
<entry>
<id>tag:google.com,2013:googlealerts/feed:10523743457612307958</id>
<title type="html"><b>Garbe Industrial</b> plant Multi-User-Immobilie in Ludwigsfelde - <b>Property</b> Magazine</title>
<link href="https://www.google.com/url?rct=j&sa=t&url=https://www.property-magazine.de/garbe-industrial-plant-multi-user-immobilie-in-ludwigsfelde-117551.html&ct=ga&cd=CAIyGWRmNjU0ZGNkMzJiZTRkOWY6ZGU6ZGU6REU&usg=AFQjCNENveXYlfrPc7pZTltgXY8lEAPe4A"/>
<published>2022-03-17T19:34:28Z</published>
<updated>2022-03-17T19:34:28Z</updated>
<content type="html">Die <b>Garbe Industrial Real Estate</b> GmbH startet ihr drittes Neubauprojekt in der Metropolregion Berlin/Brandenburg. Der Projektentwickler hat sich ...</content>
<author>
...
</feed>
I want to extract entry -> id, title, link, updated, content.
I used this script:
function ImportFeed(url, n) {
var res = UrlFetchApp.fetch(url).getContentText();
var xml = XmlService.parse(res);
//var item = xml.getRootElement().getChild("channel").getChildren("item")[n - 1].getChildren();
var item = xml.getRootElement().getChildren("entry")[n - 1].getChildren();
var values = item.reduce(function(obj, e) {
obj[e.getName()] = e.getValue();
return obj;
}, {});
return [[values.id, values.title, values.link, values.updated, values.content]];
}
I modified this part, but all i got was "TypeError: Cannot read property 'getChildren' of undefined"
//var item = xml.getRootElement().getChild("channel").getChildren("item")[n - 1].getChildren();
var item = xml.getRootElement().getChildren("entry")[n - 1].getChildren();
Any idea is welcome!
In your situation, how about the following modified script?
Modified script:
function SAMPLE(url, n = 1) {
var res = UrlFetchApp.fetch(url).getContentText();
var root = XmlService.parse(res.replace(/&/g, "&")).getRootElement();
var ns = root.getNamespace();
var entries = root.getChildren("entry", ns);
if (!entries || entries.length == 0) return "No values";
var header = ["id", "title", "link", "updated", "content"];
var values = header.map(f => f == "link" ? entries[n - 1].getChild(f, ns).getAttribute("href").getValue().trim() : entries[n - 1].getChild(f, ns).getValue().trim());
return [values];
}
In this case, when you use getChild and getChildren, please use the name space. I thought that this might be the reason of your issue.
From your script, I guessed that you might use your script as the custom function. In that case, please modify the function name from ImportFeed to others, because IMPORTFEED is a built-in function of Google Spreadsheet. In this sample, SAMPLE is used.
If you want to change the columns, please modify header.
In this sample, the default value of n is 1. In this case, the 1st entry is retrieved.
In this script, for example, you can put =SAMPLE("URL", 1) to a cell as the custom function. By this, the result value is returned.
Note:
If the above-modified script was not the direct solution of your issue, can you provide the sample value of res? By this, I would like to modify the script.
As the additional information, when you want to put all values by executing the script with the script editor, you can also use the following script.
function myFunction() {
var url = "###"; // Please set URL.
var res = UrlFetchApp.fetch(url).getContentText();
var root = XmlService.parse(res.replace(/&/g, "&")).getRootElement();
var ns = root.getNamespace();
var entries = root.getChildren("entry", ns);
if (!entries || entries.length == 0) return "No values";
var header = ["id", "title", "link", "updated", "content"];
var values = entries.map(e => header.map(f => f == "link" ? e.getChild(f, ns).getAttribute("href").getValue().trim() : e.getChild(f, ns).getValue().trim()));
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1"); // Please set the sheet name.
sheet.getRange(sheet.getLastRow() + 1, 1, values.length, values[0].length).setValues(values);
}
References:
XML Service
map()

How to find the SearchImpressionShare for a particular keyword?

One could easily find the average position for a keyword using getAveragePositon() method but the same is not available for SearchImpressionShare.
EDIT
I tried to get the SearchImpressionShare by querying the data but that gives me inconsistent data.
function main() {
var keywordId = 297285633818;
var last14dayStatsQuery = "SELECT Id, SearchTopImpressionShare FROM KEYWORDS_PERFORMANCE_REPORT WHERE Id = "+keywordId+" DURING LAST_14_DAYS"
var last14dayReport = AdWordsApp.report(last14dayStatsQuery);
var last14dayRows = last14dayReport.rows();
var last14dayRow = last14dayRows.next();
Logger.log('Keyword: ' + last14dayRow['Id'] + ' SearchTopIS: ' + last14dayRow['SearchTopImpressionShare']);
}
For example, below are the two outputs I received after running the same code twice.
Output 1:
10/16/2019 10:47:29 AM Keyword: 297285633818 SearchTopIS: 0.0
Output 2:
10/16/2019 10:47:45 AM Keyword: 297285633818 SearchTopIS: 0.17
Keywords performance report provides you those data https://developers.google.com/adwords/api/docs/appendix/reports/keywords-performance-report#searchimpressionshare
sample use:
function main () {
var query = "SELECT SearchImpressionShare, Criteria FROM KEYWORDS_PERFORMANCE_REPORT WHERE Clicks > 15 DURING YESTERDAY"
var report = AdWordsApp.report(query)
var rows = report.rows()
while (rows.hasNext()) {
var row = rows.next()
Logger.log('Keyrword %s, Impressions Share %s', row['Criteria'], row['SearchImpressionShare'])
}
}
update:
please note that if you have the same keyword within several ad group you'll get aslo several rows in report, each row for each adgroup. for the whole list of keywords use the following approach:
function main() {
var keywordId = 350608245287;
var last14dayStatsQuery = "SELECT Id, SearchTopImpressionShare FROM KEYWORDS_PERFORMANCE_REPORT WHERE Id = "+keywordId+" DURING LAST_14_DAYS"
var last14dayReport = AdWordsApp.report(last14dayStatsQuery);
var last14dayRows = last14dayReport.rows();
while (last14dayRows.hasNext()) {
var last14dayRow = last14dayRows.next();
Logger.log('Keyword: ' + last14dayRow['Id'] + ' SearchTopIS: ' + last14dayRow['SearchTopImpressionShare']);
}
}
You might find it useful to add ad group parameters to your query such as AdGroupName, AdGroupId.

Convert Xml String with node prefixes to XElement

This is my xml string
string fromHeader= "<a:From><a:Address>http://ex1.example.org/</a:Address></a:From>";
I want to load it into an XElement, but doing XElement.Parse(fromHeader) gives me an error due to the 'a' prefixes. I tried the following:
XNamespace xNSa = "http://www.w3.org/2005/08/addressing";
string dummyRoot = "<root xmlns:a=\"{0}\">{1}</root>";
var fromXmlStr = string.Format(dummyRoot, xNSa, fromHeader);
XElement xFrom = XElement.Parse(fromXmlStr).Elements().First();
which works, but seriously, do i need 4 lines of code to do this! What is a quickest / shortest way of getting my XElement?
I found out the above 4 lines are equivalent to
XNamespace xNSa = "http://www.w3.org/2005/08/addressing";
XElement xFrom = new XElement(xNSa + "From", new XElement(xNSa + "Address", "http://ex1.example.org/"));
OR ALTERNATIVELY move the NS into the 'From' element before parsing.
var fromStr = "<a:From xmlns:a=\"http://www.w3.org/2005/08/addressing\"><a:Address>http://ex1.example.org/</a:Address></a:From>";
XElement xFrom = XElement.Parse(fromStr);

TypeError: Cannot find function getSheets in object - looking for what's wrong

While running this code, there is error while executing this line:
var invoiceSheet = newSSFile.getSheets()[0];'
"TypeError: Cannot find function getSheets in object Copy of Invoice Przykładowy. (line 69, file "Code")"
With this code I want:
Create a new spreadsheet and move it to proper folder [works]
Get a value from another spreadsheet and paste it in this new one [error]
Looked for answer for an hour without any result. Any idea what might cause this error?
function invoice() {
//Create copy SS + name
var ssTemp = SpreadsheetApp.openById("1Cr2W_4lNrHYRdXK-KDQ7UFJJ3Iagh-tGct8Ee5Y");
var newSS = ssTemp.copy("Copy of " + ssTemp.getName());
// Move to folder
var DestinyFolder = DriveApp.getFolderById("0B-y1OC8ChG2XRjRRbEJ");
var newSSFile = DriveApp.getFileById(newSS.getId());
DestinyFolder.addFile(newSSFile);
DriveApp.getRootFolder().removeFile(newSSFile);
// Modify details
// Invoice No
var klienciSS = SpreadsheetApp.openById("18B151VlJaVtDdQ9CcLrL3iwRAtWw2ZzZydproj");
var klienciSheet = klienciSS.getSheets()[0];
var klienciRange= klienciSheet.getRange('AB6');
var klienciValue = klienciRange.getValue();
var invoiceSheet = newSSFile.getSheets()[0];
var inboiceRange = invoiceSheet.getRange('F4');
newCellInvoice.setValue(klienciValue);
The problem is caused by the following line
var newSSFile = DriveApp.getFileById(newSS.getId());
It makes that newSSFile holds an instance of Class File instead of an instance of Class Spreadsheet
But the problematic line looks unnecessary. Replace the line that throws the error by
var invoiceSheet = newSS.getSheets()[0];

YQL two requests for paging/limit

I'm playing around with a XML API where the search doesn't support paging/limit. The recommended way is to just request all the IDs and then in a second request get the data and handle paging on your own.
First request:
http://example.com?search=foobar&columns=ID
<results>
<item><id>1</id></item>
<item><id>2</id></item>
<item><id>3</id></item>
<item><id>4</id></item>
<item><id>5</id></item>
<item><id>6</id></item>
<item><id>7</id></item>
<item><id>8</id></item>
<item><id>9</id></item>
<item><id>10</id></item>
</results>
Second request:
http://example.com?search=1,2,3,4,5&columns=ID,title,description
<results>
<item><id>1</id><title>foobar</title><description /></item>
<item><id>2</id><title>foobar</title><description /></item>
<item><id>3</id><title>foobar</title><description /></item>
<item><id>4</id><title>foobar</title><description /></item>
<item><id>5</id><title>foobar</title><description /></item>
</results>
Is it possible with YQL to combine this into a single request with a search result count and paging support?
I don't have a straight forward way from the documentation but you could do this:
1) Create a YQL table A which queries http://example.com?search=foobar&columns=ID
2) Create a YQL table B which queries http://example.com?search=1,2,3,4,5&columns=ID,title,description
3) Now, create a YQL table C which does a y.query on join of A and B like so:
select * from B where search in (select ids from A where search="foobar")
Ofcourse the query syntax will change based on table name and keys defined in it. For more information on YQL join refer here
Hope this is clear and if you find something better in this case, do let me know :)
Create a YQL table with paging:
<paging model="offset" matrix="true">
<start id="internalIndex" default="0" />
<pagesize id="internalPerPage" max="250" />
</paging>
Use Javascript to handle the two fetches and the paging:
var internalIndex = parseInt( request.matrixParams['internalIndex'] );
var internalPerPage = parseInt( request.matrixParams['internalPerPage'] );
var interimURL = 'http://example.com?columns=ID';
interimURL += '&search=' + request.queryParams['search'];
var interimQueryParameter = {url:interimURL};
var interimQuery = y.query("SELECT * FROM xml WHERE url=#url", interimQueryParameter);
var rows = interimQuery.results.*;
// get subset
var xml = rows;//this.copy(); // clone XML
var from = internalIndex;
var to = ((from + internalPerPage) < xml.length()) ? (from + internalPerPage) : xml.length();
var sliced = [];
for (; from < to; from++) {
sliced.push(xml[from].#["ID"]);
}
var finalURL = 'http://example.com?';
finalURL += '&search=' + sliced.join(",");
finalURL += "&columns=ID,title,description"
var finalQueryParameter = {url:finalURL};
var finalQuery = y.query("SELECT * FROM xml WHERE url=#url", finalQueryParameter);
var finalResults = finalQuery.results.response;
finalResults.node += <internalCurserPositon>{internalIndex}</internalCurserPositon>
finalResults.node += <internalCount>{rows.length()}</internalCount>
response.object = finalResults;

Resources