pdfMake - Wide Table Page Break - pdfmake

I am using pdfMake to generate table reports. Some of the reports are very wide and dont fit on a standard page width, even in landscape mode. Currently, pdfMake is cutting off the table content when it overflows past the page margin.
I would like to page break the table when it is too wide, much like when the rows overflow to the next page.
Is this possible using pdfMake?
Can using pageBreakBefore callback function help for this?
Thank you

Yes, this is possible with pdfMake, even though not currently a feature.
To achieve this, you can just break overflowing columns into another table. We can put all the tables in an array, then just set them in the docDefinition as follows.
Any common attributes you want in the tables can be defined in the Template constructor.
for (var i = 0; i < tables.length;i++) {
docDefinition.content[i] = new Template();
docDefinition.content[i].table.body = tables[i];
docDefinition.content[i].table.widths = widths[i];
if (i > 0) {
docDefinition.content[i].pageBreak = 'before';
}
}
function Template(){
this.table = {
dontBreakRows: true
};
//zebra stripes layout
this.layout = {
fillColor: function (row, node, col) {
return (row % 2 === 0) ? '#CCCCCC' : null;
}
}
}
How do we determine if a column will overflow? We have two options:
If you are using bootstrap datatables, you can use the "width" attribute in the html.
pdfmake calculates the actual width, but you may have to dig around in pdfmake.js.
Then, just loop through, adding widths until you exceed your limit (my limit was for 8pt font). You can do this for THs then save those column splits and use those for the TDs.
If the final page is just barely overflowing, we don't want the final page to have just one column, we want each page to have roughly the same width. We calculate the number of pages needed, then find the desired break point from there. To link the pages together more easily, you can add a row number column at the beginning of each table.
var colSplits = [];
var tables = new Array();
function parseTHs(colSplits, tables) {
var colSum = 0;
var pageSize = 1120-7*rows.toString().length;
var paddingDiff = 11.9;
var columns = 0;
var prevSum;
var i = 0;
var width = $(".dataTables_scrollHeadInner > table").width();
var pages = Math.ceil(width/pageSize);
console.log("pages: "+pages);
var desiredBreakPoint = width/pages;
console.log("spread: "+desiredBreakPoint);
var limit = pageSize;
var row = ['#'];
var percent = '%';
widths.push(percent);
$(".dataTables_scrollHeadInner > table > thead > tr:first > th").each(function() {
prevSum = colSum;
colSum += $(this).outerWidth()-paddingDiff;
//if adding column brings us farther away from the desiredBreakPoint than before, kick it to next table
if (colSum > limit || desiredBreakPoint-colSum > desiredBreakPoint-prevSum) {
tables[i] = [row];
row = ['#'];
widths.push(percent);
colSplits.push(columns);
i++;
desiredBreakPoint += width/pages;
limit = prevSum+pageSize;
}
row.push({text: $(this).text(), style:'header'});
widths.push(percent);
columns++;
});
//add the final set of columns
tables[i] = [row];
}
function parseTDs(colSplits, tables) {
var currentRow = 0;
$("#"+tableId+" > tbody > tr").each(function() {
var i = 0;
var row = [currentRow+1];
var currentColumn = 0;
var split = colSplits[i];
$(this).find("td").each(function() {
if (currentColumn === split) {
tables[i].push(row);
row = [currentRow+1];
i++;
split = colSplits[i];
}
row.push({text: $(this).text()});
currentColumn++;
});
//add the final set of columns
tables[i].push(row);
currentRow++;
});
}
parseTHs(colSplits, tables);
parseTDs(colSplits, tables);
Note: If you want the columns to fill all the available page, there's a good implementation for that at this link.
I just added '%' for the widths and added that code to pdfmake.js.
Hope this helps!

Just add dontBreakRows property in your table object like this
table: {
dontBreakRows: true,
widths: [30,75,48,48,48,48,48,115],
body: []
}

Also, you can make the page wider and change the page orientation as landscape.
pageSize: "A2",
pageOrientation: "landscape",

Related

Why does GSheet's "find" not find my blank cells even though ISBLANK() finds them?

I have the following problem: When I type "^\s*$" into GSheet's "find" it does not find my blank cells even though ISBLANK() finds them. I need to find and replace the blank cells with "NA". Help would be greatly appreciated!
This is an excerpt of my table: https://docs.google.com/spreadsheets/d/12EajCPW68UXc8kgeqfsEoxgTBuLbccdAnnj6tZOSntM/edit#gid=0
try it like this:
=REGEXMATCH(C5, "^$")
^$
You can try this Apps Script code. This should get you started.
function replaceBlank() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getActiveSheet();
var range = sheet.getRange(2,2, sheet.getLastRow(), 2);
var rangeval = range.getValues();
for (var i = 0; i < sheet.getLastRow()-1; i++) {
Logger.log(rangeval[i]);
if (rangeval[i].concat() == ",") {
rangeval[i].splice(0,2,"NA","NA");
sheet.getRange(i+2,2,1,2).setValues([rangeval[i]]);
} else {
}
}
}
The way this code works is that it will iterate through columns B and C and once it detects that the current row is blank, it will set the value NA as defined on rangeval[i].splice(0,2,"NA","NA");
Screenshot:

Extract visual text from Google Classic Site page using Apps Script in Google Sheets

I have about 5,000 Classic Google Sites pages that I need to have a Google Apps script under Google Sheets examine one by one, extract the data, and enter that data into the Google Sheet row by row.
I wrote an apps script to use one of the sheets called "Pages" that contains the exactly URL of each page row by row, to run down while doing the extraction.
That in return would get the HTML contents and I would then use regex to extract the data I want which is the values to the right of each of the following...
Job name
Domain owner
Urgency/Impact
ISOC instructions
Which would then write that date under the proper columns in the Google Sheet.
This worked except for one big problem. The HTML is not consistent. Also, ID's and tags were not used so really it makes trying to do this through SitesApp.getPageByUrl not possible.
Here is the code I came up with for that attempt.
function startCollection () {
var masterList = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Pages");
var startRow = 1;
var lastRow = masterList.getLastRow();
for(var i = startRow; i <= lastRow; i++) {
var target = masterList.getRange("A"+i).getValue();
sniff(target)
};
}
function sniff (target) {
var pageURL = target;
var pageContent = SitesApp.getPageByUrl(pageURL).getHtmlContent();
Logger.log("Scraping: ", target);
// Extract the job name
var JobNameRegExp = new RegExp(/(Job name:<\/b><\/td><td style='text-align:left;width:738px'>)(.*?)(\<\/td>)/m);
var JobNameValue = JobNameRegExp.exec(pageContent);
var JobMatch = JobNameValue[2];
if (JobMatch == null){
JobMatch = "NOTE FOUND: " + pageURL;
}
// Extract domain owner
var DomainRegExp = new RegExp(/(Domain owner:<\/b><\/td><td style='text-align:left;width:738px'><span style='font-family:arial,sans,sans-serif;font-size:13px'>)(.*?)(<\/span>)/m);
var DomainValue = DomainRegExp.exec(pageContent);
Logger.log("DUMP1:",SitesApp.getPageByUrl(pageURL).getHtmlContent());
var DomainMatch = DomainValue[2];
if (JobMatch == null){
DomainMatch = "N/A";
}
// Extract Urgency & Impact
var UrgRegExp = new RegExp(/(Urgency\/Impact:<\/b><\/td><td style='text-align:left;width:738px'>)(.*?)(<\/td>)/m);
var UrgValue = UrgRegExp.exec(pageContent);
var UrgMatch = UrgValue[2];
if (JobMatch == null){
UrgMatch = "N/A";
}
// Extract ISOC Instructions
var ISOCRegExp = new RegExp(/(ISOC instructions:<\/b><\/td><td style='text-align:left;width:738px'>)(.*?)(<\/td>)/m);
var ISOCValue = ISOCRegExp.exec(pageContent);
var ISOCMatch = ISOCValue[2];
if (JobMatch == null){
ISOCMatch = "N/A";
}
// Add record to sheet
var row_data = {
Job_Name:JobMatch,
Domain_Owner:DomainMatch,
Urgency_Impact:UrgMatch,
ISOC_Instructions:ISOCMatch,
};
insertRowInTracker(row_data)
}
function insertRowInTracker(rowData) {
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Jobs");
var rowValues = [];
var columnHeaders = sheet.getDataRange().offset(0, 0, 1).getValues()[0];
Logger.log("Writing to the sheet: ", sheet.getName());
Logger.log("Writing Row Data: ", rowData);
columnHeaders.forEach((header) => {
rowValues.push(rowData[header]);
});
sheet.appendRow(rowValues);
}
So for my next idea, I have thought about using UrlFetchApp.fetch. The one problem I have though is that these pages on that Classics Google Site sit behind a non-shared with the public domain. While using SitesApp.getPageByUrl has the script ask for authorization and works, SitesApp.getPageByUrl does not meaning when it tries to call the direct page, it just gets the Google login page.
I might be able to work around this and turn them public, but I am still working on that.
I am running out of ideas fast on this one and hoping there is another way I have not thought of or seen. What I would really like to do is not even mess with the HTML content. I would like to use apps script under the Google Sheet to just look at the actual data presented on the page and then match a text and capture the value to the right of it.
For example have it go down the list of URLS on sheet called "Pages" and do the following for each page:
Find the following values:
Find the text "Job name:", capture the text to the right of it.
Find the text "Domain owner:", capture the text to the right of it.
Find the text "Urgency/Impact:", capture the text to the right of it.
Find the text "ISOC instructions:", capture the text to the right of it.
Write those values to a new row in sheet called "Jobs" as seen below.
Then move on the the next URL in the sheet called "Pages" and repeat until all rows in the sheet "Pages" have been completed.
Example of the data I want to capture
I have created an exact copy of one of the pages for testing and is public.
https://sites.google.com/site/2020dump/test
An inspect example
The raw HTML of the table which contains all the data I am after.
<tr>
<td style="width:190px"><b>Domain owner:</b></td>
<td style="text-align:left;width:738px">IT.FinanceHRCore </td>
</tr>
<tr>
<td style="width:190px"> <b>Urgency/Impact:</b></td>
<td style="text-align:left;width:738px">Medium (3 - Urgency, 3 - Impact) </td>
</tr>
<tr>
<td style="width:190px"><b>ISOC instructions:</b></td>
<td style="text-align:left;width:738px">None </td>
</tr>
<tr>
<td style="width:190px"></td>
<td style="text-align:left;width:738px"> </td>
</tr>
</tbody>
</table>
Any examples of how I can accomplish this? I am not sure how from an apps script perspective to go about not looking at HTML and only looking at the actual data displayed on the page. For example looking for the text "Job name:" and then grabbing the text to the right of it.
The goal at the end of the day is to transfer the data from each page into one big Google Sheet so we can kill off the Google Classic Site.
I have been scraping data with apps script using regular expressions for a while, but I will say that the formatting of this page does make it difficult.
A lot of the pages that I scrape have tables in them so I made a helper script that will go through and clean them up and turn them into arrays. Copy and paste the script below into a new google script:
function scrapetables(html,startingtable,extractlinksTF) {
var totaltables = /<table.*?>/g
var total = html.match(totaltables)
var tableregex = /<table[\s\S]*?<\/table>/g;
var tables = html.match(tableregex);
var arrays = []
var i = startingtable || 0;
while (tables[i]) {
var thistable = []
var rows = tables[i].match(/<tr[\s\S]*?<\/tr>/g);
if(rows) {
var j = 0;
while (rows[j]) {
var thisrow = tablerow(rows[j])
if(thisrow.length > 2) {
thistable.push(tablerow(rows[j]))
} else {thistable.push(thisrow)}
j++
}
arrays.push(thistable);
}
i++
}
return arrays;
}
function removespaces(string) {
var newstring = string.trim().replace(/[\r\n\t]/g,'').replace(/ /g,' ');
return newstring
}
function tablerow(row,extractlinksTF) {
var cells = row.match(/<t[dh][\s\S]*?<\/t[dh]>/g);
var i = 0;
var thisrow = [];
while (cells[i]) {
thisrow.push(removehtmlmarkup(cells[i],extractlinksTF))
i++
}
return thisrow
}
function removehtmlmarkup(string,extractlinksTF) {
var string2 = removespaces(string.replace(/<\/?[A-Za-z].*?>/g,''))
var obj = {string: string2}
//check for link
if(/<a href=.*?<\/a>/.test(string)) {
obj['link'] = /<a href="(.*?)"/.exec(string)[1]
}
if(extractlinksTF) {
return obj;
} else {return string2}
}
Running this got close, but at the moment, this doesn't handle nested tables well so I cleaned up the input by sending only the table that we want by isolating it with a regular expression:
var tablehtml = /(<table[\s\S]{200,1000}Job Name[\s\S]*?<\/table>)/im.exec(html)[1]
Your parent function will then look like this:
function sniff(pageURL) {
var html= SitesApp.getPageByUrl(pageURL).getHtmlContent();
var tablehtml = /(<table[\s\S]{200,1000}Job Name[\s\S]*?<\/table>)/im.exec(html)[1]
var table = scrapetables(tablehtml);
var row_data =
{
Job_Name: na(table[0][3][1]), //indicates the 1st table in the html, row 4, cell 2
Domain_Owner: na(table[0][4][1]), // indicates 1st table in the html, row 5, cell 2 etc...
Urgency_Impact: na(table[0][5][1]),
ISOC_Instructions: na(table[0][6][1])
}
insertRowInTracker(row_data)
}
function na(string) {
if(string) {
return string
} else { return 'N/A'}
}

Get count of characters for translation in Kentico Cloud

Is there a way to tell the count of characters of all text fields in some of our content items? We need to estimate a translation price for our content items.
You can use Delivery API to retrieve your items and run a quick javascript to count the characters for you. First, get all your items (or a subset, depending on what you need) with the call excluding all the modular content (linked items) like this:
https://deliver.kenticocloud.com/<projectid>/items?depth=0​​​​​​​
Then you can use browser console to run this piece of code:
var response = JSON.parse(document.getElementsByTagName("BODY")[0].textContent);
var noOfChars = 0;
for (var x = 0; x < response.items.length; x++) {
var p = response.items[x].elements;
for (var key in p) {
if (p[key].type=='rich_text' || p[key].type=='text') {
noOfChars += strip(p[key].value).length;
}
}
}
noOfChars;
function strip(html)
{
var tmp = document.createElement("DIV");
tmp.innerHTML = html;
return tmp.textContent || tmp.innerText || "";
}
And hit enter. This is what the result will look like:

Zapier: BigCommerce to Google Sheet, New Row for Each Item

I have successfully linked my BigCommerce account to my Google Sheets (Drive) account so every time I receive a new order in my store the order is automatically exported into a Google Sheet. Unfortunately, an entire order is listed on one row with multiple items added into one cell. What I need is to have each product on its own row; for example, if someone orders three different products Zapier would create three new rows. This functionality exists when directly exporting orders from BigCommerce, but the "Zap" does not use the BigCommerce export function when pulling order information from my store to the Google Sheet.
I know this is a shot in the dark, but I am hoping someone might have a solution that I can implement. Thank you for your help!
I have created a script that perhaps could be used or modified, at least until you find if the process can be done within Zapier.
You can try the script in the following ss: https://docs.google.com/spreadsheets/d/1ggNYlLEeN3UYtZC_KlOGwpyII9CzOLKMnIOKIDrPJPM/edit?usp=sharing
The script assumes that orders arrive in the tab named Zapier. As things are set up, you would run the script through the Custom Menu.
If there are 2 orders or more, click the menu for each order.
The complete rows appear in the sheet FullList.
(if you want to play/try again, you will have to manually delete the rows in FullList once they are showing).
function processForNewOrders() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sourceSheet = ss.getSheetByName('Zapier');
var destinationSheet = ss.getSheetByName('FullList');
var sourceValues = sourceSheet.getDataRange().getValues();
var destinationValues = destinationSheet.getDataRange().getValues();
var index = [];
destinationValues.forEach( function (x) {
index.push(x[0]);
})
var newOrders = [];
for (var y = sourceValues.length -1 ; y > 0 ; y --){
if(sourceValues[y][0].toString().indexOf('po_number') != -1 ) continue;
var i = index.indexOf(sourceValues[y][0]);
if(i != -1) break; // This readies only the fresh orders for processing
newOrders.push(sourceValues[y]);
}
Logger.log(newOrders)
for (var j = 0 ; j < newOrders.length ; j++){
var output = [];
var orderLine = newOrders[j];
Logger.log('orderLine = ' + orderLine);
var circuit = 0;
var items = 1
while (circuit < items){
var row = [];
for (var z = 0 ; z < orderLine.length; z++){
var cell = orderLine[z];
// Logger.log(cell);
var lines = cell.toString().split(',');
if(lines.length > 1) items = lines.length;
row.push(lines[circuit] ? lines[circuit] : lines[0]);
// Logger.log('row =' + row);
}
circuit ++;
Logger.log('circuit circuit circuit =' + circuit)
output.push(row);
}
}
Logger.log(output);
if(output != undefined)
destinationSheet.getRange(index.length+1,1,output.length,output[0].length).setValues(output);
}
function onOpen() {
var ui = SpreadsheetApp.getUi();
// Or DocumentApp or FormApp.
ui.createMenu('Custom Menu')
.addItem('Process new order', 'processForNewOrders')
.addToUi();
}

Sequence number for each different value in a colum

I have an Input column with a sequence of two different letters. As result I want to get something like on the picture. This formulas I will use with ARRAYFORMULA to get unlimited count of rows. To get BLOCK № I was trying to use =COUNTIFS($B$2:B2,"N") but it works only if I copy the formula manually down the column, but if I do:
=ARRAYFORMULA(COUNTIFS(($B$2):(B2:B),"N"))
It doesn't work.
How can I replicate the behavior of this function without needed to manually copy it?
I'd recommend writing a script to fill the Block Nos.
I'll assume the topmost letter begins at cell input!A4 and you want the Block Nos from cell input!C5 and below. Go to the menu bar of the spreadsheet and select Script Editor. Then write the following scripts:
//the main function
function writeBlocks() {
var sheet = SpreadsheetApp.getActiveSpreadsheet()
.getSheetByName('input');
var numRows = sheet.getLastRow();
var startRow = 4;
var inputCol = 1;
var outputCol = 3;
var block = 0;
//clear old Block Nos
sheet.getRange(startRow, outputCol, numRows - 3, 1)
.clearContent();
//recalculate LastRow in case there are fewer new inputs than old outputs
numRows = sheet.getLastRow();
//get input data
var input = sheet.getRange(startRow, inputCol, numRows - 3, 1)
.getValues;
//write output data
for (var i = 0; i < input.length; i++) {
block += input[i] == "N" ? 1 : 0;
sheet.getRange(startRow + i, outputCol)
.setValue(block);
}
}
//create new menu
function onOpen() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var menuEntries = [];
menuEntries.push({name: "Calculate blocks", functionName: "writeBlocks"});
ss.addMenu("Custom functions", menuEntries);
}
Save it all, refresh the spreadsheet, and there should be a new option on the menu bar. When you select that option, it will clear the old Block Nos and generate new ones based on the current inputs. Hope this helps.

Resources