Python-docx: Cross-reference to header - hyperlink

I am using Python-docx API to automatically generate some documents. In MS word, cross-referencing can be used and adding hyperlinks to headers/numbered items/figures/tables. Is there a way to do this using Python-docx?
For example:
"Refer to Table 1"
And when Table 1 is ctrl + clicked, will direct me to the page where Table 1 can be located.
Thanks.
I've been looking over the forums but can't seem to find a solution. I've seen hyperlinks to bookmarks but I would like to do it with headers/figures/tables.

Related

Google Docs API and Dreaded Table Row Inserts

I've been playing around with Google Docs API and am stuck on being able to add a row to an existing table in a doc and fill that row (3 columns) with data.
Below is Pastebin file of Google Get which returns a huge JSON of pretty much everything in the doc (formatting, content etc.)
(Stack OVerflow has an issue with me including pastebin file so be ready for a huge file underneath here which probably won't fit)
This a sample doc - and if you check it out in a too like https://jsoneditoronline.org/ (which I just used) to see the document structure - you'll note that it has 3 tables in total.
I've written some code that puts the start indexes of all the tables in the document into an array but I can't for the life of me figure out a clear explanation of how I can:
a) Insert a row (at the bottom of the first table for example)
b) Insert data into the first, second and 3rd column of that new row
I have read the guides but it is all very confusing - because after I insert a row the document changes and the startIndexes and all that adjust - is that correct?
If anyone has any input on the code that would insert a new row AND populate the columns in that row in a one easy to use solution I would really appreciate any help (hopefully without having to query the whole JSON again after inserting the row).
Thank you
P.S. Tried to insert pastebin link but it wouldn't let me... tried to paste JSON directly and it was too big so... I'll have to leave the question with the most info I can for now - I will ask Google direct and include the JSON.
just updating that I've solved this by using the FPDF PHP library instead - and I just copy the Google Docs text into this Google converter (conerts to HTML) then passing all the HTML to the FPDF library.
So... question is no longer relevant.
For interested parties:
use DocumentService.BatchUpdateDocumentRequest()
request should be InsertTableRequest
for more information see:

Google Sheets import multiple HTML table images

Summary
I'm looking to import a data table from a website that does not appear to have an API. The table is broken down to various images and text. The goal is to have all of the content available in a table to then reference for other sheets.
Issue
When I pull in the data, I get some of the text, none of the other images, and a reference to another table. I looked up some options, but none of them yielded anything but blank cells.
I also tried to use the =IMAGE() formula with a direct link to the images URLs, but there is a portion of the URL that is specific to the unit's release date, and as such, too dynamic to account for.
Excel Formula
=IMPORTHTML("https://gamepress.gg/pokemonmasters/database/sync-pair-list","table",3)
Unfortunately without an API it is going to be difficult to achieve what you aim here. These are the main reasons why:
PROBLEMS AND WORKAROUNDS
This table has nested tables that therefore need to be accessed separately. If you take a look at: =IMPORTHTML("https://gamepress.gg/pokemonmasters/database/sync-pair-list","table",4)
you will see how the table 4 of this HTML page is the stats of a random character of the main table. If you go for 5 or 6 you will realise that the nested tables are not even numerically ordered and that you cannot access them by accessing to the main table (i.e mainTable[0].nestedTable). A hard working approach to do this is to go one by one finding their corresponding stat table and placing next to it. For this I recommend extracting only the name field of the main table to be able to align each stat to their character. You can simply do this using:=INDEX(IMPORTHTML("https://gamepress.gg/pokemonmasters/database/sync-pair-list","table",3),0,1). You can find out more about INDEX here
IMPORTHTML cannot access images nor links so it will be very difficult to get the images in the last columns. A way to solve this is by using as you mentioned the image with its url like this: =IMAGE("https://gamepress.gg/pokemonmasters/sites/pokemonmasters/files/styles/30x30/public/2019-07/Electric.png?itok=fkRfkrFX"). You can find more info about inserting images here
CONCLUSION
To sum up, there is no easy way to solve this problem. The closest you can get is by:
Importing the name column.
Figuring out which tables belong to which character and placing them with next to their name.
Getting the image url of each weakness and type and add it to each character.
I am sorry this site does not have an API to make things smooth, good luck with your project and let me know if you need anything else or if you did not understand anything.
Here you can find more information about IMPORTHTML

Embed just a range of editable google spreadsheet

I have a google spreadsheet, and I give each of my users their own small range they can edit (just their own row, actually). Now I want to embed this sheet using iframes. How do I embed just a range of this editable spreadsheet? This line shows the desired range as I want it, but because of the "pubhtml?", it isn't editable:
src="https://docs.google.com/spreadsheets/d/1mjKXUsDs9EfqV9WztdfmNLm-sZwhphTieqEoBEHWce4/pubhtml?gid=0&single=true&widget=true&headers=false&range=a1%3Ah5&widget=false&chrome=false&rm=minimal"
When I change just the "pubhtml?" to "edit?", it becomes editable by those users as it should, but it shows me the entire sheet, including headers etc.:
src="https://docs.google.com/spreadsheets/d/1mjKXUsDs9EfqV9WztdfmNLm-sZwhphTieqEoBEHWce4/edit?gid=0&single=true&widget=true&headers=false&range=a1%3Ah5&widget=false&chrome=false&rm=minimal"
I can find a lot of (confusing) info on some of the parameters for embedding, but didn't find an answer to this problem. Or is it impossible?
(Note: I did find two similar questions, but they didn't answer my problem - or maybe I didn't understand the answers? :-)
Thanks,
Stef
Embedding Google Sheets does seem rather less intuitive than one might wish. But some research (remember, Google is your friend, unless perhaps if your name rhymes with Rump) and some trial and error have delivered a solution to your question.
The OP's plan is to provide each user with access to their own row for data entry. There are several ways that one might imagine that this could/would be done. The most obvious (to me) is simply to give the user access to "their row" on the master sheet - for instance, "user A" gets access to, say, row 53, "user B" gets access to row 17, and so on. This is quite easy to do (as we will see) BUT it is worthwhile/important that the user should also see the column headers. If, say, the column header is in row 1, then "user A" needs access to row 1 and row 53. Problem! We can give access to contiguous rows, but not to two discrete ranges. So this approach simply isn't possible - or at least I couldn't find a way to make it happen.
The approach that I took was to start with the master sheet. Then add one extra sheet for each user. For example, we add a sheet "User A", "User B" and so on. Each "User sheet" has only two rows of data. Row 1 contains column headers, Row 2 contains the user data - this gives us two contiguous rows that we can make accessible to the user. The cells in the master sheet change from containing hand-entered/hard-coded data to simple formulae that link to the appropriate column on the appropriate user sheet.
I don't believe that it is wise to give each user access to their own row (however it is that this might be done). In my opinion, the various security implications don't justify the risk. My strong recommendation would be that each user should have their own sheet (that is, a separate doc for each user). The user then gets access only to a limited number of rows in that sheet, and the master sheet (which is a separate file) contains formulae that pickup the data in the user sheet (also a separate file). With this approach, if a user manages to "screw up" (whatever/whenever/however - but you just know its gonna happen) then only their sheet (and the link to the master) is affected. This compares to an approach that would put the entire master spreadsheet at risk.
For the sake of completeness, I propose to address the various options - as eye-wateringly tedious as it may sound/be).
For the sake of reference, I created a Google sheet (so_46059687) as a stand-in for the OP's master sheet.
Non-editable embed - (Version1)
In this example, one can see but not edit the master sheet. The document includes two "User" sheets but these are not visible (by choice).
I found two documents from the Google forums very helpful "Embed Sheet and remove Titles and Scrolling bars" and "How to I edit the height/width of a google sheets embed code when embedding it on my website?"
Codepen example
<iframe width="100%" height="250" src="https://docs.google.com/spreadsheets/d/e/2PACX-1vR-1keK8Wmyr4V6o6cjskLCetvsmbLeMsJuZViPpqkPck2-P2kCb4E4Ta_YMjbawz4lfgU_LVPFuqya/pubhtml?gid=0&single=true&widget=true&headers=false"></iframe>
Non-editable embed - (Version2)
This is a variation on a theme of version 1 except that one of the user sheet has been made visible. The second user sheet is not visible (by choice).
Codepen example
<iframe width="100%" height="250" src="https://docs.google.com/spreadsheets/d/e/2PACX-1vR-1keK8Wmyr4V6o6cjskLCetvsmbLeMsJuZViPpqkPck2-P2kCb4E4Ta_YMjbawz4lfgU_LVPFuqya/pubhtml?widget=true&headers=false"></iframe>
Editable embed - (Option#1)
This scenario shows a solution to the OP's question. The user can edit the data for their row. There is a downside - the user has complete access to their sheet. They can use rows and columns other than the ones linked to the master sheet, they can add a sheet but worse, they can delete their sheet.
These two invaluable articles explained how to create an editable embed: Google Docs: Embedding editable Google Docs and How to: Embed an editable Google Docs sheet.
Codepen example
<iframe width="100%" height="400" src="https://docs.google.com/spreadsheets/d/1XqT5umvq2vzK7CEivVJXTJdKlBW07bP9nnMMWs2px_Y/edit?usp=sharing"></iframe>
There are some things to note:
The user sheet called "Fred" has been "Share[d]". In this case, I choose "Anyone with the Link" so that it works in the Codepen, but in practise I would set the permissions so that only nominated people could edit the sheet.
The mastersheet is padlocked which means it is visible, but not editable. Another option would have been to hide the master sheet altogether but I chose otherwise in order to demonstrate the options open to the owner.
Note that the other user sheet (called "Joe") is not visible (because it hasn't been shared").
Note that whatever is edited on the user sheet (called "Fred") is immediately updated in master (called "Coy_Summary")
As noted, the user can delete their sheet or add an extra sheet. However this article (and code) "Google Spreadsheet Script To Insert And Delete Sheets with Protection" is apparently quite effective. I haven't tried it - I'll leave that to others.
Editable embed - (Option#2)
This is an example of how my recommendation might work. Each user (in this case, "Brian") has their own file (sheet), and the Master sheet picks up the user data with "IMPORTRANGE" function. Another aspect is that the user is limited to just two rows (though nothing limits access to extra columns).
Refer Google sheet
The document "How to embed specific cells range when embedding a Google spreadsheet" is essential reading. It explains in details how to limit the range that is embedded. Brilliant! "How to Share only Specific Sheet/Single Tab in Google Spreadsheet?" was also valuable.
Codepen example
<iframe width="250" height="75" src="https://docs.google.com/spreadsheets/d/e/2PACX-1vSzsA_yrb2uBCXywikOAbWrLnnEPYazevavza7PmtX9C6-xNw4p31gtCRBiCyxYkxVK7aMAWY1xZJ2o/pubhtml/sheet?headers=false&gid=936292221&range=A1:C2"></iframe>
BTW, if you look at the master sheet in Codepen Option#1, you can see the actual "importrange" formulae.
This is the formula for "Item #" in Column B of the master sheet:
=importrange("https://docs.google.com/spreadsheets/d/1-HtjEawH7p45qKY5c0syIOnTF15endnG4L8wIIPEaAs/edit?usp=sharing", "Brian!b2:b2")
the closest i got was
<iframe width = 100% height="700" seamless frameborder="0" scrolling="yes"
src="https://yoursheetid?rm=minimal#gid=0/edit?"
</iframe>
i used the rm=minimal with the edit parameter
You may have to replace ? with & but it works - The only thing i cant get rid of is row and column headers
I'm using Google sites to embed items
Hope this helps

Google sheet embed URL documentation

Does anyone know if there is any official documentation for google spreadsheet embed URL paramaters?
That is, given an embed URL from Google Sheets like this:
https://docs.google.com/a/aicr.org/spreadsheet/pub?key=0AhExuVBhVYT1dGxxejBmUHAzYUhGb25veTRkdW1YekE&single=true&gid=1&output=html&gridlines=false
What do the arguments do, and
What other arguments are available, that aren't included by default?
After much digging and searching, I have found:
Some parameters don't seem to do anything (&single=true, &embedded=true)
Some parameters are declared confidently in google search results, but don't work (&gridlines=false)
Some parameters don't seem to appear in any searches I have done (&output=csv)
... and no search I have done has produced anything even remotely approaching either of:
an official, google-maintained document for embed URLs
a code view of the code that is used to parse the embed URLs
By trial and error I have found:
&key=[ID]
google sheet ID
&single=[true|false]
true: ??? (present when I have published only a single sheet)
false: ???
&gid=[#]
sheet ID ??? (present when I have published only a single sheet)
perhaps this can be used to specify a sheet and range when your entire google sheets doc has been 'published to the web' (instead of just one sheet from your doc)
&range=[CellAddress1:CellAddress2]
specify a range of cells to include, eg "B1:C20"
if 'widget=' is false or not present, suppresses display of the usual google header & footer info
if the range spacified is larger than the published sheet, displays only the sheet while still suppressing the header and footer.
&embedded=[true|false]
true: ???
false: ???
this item is included in the embed code offered from within google sheets (set to "true"), but doesn't seem to have any effect.
&widget=[true|false]
true: display entire shared item. Overrides "range=". Does NOT include the google disclaimer footer.
false: include google disclaimer footer in output (unless 'range=' is also present)
&output=[html|txt|csv]
html (default): output as an html table within code that also includes Google tracking code
txt: output the content of the specified range or sheet as tab separated text
csv: output as csv
&gridlines=[???]
this apparently used to work but doesn't work for me.
To suppress gridlines in embedded sheets I set borders on all cells, then color the borders to match the sheet's background color (eg solid white borders on a white-background sheet).
Here are some of the parameters I found for Google Docs (thanks goes to Joel http://obstruction.tumblr.com/post/60784440737/google-docs-url-parameters-rm-minimal-rm-full):
Google Docs URL parameters:
rm=minimal
rm=full
rm=embedded
rm=demo
rm=(render mode)
ui=2 (select the interface version)
chrome=false (full screen mode)
frameborder=(size of border)
q=(Whatever) Search Query
gid=24 (Which sheet you want to display)
widget=false
single=true
range=A2:AA26 Output=html
format=(export spreadsheet)
format=xlsx
format=csv
widget=false
width=(width)
height=(height)
viewer?
start=
channel=
ibd=
client=
I've been looking for the same thing! One more URL parameter I have found useful is
&rm=[minimal|?]
minimal: hides the top menu and cell inspector, but still shows row numbers, column letters, and the Add More Rows feature at the bottom.
This resource describes some of the parameters, though I can't vouch for its accuracy.
http://www.goopal.org/google-sites-business/google-spreadsheets/spreadsheet-output/publish-spreadsheet#TOC-Other-Export-Parameters
The most helpful list of parameters I found comes from Steegle.com.
You can use the htmlembed URL to display just a range from a Google Sheet - here's how to structure the URL
https://docs.google.com/spreadsheets/d/SpreadsheedID/htmlembed?single=true&gid=SheetID&range=D15:E15&widget=false&chrome=false&headers=false
SpreadsheedID should be the long letters, numbers and characters you get in the normal URL
htmlembed is for sheets you have not published: use pubhtml instead if you have chosen to publish the sheet (if you want the public to see it it's the best way
single never been sure what it does, but we think it helps with only showing a single sheet instead of multiple sheets
SheetID is the sheet number you get in the normal URL after the ?gid= (this is not the sheet name you have specified but the automatic number that Google Sheets provides)
range lets you specify the range of cells you want to display
widget lets you choose whether to display the sheet tabs at the bottom
chrome lets you choose whether to display the spreadsheet title (& sheetname) at the top
headers lets you choose whether to display the spreadsheet title at the top
Source: https://www.steegle.com/google-sites/how-to/insert-websites-apps-scripts-and-gadgets/embed-google-sheet-range

Problems creating hyperlinks using Apache POI 3.8-beta4 in a SXSSF workbook

It appears that hyperlink cells are not created correctly when using the POI SXSSF implementation. I have taken an exact copy of the example code from the HOW-TO guide for creating hyperlinks and changed the workbook to be SXSSF instead of XSSF, and the hyperlinks no longer function.
Has anyone else seen this problem or discovered a workaround?
Thanks,
Mark.
SXSSF is quite new, and currently aimed at only certain tasks. If you can, I'd advise you to look at how XSSF does it, and submit a patch!
In the mean time, you can probably get away with using the HYPERLINK function instead. Set your cell to be a formula cell, and set the Formula to be something like HYPERLINK('http://stackoverflow.com/','Stack Overflow') and it'll show as a link in Excel
Update: Support was added to SXSSF to support hyperlinks in r1145629
I know this is an old post, but it came up repeatedly while I was doing searches on the same subject.
I'm using POI 3.9X and it does work with hyperlinks, however there is a big downside if you are using really large amounts of rows with a hyperlink.
there is a limit of 65K hyperlinks per sheet in Excel
If you decide to break your workbook into sheets after the 65K mark the total number of hyperlink objects stays in memory (say if using 1 per row), which can cause a huge spike if iterating quickly and can cause Out of Memory errors if not enough Heap... by huge , I mean gigabytes for 200,000 rows.
The use of the formula method DOES work, and I switched to it as it does not have the limitations of creating a hyperlink object that stays in memory when using SXSSF. This is assuming dealing with a URL and not a relation.
For those that see a "0" based on the previous example, make sure to include the "=" before the Hyperlink Excel function

Resources