I am testing the new Google Spreadsheets as there is a new feature I really need: the 200 sheets limit has been lifted (more info here: https://support.google.com/drive/answer/3541068).
However, I can't publish a spreadsheet to CSV like you can in the old version. I go to 'File>Publish to the web' and there is no more options to publish 'all sheets' or certain sheets and you can't specify cell ranges to publish to CSV etc.
This limitation is not mentioned in the published 'Unsupported Features' documentation found at: https://support.google.com/drive/answer/3543688
Is there some other way this gets enabled or has it in fact been left out of the new version?
My use case is: we retrieve Bigquery results into the spreadsheets, we publish the sheets as a CSV automatically using the "publish automatically on update" feature which then produces the CSV URL which gets placed into charting tools that read the CSV URL to generate the visuals.
Does anyone know how to do this?
The new Google spreadsheets use a different URL (just copy your <KEY>):
New sheet : https://docs.google.com/spreadsheets/d/<KEY>/pubhtml
CSV file : https://docs.google.com/spreadsheets/d/<KEY>/export?gid=<GUID>&format=csv
The GUID of your spreadsheet relates to the tab number.
/!\ You have to share your document using the Anyone with the link setting.
Here is the solution, just write it like this:
https://docs.google.com/spreadsheets/d/<KEY>/export?format=csv&id=<KEY>
I know it's weird to write the KEY twice, but it works perfectly. A teammate from work discovered this by opening the excel file in Google Docs, then File -> Download as -> Comma separated values. Then, in the downloads section of the browser appears a link to the CSV file, like this:
https://docs.google.com/spreadsheets/d/<KEY>/export?format=csv&id=<KEY>&gid=<SOME NUMBER>
But it doesn't work in this format, what my friend did was remove "&gid=<SOME NUMBER>" and it worked! Hope it helps everyone.
If you enable "Anyone with the link sharing" for spreadsheet, here is a simple method to get range of cells or columns (or whatever your feel like) export in format of HTML, CSV, XML, JSON via the query:
https://docs.google.com/spreadsheet/tq?key=YOUR-KEY&gid=1&tq=select%20A,%20B&tqx=reqId:1;out:html;%20responseHandler:webQuery
For tq variable read query language reference.
For tqx variable read request format reference.
Downside to this is that your doc is still availble in full via the public link, but if you want to export/import data to say Excel this is a perfect way.
It's not going to help everyone, but I've made a PHP script to read the HTML into an array.
I've added converting back to a CSV at the end. Hopefully this will help some people who have access to PHP.
$html_link = "https://docs.google.com/spreadsheets/d/XXXXXXXXXX/pubhtml";
$local_html = "sheets.html";
$file_contents = file_get_contents($html_link);
file_put_contents($local_html,$file_contents);
$dom = new DOMDocument();
$html = #$dom->loadHTMLFile($local_html); //Added a # to hide warnings - you might remove this when testing
$dom->preserveWhiteSpace = false;
$tables = $dom->getElementsByTagName('table');
$rows = $tables->item(0)->getElementsByTagName('tr');
$cols = $rows->item(0)->getElementsByTagName('td'); //You'll need to edit the (0) to reflect the row that your headers are in.
$row_headers = array();
foreach ($cols as $i => $node) {
if($i > 0 ) $row_headers[] = $node->textContent;
}
foreach ($rows as $i => $row){
if($i == 0 ) continue;
$cols = $row->getElementsByTagName('td');
$row = array();
foreach ($cols as $j => $node) {
$row[$row_headers[$j]] = $node->textContent;
}
$table[] = $row;
}
//Convert to csv
$csv = "";
foreach($table as $row_index => $row_details){
$comma = false;
foreach($row_details as $value){
$value_quotes = str_replace('"', '""', $value);
$csv .= ($comma ? "," : "") . ( strpos($value,",")===false ? $value_quotes : '"'.$value_quotes.'"' );
$comma = true;
}
$csv .= "\r\n";
}
//Save to a file and/or output
file_put_contents("result.csv",$csv);
print $csv;
Here is another temporary, non-PHP workaround:
Go to an existing NEW google sheet
Go to "File -> New -> Spreadsheet"
Under "File -> Publish to the web..." now has the option to publish a csv version
I believe this is actually creating an old Google sheet but for my purposes (importing google sheet data from clients or myself into R for statistical analysis) it works until they hopefully update this feature.
I posted this in a Google Groups forum also, please find it here:
https://productforums.google.com/forum/#!topic/docs/An-nZtjaupU
The correct URL for downloading a Google spreadsheet as CSV is:
https://docs.google.com/spreadsheets/export?id=<ID>&exportFormat=csv
The current answers do not work anylonger. The following has worked for me:
Do File -> "Publish to the web" and select 'start publishing' and the format. I choose text (which is TSV)
Now just copy the URL there which will be similar to https://docs.google.com/spreadsheet/pub?key=YOUR_KEY&single=true&gid=0&output=txt
That new feature appears to have disappeared. I don't see any option to publish a csv/tsv version. I can download tsv/csv with the export, but that's not available to other people with merely the link (it redirects them to a google docs sign-in form).
I found a fix! So I discovered that old spreadsheets before this change were still allowing only publishing certain sheets. So I made a copy of an old spreadsheet, cleared the data out, copy and pasted my current info into it and now I'm happily publishing just a single sheet of my large spreadsheet. Yay
I was able to implement a query to the result, see this table
https://docs.google.com/spreadsheets/d/1LhGp12rwqosRHl-_N_N8eTjTwfFsHHIBHUFMMyhLaaY/gviz/tq?tq=select+A,B,I,J,K+where+B%3E=4.5&pli=1
the spreadsheet fetches data from earthquake, but I just want to select MAG 4.5+ earthquakes so it makes the query and the columns, just a problem:
I cannot parse the result, I tried to decode as json but was not able to parse it.
I would like to be able to show this as HTML or CSV or how to parse this ? for example to be able to plot it on a Google Map.
Related
Website Link
https://redacted
xml options I have tried so far
<span aria-labelledby="amount">722</span>
//*[#id="amount"]/h3/span[2]
/html/body/div[3]/main/div/span/div/div/div[2]/div/div/div[2]/div/div[2]/div[3]/div/div/div/div[2]/div[1]/h3/span[2]
None working
Trying to =importxml from here # a value of "722" this is value on 5/5/22 anyway.
Unfortunately, it seems that your expected value cannot be directly retrieved using the XPath. Because the value is put to the HTML using Javascript and IMPORTXML cannot analyze the result of Javascript. But, fortunately, it seems that your expected value is included in the HTML as the JSON data. So, in this answer, I would like to retrieve the value from the JSON data.
Pattern 1:
In this pattern, IMPORTXML and REGEXEXTRACT are used.
=ARRAYFORMULA(REGEXEXTRACT(IMPORTXML(A1,"//script[#data-component-name='GetOfferWrapper']"),"defaultEstimatedValue"":(.+?)}"))
The URL https://www.gazelle.com/iphone/iphone-13-pro-max/other/iphone-13-pro-max-1tb-other/498082-gpid is put in the cell "A1".
When this formula is used, the following result is obtained.
Pattern 2:
In this pattern, a custom function created by Google Apps Script is used. When the value is retrieved from JSON data, Google Apps Script is useful. When you use this script, please copy and paste the following script to the script editor of Spreadsheet and save the script. And, please put a custom function of =SAMPLE("https://www.gazelle.com/iphone/iphone-13-pro-max/other/iphone-13-pro-max-1tb-other/498082-gpid") to a cell.
function SAMPLE(url) {
const res = UrlFetchApp.fetch(url).getContentText();
const data = res.match(/<script.+data-component-name="GetOfferWrapper".+?>([\w\s\S]+?)<\/script>/);
if (!data || data.length == 0) return "No data";
const obj = JSON.parse(data[1]);
return obj.initState.defaultEstimatedValue;
}
The URL https://www.gazelle.com/iphone/iphone-13-pro-max/other/iphone-13-pro-max-1tb-other/498082-gpid is put in the cell "A1".
When this formula is used, the value of 722 is retrieved.
Note:
The formula and custom function can be used for the current HTML. So, when the specification of HTML is changed, those might not be able to be used. Please be careful about this.
References:
IMPORTXML
REGEXEXTRACT
Custom Functions in Google Sheets
fetch(url)
JSON.parse()
you will need to find another site with intel you attempting to scrape. the #N/A error is the result of google sheets not supporting the import of JavaScript elements. you can always check for compatibility by disabling JS in site settings and only what's left can be usually scrapped. in this case its nothing:
I tried Importhtml ("https://nepsealpha.com/investment-calandar/dividend","table",) and then Importxml("https://nepsealpha.com/investment-calandar/dividend",xpath). I found out xpath from "selectorgadget" extension of googlechrome, but still couldn't import it. It shows either "empty content" or formula parse error".
You can retrieve quite all the informations this way
=importxml(url,"//div/#data-page")
and then parse the json.
By script : =getData("https://nepsealpha.com/investment-calandar/dividend")
function getData(url) {
var from='data-page="'
var to='"></div></body>'
var jsonString = UrlFetchApp.fetch(url).getContentText().split(from)[1].split(to)[0].replace(/"/g,'"')
var json = JSON.parse(jsonString).props.today_prices_summary.top_volume
var headers = Object.keys(json[0]);
return ([headers, ...json.map(obj => headers.map(header => obj[header]))]);
}
edit
to update periodically, add this script
function update(){
var chk = SpreadsheetApp.getActiveSpreadsheet().getSheets()[0].getRange('A1')
chk.setValue(!chk.getValue())
}
put a trigger as you wish on the update function and change as follows
=getData("https://nepsealpha.com/investment-calandar/dividend",$A$1)
I know that's not the answer you want to see.
It's impossible to get any content from this website using IMPORTXML or other tools included in Google Sheets.
It's generated using Javascript. Once Javascript is disabled no content is displayed:
It's done on purpose. Financial companies pay for live stock data and they don't want to share it with us for free.
So the site is protected against tools like importxml.
I'm doing a project that allow the customer to export the mysql data into .xls form. I'm using phpspreadsheet library.
That's done, but in my data contain lots of date, some of the date is 0000-00-00 means that it is not used.
I wanted to filter all of these '0000-00-00' into '-'.
I uses excel find and replace and save them as macro ( .bas )
What i have tried is
load the .bas file with IOFactory and reader in php, but it say the file format is not accepted
use substitute method in php loops that use to get the sql data value
$activeSheet->setCellValue('L'.$i, '=substitute('L'.$i ,"0000-00-00", "-')');
$i is 1 that will increase by 1 for each loop
This method failed when the i can't include the $i inside the substitute() because the of "" and
'' problem, I tried to change them around, but seem like the 0000-00-00 and - must use "", if
not the method is not recognise by the library that makes the $i can't be detect then...
Is there any way to solve any of these problems? or it can't be solve in the first place?
cause i can't found any explanation of macro in phpspreadsheet from community nor google.
When setting the value of the cell
if ($datefromselect == '0000-00-00') {
$activeSheet->setCellValueByColumnAndRow($colnum, $rownum, '-');
} else {
$activeSheet->setCellValueByColumnAndRow($colnum, $rownum, $datefromselect);
}
or get it done in the select as in
SELECT lastname,
if(date_closed = '0000-00-00', '-', date_closed)
FROM `lca_clients`
I can't find any reference to an API that enables Rest API clients to export an existing Google Sheet to a csv file.
https://developers.google.com/sheets/
I believe there should be a way to export them.
The following URL gives you the CSV of a Google spreadsheet per sheet. The sheet must be accessible by the public, by anyone with the link (unlisted).
The parameters you need to provide are:
sheet ID (that is simply the ID in the URL of a Google Spreadsheet https://docs.google.com/spreadsheets/d/{{ID}}/edit)
sheet name (that is simply the name of the sheet as given by the user)
https://docs.google.com/spreadsheets/d/{{ID}}/gviz/tq?tqx=out:csv&sheet={{sheet_name}}
With that URL you can run a GET-request to fetch the CSV.
Or paste it in your browser address bar.
You can use the Drive API to do this today -- see https://developers.google.com/drive/v3/web/manage-downloads#downloading_google_documents, however that will limit you to the first sheet of the document. The Sheets API doesn't expose exporting as CSV today, but may offer it in the future.
Nobody's mentioned gspread yet, so here's how I did it:
#open sheet
sheet = gc.open_by_key(sheet_id)
#select worksheet
worksheet = sheet.get_worksheet(0)
#download values into a dataframe
df = pd.DataFrame(worksheet.get_all_records())
#save dataframe as a csv, using the spreadsheet name
filename = sheet.title + '.csv'
df.to_csv(filename, index=False)
Firstly you should make document accessible for anyone. Then you get url. From this url you should extract long id composed from big and small letters and numbers. Then use this script.
#!/bin/bash
long_id="id_assigned_to_your_document"
g_id="number_assigned_to_card_in_google_sheet"
wget --output-document=temp.csv "https://docs.google.com/spreadsheets/d/$long_id/export?gid=$g_id&format=csv&id=$long_id"
If you use only one card in document, their number is: g_id="0"
The problem you will probably have is connected with strange spaces in obtained file. I use this second script to process it
#!/bin/bash
#Delete all lines beginning with a # from a file
#http://stackoverflow.com/questions/8206280/delete-all-lines-beginning-with-a-from-a-file
sed '/^#/ d' temp.csv |
# reomve spaces
# http://stackoverflow.com/questions/9953448/how-to-remove-all-white-spaces-from-a-given-text-file
tr -d "[:blank:]" |
# regexp "1,2" into 1.2
# http://www.funtoo.org/Sed_by_Example,_Part_2
sed 's/\"\([−]\?[0-9]*\),\([0-9]*\)\"/\1.\2/g' > out.csv
Update
As Sam mentioned, api is better solution. There is now great documentation on address:
https://developers.google.com/sheets/quickstart/php
With example that generate output having CSV structure.
If you don't have easy access to or familiarity with PHP, here's a very barebones Google Apps Script Web App that once deployed and the caller permission accepted, should allow clients with an appropriately scoped access token or api key to export an existing Google Sheet to a csv file. It takes a Google Sheets spreadsheet id and sheet name (and optional download filename) as query parameters, and returns the corresponding theoretically RFC 4180 compliant CSV file.
Further instructions on deploying an Apps Script project as a web app are here: https://developers.google.com/apps-script/guides/web#deploying_a_script_as_a_web_app.
You can deploy it and test it out easily in the browser just by visiting the "Current web app URL" (as provided when you publish as web app from the script editor), and accepting the consent screen, or even just visit the one that I deployed (configured to execute as the accessing user, and unverified/scary consent) at the example URL.
The tricky part (as usual) is getting the OAuth token or API key set up, but if you're already calling the Google Sheets V4 API, you've probably already got that dialed in. I used CURL to make sure that it behaved as a REST api, but the technique I used to get an OAuth token there is both a distraction and frankly a little scary to include here since it's really easy to mess up. If you don't already have a way to get one, that's probably a good topic for a separate SO question in any case.
One related (and big!) caveat: I'm not 100% sure how the consent and verification interact with a pure Rest client (i.e. how that works if you DON'T visit this in the browser first...), and/or whether this script would need to be in the same GCP project as the other code that uses the Sheets API. If there's interest, and/or it doesn't work right out of the box, please let me know and I'll happily dig deeper and follow up.
// Example URL, assuming:
// "Current web app URL": https://script.google.com/a/tillerhq.com/macros/s/AKfycbyZlWAW6bpCpnFoPjbdjznDomFRbTNluG4siCBMgOy2qU2AGoA/exec
// spreadsheetId: 1xNDWJXOekpBBV2hPseQwCRR8Qs4LcLOcSLDadVqDA0E
// sheet name: Sheet1
// (optional) filename: mycsv.csv
//
// https://script.google.com/a/tillerhq.com/macros/s/AKfycbyZlWAW6bpCpnFoPjbdjznDomFRbTNluG4siCBMgOy2qU2AGoA/exec?spreadsheetid=1xNDWJXOekpBBV2hPseQwCRR8Qs4LcLOcSLDadVqDA0E&sheetname=Sheet1&filename=mycsv.csv?spreadsheetid=1xNDWJXOekpBBV2hPseQwCRR8Qs4LcLOcSLDadVqDA0E&sheetname=Sheet1&filename=mycsv.csv
//
var REQUIRED_PARAMS = [
'spreadsheetid', // example: "1xNDWJXOekpBBV2hPseQwCRR8Qs4LcLOcSLDadVqDA0E"
'sheetname' // Case-sensitive; example: "Sheet1"
];
// Returns an RFC 4180 compliant CSV for the specified sheet in the specified spreadsheet
function doGet(e) {
REQUIRED_PARAMS.forEach(function(requiredParam) {
if (!e.parameters[requiredParam]) throw new Error('Missing required parameter ' + requiredParam);
});
var spreadsheet = SpreadsheetApp.openById(e.parameters.spreadsheetid);
var sheet = spreadsheet.getSheetByName(e.parameters.sheetname);
if (!sheet) throw new Error("Could not find sheet " + e.parameters.sheetname + " in spreadsheet " + e.parameters.spreadsheetid);
var filename = e.parameters.filename || (spreadsheet.getName() + "_" + e.parameters.sheetname + ".csv");
var numRows = sheet.getLastRow();
var numColumns = sheet.getLastColumn();
var values = sheet.getSheetValues(1, 1, numRows, numColumns);
function quote(s) {
s = s.toString();
if ((s.indexOf("\r") == -1)
&& (s.indexOf("\n") == -1)
&& (s.indexOf(",") == -1)
&& (s.indexOf("\"") == -1)) return s;
// Fields containing line breaks (CRLF)*, double quotes, and commas should be enclosed in double-quotes;
// anything other than that we already returned, so if we get here -- escape it and quote it.
// *That's what the text of the RFC says, but the ABNF (...and Excel) treat EITHER CR or LF as requiring quotes.
// Replace any double quote with a double double quote, and wrap the whole thing in quotes
return "\"" + s.replace(/"/g, '""') + "\"";
};
var csv = values.map(function(row) {
return row.map(quote).join();
}).join("\r\n") + "\r\n";
return ContentService
.createTextOutput(csv)
.setMimeType(ContentService.MimeType.CSV)
.downloadAsFile(filename);
}
I was wondering if its possible to download say only sheet 1 of a google spreadsheet as excel? I have seen few SO posts that show the method to export the WHOLE sheet as excel, but I need to just export one sheet. Is it at all possible? and if yes, how?
You can download a specific sheet using the 'GID'.
Each sheet has a GID, you can find GID of specific sheet in the URL of
spreadsheet. Then you can use this link to download specific sheet -
https://docs.google.com/spreadsheets/d/<KEY>/export?format=xlsx&gid=<GID>
ex:
https://docs.google.com/spreadsheets/d/1D5vzPaOJOx402RAEF41235qQTOs28_M51ee5glzPzj0/export?format=xlsx&gid=1990092150
KEY is the unique ID of the spreadsheet.
source: https://www.quora.com/How-do-I-download-just-one-sheet-from-google-spreadsheet/answer/Ranjith-Kumar-339?srid=2YCg
From what I've found, the other two answers on this post are exactly correct, all you need to do is replace this:
/edit#gid=
with:
/export?format=xlsx&gid=
This works just fine although I did find that I had to keep looking up this string and copying it. Instead, I made a quick Javascript snippet that does all the work for you:
Just run the code snippet below and drag the link it creates into your bookmarks bar. I know this is a little hacky but for some reason, stackoverflow doesn't want me injecting javascript into the links I provide.
Export Sheet as Excel
I've tested this on the latest versions of Chrome, Safari, and Firefox. They all work although you might have to get a little creative about how you make your bookmarks.
when you see every Google spreadsheet url looks like this
https://docs.google.com/spreadsheets/d/1D5vzPaOJOx402RAEF41235qQTOs28_M51ee5glzPzj0/edit#gid=1078561300
In every spreadsheet URL we can see: /edit#gid=
this is generally the default mode.
/edit#gid=
just replace it with:
/export?format=xlsx&gid=
it will download the single spreadsheet from the workbook
I am able to download all sheets of a spreadsheet.
Just remove anything after
/edit?
and replace with
/export?format=xlsx
for Excel
or
/export?format=pdf
for PDF
Please use any_value() function before the column because field(column) have more than one value for one id(group by).
like-
select any_value(phone_no) from user_details group by user_id.
here one user_id have more than one phone number so query confused which choose.
You can do this by clicking on the down arrow near the sheet name to bring up the options, and then selecting "Copy to -> New spread sheet", then click the "Open spread sheet" in the pop up that comes up after.
You can use my code:
function emailAsExcel() {
var config = {
to: "name#gmail.com",
subject: "your text",
body: "your text"
};
var ui = SpreadsheetApp.getUi();
if (!config || !config.to || !config.subject || !config.body) {
throw new Error('Configure "to", "subject" and "body" in an object as
the first parameter');
};
var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
var spreadsheetId = spreadsheet.getId();
var file = Drive.Files.get(spreadsheetId);
var url = 'https://docs.google.com/spreadsheets/d/'+spreadsheetId+'/export?
format=xlsx&gid=numberSheetID to email';
var token = ScriptApp.getOAuthToken();
var response = UrlFetchApp.fetch(url, {
headers: {
'Authorization': 'Bearer ' + token
};
});
var fileName = (config.fileName || spreadsheet.getName()) + '.xlsx';
var blobs = [response.getBlob().setName(fileName)];
if (config.zip) {
blobs = [Utilities.zip(blobs).setName(fileName + '.zip')];
}
GmailApp.sendEmail(
config.to,
config.subject,
config.body,
{
attachments: blobs
}
);
}