I'm performing some file list searches in Google Drive API and have found out that there is a mismatch between the searches in the API and the one performed directly in the Google Drive Webpage on Shared files. Due to character issue.
If I search for "företagsrevision" in shared files like with "fullText contains 'företagsrevision'" it will not find anything. But if I search the same in Google Drive Web I get hits.
If I use some other text that does not have åäö there is not issue, but what should i convert this characters to?
The characters should be url escaped (as they appear in a url). Other than that, they should be found correctly. If not, it might be a bug. To help, what encoding are you using for the characters?
Related
On daily basis I am getting an attachment in email. Basically that attachment is an excel spreadsheet. I am pushing that spreadsheet to google drive folder ./attachments using Microsoft Power Automate. The main purpose of pushing that spreadsheet into google drive is to load into PowerBi for Analytics purposes.
In PowerBi I am using the "Web" connector to import the file and it's working fine. The sample link is below.
https://docs.google.com/spreadsheets/u/2/d/1eBJR6wrcFrdjv4Lbf_Wq3MQOeUwBbgLw/export?format=xlsx
The above link exports the file into powerBi and hence I can load data into PowerBi.
The problem is, on daily basis I am getting new file in drive and the Unique Id of the file is not same. In the above example the unique Id 1eBJR6wrcFrdjv4Lbf_Wq3MQOeUwBbgLw will be different for the second file even though I am renaming the file with the same name using Microsoft Power Automate when pushing to Google Drive. eg: "PowerBi load file.xlsx". Is that possible that I can get a stable link for all the files with the same name?.
I have also shared the whole folder ./attachments and tried to get the link of the file but that doesn't work. eg:
https://drive.google.com/drive/folders/1h1VuPtXfWflgIQw7ecMTwweoLblADscq/PowerBi Analytics file.xlsx/export?format=xlsx
Any help, suggestions will be really appreciated.
Thanks everyone.
I believe your goals as follows.
You want to retrieve the file IDs from a filename.
You want to retrieve the file IDs from a shared folder.
Answer for question 1:
In order to retrieve the file IDs from a filename, I think that the method of "Files: list" of Drive API can be used.
The endpoint is as follows.
GET https://www.googleapis.com/drive/v3/files?q=name%3D%27{filename}%27
The search query is name='{filename}'.
In this case, the API key cannot be directly used because the file list tries to be retrieved from the while Google Drive including the file. In this case, the access token is required to be used.
By this, I thought that your goal 2 might be suitable.
Answer for question 2:
In order to retrieve the file IDs from a shared folder, I think that the method of "Files: list" of Drive API can be also used. In this case, at least, the file list is retrieved from the shared folder. By this, the API key can be used.
The endpoint is as follows.
GET https://www.googleapis.com/drive/v3/files?q=%271h1VuPtXfWflgIQw7ecMTwweoLblADscq%27%20in%20parents&key=[YOUR_API_KEY]
The search query is '1h1VuPtXfWflgIQw7ecMTwweoLblADscq' in parents.
In this case, the file list can be retrieved using the API key because the folder is publicly shared and the file list is directly retrieved from the publicly shared.
But, in order to use this, it is required to use the API key. Please be careful this.
Other pattern:
If you want to achieve your goal without the API key and the access token, I would like to propose to use the Web Apps created by Google Apps Script as a wrapper API. When this Web Apps is used, you can achieve above both goals without using the API key and the access token.
The official document of Web Apps is here.
The unofficial document of Web Apps including several sample situations is here.
References:
Files: list
Search for files and folders
Can someone confirm it for me?
I'm helping someone with the importHTML problem on Google spreadsheet. I'm not familiar with importHTML but I thought it should work.
=importhtml("http://www.stockq.org/","table",1)
I don't care which table I'm importing so long as it imports something. It's giving out error message Error: Could not fetch url: http://www.stockq.org/. But the web site is accessible in my browser. That's really bizarre.
My Google Spreadsheet can't cope with the Chinese characters but numbers recognisable by me on the web page are happily imported, as least for the middle table of the three, with:
=importhtml("http://www.stockq.org/","table",A12)
This is much what was I think mentioned by #DigitalSeraphim way back in September. To quote from an answer that was deleted (as not an answer?):
So, I have been building a page to help me keep up with mod updates for my minecraft server, using importxml heavily. I have found that I get the same error for some sites that load absolutely fine in the browser. Looking into it further, I found that the sites are reporting a 404 error, but actually returning the data requested. According to https://drupal.stackexchange.com/questions/110651/how-to-show-a-node-but-return-http-404-response, this is used to remove pages from search engines, as I had assumed. I don't think there is any way around this without some hackery... namely, setting up a "proxy" server that would "fix" the status.
However, it appears that the example you gave is now working, so maybe give it another try.
TL;DR
Use IMPORTXML with XPaths.
I encountered similar problem where I tried to switch between http and https. The work around worked occasionally but the result is not consistent (either way failed a lot).
Later I noticed there is another API named IMPORTXML (XML, not HTML here). With this one you can actually query the content from the same URL and apply XPath instead.
Therefore I would suggest to switch to use IMPORTXML. For example, the following formula
=IMPORTXML("http://www.stockq.org/index/IBOV.php", "//table[#class='indexpagetable']")
will give you all the tables that have class indexpagetable from the page of the given URL.
Note the XPath is slightly different in the spreadsheet, you can refer to the documents for more specifics.
I have a problem in my reporting, i create every day a google doc tracker where all the stack holders in my department update their work progress in it, so i have plenty of spreadsheets to monitor which is a hassle, here is what im trying to do, I'm trying to create a big google doc tracker where i can have an access over date applied in the normal spreadsheets, all what i need is the spreadsheet's URLs that exist in my google drive to be retrieved in this big tracker, with this i'll be able to drive all the needed data from the normal trackers.
PS: I'm not good with google scripts.
You can use the Drive Service to get a list of files with MIME type "application/vnd.google-apps.spreadsheet" using getFilesByType. This returns a FileIterator, which you can use to individually get each Spreadsheet file. From there, just use getUrl() to find the URL's. The FileIterator link has examples of how to loop through all the matching files.
Google stopped crawling my webpage because my robots.txt file was inadvertently moved. It said I should try making sure it is there by going to the address: http://www.site.com//robots.txt. It had two slashes just like that. But it still works. It also works with three. What's up with that? Even if I can sort of see why it could be ignored—I'm not specifying any directory between the two—why would it be preferential to display a url like this, as the google webmasters' page does?
Most (all?) servers seem to allow several slashes directly after the hostname (not in other positions, though), see for example:
http://www.google.com//////////robots.txt
https://stackoverflow.com/////robots.txt
http://en.wikipedia.org////////////////////////robots.txt
(Related question: How to avoid multiple slashes after domain name in url using htaccess?)
However, when Google Webmaster Tools displays the URL with two slashes, you probably have set your domain in the GWT preferences with a trailing slash (http://example.com/ instead of http://example.com). See this question for Google Analytics (I guess it should be similar for GWT).
I am currently working on document search in google. I don't want to do HTML parsing at all. However looking for some api which search google documents from internet.
Basically I do have one requirement where there is a task of searching .pdf, .doc files from google search. I have done some googling and found that this cause captcha to be introduce from google and there is limit of 100 query / day.
Is there any free API from google or if not any paid by which i can pass some search query and get the result.
Please note, i don't want HTML parsing at all.
Moreover, Is there any way to overcome the issue of Captch??