ImportHTML find ALL tables and lists - google-sheets

In Google sheets there is a function called IMPORTHTML which allows you to return tables and lists from HTML pages. However I have no idea how to find out the indexes of the tables or lists I am looking at. Is there a way to figure out ALL of the tables and lists quickly and there respective indexes? I have tried looking at the source on some of the web pages in Chrome and it is unreadable due to how they load the page but somehow the google function knows how to get the 17th table for example. This is a generic question and doesn't apply to any one website but rather ANY website I might want to extract some unknown table number. Right now I am just brute forcing through a bunch of indices which doesn't seem right.

This won't find the exact table you want, but it will give you a list of the tables.
From google developers console enter below to find table numbers
var i = 1; [].forEach.call(document.getElementsByTagName("table"), function(x) { console.log(i++, x); });

Related

Google data studio - Use multiple datasheet with same data keys/headers

So I've been stuck in this for some days, tryed a lot of search terms but all of them seems to bring me the same answers and i really need this:
I have a demand to join two different company's datas from the same owner, all of them have the same data sources (excel data sheets from FB ADS).
So they all share the same (keys/headers), like this:
COMPANY(1)'S ADS DATA
COMPANY(2)'S ADS DATA
So this way I need to put then togheter without having to join both of then on excel every time and also give him some nice data manipulation power.
The results should be something like this
By now I was trying to join data from the two companys but I couldn't really figure out how to properly do this so far I've made some tests and tryed reading a couple of articles and google data studio's help files. The merging data function seems to mess everything.
As a result of this merge, GDS gives me this fields:
Shouldn't I see like only one field labeled as cnt and cmp? I've noticed that GDS creates not one, but two data fields. If I try adding all data I need as key the left sheet turns all "0s". What Am I doing wrong here?
I have read your descriptions. It seems that you are looking for a solution to append both tables instead of merging the tables.
Do note that the data blending in GDS is a left outer join.
Hence, instead of doing the blending in GDS, I'd suggest you to append both datasets in Google Sheet in a separate tab before importing to GDS for visualisation. (assuming you don't mind copy-pasting the data into the Google Sheet).
Here is the formula to append both datasets in Google Sheets:
= {QUERY(A!A1:D1000,"SELECT A,B,C,D WHERE A <> ''",1);QUERY(B!A2:D, "SELECT A,B,C,D WHERE A <> '' ")}
I've created some dummy data in this google sheets and appended the data using the formula provided , you may take a look to understand further.
If you are unclear on the difference between merge and append, you may take a look in the Google Sheet documentation as well.
On a side note, I've screencast the process of answering this question and posted on my youtube channel. You may take a look if needed. (Thanks for the question and inspiration you provided for the video)

Inverse LOOKUP in google sheets to return column name

Good day,
I am currently sorting a storage unit, where various parts from samples are stored in multiple locations. The idea is to sort it. So I am creating a spreadsheet for each part. The columns will be the location and the rows are a list of the sample numbers of which the parts can be found in this location.
input:
A spreadsheet like this will exist for every part.
The idea is to have a final table, sorted by sample number, which has the parts in the columns. I want the cells to return where this part of this sample is stored:
desired output:
I tried various LOOKUP formulas but they do not return the column name.
Because this has to be accessible by multiple people, it has to be in google sheets.
This is an example file: https://docs.google.com/spreadsheets/d/1pUmTs0mLoZAdPc83pLXC75MCUF2P1SHDtEfYPEMohr4/edit?usp=sharing
I am super thankful for any help!
With the help of this website:
https://infoinspired.com/google-docs/spreadsheet/search-across-columns-and-return-the-header/
I found a solution. The idea is to use match functions for each column. For the example posted the code looks like this:
=ifs(isna(match(A2,'Part 1'!$A$1:$A$7,0))=FALSE,'Part 1'!$A$1,isna(match(A2,'Part 1'!$B$1:$B$7,0))=FALSE,'Part 1'!$B$1,isna(match(A2,'Part 1'!$C$1:$C$7,0))=FALSE,'Part 1'!$C$1,isna(match(A2,'Part 1'!$D$1:$D$7,0))=FALSE,'Part 1'!$D$1)
I know it is not pretty, but it works, as the number of columns is limited. The website also suggests a dynamic solution with a query function, but that only works when the cell left of the cell of interest has entries.
Thanks to everyone :-)

Google Sheets import multiple HTML table images

Summary
I'm looking to import a data table from a website that does not appear to have an API. The table is broken down to various images and text. The goal is to have all of the content available in a table to then reference for other sheets.
Issue
When I pull in the data, I get some of the text, none of the other images, and a reference to another table. I looked up some options, but none of them yielded anything but blank cells.
I also tried to use the =IMAGE() formula with a direct link to the images URLs, but there is a portion of the URL that is specific to the unit's release date, and as such, too dynamic to account for.
Excel Formula
=IMPORTHTML("https://gamepress.gg/pokemonmasters/database/sync-pair-list","table",3)
Unfortunately without an API it is going to be difficult to achieve what you aim here. These are the main reasons why:
PROBLEMS AND WORKAROUNDS
This table has nested tables that therefore need to be accessed separately. If you take a look at: =IMPORTHTML("https://gamepress.gg/pokemonmasters/database/sync-pair-list","table",4)
you will see how the table 4 of this HTML page is the stats of a random character of the main table. If you go for 5 or 6 you will realise that the nested tables are not even numerically ordered and that you cannot access them by accessing to the main table (i.e mainTable[0].nestedTable). A hard working approach to do this is to go one by one finding their corresponding stat table and placing next to it. For this I recommend extracting only the name field of the main table to be able to align each stat to their character. You can simply do this using:=INDEX(IMPORTHTML("https://gamepress.gg/pokemonmasters/database/sync-pair-list","table",3),0,1). You can find out more about INDEX here
IMPORTHTML cannot access images nor links so it will be very difficult to get the images in the last columns. A way to solve this is by using as you mentioned the image with its url like this: =IMAGE("https://gamepress.gg/pokemonmasters/sites/pokemonmasters/files/styles/30x30/public/2019-07/Electric.png?itok=fkRfkrFX"). You can find more info about inserting images here
CONCLUSION
To sum up, there is no easy way to solve this problem. The closest you can get is by:
Importing the name column.
Figuring out which tables belong to which character and placing them with next to their name.
Getting the image url of each weakness and type and add it to each character.
I am sorry this site does not have an API to make things smooth, good luck with your project and let me know if you need anything else or if you did not understand anything.
Here you can find more information about IMPORTHTML

(Google sheets) Query and return multiple tables from url inject

I use the JSON data from a Google spreadsheet, for 2 mobile applications (iOS and Android). The same information can be outputted using HTML or XML, in this case I am using HTML so the information shown (from the spreadsheet) can be understood by everyone. The only logical way to do this is without Authentication (O’Auth) is through public URL Injects. Information about what I’m talking can be found here. In order to understand what I’m asking, you have to actually click the links and see for yourself. I do not know what to “call” some of the things I’m asking as Google’s documation is poor, no fault of my own.
In my app I have a search feature that queries the spreadsheet (USING A URL REQUEST) along the lines of this,
https://docs.google.com/spreadsheets/d/1yyHaR2wihF8gLf40k1jrPfzTZ9uKWJKRmFSB519X8Bc/gviz/tq?tqx=out:html&tq=select+A,B,C,D,E+where+(B+contains"Cat")&gid=0
I select my columns (A, B, C, D, and, E) and ask (Google) that only the rows where column B contains the word cat be return. Again I’m stressing the point that this is done via a URL address (inject being the proper term). I CANNOT use almost any function/formulas that would normally work within a spreadsheet like, ArrayFormula or ImportRange. In fact I only access to 10 language clauses (Read link from before). I have a rather well knowledge of spreadsheets and databases, and as the URL method of getting information from them is similar they are in NO way the same thing.
Now, I would like to point out this part within the URL
tq?tqx=out:html&tq=select+A,B,C,D,E+where+(B+contains"Cat")&gid=0
Type of output, HTML in this case
tqx=out:html
The start of query
&tq=
Select columns A-E
select+A,B,C,D,E
For returning specific information about Cat
where+(B+contains"Cat")
This is probably the most important part of my question. This is used for specifying what table (Tab) is being queried.
&gid=0
If the gid is changed from gid=0 to gid=181437435 the data returned is from the spreadsheets second table. Instead of having to make 2 requests to search both tables is there a way to do both in one request? (like combining the 2) <— THIS IS WHAT I’M ASKING.
There is a AND clause that I have tried all over the url
select+A,B,C,D,E+where+(B+contains%20"Cat")&gid=181437435+AND+select+A,B,C,D,E+where+(B+contains%20"Cat")&gid=0
I have even flipped the gid around and put in other places but it seems to only go by the last one (gid) in the url, and no matter what is done only 1 table is returned. Grouping is allowed by the way. If that doesn’t clear my question up then let me know where you’re lost. Also I would have posted more URLs for easy access but I am kind of on this 2 URL maximum program.
If I understand your requirement, indeed it is, with syntax like this for example:
=ArrayFormula(QUERY({Sheet1!A1:C4;Sheet2!B1:D4},"select * order by Col1 desc"))
The ; stacks one array above the other (, for side by side).
My confusions is with "URL Query Language" as what here is called Google Query Language (there is even the tag though IMO almost all those Qs belong on Web Applications - including this one, by my understanding!) is not confined to use with URLs.
In the example above the sheet references might be replaced with data import functions.

Using multiple filter functions in google sheets elastically

In a Google spreadsheet, I have a summary sheet were I am importing information from multiple sheets. One of my filter function looks like the following:
=Filter(Sheet2!A14:A27, (Sheet2!K14:K27="Y") + (Sheet2!K14:K27="R"))
I have a multiple Filter functions like this one. The problem I am facing is that I have to assign a static number of rows for the result of this function but the result is very dynamic (could be 1 row or even 15 rows).
I have been searching exhaustively but couldnt find a good way to do it elastically so that the results of all Filter functions are just appended (with perhaps an empty row/header row between each of the results).
One solution that someone gave on one of the forums was to assign static number of rows to each and hide the empty rows using a script which did not seem a very clean solution (but I may have to fallback on that)
Also, I thought of using scripts but if I understand it correctly, scripts can only be 'triggered' from menus, onOpen, onEdit etc. which may also not be very intuitive (one has to reload the spreadsheet to see any change in case of onOpen(), etc.)
Using custom functions would again cause the same problem because custom functions run on a specific cell (and we dont know which cell since we are trying to make this dynamic)
Happy to hear any thoughts!
Here is how to stack multiple columns.
{A:A;B:B;C:C}
Here is how to stack multiple filters.
{Filter1;Filter2;Filter3}
Here is how to stack multiple filters with headers.
{"Header1";Filter1;"Header2";Filter2;"Header3";Filter3}
You should always pass the cells/ranges you work on into the custom function instead of reading them within the function. Also try not to write directly but instead return a result. That way the spreadsheet will automatically update correctly and you won't need any permissions.
Here is an example
function myFilter(values1, values2) {
return values1.filter(function(v, i) {
return values2[i][0] === "Y" || values2[i][0] === "R";
});
}
and then do
={myFilter(Sheet1!A14:A27,Sheet1!K14:K27);A1;myFilter(Sheet2!A14:A27,Sheet2!K14:K27)}
still get's a little long though. But you could also save intermediary results in different cells and then join the results together. Or write a filter function that can take an arbitrary number of ranges as arguments.
Can you give some more examples of what those filter functions look like. Maybe there is a better way to modularise/shorten it.

Resources