Google Sheet: IMPORTXML from Yahoo Finance [duplicate] - google-sheets

This question already has answers here:
Scraping data to Google Sheets from a website that uses JavaScript
(2 answers)
Closed last month.
I'm trying to import current stock price from yahoo finance. I used a formula from some website and it partially work. I only know how to tell it to look for a specific query and it worked fine for some other data point I need but the price change query changes from
"Fw(500) Pstart(10px) Fz(24px) C($dataRed)"
to
"Fw(500) Pstart(10px) Fz(24px) C($dataGreen)"
depending if the price is up or down for the day.
How do I modify the formula I'm using below to use the "or" operator in this case? so that it will pull the price down whether the stock is up or down for the day. Thanks!
Formula I'm using:
=IMPORTXML("https://finance.yahoo.com/quote/IBM","//span[#class='Fw(500) Pstart(10px) Fz(24px) C($dataRed)']")

I noticed the other answers did not work for me (they may have worked in the past), so I decided to post this solution. Just put the ticker in cell A1 and one or both of the below formulas somewhere else.
Price:
=IFNA(VALUE(IMPORTXML("https://finance.yahoo.com/quote/" & A1, "//*[#class=""D(ib) Mend(20px)""]/span[1]")))
Change:
=IFNA(VALUE(REGEXEXTRACT(IMPORTXML("https://finance.yahoo.com/quote/" & A1,"//*[#class=""D(ib) Mend(20px)""]/span[2]"), "^.*?\s")))

Currently using googlefinance but find it does not update often enough even when updates set to every minute so currently testing if below will allow updates at least with an F5 press within the sheet
This brings in the price and other information (dated 2022/09/27)
=IMPORTXML("https://finance.yahoo.com/quote/SAVA/", "//*[#id=""quote-header-info""]/div[3]/div[1]/div[1]")

If you just want the price: =IFNA(VALUE(IMPORTXML("https://finance.yahoo.com/quote/" & $A1, "//*[#class=""D(ib) Mend(20px)""]/span[1]")))

You could use a more dynamic/generic xpath that doesnt require such specific paths such as this:
This one pulls in both the price and the change:
=ARRAY_CONSTRAIN(transpose(IMPORTXML("https://finance.yahoo.com/quote/IBM:,"//*[#class='Mt(6px)']//span")),1,2)
If you just want the price:
=trim(IMPORTXML("https://finance.yahoo.com/quote/IBM","//*[#class='Mt(6px)']//span"))
If you just want the change:
=IMPORTXML("https://finance.yahoo.com/quote/IBM","//*[#class='Mt(6px)']//span[2]")

Sadly Yahoo Finance changes the XML/HTML structure of its website quite often. The one that works for now is:
=IMPORTXML("https://finance.yahoo.com/quote/IBM/", "//*[#id=""quote-header-info""]/div[3]/div[1]/div/span[1]")
You may always open the HTML structure and use the developer tools to find and copy the X-path.
P.S.1. Though there seem to be a bug and the function can't retrieve data from URLs where there is a dot/point/period "." in the name.
P.S.2. The IMPORTHTML() function can't also fetch the latest price from Yahoo Finance because the information is neither in a table nor a list. You can try the scripts from this page and this page to list all the tables and lists.

Related

Google Sheets: Parsing users who are Program Managers

I have a spreadsheet that is always updating with 50+ rows. I am trying to retrieve users who are Program Managers (PGM) by parsing text but I am having a hard time since the data is not consistent since it's filled out by 20+ users.
I googled "google sheet parse text" but it's giving me functions such as =SEARCH, =LENS, =LEFT which I cannot use since my data is not consistent. Are there any other options or am I out of luck and must parse my info manually? Thanks in advance.
Google Sheet Link Example
in C2 use:
=ARRAYFORMULA(IFERROR(REGEXEXTRACT(B2:B,"PGM:.*")))
You may try:
=byrow(B2:B,lambda(z,if(z="",,ifna(regexextract(z,"(PGM:.*)(?:\n|$)")))))
row 6 outcome is varying a bit

Googlesheet formulas for Crypto Coins

I am trying to create a google sheet showing various crypto prices for a few set times (but lets just use BTC-USD for the moment).
The sheet would show
BTCUSD Current Price, Previous Close, Close 5 days ago and Close 31 days ago
I have tried the following but running into the problems described which appear to be reserved for crypto.
There are various ways one can get the current price:
=GOOGLEFINANCE("BTCUSD") will work - so we are ok for current price
=GOOGLEFINANCE("BTCUSD","change") will not work, however it will work for an equity
=GOOGLEFINANCE("AAPL","change") will work
Similarly
=index(IMPORTHTML(CONCATENATE("https://finance.yahoo.com/quote/","AAPL"),"table",1),1,2) will return from table 1 row 1, column 2 from the yahoo finance page for Apple (an equity)
However
=index(IMPORTHTML(CONCATENATE("https://finance.yahoo.com/quote/","BTC-USD"),"table",1),1,2)
does not work even though the page and table layout appear to be the same
I also notice that
=GOOGLEFINANCE("BTCUSD", "price", DATE(2022,1,1), DATE(2022,8,15), "DAILY") will return the price of bitcoin for the date range,
However
=GOOGLEFINANCE("BTCUSD", "price", DATE(a1), DATE(a2), "DAILY")
will not work even if cell a1 and a2 have a copy and paste of the 2022,1,1 and 2022,8,15 in them.
I suspect the second question relates to the fact that the dates in the formula are not in quotes, however if you reference them from a cell excel may inadvertently put them into a quote causing a problem.
This last problem makes it difficult to solve the problem from a different angle ie by referencing cells as the day changes and we refresh the sheet ie we cannot reference a cell which would always be 5 days ago or 31 days ago.
Answer to your first question
With the first formula, =index(IMPORTHTML(CONCATENATE("https://finance.yahoo.com/quote/","AAPL"),"table",1),1,2) it worked for a moment and then stopped working. Then I tested =index(IMPORTHTML(CONCATENATE("https://finance.yahoo.com/quote/","BTC-USD"),"table",1),1,2) and did not work, I even tried =IMPORTHTML("https://finance.yahoo.com/quote/BTC-USD","table") to see if it was importing the table but you get the same error "Resource at url not found".
I did some research and it seems that Yahoo made some changes to their website and this affected some of their web-pages. It's suggested to use another website that is scrape-able by IMPORT functions. This is just an example of what is mentioned about Yahoo Finance and IMPORT functions, there are other communities that are also mentioning issues with doing web scraping to Yahoo Finance.
Answer to your second question
Issue with this formula =GOOGLEFINANCE("BTCUSD", "price", DATE(a1), DATE(a2), "DAILY"), according to documentation:
Inputs to DATE must be numbers - if a string or a reference to a cell containing a string is provided, the #VALUE! error will be returned.
The correct way would be: DATE(2022,1,1) and if you want to refer to a cell you will have to split 2022,1,1 in three different cells and make the reference this way DATE(A2,B2,C2).

set times with miliseconds in google sheet [duplicate]

This question already has answers here:
How to SUM duration in Google Sheets?
(5 answers)
Closed 8 months ago.
some time ago I ported an old timesheet to the google-timesheet to be able to share it online and have others modify it but didn't keeped it so I don't remember how I manage to do.
The goal is to compare two timing sheet, today I wanted to edit the sheet to be able add new datas and write timing in separate page sheet which simplify the comparation as I'll just have to import the data from the page and not rewrite it all the time. My issue is that I'm not enable to replicate on my new pages the format. What I would like is to have this exemple working
Cities
Time
Helsinki
2:04.820
Travemünde
4:03.290
Hambourg
0:30.900
Hanovre
2:28.610
Francfort
4:53.470
Mannheim
1:35.170
Strasbourg
2:13.650
Berne
2:25.190
Genève
2:22.620
Lyon
2:24.000
Marseille
3:34.550
Marseille (ferry)
Palerme
2:28.670
Catania
4:07.670
Total
=SUM(above)
so that I can replicate the format on the other pages as I don't understand why it worked before but not now.
mm:ss.000 is the format I would like to have, but atm my format is [h]:mm:ss.000 and it seems that the hour markdown is necesary so I don't really mind if we need to keep the hour.
This is completely possible with google sheets. Enter the data in the format hh:mm:ss.ms, and use sum() (with a range, obv.) to sum the column. Then select the whole colume and apply a custom number format (data-> custom time/date). Using the dropdown to get the parts and typing the separators you can get Minute(1): Seconds(1).Milliseconds(3) which seems to be what you want.
For sheets to recognise the cell entry as a time it needs the hh: part. But you can certainly hide that in the display.
Demo Spreadsheet
If the spreadsheet locale is set to something which uses , for the decimal point, you need to use that instead of . Google could definitely make that a lot clearer. If you have the time you might even want to open a bug report with them, as the examples in their docs don't work when the locale requires a ,.
As a bonus, you can bulk-convert using a formula like =replace(B2; find("."; B2); 1; ",") * 1 (where B2 is the cell to be converted). Drag down, copy and paste the values, and then format if need be.
use:
=ARRAYFORMULA(TEXT(SUM(IFERROR(TIMEVALUE("0:"&B1:B15))); "[m]:ss.000"))

Assitance with ImportXML in google sheets

I'm putting together a spreadsheet so I can keep track of items my store has for preorder and making it so it updates when stock levels change. I am trying to get it so my sheet tells me whether the webpage button says pre-order or add to basket. Sometimes this works and then other times it doesn't.
Here is the formula I have:
=IMPORTXML("https://www.smythstoys.com/uk/en-gb/video-games-and-tablets/gaming-merchandise/harry-potter-lumos-logo-light/p/209816","//html/body/div[7]/section/div/div/div[2]/div[1]/div[5]/div/div/div/div[2]/form/button")
This particular one seems to return an N/A. Outside of trying to teach myself, I don't know Importxml well or the details for HTML as to why this might or might not work.
I was also wondering if it were possible to only retrieve certain bits of text but not all of it. So, I have this:
=IMPORTXML("https://www.smythstoys.com/uk/en-gb/video-games-and-tablets/gaming-merchandise/harry-potter-lumos-logo-light/p/209816","//html/body/div[7]/section/div/div/div[2]/div[1]/div[5]/div/div/div/div[2]/form/div[1]/span[14]/table/tbody/tr/td[2]")
which returns "Out of Stock. Expected Stock August 2022". Is it possible to only retrieve the "August 2022" part of that text?
Thank you so much to anyone who can help.
To return the button
=index(importxml(url,"//button[#id='addToCartButton']"),1,1)
To get only the availability date
=regexextract(IMPORTXML(_______________),".*: (.*)")

Find sector by share symbol in google sheets

Is there a way to print out the GICS sector name for a specific share/ETF symbol in google sheets using the GOOGLEFINANCE commands or any other way?
Many thanks
I used this site to find several scraping methods to get data from finviz.
https://decodingmarkets.com/scrape-stock-data-from-finviz/
Extending their logic, I was able to get the company name, and the combined sector/subsector codes
(I originally used the website's scraping techniques to get Dividend data that GoogleFinance formula lacks...)
This formula gets the company name using US ticker symbol in cell C3:
=SUBSTITUTE(INDEX(IMPORTHTML("http://finviz.com/quote.ashx?t="&C3,"table",6),2,1),"*","")
Through trial and error, I found that table 6 has name and sectors. I then referenced the 2nd row and 1st column to get the name.
I found that row 3, column 1 has the sector, subsector and country combined as one value. They use a pipe | delimiter for each break.
Using the split function, I was able to split segment.
=SPLIT(SUBSTITUTE(INDEX(IMPORTHTML("http://finviz.com/quote.ashx?t="&C3,"table",6),3,1),"*",""),"|",true,true)
Its not available from Sheets
Check out the official docs:
https://support.google.com/docs/answer/3093281?hl=en
It has a lot of options but unfortunately, not that one.
If you think it would be useful, then make sure you file a feature request #
https://developers.google.com/issue-tracker
As for any other way
#GSee said it best here: https://stackoverflow.com/a/16525782/10445017

Resources