Using IMPORTXML to just retrieve the closing date and nothing else - google-sheets

I need to scrape just the closing date on a website onto google sheets.
currently using =IMPORTXML(A1,"//*[#id]") but it scrapes all the data on the site.
I need just the closing date right at the bottom of this page, is this possible?
https://justicejobs.tal.net/vx/lang-en-GB/mobile-0/appcentre-1/brand-15/xf-5ebef95e1d21/candidate/so/pm/1/pl/3/opp/54025-202202-Prison-Officer-HMP-Leicester/en-GB

Try
=IMPORTXML(A1, "//p/span/span/strong/span")
or
=REGEXEXTRACT(IMPORTXML(A1, "//p/span/span/strong/span"),"Closing date (.*)\.")

try:
=QUERY(FLATTEN(IMPORTXML(A1, "//*[#id]")),
"where lower(Col1) starts with 'closing date'")
or just a date:
=REGEXEXTRACT(QUERY(FLATTEN(IMPORTXML(A1, "//*[#id]")),
"where lower(Col1) starts with 'closing date'"), "(\d+.*).")

Related

How to webscrape from Marketwatch Financials for Google Sheets

I want to scrape data from MarketWatch. I have a formula to pull from Finviz:
=value(regexextract(query(importhtml("http://finviz.com/quote.ashx?t="&$C7,"table",9),"select Col2 where Col1 = 'Income' ",0),"[-\d.]+"))
Note: The C7 box contains SBSW.
How do I scrape the Sales/Revenue of 2021 for the ticker SBSW. Here's the link:
https://www.marketwatch.com/investing/stock/SBSW/financials
The result should show 172.19
I tested using this formula, and it works for me:
=IMPORTXML("https://www.marketwatch.com/investing/stock/SBSW/financials", "//*[#id='maincontent']/div[6]/div/div[2]/div/div/table/tbody/tr[1]/td[6]/div/span")
And it looks like this:
You can get the xpath_query with the developer tools like this:
Edit answer, removing the B at the end
First option
If the letter is always "B."
=SUBSTITUTE(IMPORTXML("https://www.marketwatch.com/investing/stock/SBSW/financials", "//*[#id='maincontent']/div[6]/div/div[2]/div/div/table/tbody/tr[1]/td[6]/div/span"),"B","")
Second option
If the letter at the end always changes.
=REGEXEXTRACT(IMPORTXML("https://www.marketwatch.com/investing/stock/SBSW/financials", "//*[#id='maincontent']/div[6]/div/div[2]/div/div/table/tbody/tr[1]/td[6]/div/span"),"[0-9]+.+[0-9]")
Reference:
IMPORTXML
SUBSTITUTE
REGEXEXTRACT

Textjoin or Concatenate for this case?

hope you have a good day/evening.
Due to I always seems to use importrange function to import multiple sheets. I want to have a quicker way to replace the date (highlighted in red as per the screenshot) with the date referenced in Col A. This is my Google Sheet under the tab name "TextJoin" Google Sheets Link
try:
=INDEX({""; "={"&TEXTJOIN("; ", 1,
"IMPORTRANGE(""13DWtP4L7swqBgK6BGLeA-o_FfyD-D8-Ru30cOPf0I10"", """&
FILTER(TO_TEXT(A2:A)&"!A2:C"")", A2:A<>""))&"}"})
but you may need to wrap it into query and remove empty rows perhaps like:
=INDEX({""; "=QUERY({"&TEXTJOIN("; ", 1,
"IMPORTRANGE(""13DWtP4L7swqBgK6BGLeA-o_FfyD-D8-Ru30cOPf0I10"", """&
FILTER(TO_TEXT(A2:A)&"!A2:C"")", A2:A<>""))&"}, ""where Col1 is not null"", )"})
Try
=importrange("_____","'" & text(A2,"M/d/yy") &"'!A2:C")

Google Sheets - How to Extract dates list in new column from start date and end date

i need to extract dates (for each day) between 2 dates (start & end) in new column in google sheet.
The data is:
The result should be like this:
i now that it can be done with EXCEL + POWER QUERY
with google sheets i did not find anything
i will be glad if someone will help with that.
Thanks.
Depending on your locale settings, try:
=arrayformula({A1:C1,"DAY";split(query(flatten(array_constrain(if(split(rept(10,C2:C-B2:B+1),0)=1,A2:A&char(9999)&B2:B&char(9999)&C2:C&char(9999)&B2:B+transpose(row(A:A))-1&char(9999),),counta(B2:B),max(C2:C-B2:B)+1)),"where Col1 is not null",0),char(9999))})

How to get current weekly closing price of stock in google sheet?

How can I fetch current weekly closing price of a stock in google sheet?
I have tried using the formula GOOGLEFINANCE("GOOG", "price", TODAY(), TODAY(), "WEEKLY").
But its showing no results.
=GOOGLEFINANCE("NASDAQ:GOOGL", "close",TODAY()-60,TODAY(),"WEEKLY")
You can adjust the TODAY()-60 to a start date DATE(2019,1,1) for example:
=GOOGLEFINANCE("NASDAQ:GOOGL", "close",DATE(2019,1,1),TODAY(),"WEEKLY")
or how you'd like to do it. It might update the current week in real time if you use "price" instead of "close", not sure since it's the weekend, and so nothing is ticking or updating right now.
Since it appears you just want a single result of the most recent weeks closing price, try this:
=INDEX(GOOGLEFINANCE("NASDAQ:GOOGL","price",TODAY()-14,TODAY(),"WEEKLY"),3,2)
Improved the integer for the one above, but it could still theoretically break.
Try this instead, as it gives a 30 day range but only selects the result from within 1 week:
=INDEX(QUERY(GOOGLEFINANCE("NASDAQ:GOOGL","price",TODAY()-30,TODAY(),"WEEKLY"),"select Col2 where Col1 < date'"&TEXT(TODAY(),"yyyy-mm-dd")&"' and Col1 > date'"&TEXT(TODAY()-7,"yyyy-mm-dd")&"' limit 1"),2)
Hope that (finally) helps!

Importxml from Etsy site in Google Sheets

I want to extract the sale info from this link with importxml
link
My formula:
substitute(substitute(to_text(index(IMPORTXML(strong textLINK;"//div['before-sticky-nav']//div['trust-signals col-group content no-banner']//div['show-lg show-xl show-tv shop-info col-lg-7 pl-lg-3']//p['trust-signal-row text-gray-lighter']/span[3]");1));" Sales";"");" Sale";"")
The formula has been working for 1 month, yesterday it didn't work. I've tried many variations of the xpath still no luck.
Any ideas?
=VALUE(REGEXREPLACE(REGEXEXTRACT(QUERY({ARRAY_CONSTRAIN(IMPORTDATA(
"https://www.etsy.com/shop/1000Lightyear"), 4000, 1)},
"WHERE Col1 CONTAINS 'mr-xs-2 pr-xs-2 br-xs-1"">'"),
"\>([0-9 A-Za-z]+)\<"),
" Sales| Sale", ""))

Resources