get minute bar historical data from Google finance? - yahoo-finance

I can get daily data easily using this link:
https://www.google.com/finance/getprices?q=LHA&x=ETR&i=60&p=1d&f=d,c,h,l,o,v
But when I try to change "1d" to "1y" I still get 1 day's data.
I am trying to get 2 years' worth.
Is there a way to do this? yahoo or bing finance would be fine too.

You need to use '1Y', not '1y' on your query to get a time period stretching back 1 year. However, you will also need to change the granularity of your query, as minute data is only available for the previous 5 days.
This query will provide you with minute data for the previous 5 days.
https://www.google.com/finance/getprices?q=LHA&x=ETR&i=60&p=5d&f=d,c,h,l,o,v
The following query will provide you with the last two years of prices at the close.
https://www.google.com/finance/getprices?q=LHA&x=ETR&i=86400&p=2Y&f=d,c,h,l,o,v

google API "getprices" is NO more fetching intraday data of any interval (say 1 minute or 5 minute or 60 minutes .....). It is now fetching data in 1 day interval only irrespective of interval set.

I tried getting intraday data before, the furthest back I can get is 15 days, i.e. 15d
Using 2w or 1y did not work for me.
references:
http://www.mathworks.com/matlabcentral/fileexchange/32745-get-intraday-stock-price/content/getHistoricalIntraDayStockPrice.m
http://www.codeproject.com/Articles/221952/Simple-Csharp-DLL-to-download-data-from-Google-Fin

Update [2020]
Google and Yahoo deprecated their APIs so only direct website access works. However, the max resolution you can get is 1 day (end-of-day).
For historical high-resolution data Tickdata.com is a good source for tick-level data.
If 1-min bars are sufficient FirstRateData.com has 1-min data for most stocks and fx going back 20 years.

Related

What is the time unit for "datadelay" attribute on GoogleFinance?

I'm using the GoogleFinance() functions on a Google spreadsheet to keep track of my stocks. With the "datadelay" attribute I can check how long ago the data has been updated for the last time. But it only returns a raw number, like "54000" for one ticker and "15" for another. What time unit is that supposed to be? minutes? seconds? milliseconds?
When I check the documentation for the Google Finance I saw that there is a page explains there might be delay up to 20 minutes. They also mentioned that they are using different exchanges to retrieve market data and all this different exchanges might have different data delay. It can explain the differences in the "datadelay" column.
For the unit of this column, my assumption is it should be shorter than seconds since 54000 seconds = 900 min, which is far higher than the maximum delay defined in the help page. But I am not sure what would be value for this column when you query in the not-trading days.
The page shows delays for each exchange.
GoogleFinance() function updates every minute (if it is set so), but keep in mind that results may be delayed up to 20 minutes. so the answer is between 1-20 minutes

Change the delay from 20 to 1 minutes

Quotes are not sourced from all markets and may be delayed up to 20
minutes. Information is provided 'as is' and solely for informational
purposes, not for trading purposes or advice.
This advice appears when I use GOOGLEFINANCE() function in my spreadsheet. It is unfortunate that the data is delayed up to 20 minutes.
What is the best way to get real-time stock prices? Suppose my budget is around $50 per month.
Be aware that I trade only US equities, i.e. no bonds, no cryptocurrencies, and so on.
UPDATE
Here is a sample version of my portfolio spreadsheet : https://docs.google.com/spreadsheets/d/1hIfCuupmc_OZ6514DXFe_NrDCX1Ix6tcvySP_VolppI/edit#gid=42667785. It would be important for me to get the price in real-time, and not delayed by maximum 20 minutes.
Is there a way to fix that?
The GOOGLEFINANCE formula is not consistent with the delays. Different stocks can be delayed by different times. You can get an estimate of the delay by using GOOGLEFINANCE("TICKER","DATADELAY").
This is at least somewhat helpful, but not ideal, because you'll have a price on your sheet and you don't know exactly when the price was from, just an estimate of how old the price might be. And forget about pre-market or after-hours. Once the market closes, all bets are off you'll have no idea when the price is from (i.e. after hours quote or regular session close).
If you want accurate real-time quotes, you're going to need an add-on. You said your budget is $50. That doesn't leave you a lot of options. For $9 per month you can use the Market Data Add-on and get real-time stock prices along with historical intraday prices. There is also a free tier that gives you 100 free daily prices.
Market Data's STOCKDATA formula is a drop-in replacement for GOOGLEFINANCE, so it follows the same syntax. It will accomplish what you need. For example, STOCKDATA("SPY","ALL") will produce an output like this:
Date
Bid
Bid Size
Mid
Ask
Ask Size
Last
Volume
5/19/2022 9:09:48
388.36
1400
388.38
388.41
1400
388.37
2715229
Note that the date and time of the quote is included in the output, so you know exactly when the quote was fetched. There is no doubt as to whether the quote is coming from the previous day or whether it is a quote from the pre-market session (which is the case of this example). If you compare to the current time using NOW(), you'll find the Market Data quotes are delayed by about 1-2 seconds, which is due to network latency from your Google Sheet to the servers.
it's important to notice the word "may" in the first sentence:
...and may be delayed up to 20 minutes...
usually, it's way under 20 minutes (around 1 - 1:30 minutes), but there could be times when some delay may occur.
and to answer your question: no, it's not possible to force it under 1 minute
if you want to go full pro mode with Google Sheets then try: =CRYPTOFINANCE()
The documentation links from player0 indicate that ONLY crypto exchanges are supported. Data is NOT available from stock exchanges (NASDAQ, NYSE, etc).

InfluxDB integral gives high value after missing data

I am storing Amps, Volts and Watts in influxdb in a measurement/table called "Power". The frequency of update is approx every second. I can use the integral function to get power usage (in Amp Hour or Watt Hour) on an hourly basis. This is working very nicely, so I can get a graph of power used each hour over a 24 hour period. My SQL is below.
The issue is if there is a gap in the data then I get a huge spike in the result when it returns. eg if data was missing from 3pm to 5.45 pm, then the 5 pm result shows a huge spike. Reason I can see is there is close to 3 hours gap, so it just calculates the area under the graph and lumps it into the 5 PM value. Can I avoid that?
SELECT INTEGRAL(Watts) FROM Power WHERE time > now() - 24h GROUP BY time(1h)
I had a similar issue with Influx. It turns out that integral() doesn't support fill() as noted by Yuri Lachin in the comments.
Since you're grouping by hours anyway, then the average value of the power (watts) for the hour happens to be equal to the energy consumption for the hour (watt-hours), so you can use the mean() value here and you should get the correct result.
The query I'm using is:
SELECT mean("load_power") AS "load"
FROM "power_readings"
WHERE $timeFilter
GROUP BY time(1h) fill(0)
For daily numbers I can go back to using integral() because I rarely have gaps of data that span multiple days, so there's no filler needed.
Since you can use the fill() function in this query, you can decide what makes the most sense of the various options (see https://docs.influxdata.com/influxdb/v1.7/query_language/data_exploration/#group-by-time-intervals-and-fill)
You need to use fill() in group by section of query (see docs for Influx fill() usage).
In your case fill(none) or fill(0) should do the job.

Prepping Data For Usage Clustering

Dataset: I'm given the number of minutes individual customers use a product each day and am trying to cluster this data in order to find common usage patterns.
My question: How can I format the data so that, for example, a power user with high levels of use for a year looks the same as a different power user who has only been able to use the device for a month before I ended data collection?
So far I've turned each customer into an array where each cell is the number of minutes used that day. This array starts when the user first uses the product and ends after the user's first year of use. All entries in the cells must be double values (e.x. 200.0 minutes used) for the clustering model. I've considered either setting all cells/days after the last day of data collection to either -1.0 or NULL. Are either of these a valid approach? If not what would you suggest?
For the problem where you want both users (one that used the product a lot every day for a year, and the other used it a lot for one month), create a new entry where it's values are:
avg_usage per time_bin
time_bin can be a month, a day or another time bin which best fits your needs.
This way, a user which use a product, let's say 200 minutes per day for one year, will get:
200 * 30 * 12 / 12 = 6000 minutes per month
and the other user, which joined just last month, will also get, with the exact same usage will get:
200 * 30 * 1 / 1 = 6000 minutes per month.
This way, it doesn't matter when you have started to use the product, the only thing that matter, is the usage rate.
An important thing you might take into consideration, that products, may be forgotten for some time. for example, a computer, and I'm away for a vacation. Those days I didn't use my computer, doesn't have (maybe) an effect of my general usage of this product. So, based on your data, product and intuition you might consider removing gaps like the one I mentioned, and not take it into account inside the calculation.
The amount of time a user has used your product could be a signal of something, but if indeed he only started some time ago, and still using it until today, it may be something you need to take into consideration, and for that use, this average binning technique may help.

zabbix trigger based on one week old data

Iam very much new to Zabbix. i have tried my hands on triggers. what i was able to make out was it can set triggers on some constant threshold. what i need is that it should compare with the data which i exactly one week old for that exact time and if the change is above some particular % threshold then trigger an alert.
i had tried some steps like keeping the current data and one week old data in and external database and then querying that data with zabbix ODBC drivers but then i was stuck when i was not able to compare two items.
if i may be confusing stating my issue. let me know and i will be more clear with my problem
you can use the last() function for this.
For example if we sample our data every 5 minutes and we want to compare the last value with the value 10 minutes ago we can use
(item1.last(#1)/item2.last(#3)) > 1.2 - this will trigger an alert if the latest value is greater by 20% than the value 10 minutes ago.
From the documentation it is not very clear to me if you can use seconds or if they will be ignored (for example item.last(60) - to get the value 1 minute ago), but you can read more about the last function here:
https://www.zabbix.com/documentation/2.4/manual/appendix/triggers/functions

Resources