How to group dates by year in pandas timeseries? - time-series

I have the following data for AAPL:
High Low Open Close Volume Adj Close
Date
1987-12-31 1.535714 1.495536 1.517857 1.500000 29400000.0 1.200883
1988-01-04 1.598214 1.508929 1.526786 1.598214 82600000.0 1.279513
1988-01-05 1.651786 1.580357 1.642857 1.593750 77280000.0 1.275938
1988-01-06 1.607143 1.562500 1.607143 1.562500 67200000.0 1.250920
1988-01-07 1.598214 1.517857 1.553571 1.589286 53200000.0 1.272364
... ... ... ... ... ... ...
2007-12-24 28.475714 27.827143 27.861429 28.400000 120050700.0 24.785059
2007-12-26 28.708570 28.117144 28.430000 28.421429 175933100.0 24.803761
2007-12-27 28.994286 28.257143 28.421429 28.367144 198881900.0 24.756376
2007-12-28 28.794285 28.125713 28.655714 28.547142 174911800.0 24.913471
2007-12-31 28.642857 28.250000 28.500000 28.297142 134833300.0 24.695290
5044 rows × 6 columns
What i want to do is add another column which will group all the data by month and year. This way i can run an operation on just months or come year(s). The Date is already in datetime.

You can extract the year and month and put them in two columns "year" and "month" with:
df['year'] = df['Date'].dt.year
df['month'] = df['Date'].dt.month
If you want a single column "year_and_month" you can use:
df['year_and_month'] = df['Date'].dt.year.astype('str') + '-' + df['Date'].dt.month.astype('str')

Related

How to get VWAP using DolphinDB TimeSeriesEngine or ReactiveStateEngine

I am getting live tick data consisting of Time, Symbol Name, Last Traded Price, Cumulative Volume (Daily).
Now how to get VWAP using 1) Custom function 2) TimeSeriesEngine 3) ReactiveStateEngine with DolphinDB? Please Help me. Necessary code is as under.
This is stream table for getting ticks from python
t_colNames=`ts`symbol`price`vol`upd_tick
t_colTypes=`TIMESTAMP`SYMBOL`DOUBLE`DOUBLE`TIMESTAMP
This is stream table to store 1 min OHLC data
ohlc_colNames=`ts`symbol`open`high`low`close`volume`tp`last_tick`upd_1m
ohlc_colTypes=`TIMESTAMP`SYMBOL`DOUBLE`DOUBLE`DOUBLE`DOUBLE`DOUBLE`DOUBLE`TIMESTAMP`TIMESTAMP
This is 1 min OHLC TimeSeriesEngine
OHLC_sm1 = createTimeSeriesEngine(name="OHLC_sm1", windowSize=60000, step=60000, metrics=<[first(price) as open, max(price) as high, min(price) as low, last(price) as close, sum(vol) as volume, (max(price)+min(price)+last(price))/3 as tp, last(upd_tick) as last_tick, now() as upd_1m]>, dummyTable=tmp, outputTable=sm1 , timeColumn=`ts, useSystemTime=true, keyColumn=`symbol, updateTime=60000, useWindowStartTime=false);
This is the function to convert cumulative volume to volume
def calcVolume(mutable dictVolume, mutable tsAggrOHLC, msg){
t = select ts,symbol,price,vol,upd_tick from msg context by symbol limit -1
update t set prevVolume = dictVolume[symbol]
dictVolume[t.symbol] = t.vol
tsAggrOHLC.append!(t.update!("vol", <vol-prevVolume>))
}
dictVol = dict(STRING, DOUBLE)
subscribeTable(tableName="t", actionName="OHLC_sm1", offset=0, handler=calcVolume{dictVol,OHLC_sm1}, msgAsTable=true, hash=1)
I recommend using ReactiveStateEngine to convert cumulative volume to volume and then connecting two engines in series. Here is an example:
tradesData = your_tick_data
//define Trade Table
x=tradesData.schema().colDefs
share streamTable(100:0, x.name, x.typeString) as Trade
//define OHLC outputTable
share streamTable(100:0, `datetime`symbol`open`high`low`close`volume`updatetime,[TIMESTAMP,SYMBOL,DOUBLE,DOUBLE,DOUBLE,DOUBLE,LONG,TIMESTAMP]) as OHLC
//1 min OHLC TimeSeriesEngine
tsAggrOHLC = createTimeSeriesAggregator(name="aggr_ohlc", windowSize=60000, step=60000, metrics=<[first(Price),max(Price),min(Price),last(Price),wavg(Price,Volume),now()]>, dummyTable=Trade, outputTable=OHLC, timeColumn=`Datetime, keyColumn=`Symbol)
//ReactiveStateEngine:convert cumulative volume to volume
rsAggrOHLC = createReactiveStateEngine(name="calc_vol", metrics=<[Datetime, Price, deltas(Volume) as Volume]>, dummyTable=Trade, outputTable=tsAggrOHLC, keyColumn=`Symbol)
//subscribe table and insert data into engines
subscribeTable(tableName="Trade", actionName="minuteOHLC2", offset=0, handler=append!{rsAggrOHLC}, msgAsTable=true)
replay(inputTables=tradesData, outputTables=Trade, dateColumn=`Datetime)
You can use user-defined functions in any of the engine's matrics.

Date with Gaps - Wavelet Analysis in R Using Biwavelet Package

I am performing Wavelet Analysis using biwavelet package in R. The date variable does not have continuous dates but with gaps. When I try to create the graph, I get the following error.
Error in check.datum(d) : The step size must be constant (see approx function to interpolate)
An MWE is given below:
library(foreign)
library(biwavelet)
library(xts)
library(labelled)
library(zoo)
date =c("2020-02-13", "2020-02-14", "2020-02-17", "2020-02-18", "2020-02-19", "2020-02-20", "2020-02-21", "2020-02-24", "2020-02-25", "2020-02-26", "2020-02-27", "2020-02-28", "2020-03-02", "2020-03-03", "2020-03-04", "2020-03-05", "2020-03-06", "2020-03-09", "2020-03-10", "2020-03-11", "2020-03-12", "2020-03-13")
rdate = as.Date(date)
date <- as.Date(date, format = "%Y-%m-%d")
date
class(date)
var = c(-0.077423148, -0.083293147, -0.089214072, -0.095185943, -0.101208754, -0.107282504, -0.113407195, -0.119582824, -0.125809386, -0.125806898, -0.132149309, -0.138584509, -0.145112529, -0.151733354, -0.158446968, -0.165253401, -0.172152638, -0.179144681, -0.186229542, -0.193407193, -0.200677648, -0.208040923)
data = data.frame(date, var)
View(data)
X <- as.xts(data[,-1], order.by = date)
ABC <- data.frame(date, var)
wt.t1=plot(wt(ABC), form = "%b-%d")
How can I resolve this issue?
You can interpolate missing days by following the instructions in the error message:
alldates <- seq(min(date), max(date), by = 1)
interpdata <- approx(date, var, xout = alldates)
ABC <- data.frame(date = alldates, var = interpdata$y)
wt.t1 <- plot(wt(ABC, form = "%b-%d")
However, I think the reason you are missing some days is that they are Saturday or Sunday; I only see weekdays in the dataset.
For many datasets (e.g. stock market trading, etc.) it doesn't make sense to interpolate "what would the price have been on Saturday?", because trades never occur on Saturday or Sunday. In that case, I'd suggest replacing the "date" variable with a simple increment, e.g.
date <- 1:length(date)
ABC <- data.frame(date, var)
wt.t1=plot(wt(ABC), form = "%b-%d")

abas-ERP (FO- Language): Getting Weekday of abas date

Is there any FO function for getting the weekday of an abas date as short. e.g.:
Today: 07.04.2016 -> Thursday (th) ?
In example, when you have;
.type GD xddate ? _F|defined(U|xddate)
.type int xidate ? _F|defined(U|xidate)
..
!START
.formula U|xddate = "."
.formula U|xidate = U|xddate//7
.println 'F|tostring(U|xidate)'
The variable U|xtdate will continue "4" which is the fourth day of the week, Thursday.
Another deeper approach would be, to get the name of the weekday from the built-in dictionary.
See in HOMEDIR/msg.cc.dic which number monday has (in my case 420)
Then this FO-line
.type text xtweekday
.type GD xddate
.formula U|xddate = "09.04.2016"
.atext -language E xtweekday 'F|eval(420 + U|xddate//7)'
'xtweekday' returns Saturday
for "today" just write
.atext -language E xtweekday 'F|eval(420 + G|date//7)'
You can also use the more powerfull .translate command, but this is in this case not really necessary.

Lua seconds format questions

I have this function:
function SecondsFormat(X)
if X <= 0 then return "" end
local t ={}
local ndays = string.format("%02.f",math.floor(X / 86400))
if tonumber(ndays) > 0 then table.insert(t,ndays.."d ") end
local nHours = string.format("%02.f",math.floor((X/3600) -(ndays*24)))
if tonumber(nHours) > 0 then table.insert(t,nHours.."h ") end
local nMins = string.format("%02.f",math.floor((X/60) - (ndays * 1440) - (nHours*60)))
if tonumber(nMins) > 0 then table.insert(t,nMins.."m ") end
local nSecs = string.format("%02.f", math.fmod(X, 60));
if tonumber(nSecs) > 0 then table.insert(t,nSecs.."s") end
return table.concat(t)
end
I would like to add weeks and months to it but cant get my head around the month part to move on to the week part just because the days in a month aren't always the same so can anyone offer some help?
The second question is, is using a table to store the results the most efficient way of dealing with this given the function will be called every 3 seconds for up to 100 items (in a grid)?
Edit:
function ADownload.ETA(Size,Done,Tranrate) --all in bytes
if Size == nil then return "--" end
if Done == nil then return "--" end
if Tranrate == nil then return "--" end
local RemS = (Size - Done) / Tranrate
local RemS = tonumber(RemS)
if RemS <= 0 or RemS == nil or RemS > 63072000 then return "--" end
local date = os.date("%c",RemS)
if date == nil then return "--" end
local month, day, year, hour, minute, second = date:match("(%d+)/(%d+)/(%d+) (%d+): (%d+):(%d+)")
month = month - 1
day = day - 1
year = year - 70
if tonumber(year) > 0 then
return string.format("%dy %dm %dd %dh %dm %ds", year, month, day, hour, minute, second)
elseif tonumber(month) > 0 then
return string.format("%dm %dd %dh %dm %ds",month, day, hour, minute, second)
elseif tonumber(day) > 0 then
return string.format("%dd %dh %dm %ds",day, hour, minute, second)
elseif tonumber(hour) > 0 then
return string.format("%dh %dm %ds",hour, minute, second)
elseif tonumber(minute) > 0 then
return string.format("%dm %ds",minute, second)
else
return string.format("%ds",second)
end
end
I merged the function into the main function as I figured it would probably be quicker but I now have two questions:
1: I had to add
if date == nil then return "--" end
because it errors occasionally with date:match trying to compare with "nil" however os.date mentions nothing in the literature about returning nil as its a string or a table so although the extra line of code fixes the issue I'm wondering why that behaviour occurs as I'm sure I caught all the non events in the previous returns?
2: Sometimes I see functions written like myfunction(...) and I'm sure that just does away with the arguments and if so is there a one line of code that could do away with the first 3 "if" statements?
You can use the os.date function to get date values in a useable format. The '*t' formating parameter makes the returned date into a table instead of a string.
local t = os.date('*t')
print(t.year, t.month, t.day, t.hour, t.min, t.sec)
print(t.wday, t.yday)
os.data uses the current time by default, you can pass it an explicit time if you want (see the os.data docs for more info on this)
local t = os.date('*t', x)
As for table performance, I wouldn't worry about that. Not only is your function not called all that often, but table handling is much cheaper than other things you might be doing (calling os.date, lots of string formatting, etc)
Why not let Lua's os library do the hard work for you?
There is probably an easier (read: better) way to figure out the difference to 01/01/70, but here is a quick idea of how you could use it:
function SecondsFormat(X)
if X <= 0 then return "" end
local date = os.date("%c", X) -- will give something like "01/03/70 03:40:00"
local inPattern = "(%d+)/(%d+)/(%d+) (%d+):(%d+):(%d+)"
local outPattern = "%dy %dm %dd %dh %dm %ds"
local month, day, year, hour, minute, second = date:match(inPattern)
month = month - 1
day = day - 1
year = year - 70
return string.format(outPattern, year, month, day, hour, minute, second)
end
I think that this should also be a lot quicker than constructing the table and calling string.format multiple times - but you'd have to profile that.
EDIT: I ran a quick test with two functions that concatenate "abc", "def" and "ghi" using both methods. Inserting those strings into a table an concatenating took 14 seconds (for several million runs of course) and using a single string.format() took 6 seconds. This does not take into account, that your code calls string.format() anyway (multiple times) - nor the difference between you figuring out the values by division and I by pattern matching. Pattern matching is certainly slower, but I doubt that it outweighs the gains from not having a table - and it's certainly convenient to be able to leverage os.time(). The fastest way would probably be figuring out the month and day manually and then using a single string.format(). But again - you'd have to profile that.
EDIT2: missingno has a good point with using the "*t" option with os.date to give you the values separately in the first place. Again, this depends on whether you want to have a table for convenience vs. a string for storage or whatever reasons. Combining "*t" and a single string.format:
function SecondsFormat(X)
if X <= 0 then return "" end
local date = os.date("*t", X) -- will give you a table
local outPattern = "%dy %dm %dd %dh %dm %ds"
date.month = date.month - 1
date.day = date.day - 1
date.year = date.year - 70
return string.format(outPattern, date.year, date.month, date.day, date.hour, date.min, date.sec)
end

display a calendar year with custom hyperlinks in asp.net mvc

I'm looking to create an MVC web page that displays the 12 months of the year in a calendar format. Inside each day in the month I will bold only the dates that have any activity (data from database). The dates with activity would also be hyperlinked to a route like /Activity/2008/12/25
i'm about to attempt to try the asp.net ajax control toolkit calendar control but was wondering if anyone else had any advice.
Rendering a calendar is not extremely complicated. By using DateTimeFormatInfo in System.Globalization and the DateTime all the necessary information can be retrieved:
DateTimeFormatInfo.CurrentInfo.FirstDayOfWeek
DateTimeFormatInfo.CurrentInfo.GetMonthName(month)
DateTimeFormatInfo.CurrentInfo.GetAbbreviatedDayName((DayOfWeek)dayNumber)
A month in the calendar can be rendered in a table:
_ _ _ 1 2 3 4
5 6 7 8 9 ...
To dermine the number of empty cells at the begining something like this can be used:
DateTime date = new DateTime(year, month, 1);
int emptyCells = ((int)date.DayOfWeek + 7
- (int)DateTimeFormatInfo.CurrentInfo.FirstDayOfWeek) % 7;
As there are maximum 31 days in a month and maximum 6 empty cells at begining, a month can be rendered on maximum Ceil(37 / 7) = 6 rows. So there are maximum 42 cells to render in a month, some of them will be empty.
A new row is inserted in the table each 7 (number of days in a week) cells.
int days = DateTime.DaysInMonth(year, month);
for (int i = 0; i != 42; i++)
{
if (i % 7 == 0) {
writer.WriteLine("<tr>");
if( i > 0 ) writer.WriteLine("</tr>");
}
if (i < emptyCells || i >= emptyCells + days) {
writer.WriteLine("<td class=\"cal-empty\"> </td>");
} else {
writer.WriteLine("<td class=\"cal-day\"\">" + date.Day + "</td>");
date = date.AddDays(1);
}
}
Also, simply add an additional link in the non-empty cells to the desired route when the dates are with activity.

Resources