Cut time series when drawdown crosses a given number - time-series

I want to cut the time series when the drawdown crosses a number. For example, as the drawdown cuts 4%, I want the original time series to cut by 50%, and when the drawdown becomes 2%, the time series reverts from 50% to 100%.
So far, I have the code to generate a random time series and the code to calculate the drawdown. I would appreciate it if anyone knew how to adjust the time series based on the drawdown. Thank you so much for your insights!
import numpy as np
import math
import pandas as pd
np.random.seed(1234)
sigma = 0.07
N = 3
IR = 1
daily_sigma = sigma / math.sqrt(252)
daily_mu = daily_sigma * IR / math.sqrt(252)
days = pd.bdate_range(pd.Timestamp(2022, 1, 1), pd.Timestamp(2022, 12, 31))
T = len(days)
def do_drawdown(cum_pnl_max, cum_pnl):
return np.maximum(cum_pnl_max.add(-cum_pnl), 0)
def get_daily_gross(target_ir):
return pd.DataFrame(np.random.normal(loc=daily_mu * target_ir, scale=daily_sigma, size=(N, T)),
columns=pd.Series(days, name='date'), index=pd.Series(np.arange(N), name='obs'))
daily_net = get_daily_gross(1)
drawdown = do_drawdown(daily_net.cumsum(axis=1).cummax(axis=1), daily_net.cumsum(axis=1))

Related

How to select interval of time or values range for timeseries data

I am working on time-series. which has two columns unit_time and traffic_load. In unit_time I have values from 0 to 7140. In traffic_load column I have different float values. I want to select the interval of time from 360 to 1000. When I try with the following code I am getting blank data_frame which has nothing. How can I select the interval of time. What will be the exact approach to select values range in specific column?
Actual data_frame
Here is the code that I am working with
df = data_frame[(data_frame['unit_time'] >= 360) & (data_frame['unit_time'] <= 1000)]
df
I am getting this Output
I am expecting to have interval of time from 360 to 1000 values.
I think you may be mixing up df and data_frame. The following example works for me:
import pandas as pd
import numpy as np
times = np.arange(7140)
loads = 100*np.random.rand(7140)
d = {'unit_time': times, 'traffic_load': loads}
data_frame = pd.DataFrame(data=d)
df = data_frame[(data_frame['unit_time'] >= 360) & (data_frame['unit_time'] <= 1000)]
df
Output:

Time series simulation (Monte Carlo) code

I'm trying to make a Monte Carlo Simulation with time series and I can't get what I'm doing wrong due to my little knowledge in Stata.
With my ARMA model I'm able to create a time series with 300 observations.
My idea is to do this process 1000 times and in every process I want to save the mean and variance of the 300 observations in a matrix.
This is my ARMA model:
This is my code:
clear all
set more off
set matsize 1000
matrix simulaciones =J(1000,2,0) *To save every simulation of every time series generated
matrix serie = J(300,3,0) *To save each time series
set obs 300 *For the 300 observations in every time series
gen t = _n
tsset t
g y1=0
forvalue j = 1(1)1000{
* Creating a time series
forvalues i = 1(1)300 {
gen e = rnormal(0,1)
replace y1=0 if t==1
replace y1 = 0.7*L1.y1 + e - 0.6*L1.e if t == 2
replace y1 = 0.7*L1.y1 - 0.1*L2.y1 + e - 0.6*L1.e + 0.08*L2.e if t > 2
matrix serie[`i',3] = y1
drop e y1
}
svmat serie
matrix simulaciones[`j',1] = mean(y1)
matrix simulaciones[`j',2] = var(y1)
}
I have no idea how to follow and any idea or recommendation is more than welcomed.
Thanks a lot for your help and time.

How to export all the information from 3d numpy array to a csv file

Kaggle Dataset and code link
I'm trying to solve the above Kaggle problem and I want to export preprocessed csv so that I can build a model on weka, but when I'm trying to save it in csv I'm losing a dimension, I want to retain all the information in that csv.
please help me with the relevant code or any resource.
Thanks
print (scaled_x)
|x |y |z |label
|1.485231 |-0.661030 |-1.194153 |0
|0.888257 |-1.370361 |-0.829636 |0
|0.691523 |-0.594794 |-0.936247 |0
Fs=20
frame_size = Fs*4 #80
hop_size = Fs*2 #40
def get_frames(df, frame_size, hop_size):
N_FEATURES = 3
frames = []
labels = []
for i in range(0,len(df )- frame_size, hop_size):
x = df['x'].values[i: i+frame_size]
y = df['y'].values[i: i+frame_size]
z = df['z'].values[i: i+frame_size]
label = stats.mode(df['label'][i: i+frame_size])[0][0]
frames.append([x,y,z])
labels.append(label)
frames = np.asarray(frames).reshape(-1, frame_size, N_FEATURES)
labels = np.asarray(labels)
return frames, labels
x,y = get_frames(scaled_x, frame_size, hop_size)
x.shape, y.shape
((78728, 80, 3), (78728,))
According to the link you posted, the data is times series accelerometer/gyro data sampled at 20 Hz, with a label for each sample. They want to aggregate the time series into frames (with the corresponding label being the most common label during a given frame).
So frame_size is the number of samples in a frame, and hop_size is the amount the sliding window moves forward each iteration. In other words, the frames overlap by 50% since hop_size = frame_size / 2.
Thus at the end you get a 3D array of 78728 frames of length 80, with 3 values (x, y, z) each.
EDIT: To answer your new question about how to export as CSV, you'll need to "flatten" the 3D frame array to a 2D array since that's what a CSV represents. There are multiple different ways to do this but I think the easiest may just be to concatenate the final two dimensions, so that each row is a frame, consisting of 240 values (80 samples of 3 co-ordinates each). Then concatenate the labels as the final column.
x_2d = np.reshape(x, (x.shape[0], -1))
full = np.concatenate([x, y], axis=1)
import pandas as pd
df = pd.DataFrame(full)
df.to_csv("frames.csv")
If you also want proper column names:
columns = []
for i in range(1, x.shape[1] + 1):
columns.extend([f"{i}_X", f"{i}_Y", f"{i}_Z"])
columns.append("label")
df = pd.DataFrame(full, columns=columns)

How to split the image into chunks without breaking character - python

I am trying to read image from the text.
I am getting better result if I break the images into small chunks but the problem is when i try to split the image it is cutting/slicing my characters.
code I am using :
from __future__ import division
import math
import os
from PIL import Image
def long_slice(image_path, out_name, outdir, slice_size):
"""slice an image into parts slice_size tall"""
img = Image.open(image_path)
width, height = img.size
upper = 0
left = 0
slices = int(math.ceil(height/slice_size))
count = 1
for slice in range(slices):
#if we are at the end, set the lower bound to be the bottom of the image
if count == slices:
lower = height
else:
lower = int(count * slice_size)
#set the bounding box! The important bit
bbox = (left, upper, width, lower)
working_slice = img.crop(bbox)
upper += slice_size
#save the slice
working_slice.save(os.path.join(outdir, "slice_" + out_name + "_" + str(count)+".png"))
count +=1
if __name__ == '__main__':
#slice_size is the max height of the slices in pixels
long_slice("/python_project/screenshot.png","longcat", os.getcwd(), 100)
Sample Image : The image i want to process
Expected/What i am trying to do :
I want to split every line as separate image without cutting the character
Line 1:
Line 2:
Current result:Characters in the image are cropped
I dont want to cut the image based on pixels since each document will have separate spacing and line width
Thanks
Jk
Here is a solution that finds the brightest rows in the image (i.e., the rows without text) and then splits the image on those rows. So far I have just marked the sections, and am leaving the actual cropping up to you.
The algorithm is as follows:
Find the sum of the luminance (I am just using the red channel) of every pixel in each row
Find the rows with sums that are at least 0.999 (which is the threshold I am using) as bright as the brightest row
Mark those rows
Here is the code that will return a list of these rows:
def find_lightest_rows(img, threshold):
line_luminances = [0] * img.height
for y in range(img.height):
for x in range(img.width):
line_luminances[y] += img.getpixel((x, y))[0]
line_luminances = [x for x in enumerate(line_luminances)]
line_luminances.sort(key=lambda x: -x[1])
lightest_row_luminance = line_luminances[0][1]
lightest_rows = []
for row, lum in line_luminances:
if(lum > lightest_row_luminance * threshold):
lightest_rows.add(row)
return lightest_rows
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 ... ]
After colouring these rows red, we have this image:

combine time series plot by using R

I wanna combine three graphics on one graph. The data from inside of R which is " nottem ". Can someone help me to write code to put a seasonal mean and harmonic (cosine model) and its time series plots together by using different colors? I already wrote model code just don't know how to combine them together to compare.
Code :library(TSA)
nottem
month.=season(nottem)
model=lm(nottem~month.-1)
summary(nottem)
har.=harmonic(nottem,1)
model1=lm(nottem~har.)
summary(model1)
plot(nottem,type="l",ylab="Average monthly temperature at Nottingham castle")
points(y=nottem,x=time(nottem), pch=as.vector(season(nottem)))
Just put your time series inside a matrix:
x = cbind(serie1 = ts(cumsum(rnorm(100)), freq = 12, start = c(2013, 2)),
serie2 = ts(cumsum(rnorm(100)), freq = 12, start = c(2013, 2)))
plot(x)
Or configure the plot region:
par(mfrow = c(2, 1)) # 2 rows, 1 column
serie1 = ts(cumsum(rnorm(100)), freq = 12, start = c(2013, 2))
serie2 = ts(cumsum(rnorm(100)), freq = 12, start = c(2013, 2))
require(zoo)
plot(serie1)
lines(rollapply(serie1, width = 10, FUN = mean), col = 'red')
plot(serie2)
lines(rollapply(serie2, width = 10, FUN = mean), col = 'blue')
hope it helps.
PS.: zoo package is not needed in this example, you could use the filter function.
You can extract the seasonal mean with:
s.mean = tapply(serie, cycle(serie), mean)
# January, assuming serie is monthly data
print(s.mean[1])
This graph is pretty hard to read, because your three sets of values are so similar. Still, if you want to simply want to graph all of these on the sample plot, you can do it pretty easily by using the coefficients generated by your models.
Step 1: Plot the raw data. This comes from your original code.
plot(nottem,type="l",ylab="Average monthly temperature at Nottingham castle")
Step 2: Set up x-values for the mean and cosine plots.
x <- seq(1920, (1940 - 1/12), by=1/12)
Step 3: Plot the seasonal means by repeating the coefficients from the first model.
lines(x=x, y=rep(model$coefficients, 20), col="blue")
Step 4: Calculate the y-values for the cosine function using the coefficients from the second model, and then plot.
y <- model1$coefficients[2] * cos(2 * pi * x) + model1$coefficients[1]
lines(x=x, y=y, col="red")
ggplot variant: If you decide to switch to the popular 'ggplot2' package for your plot, you would do it like so:
x <- seq(1920, (1940 - 1/12), by=1/12)
y.seas.mean <- rep(model$coefficients, 20)
y.har.cos <- model1$coefficients[2] * cos(2 * pi * x) + model1$coefficients[1]
plot_Data <- melt(data.frame(x=x, temp=nottem, seas.mean=y.seas.mean, har.cos=y.har.cos), id="x")
ggplot(plot_Data, aes(x=x, y=value, col=variable)) + geom_line()

Resources