How I can realize displaying two values for series: one - is absolute value and other in percent:
|||||||||||||||| 200 — (30%)
|||||||| 100 — (15%)
||||||||||||||||||||||||| 300 — (45%)
Im use a m_chart plugin for WordPress.
Thanks!
Related
I am trying to find a method that allows me to automatically detect and correct abnormal periods of data points in a time series (a sequence of outliers). I've already tried ThymeBoost but it only corrects point outliers and not outlier periods.
Here is an example of a time series containing a period of outliers
the time series :
01/02/2018 288.000000
01/03/2018 332.000000
01/04/2018 277.000000
01/05/2018 233.000000
01/06/2018 204.000000
01/07/2018 216.000000
01/08/2018 175.000000
01/09/2018 218.000000
01/10/2018 413.000000
01/11/2018 416.000000
01/12/2018 151.000000
01/01/2019 224.000000
01/02/2019 563.000000
01/03/2019 413.000000
01/04/2019 238.000000
01/05/2019 343.000000
01/06/2019 176.000000
01/07/2019 533.103060
01/08/2019 230.000000
01/09/2019 364.000000
01/10/2019 324.000000
01/11/2019 437.000000
01/12/2019 738.000000
01/01/2020 619.000000
01/02/2020 728.000000
01/03/2020 506.000000
01/04/2020 500.000000
01/05/2020 886.000000
01/06/2020 892.000000
01/07/2020 268.000000
01/08/2020 32.000000
01/09/2020 45.000000
01/10/2020 51.000000
01/11/2020 373.000000
01/12/2020 61.000000
01/01/2021 73.000000
01/02/2021 779.000000
01/03/2021 584.718872
01/04/2021 614.000000
01/05/2021 489.000000
01/06/2021 534.000000
01/07/2021 455.000000
The plot:
I have also tried to use the seasonal decompose but it doesn't work since the series doesn't seem to have a seasonality
Within my current research I'm trying to find out, how big the impact of ad-hoc sentiment on daily stock returns is.
Calculations functioned quite well and results also are plausible.
The calculations until now with quantmod package and yahoo financial data look like below:
getSymbols(c("^CDAXX",Symbols) , env = myenviron, src = "yahoo",
from = as.Date("2007-01-02"), to = as.Date("2016-12-30")
Returns <- eapply(myenviron, function(s) ROC(Ad(s), type="discrete"))
ReturnsDF <- as.data.table(do.call(merge.xts, Returns))
# adjust column names
colnames(ReturnsDF) <- gsub(".Adjusted","",colnames(ReturnsDF))
ReturnsDF <- as.data.table(ReturnsDF)
However, to make it more robust towards noisy influence of pennystock data I wonder, how its possible to exclude stocks that once in the time period go below a certain value x, let's say 1€.
I guess, the best thing would be to exclude them before calculating the returns and merge the xts object results or even better, before downloading them with the getSymbols command.
Has anybody an idea how this could work best? Thanks in advance.
Try this:
build a price frame of the Adj. closing prices of your symbols
(I use the PF function of the quantmod add-on package qmao which has lots of other useful functions for this type of analysis. (install.packages("qmao", repos="http://R-Forge.R-project.org”))
check by column if any price is below your minimum trigger price
select only columns which have no closings below the trigger price
To stay more flexible I would suggest to take a sub period - let’s say no price below 5 during the last 21 trading days.The toy example below may illustrate my point.
I use AAPL, FB and MSFT as the symbol universe.
> symbols <- c('AAPL','MSFT','FB')
> getSymbols(symbols, from='2018-02-01')
[1] "AAPL" "MSFT" "FB"
> prices <- PF(symbols, silent = TRUE)
> prices
AAPL MSFT FB
2018-02-01 167.0987 93.81929 193.09
2018-02-02 159.8483 91.35088 190.28
2018-02-05 155.8546 87.58855 181.26
2018-02-06 162.3680 90.90299 185.31
2018-02-07 158.8922 89.19102 180.18
2018-02-08 154.5200 84.61253 171.58
2018-02-09 156.4100 87.76771 176.11
2018-02-12 162.7100 88.71327 176.41
2018-02-13 164.3400 89.41000 173.15
2018-02-14 167.3700 90.81000 179.52
2018-02-15 172.9900 92.66000 179.96
2018-02-16 172.4300 92.00000 177.36
2018-02-20 171.8500 92.72000 176.01
2018-02-21 171.0700 91.49000 177.91
2018-02-22 172.5000 91.73000 178.99
2018-02-23 175.5000 94.06000 183.29
2018-02-26 178.9700 95.42000 184.93
2018-02-27 178.3900 94.20000 181.46
2018-02-28 178.1200 93.77000 178.32
2018-03-01 175.0000 92.85000 175.94
2018-03-02 176.2100 93.05000 176.62
Let’s assume you would like any instrument which traded below 175.40 during the last 6 trading days to be excluded from your analysis :-) .
As you can see that shall exclude AAPL and FB.
apply and the base function any applied(!) to a 6-day subset of prices will give us exactly what we want. Showing the last 3 days of prices excluding the instruments which did not meet our condition:
> tail(prices[,apply(tail(prices),2, function(x) any(x < 175.4)) == FALSE],3)
FB
2018-02-28 178.32
2018-03-01 175.94
2018-03-02 176.62
Adapted from the README.md file in https://github.com/tevye/HighchartsXAxisSpecificationProblem:
Three example Highstocks HTML files, one working, one broken by making the data timestamps irregular, and the last shows a failed attempt to fix are in the github repository.
Background
The working version is a slightly modified version of the example Emerson entered for help with a javascript console error 15 (Sorting Scatter Highstock Chart with Multiple Series). Ignoring the console error, we want to the Highstocks navigator on a scatter plot with irregular data timestamps. The working version included in the repository has a large 2D-array 'points' with regular time intervals. The xaxis declaration has a 'data' definition mapping values from 'points'.
data: points.map(function(point) {
return [point[2]];
}),
The only changes in the broken version is a set of arbitrary deletes from the 'points' array to force the timestamps to be sufficiently irregular to break the date inference provided by Highcharts. (If you delete just a few lines from the working copy's 'points' 2D-array, it still works. Delete a few and it still works...which is cool).
In the repository's screen grab 'broken_badTickArray.html.png', you can see the dates are Dec 31, 1969 to Jan 1, 1970 and the tick array data is indecipherable.
The attempted fix (version uploaded only 'representative' of several tries)
Starting with the broken version, several attempts were made to overcome the erroneous date range problem. The screen grab 'works_goodTickArray.html.png' shows that Highcharts boiled down the large number of timestamps to a small set of midnight day boundaries. In the attempted fix version, the following code generators an explicit set of timestamps that are then given as the value for the x-axis data.
var xtd = [];
var apr222017 = 1492819200000;
var may162017 = 1494892800000;
var x = 0;
do {
xtd[x] = apr222017 + (x * 86400000);
x++
} while ((apr222017 + (x * 86400000)) < may162017);
When that didn't work, attempted to set 'floor' and 'ceiling'...
// ...
data: xtd,
floor: apr222017,
ceiling: may162017,
// ...
Having no luck with that added...
min: apr222017,
max: may162017,
which didn't help. Nor did removng the floor and ceiling definitions and going only with min/max.
Adding the following also failed:
tickPositioner: function() {
return xtd;
},
It's failing when the number of data points are less than 1001.
What seems to be happening here is that "turbo" mode kicks in by default at 1000 entries and seems to be interpreting the data correctly once in that mode.
set turboThreshold to 1 for "axis 0"
{
xAxis: 0,
turboThreshold: 1,
//min:0,
//max: 100,
data: points.map(function(point) {
return [point[0]];//, point[1]];
}),
showInNavigator: true,
enableMouseTracking: false,
color: '#FF0000',
showInLegend: false
}
],
StackOverflow Question
Referenced within the issues space for highcharts / albeit not an issue
My dataset includes TWO main variables X and Y.
Variable X represents distinct codes (e.g. 001X01, 001X02, etc) for multiple computer items with different brands.
Variable Y represents the tax charged for each code of variable X (e.g. 15 = 15% for 001X01) at a store.
I've created categories for these computer items using dummy variables (e.g. HD dummy variable for Hard-Drives, takes value of 1 when variable X represents a HD, etc). I have a list of over 40 variables (two of them representing X and Y, and the rest is a bunch of dummy variables for the different categories I've created for computer items).
I would like to display the averages of all these categories using a loop in Stata, but I'm not sure how to do this.
For example the code:
mean Y if HD == 1
Mean estimation Number of obs = 5
--------------------------------------------------------------
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
Tax | 7.1 2.537716 1.154172 15.24583
gives me the mean Tax for the category representing Hard Drives. How can I use a loop in Stata to automatically display all the mean Taxes charged for each category? I would do it by hand without a problem, but I want to repeat this process for multiple years, so I would like to use a loop for each year in order to come up with this output.
My goal is to create a separate Excel file with each of the computer categories I've created (38 total) and the average tax for each category by year.
Why bother with the loop and creating the indicator variables? If I understand correctly, your initial dataset allows the use of a simple collapse:
clear all
set more off
input ///
code tax str10 categ
1 0.15 "hd"
2 0.25 "pend"
3 0.23 "mouse"
4 0.29 "pend"
5 0.16 "pend"
6 0.50 "hd"
7 0.54 "monitor"
8 0.22 "monitor"
9 0.21 "mouse"
10 0.76 "mouse"
end
list
collapse (mean) tax, by(categ)
list
To take to Excel you can try export excel or put excel.
Run help collapse and help export for details.
Edit
Because you insist, below is an example that gives the same result using loops.
I assume the same data input as before. Some testing using this example database
with expand 1000000, shows that speed is virtually the same. But almost surely,
you (including your future you) and your readers will prefer collapse.
It is much clearer, cleaner and concise. It is even prettier.
levelsof categ, local(parts)
gen mtax = .
quietly {
foreach part of local parts {
summarize tax if categ == "`part'", meanonly
replace mtax = r(mean) if categ == "`part'"
}
}
bysort categ: keep if _n == 1
keep categ mtax
Stata has features that make it quite different from other languages. Once you
start getting a hold of it, you will find that many things done with loops elsewhere,
can be made loop-less in Stata. In many cases, the latter style will be preferred.
See corresponding help files using help <command> and if you are not familiarized with saved results (e.g. r(mean)), type help return.
A supplement to Roberto's excellent answer: After collapse, you will need a loop to export the results to excel.
levelsof categ, local(levels)
foreach x of local levels {
export excel `x', replace
}
I prefer to use numerical codes for variables such as your category variable. I then assign them value labels. Here's a version of Roberto's code which does this and which, for closer correspondence to your problem, adds a "year" variable
input code tax categ year
1 0.15 1 1999
2 0.25 2 2000
3 0.23 3 2013
4 0.29 1 2010
5 0.16 2 2000
6 0.50 1 2011
7 0.54 4 2000
8 0.22 4 2003
9 0.21 3 2004
10 0.76 3 2005
end
#delim ;
label define catl
1 hd
2 pend
3 mouse
4 monitor
;
#delim cr
label values categ catl
collapse (mean) tax, by(categ year)
levelsof categ, local(levels)
foreach x of local levels {
export excel `:label (categ) `x'', replace
}
The #delim ; command makes it possible to easily list each code on a separate line. The"label" function in the export statement is an extended macro function to insert a value label into the file name.
Is there any body who has used TREC_EVAL? I need a "Trec_EVAL for dummies".
I'm trying to evaluate a few search engines to compare parameters like Recall-Precision, ranking quality, etc for my thesis work. I can not find how to use TREC_EVAL to send queries to the search engine and get a result file which can be used with TREC_EVAL.
Basically, for trec_eval you need a (human generated) ground truth. That has to be in a special format:
query-number 0 document-id relevance
Given a collection like 101Categories (wikipedia entry) that would be something like
Q1046 0 PNGImages/dolphin/image_0041.png 0
Q1046 0 PNGImages/airplanes/image_0671.png 128
Q1046 0 PNGImages/crab/image_0048.png 0
The query-number identifies therefore a query (e.g. a picture from a certain category to find similiar ones). The results from your search engine has then to be transformed to look like
query-number Q0 document-id rank score Exp
or in reality
Q1046 0 PNGImages/airplanes/image_0671.png 1 1 srfiletop10
Q1046 0 PNGImages/airplanes/image_0489.png 2 0.974935 srfiletop10
Q1046 0 PNGImages/airplanes/image_0686.png 3 0.974023 srfiletop10
as described here. You might have to adjust the path names for the "document-id". Then you can calculate the standard metrics trec_eval groundtrouth.qrel results.
trec_eval --help should give you some ideas to choose the right parameters for using the measurements needed for your thesis.
trec_eval does not send any queries, you have to prepare them yourself. trec_eval does only the analysis given a ground trouth and your results.
Some basic information can be found here and here.