I am a new comer to fMRI/NeuroScience, so my apologies if this question is too basic.
I understand that repetition time(TR) is the time between successive RF pulses, for measuring the fMRI signal.Also there are TrialOnSet Times, at which we show the subject some stimulus.
Can someone suggest about:
how TrialOnSet Times and repetition time(TR) are related. For example for the dataset here, Trail onsets starts from 6+ second and their repetition time is 2 s. Does that mean we record first fMRI Image/volume at 0s, or at 2 s.
Related
I have a discrete time series covering 49 quarters between January 2007 and March 2019, which I am trying to analyse. Before undertaking various forms of analysis I wanted to check for the existence of seasonality and have tried to methods for such in R. In the first I used the WO function (Webel and Ollech) from the seastests package, which informed me that the data did not display seasonality.
library(seastests)
summary(wo(tt))
> summary(wo(tt))
Test used: WO
Test statistic: 0
P-value: 0.8174965 0.5785041 0.2495668
The WO - test does not identify seasonality
However, I wanted to check such again and used the decompose function, from which I got the below, which would appear to suggest a seasonal component. Can anyone advise if;
I am reading the decomposed data correctly?
AND
Why there is such disagreement between decompose and the seastest results?
The decompose function is a simple function that basically estimates the (moving) period average. The volatility of your time series increases strongly in the last years. Thus the averages may pick up on some random increases. Also, the seasonal component that you obtain using the decompose() function will basically always look seasonal.
set.seed(1234)
x <- ts(rnorm(80), frequency=4)
seastests::wo(x)
plot(decompose(x))
Therefore, seasonality tests are preferable to assessing whether a time series really is seasonal.
Still, if you have information that the data generating process has changed, you may want to use the test on the last few years of observations.
I am new to R and am trying to learn time series on the wmurder dataset of fpp2 package. To start with, as I try a classical decomposition I keep getting this error. There are 55 observations, one for each year. Shouldn't the frequency be 1? Would someone please tell me how to go about this ?
Thanks a million
Annual data does not have seasonality, so you can't do a seasonal decomposition.
I am considering to implement a complete linkage clustering algorithm from scratch for study purposes. I've seen that there is a big difference when compared to single linkage:
Unlike single linkage, the complete linkage method can be strongly affected by draw cases (where there are 2 groups/clusters with the same distance value in the distance matrix).
I'd like to see an example of distance matrix where this occurs and understand why it happens.
Consider the 1-dimensional data set
1 2 3 4 5 6 7 8 9 10
Depending how you do the first merges, you can get pretty good or pretty bad results. For example, first merge 2-3, 5-6 and 8-9. Then 2-3-4 and 7-8-9. Compare this to the "obvious" result that most humans would produce.
Lets start off with "I know ML cannot predict stock markets better than monkeys."
But I just want to go through with it.
My question is a theretical one.
Say I have date, open, high, low, close as columns. So I guess I have 4 features, open, high, low, close.
'my_close' is going to be my label(answer) and I will use the 'close' 7 days from current row. Basically i shift the 'close' column up 7 rows and make it a new column called 'my_close'.
LSTMs work on sequences. So say the sequence I set is 20 days.
hence my shape will be (1000days of data, 20 day as a sequence, 3 features).
The problem that is bothering me is should these 20 days or rows of data, have the exact same label? or can they have individual labels ?
Or have i misunderstood the whole theory?
Thanks guys.
In your case, You want to predict the current day's stock price using previous 7 days stock values. The way your building your inputs and outputs require some modification before feeding into the model.
Your making mistake in understanding timesteps(in your sequences).
Timesteps(sequences) in layman terms is the total number of inputs we will consider while predicting the output. In your case, it will be 7(not 20) as we will be using previous 7 days data to predict the current day's output.
Your Input should be previous 7 days of info
[F11,F12,F13],[F21,F22,F23],........,[F71,F72,F73]
Fij in this, F represents the feature, i represents timestep and j represents feature number.
and the output will be the stock price of the 8th day.
Here your model will analyze previous 7 days inputs and predict the output.
So to answer your question You will have a common label for previous 7 days input.
I strongly recommend you to study a bit more on LSTM's.
i am currently using 20NewsGroup-18828 dataset in weka. I have selected a subset of document with 100 per category (total 2000 documents) which i divided in a split of 70%(training) and 30%(testing) when i tried classification with naive bayes, SVM and K-nn its accuracy is very low.Here are list of operations i am performing on the dataset
StringtoWordVector (indexing and term weighting with Tf-Idf, Smart stopword list, Snowball stemmer)
Dimensionality reduction with feature selection (InformationGain)
Dimensionality reduction with feature transformation (Random Projection)
When i use original dataset with 20,000 docs it performs well but it has duplications like some documents are classified in multiple categories.
Did any one used this dataset or can someone tell me what i am doing wrong ?
Regarding differences between datasets
The main difference between 20newsgroup ( o riginal dataset) and 20newsgroup-18828 (m odified) is:
o contains duplicates, m does not
o contains trivial problem, as it includes newsgroup identification header, m includes only from and subject headers (so it is still easy version of the problem, but harder than o), for example:
FILE 51126 regarding atheism
in original form:
Path:
cantaloupe.srv.cs.cmu.edu!crabapple.srv.cs.cmu.edu!fs7.ece.cmu.edu!europa.eng.gtefsd.com!howland.reston.ans.net!noc.near.net!news.centerline.com!uunet!olivea!sgigate!sgiblab!adagio.panasonic.com!nntp-server.caltech.edu!keith
From: keith#cco.caltech.edu (Keith Allan Schneider) Newsgroups:
alt.atheism Subject: Re: >>>>>>Pompous ass Message-ID:
<1pi9btINNqa5#gap.caltech.edu> Date: 2 Apr 93 20:57:33 GMT References:
<1ou4koINNe67#gap.caltech.edu> <1p72bkINNjt7#gap.caltech.edu>
<93089.050046MVS104#psuvm.psu.edu> <1pa6ntINNs5d#gap.caltech.edu>
<1993Mar30.210423.1302#bmerh85.bnr.ca> <1pcnqjINNpon#gap.caltech.edu>
Organization: California Institute
of Technology, Pasadena Lines: 9 NNTP-Posting-Host:
punisher.caltech.edu
kmr4#po.CWRU.edu (Keith M. Ryan) writes:
>>Then why do people keep asking the same questions over and over?
>Because you rarely ever answer them.
Nope, I've answered each question posed, and most were answered
multiple times.
keith
In modified form (-18828 version)
From: keith#cco.caltech.edu (Keith Allan Schneider)
Subject: Re: >>>>>>Pompous ass
kmr4#po.CWRU.edu (Keith M. Ryan) writes:
>>Then why do people keep asking the same questions over and over?
>Because you rarely ever answer them.
Nope, I've answered each question posed, and most were answered
multiple times.
keith
As you can see, original data is so simple, that you actually can find the name of the label inside of the file... this is why you will always get good scores on such data, even if your whole processing concept is very, very wrong.
So the question is not "what is wrong with 20newsgroup-18828" but rather "what is wrong with the original dataset".
General ideas
First, why would you assume that anything is wrong? You are performing very arbitrary methods of data representation processing (two different dimensionality reduction steps) on the very small (70 training vectors per class) dataset. There is nothing wrong with this data, this is a simple NLP data, which, as most of the NLP tasks require large amounts of data, and "naive" (not NLP-based) dimensionality reduction techniques have no guarantees to actually help.
Secod, even if you do something wrong, in 90% os cases (arbitrary high number) the error is between what user think he does, and what he actually does. So describing what you do won't lead to any help, you have to show what you exactly do (by giving a reproducible example).