Generation Mean Analysis data analysis with R - analysis

I collected data on crosses that I made (six generations: P1, P2, F1, BC1P1, BC1P2, and F2) for Generation Mean Analysis. By which script and package may I analyse the data to get each generation's mean and genetic parameters?

There is no R package currently for you to do Generation mean analysis.

Related

What statistic method to use in multivariate abundance data with random effects?

I am working with multivariate data with random effects.
My hypothesis is this: D has an effect on A1 and A2, where A1 and A2 are binary data, and D is a continuous variable.
I also have a random effect, R, that is a factor variable.
So my model would be something like this: A1andA2~D, random=1=~1|R
I tried to use the function manyglm in mvabund package, but it can not deal with random effects. Or I can use lme4, but it can not deal with multivariate data.
I can convert my multivariate data to a 4 level factor variable, but I didn't find any method to use not binary but factor data as a response variable. I also can convert the continuous D into factor variable.
Do you have any advice about what to use in that situation?
First, I know this should be a comment and not a complete answer but I can't comment yet and thought you might still appreciate the pointer.
You should be able to analyze your data with the MCMCglmm R package. (see here for an Intro), as it can handle mixed models with multivariate response data.

Time Aware Recommender System would work for my data set?

I have implicit feed back data.
Customer Data: <CustomerID> <Product Bought> <Date of Purchase>
C1 P1 01-11-2008
C1 P2 01-01-2009
C1 P3 01-01-2020
C2 P1 01-01-2021
I am building a recommender system. I have used co-occurrence matrix to build the recommender system. I have used Graph lab to do it. I also used Jaccard similarity measure for it.
Now my objective is to recommend products which customer likely to buy in next 6 months. For C2, I should recommend product P2 and not P3. How to handle this problem? I did learn about CARS (context-aware recommender systems) esp. Fast FM. It looks like it did not serve me well. Please help me how to handle problem. Item-item similarity is not doing well for me.

Optimal Range for Universal Scalability Law (USL)

I'm doing a report and needed to have a test for the scalability of a mind map database software design idea. I wanted to use the USL equation to get a quantifiable metric for scalability, but I have no idea what range is considered good for USL. Any help would be appreciated :)
USL Eq'n:
C(N) = N/ (1 + α (N − 1) + β N (N − 1))
The three terms in the denominator of eqn. are associated repectively with the three Cs: the level of concurrency, a contention penalty (with stength α) and a coherency penalty (with stength β). The parameter values are defined in the range: 0 ≤ α, β < 1. The independent variable N can represent either
Do you mean number of measurements by "What range"? If yes then you cannot be assured about the required number of measured data points beforehand. Until you don't see any change in the predicted maximum number of concurrency after including more data points you have to keep adding more data points.
The estimated parameters and predictions there of are not reliable if you use the MS Excel spread sheet method explained in the book "Guerrilla Capacity Planning" book. Check out the paper "Mythbuster for the Guerrillas" to understand why and how to get reliable results. It might be worth to read the paper "Better Prediction Using the Super-serial Scalability Law Explained by the Least Square Error Principle and the Machine Repairman Model”.

Design a Data Model to predict value of sum of Function

I am working on a data mining projects and I have to design following model.
I have given 4 feature x1, x2, x3 and x4 and four function defined on these
feature such that each function depend upon some subset of available feature.
e.g.
F1(x1, x2) =x1^2+2x2^2
F2(x2, x3) =2x2^2+3x3^3
F3(x3, x4) =3x3^3+4x4^4
This implies F1 is some function which depend on feature x1, x2. F2 is some feature which depend upon x2, x3 and so on
Now I have a training data set is available where value of x1,x2,x3,x4 is known and sum(F1+F2+F3) { I know total sum but not individual sum of function)
Now using these training data I have to make a model which which can predict total sum of all the function correctly i.e. (F1+F2+F3)
I am new to the data mining and Machine learning field .So I apologizes in advance if this question is too trivial or wrong. I have tried to model it many way but I am not getting any clear thought about it. I will appreciate any help regarding this.
Your problem is non-linear regression. You have features
x1 x2 x3 x4 S where (S = sum(F1+F2+F3) )
You would like to predict S using Xn but S function is non-linear.
Since your function S is non linear, you need to use a non-linear regression algorithm for this problem. Normal nonlinear regression may solve your problem or you may choose other approaches. You may for example try Tree Regression or MARS (Multivariate Adaptive Regression Splines). They are well known algorithms and you can find commercial and open source versions.

SVD output interpretation in mahout

I am trying to run a SVD job in mahout. I have a matrix (say A) created (Document x term) of size 372053 x 21338 (21338 no of unique words say N, 372053 documents say M). So my matrix A is of size (M*N). I ran the svd using mahout and i got the cleaned eigen vectors (i gave the expected rank as 200 say R). Now i have a eigen vectors matrix created of size R*N.
Stating the SVD equation
A = U * S * V' (V' being transpose of V)
I need to convert the matrix A to the new space, to get the compressed vectors of the documents (I am trying to implement LSI)
What is the output i get from mahout SVD? (I would like to know in terms of the equation above) I read mailing list that we can get the eigen values from the NamedVectors in the generated eigen vectors matrix.
Please guide me on how to proceed from here to generate the document-term matrix A in the new space (of size M*R).
Any help is highly appreciated :)
A good starting point for LSI with Stochastic SVD on Mahout can be found here.
The good part is that the paper describes also the folding in process and is explicit on the output format in terms of the svd equation.
The work is integrated in the latest version 0.8 and can be used with SSVDCli job or through mahout CLI with mahout ssvd <options>

Resources