How to properly triangulate GSM cell towers to get a location? - geolocation

First of all, I am trying to do all this disaster in c# (.net 4) so if you come up with some code to help me that would be appreciated but really anything would help at this point.
I have a situation where I have a device that can only get GSM Cell information (incidentally via the AT+KCELL command) so I have a collection of values about cell towers (each has LAC, MCC, MNC, Cell ID, Signal Strength and the first Timing Advance). I think, therefore, I am in a good place to be able to come up with some sort of longitude and latitude coordinate (albeit inaccurate, but, well meh). This is where I am reaching out for help because now my little brain is confused...
I can see various services that provide cell code resolution (Google, Open Cell ID, etc) and they take LAC,MCC etc as arguments and return a coordinate. I figure that what they return would, therefore, be the coordinate of the given tower I pass in. So in my case I could send off all the LACs etc that I have and get back a collection of longitude and latitudes. Brilliant, but that is not where my device is. Now I think I need to do some kind of triangulation and this is where my lack of knowledge is hurting me.
So am I right so far? Assuming I am, how do I perform this calculation (is there something out there that will tell me what to do with all these numbers or, even better, some open source library I can reference and feed all this stuff into to get something sensible)?
I'm assuming that I would need to use the timing advance to work out some approximate distance from a cell tower (maybe using the signal strength somehow) but what do I have to do? As you can tell - I am way out of my depth here!
For example, this is something I might get back from the aforementioned AT command:
5,74,33,32f210,157e,8101,50,0,79,3,32f210,157e,80f7,37,64,5,32f210,157e,810b,37,55,32,32f210,157e,9d3,27,41,33,32f210,157e,edf8,15
breaking it up and parsing it I would get (I hope I parse this right - there is a chance there is a bug in my parsing routine of course but it looks reasonable):
Number of cells: 5
Cell 1
LAC: 5502
MNC: 1
MCC: 232
Cell ID: 33025
Signal: 80
ARFCN: 74
BSIC: 33
Timing advance: 0
Longitude: 14.2565389
Latitude: 48.2248439
Cell 2
LAC: 5502
MNC: 1
MCC: 232
Cell ID: 33015
Signal: 55
ARFCN: 79
BSIC: 3
Longitude: 14.2637736
Latitude: 48.2331576
Cell 3
LAC: 5502
MNC: 1
MCC: 232
Cell ID: 33035
Signal: 55
ARFCN: 64
BSIC: 5
Longitude: 14.2488966
Latitude: 48.232513
Cell 4
LAC: 5502
MNC: 1
MCC: 232
Cell ID: 2515
Signal: 39
ARFCN: 55
BSIC: 32
Longitude: 14.2488163
Latitude: 48.2277972
Cell 5
LAC: 5502
MNC: 1
MCC: 232
Cell ID: 60920
Signal: 21
ARFCN: 41
BSIC: 33
Longitude: 14.2647612
Latitude: 48.2299558
So with all that information how do I find, in the most accurate way, where I actually am?

I can help you with a bit of the theory.
Triangulation is basically finding the intersection point of 3 circles.
Each mobile tower is the center of a circle. The size of the circle is relative to the signal strength of that tower.
The place where the 3 circles overlap is where the user is.
You can do some very basic triangulation as follows:
3 Towers at
tx1,ty1
tx2,ty2
tx3,ty3
With signal strengths s1, s2, s3
We calculate the weight of each signal. Essentially a number from 0 to 1 for each tower where the sum of the weights adds up to 1.
Weighted signal w1, w2, w3 where:
w1 = s1/(s1+s2+s3)
w2 = s2/(s1+s2+s3)
w3 = s3/(s1+s2+s3)
User will be at
x: (w1 * tx1 + w2 * tx2+ w3 * tx3)
y: (w1 * ty1 + w2 * ty2+ w3 * ty3)
Here is a working example using the values from your question:
s1 = 80
s2 = 55
s3 = 55
s4 = 55
s5 = 21
w1 = 80 / ( 80 + 55 + 55 + 55 + 21 )
w2 = 55 / ( 80 + 55 + 55 + 55 + 21 )
w3 = 55 / ( 80 + 55 + 55 + 55 + 21 )
w4 = 55 / ( 80 + 55 + 55 + 55 + 21 )
w5 = 21 / ( 80 + 55 + 55 + 55 + 21 )
w1 = 0.3007519
w2 = 0.2067669
w3 = 0.2067669
w4 = 0.2067669
w5 = 0.0789474
1. Longitude: 14.2565389
1. Latitude: 48.2248439
2. Longitude: 14.2637736
2. Latitude: 48.2331576
3. Longitude: 14.2488966
3. Latitude: 48.232513
4. Longitude: 14.2488163
4. Latitude: 48.2277972
5. Longitude: 14.2647612
5. Latitude: 48.2299558
Location Longitude =
14.2565389 * 0.3007519 +
14.2637736 * 0.2067669 +
14.2488966 * 0.2067669 +
14.2488163 * 0.2067669 +
14.2647612 * 0.0789474
Location Latitude: =
48.2248439 * 0.3007519 +
48.2331576 * 0.2067669 +
48.232513 * 0.2067669 +
48.2277972 * 0.2067669 +
48.2299558 * 0.0789474
Result Longitude: 14.255507
Result Latitude: 48.2291628

This is not an answer really but its a starter and I might add more to it:
The cell ids are published it seems:
http://openbmap.org/
I found this link from this wiki page that has links to other cell id data sources: http://en.wikipedia.org/wiki/Cell_ID )
see the bottom of the page the is a link to the cell id data:
http://openbmap.org/latest/cellular/raw/input_raw.zip
also i found this youtube video where a guys is playing around with some apps that have cell tower locations it seems:
http://www.youtube.com/watch?v=CYvVN5dJD7A
possibly between the cell ids and signal strength you can make a guess..
but AFAIK for general triangulation you need to know the exact location of at least three towers and your exact distance from them (this could be a rough distance with signal strength but it may just be too in accurate).
it seems like wikipedia is saying its done in this way.. use a combination of which cell you are in, the closest tower and signal strengths to get your location:
http://en.wikipedia.org/wiki/Mobile_phone_tracking

Related

What is the average mobile rating on Google PageSpeed Insights?

We use Google PageSpeed Insights as a marketing tool to compare the download speed of websites we do with what our competitors do. But so many mobile sites are rated in the 30s and wondered if that's what the average mobile rating is. Does anyone know? Thx
Short Answer
The average mobile rating is 31.
Long Answer.
An article I found after writing the below that answers the question
This article from tunetheweb has actually done the hard work for us here and gathered the data from httparchive. (give the article a read it has a wealth of interesting information!)
The below table taken from that article covers your question (the answer is 31 for the performance metric 50th percentile)
Percentile Performance Accessibility Best Practices SEO PWA
10 8 56 64 69 14
25 16 69 64 80 25
50 31 80 71 86 29
75 55 88 79 92 36
90 80 95 86 99 54
95 93 97 93 100 54
99 99 100 93 100 64
I have left the below in as the information may be useful to somebody but the above answers the question much better. At least my guess of 35 wasn't a millions miles away from the actual answer. hehe.
My original Answer
You would imagine that a score of 50 would be the average right? Nope!
Lighthouse uses a log-normal curve to create a curve that dictates scores.
The two key control points on that curve are the 25th percentile for the median (a score of 50 means you are in the top 25% effectively) and the 8th percentile for a score of 90.
The numbers used to determine these points are derived from http archive data.
You can explore the curve used for Time To Interactive scoring here as an example.
Now I am sure someone who is a lot better at maths than me can use that data to calculate the average score for a site, but I would estimate it to be around 35 for a mobile site, which is pretty close to what you have observed.
One thing I can do is provide how the scoring works based on those control points so you can see the various cutoff points etc. for each metric.
The below is taken from the maths module at https://github.com/paulirish/lh-scorecalc/tree/190bed715a3589601f314b3c8a50fb0fb147c121
I have also included the median and falloff values currently used in this calculation in the scoring variable.
To play with it use either the VALUE_AT_QUANTILE function to get what value you need to achieve a certain percentage (so to see the value for the 90th percentile for Time To Interactive you would use VALUE_AT_QUANTILE(7300, 2900, 0.9); (take median (7300) and falloff (2900) from TTI in the scoring variable and then enter the desired percentile as a decimal (90 -> 0.9)).
Similar for QUANTILE_AT_VALUE function which does the reverse (shows the percentile that a particular value would fall at). E.g. if you wanted to see what percentile a First CPU Idle time of 3200 gets you would use QUANTILE_AT_VALUE(6500, 2900, 3200).
Anyway I have gone a bit off tangent, but hopefully the above and below will let someone cleverer than me the info needed to work it out (I have included the weightings for each item as well in the weights variable).
const scoring = {
FCP: {median: 4000, falloff: 2000, name: 'First Contentful Paint'},
FMP: {median: 4000, falloff: 2000, name: 'First Meaningful Paint'},
SI: {median: 5800, falloff: 2900, name: 'Speed Index'},
TTI: {median: 7300, falloff: 2900, name: 'Time to Interactive'},
FCI: {median: 6500, falloff: 2900, name: 'First CPU Idle'},
TBT: {median: 600, falloff: 200, name: 'Total Blocking Time'}, // mostly uncalibrated
LCP: {median: 4000, falloff: 2000, name: 'Largest Contentful Paint'},
CLS: {median: 0.25, falloff: 0.054, name: 'Cumulative Layout Shift', units: 'unitless'},
};
const weights = {
FCP: 0.15,
SI: 0.15,
LCP: 0.25,
TTI: 0.15,
TBT: 0.25,
CLS: 0.05
};
function internalErf_(x) {
// erf(-x) = -erf(x);
var sign = x < 0 ? -1 : 1;
x = Math.abs(x);
var a1 = 0.254829592;
var a2 = -0.284496736;
var a3 = 1.421413741;
var a4 = -1.453152027;
var a5 = 1.061405429;
var p = 0.3275911;
var t = 1 / (1 + p * x);
var y = t * (a1 + t * (a2 + t * (a3 + t * (a4 + t * a5))));
return sign * (1 - y * Math.exp(-x * x));
}
function internalErfInv_(x) {
// erfinv(-x) = -erfinv(x);
var sign = x < 0 ? -1 : 1;
var a = 0.147;
var log1x = Math.log(1 - x*x);
var p1 = 2 / (Math.PI * a) + log1x / 2;
var sqrtP1Log = Math.sqrt(p1 * p1 - (log1x / a));
return sign * Math.sqrt(sqrtP1Log - p1);
}
function VALUE_AT_QUANTILE(median, falloff, quantile) {
var location = Math.log(median);
var logRatio = Math.log(falloff / median);
var shape = Math.sqrt(1 - 3 * logRatio - Math.sqrt((logRatio - 3) * (logRatio - 3) - 8)) / 2;
return Math.exp(location + shape * Math.SQRT2 * internalErfInv_(1 - 2 * quantile));
}
function QUANTILE_AT_VALUE(median, falloff, value) {
var location = Math.log(median);
var logRatio = Math.log(falloff / median);
var shape = Math.sqrt(1 - 3 * logRatio - Math.sqrt((logRatio - 3) * (logRatio - 3) - 8)) / 2;
var standardizedX = (Math.log(value) - location) / (Math.SQRT2 * shape);
return (1 - internalErf_(standardizedX)) / 2;
}
console.log("Time To Interactive (TTI) 90th Percentile Time:", VALUE_AT_QUANTILE(7300, 2900, 0.9).toFixed(0));
console.log("First CPU Idle time of 3200 score / percentile:", (QUANTILE_AT_VALUE(6500, 2900, 3200).toFixed(3)) * 100);

Testing accuracy more than training accuracy

I am building a tuned random forest model for multiclass classification.
I'm getting the following results
Training accuracy(AUC) :0.9921996
Testing accuracy(AUC) :0.992237664
I saw a question related to this on this website and the common answer seems to be that the dataset must be small and your model got lucky
But in my case I have about 300k training data points and 100k testing data points
Also my classes are well balanced
> summary(train$Bucket)
0 1 TO 30 121 TO 150 151 TO 180 181 TO 365 31 TO 60 366 TO 540 541 TO 730 61 TO 90
166034 32922 4168 4070 15268 23092 8794 6927 22559
730 + 91 TO 120
20311 11222
> summary(test$Bucket)
0 1 TO 30 121 TO 150 151 TO 180 181 TO 365 31 TO 60 366 TO 540 541 TO 730 61 TO 90
55344 10974 1389 1356 5090 7698 2932 2309 7520
730 + 91 TO 120
6770 3741
Is it possible for a model to fit this well on a large testing data? Please answer if I can do something to cross verify that my model is indeed fitting really well.
My complete code
split = sample.split(Book2$Bucket,SplitRatio =0.75)
train = subset(Book2,split==T)
test = subset(Book2,split==F)
traintask <- makeClassifTask(data = train,target = "Bucket")
rf <- makeLearner("classif.randomForest")
params <- makeParamSet(makeIntegerParam("mtry",lower = 2,upper = 10),makeIntegerParam("nodesize",lower = 10,upper = 50))
#set validation strategy
rdesc <- makeResampleDesc("CV",iters=5L)
#set optimization technique
ctrl <- makeTuneControlRandom(maxit = 5L)
#start tuning
tune <- tuneParams(learner = rf ,task = traintask ,resampling = rdesc ,measures = list(acc) ,par.set = params ,control = ctrl ,show.info = T)
rf.tree <- setHyperPars(rf, par.vals = tune$x)
tune$y
r<- train(rf.tree, traintask)
getLearnerModel(r)
testtask <- makeClassifTask(data = test,target = "Bucket")
rfpred <- predict(r, testtask)
performance(rfpred, measures = list(mmce, acc))
The difference is of order 1e-4, nothing is wrong, it is a regular, statistical error (variance of the result). Nothing to worry about. This literally means that a difference is about 0.0001 * 100,000 = 10 samples ... 10 samples out of 100k.

Recursive daily forecast

I am doing a recursive one-step-ahead daily forecast with different time series models for 2010. For example:
set.seed(1096)
Datum=seq(as.Date("2008/1/1"), as.Date("2010/12/31"), "days")
r=rnorm(1096)
y=xts(r,order.by=as.Date(Datum))
List.y=vector(mode = "list", length = 365L)
for (i in 1:365) {
window.y <- window(y[,1], end = as.Date("2009-12-30") + i)
fit.y <- arima(window.y, order=c(5,0,0))
List.y[[i]] <- forecast(fit.y , h = 1)
}
the list looks like this:
List.y
[[1]]
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
732 -0.0506346 -1.333437 1.232168 -2.012511 1.911242
[[2]]
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
733 0.03905936 -1.242889 1.321008 -1.921511 1.99963
....
[[365]]
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
1096 0.09242849 -1.1794 1.364257 -1.852665 2.037522
And now I want to extract only the forecast value for each period [1]-[365], so I can work with the forecast data. However, I am not sure how to do this.
I tried
sa=sapply(List.y[1:365], `[`, 4)
but then I only get this:
$mean
Time Series:
Start = 732
End = 732
Frequency = 1
[1] -0.0506346
$mean
Time Series:
Start = 733
End = 733
Frequency = 1
[1] 0.03905936
...
$mean
Time Series:
Start = 1096
End = 1096
Frequency = 1
[1] 0.09242849
but I want all 365 [1] values in a numeric vector or something, so I can work with the data.
Just use this: sa2=as.numeric(sa). sa2 will be a numeric vector of forecasted means.

finding the rth term of a sequence

the question is to give a possible formula for the rth term.
i'm able to solve two questions but rest i can't seems to be of a different way or like weird.as i'm studying alevels i think there's a common rule or maybe an easy way to solve sequence related problems.i never understood sequence well enough-it's just that hard for me.
6 18 54 162
i'm able to solve it by 2*3^r
4 7 12 19
by r^2+3
but
4 12 24 40 60
i'm trying so many ways but i can't find the answer.i think there's a common rule for solving all these not much marks are there so it should be solved in an easy way but i'm not getting how to.please help
Here's a formula in R for the sequence:
g <- function(n) 6*n + 2*n^2 + 4
g(0:4)
[1] 4 12 24 40 60
Here is one way to solve this relation. First, recognize that it is quadratic as the difference is an arithmetic sequence (linear).
Then note that g(x + 1) = g(x) + 8 + 4x. Represent g(x) = a*x^2 + b*x + c.
Then:
g(x+1) = a(x+1)^2 + b(x+1) + c = g(x) + 8 + 4x = a*x^2 + b*x + c + 8 * 4x
ax^2 + 2ax + a + b*x + b + c = a*x^2 + b*x + c + 8 + 4x
Thus
2ax + a +b = 8 + 4x
As this holds for all x, it must be that 2ax = 4x or a = 2. Thus
4x + 2 + b = 8 + 4x
So b = 6. With these known, c is determined by g(0) = c = 4.

Extracting sampled Time Points

I have a matlab Curve from which i would like to plot and find Concentration values at 17 different time samples
Following is the curve from which i would like to extract Concentration values at 17 different time points
following are the time points in minutes
t = 0,0.25,0.50,1,1.5,2,3,4,9,14,19,24,29,34,39,44,49. minutes samples
Following is the Function which i have written to plot the above graph
function c_t = output_function_constrainedK2(t, a1, a2, a3,b1,b2,b3,td, tmax,k1,k2,k3)
K_1 = (k1*k2)/(k2+k3);
K_2 = (k1*k3)/(k2+k3);
DV_free= k1/(k2+k3);
c_t = zeros(size(t));
ind = (t > td) & (t < tmax);
c_t(ind)= conv(((t(ind) - td) ./ (tmax - td) * (a1 + a2 + a3)),(K_1*exp(-(k2+k3)*t(ind)+K_2)),'same');
ind = (t >= tmax);
c_t(ind)= conv((a1 * exp(-b1 * (t(ind) - tmax))+ a2 * exp(-b2 * (t(ind) - tmax))) + a3 * exp(-b3 * (t(ind) - tmax)),(K_1*exp(-(k2+k3)*t(ind)+K_2)),'same');
plot(t,c_t);
axis([0 50 0 1400]);
xlabel('Time[mins]');
ylabel('concentration [Mbq]');
title('Model :Constrained K2');
end
If possible, Kindly please suggest me some idea how i could possibly alter the above function so that i can come up with concentration values at 17 different time points stated above
Following are the input values that i have used to come up with the curve
output_function_constrainedK2(0:0.1:50,2501,18500,65000,0.5,0.7,0.3,3,8,0.014,0.051,0.07)
This will give you concentration values at the time points you wanted. You will have to put this inside the output_function_constrainedK2 function so that you can access the variables t and c_t.
T=[0 0.25 0.50 1 1.5 2 3 4 9 14 19 24 29 34 39 44 49];
concentration=interp1(t,c_t,T)

Resources