Does anybody know what algorithm is used here?
I want to implement this function to do detection's windows grouping.
Thank you.
If you look at the OpenCV source code for the partition function, you will see the following comments:
// This function splits the input sequence or set into one or more equivalence classes and
// returns the vector of labels - 0-based class indexes for each element.
// predicate(a,b) returns true if the two sequence elements certainly belong to the same class.
//
// The algorithm is described in "Introduction to Algorithms"
// by Cormen, Leiserson and Rivest, the chapter "Data structures for disjoint sets"
template<typename _Tp, class _EqPredicate> int partition( const vector<_Tp>& _vec, vector<int>& labels, _EqPredicate predicate=_EqPredicate())
{
// ... etc.
}
This gives you both the source code, and the reference for the algorithm.
So, that's Chapter 21 in this book.
Related
I was trying to implement vector algebra with generic algorithms and ended up playing with iterators. I have found two examples of not obvious and unexpected behaviour:
if I have pointer p to a struct (instance) with field fi, I can access the field as simply as p.fi (rather than p.*.fi)
if I have a "member" function fun(this: *Self) (where Self = #This()) and an instance s of the struct, I can call the function as simply as s.fun() (rather than (&s).fun())
My questions are:
is it documented (or in any way mentioned) somewhere? I've looked through both language reference and guide from ziglearn.org and didn't find anything
what is it that we observe in these examples? syntactic sugar for two particular cases or are there more general rules from which such behavior can be deduced?
are there more examples of weird pointers' behaviour?
For 1 and 2, you are correct. In Zig the dot works for both struct values and struct pointers transparently. Similarly, namespaced functions also do the right thing when invoked.
The only other similar behavior that I can think of is [] syntax used on arrays. You can use both directly on an array value and an array pointer interchangeably. This is somewhat equivalent to how the dot operates on structs.
const std = #import("std");
pub fn main() !void {
const arr = [_]u8{1,2,3};
const foo = &arr;
std.debug.print("{}", .{arr[2]});
std.debug.print("{}", .{foo[2]});
}
AFAIK these are the only three instances of this behavior. In all other cases if something asks for a pointer you have to explicitly provide it. Even when you pass an array to a function that accepts a slice, you will have to take the array's pointer explicitly.
The authoritative source of information is the language reference but checking it quickly, it doesn't seem to have a dedicated paragraph. Maybe there's some example that I missed though.
https://ziglang.org/documentation/0.8.0/
I first learned this syntax by going through the ziglings course, which is linked to on ziglang.org.
in exercise 43 (https://github.com/ratfactor/ziglings/blob/main/exercises/043_pointers5.zig)
// Note that you don't need to dereference the "pv" pointer to access
// the struct's fields:
//
// YES: pv.x
// NO: pv.*.x
//
// We can write functions that take pointer arguments:
//
// fn foo(v: *Vertex) void {
// v.x += 2;
// v.y += 3;
// v.z += 7;
// }
//
// And pass references to them:
//
// foo(&v1);
The ziglings course goes quite in-depth on a few language topics, so it's definitely work checking out if you're interested.
With regards to other syntax: as the previous answer mentioned, you don't need to dereference array pointers. I'm not sure about anything else (I thought function pointers worked the same, but I just ran some tests and they do not.)
I found out that dart language has a built-in sort method in List class and I would like to know what is the algorithm they used in this method and what is the Big O notation of it?
I found out that dart language has a built-in sort method in List class and I would like to know what is the algorithm they used in this method and what is the Big O notation of it?
If we take a look inside the SDK we can find the following implementation of the sort method on List:
void sort([int compare(E a, E b)]) {
Sort.sort(this, compare ?? _compareAny);
}
https://github.com/dart-lang/sdk/blob/b86c6e0ce93e635e3434935e31fac402bb094705/sdk/lib/collection/list.dart#L340-L342
Which just forward the sorting to the following internal helper class:
https://github.com/dart-lang/sdk/blob/a75ffc89566a1353fb1a0f0c30eb805cc2e8d34c/sdk/lib/internal/sort.dart
Which has the following comment about the sorting algorithm:
/**
* Dual-Pivot Quicksort algorithm.
*
* This class implements the dual-pivot quicksort algorithm as presented in
* Vladimir Yaroslavskiy's paper.
*
* Some improvements have been copied from Android's implementation.
*/
This sorting algorithm is actually the same used in Java (at least Java 7):
http://www.docjar.com/html/api/java/util/DualPivotQuicksort.java.html
Here we can see that the O-notation is mostly O(n log(n)):
* This class implements the Dual-Pivot Quicksort algorithm by
* Vladimir Yaroslavskiy, Jon Bentley, and Josh Bloch. The algorithm
* offers O(n log(n)) performance on many data sets that cause other
* quicksorts to degrade to quadratic performance, and is typically
* faster than traditional (one-pivot) Quicksort implementations.
For more details you can read the paper by the designer of the Dual-Pivot Quicksort algorithm:
https://web.archive.org/web/20151002230717/http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf
But, also please note that Dart also has the following constant:
// When a list has less then [:_INSERTION_SORT_THRESHOLD:] elements it will
// be sorted by an insertion sort.
static const int _INSERTION_SORT_THRESHOLD = 32;
...
static void _doSort<E>(
List<E> a, int left, int right, int compare(E a, E b)) {
if ((right - left) <= _INSERTION_SORT_THRESHOLD) {
_insertionSort(a, left, right, compare);
} else {
_dualPivotQuicksort(a, left, right, compare);
}
}
So for small lists, it makes more sense to use the traditional insertion sort algorithm which has a worst case big O of О(n^2). But since the input are very small, it is properly faster than the Dual-Pivot Quicksort algorithm.
At https://dartpad.dartlang.org/, try the following code. I can't answer your question about what the implementation is under the covers, but you can bet it's O(n log n).
I'm using an answer because I can't supply code in a comment easily.
void main() {
Map<String, int> map = {'a': 1, 'b': 2};
List<String> list = ['banana', 'apple', 'age', 'bob'];
list.sort((String a, String b) => a.compareTo(b));
print(list);
}
BTW list.sort(); would give the same results since that custom comparitor is the same as the default.
I am surprising to notice that it is somehow difficult to obtain a correct fit of interaction function from gam().
To be more specific, I want to estimate an additive function:
y=m_1(x)+m_2(z)+m_{12}(x,z)+u,
where m_1(x)=x^2, m_2(z)=z^2,m_{12}(x,z)=xz. The following code generate this model:
test1 <- function(x,z,sx=1,sz=1) {
#--m1(x) function
m.x<-x^2
m.x<-m.x-mean(m.x)
#--m2(z) function
m.z<-z^2
m.z<-m.z-mean(m.z)
#--m12(x,z) function
m.xz<-x*z
m.xz<-m.xz-mean(m.xz)
m<-m.x+m.z+m.xz
return(list(m=m,m.x=m.x,m.z=m.z,m.xz=m.xz))
}
n <- 1000
a=0
b=2
x <- runif(n,a,b)/20
z <- runif(n,a,b)
u <- rnorm(n,0,0.5)
model<-test1(x,z)
y <- model$m + u
So I use gam() by fitting the model as
b3 <- gam(y~ ti(x) + ti(z) + ti(x,z))
vis.gam(b3);title("tensor anova")
#---extracting basis matrix
B.f3<-model.matrix.gam(b3)
#---extracting series estimator
b3.hat<-b3$coefficients
Question: when I plot the estimated function by gam()above against its true function, I end up with
par(mfrow=c(1,3))
#---m1(x)
B.x<-B.f3[,c(2:5)]
b.x.hat<-b3.hat[c(2:5)]
plot(x,B.x%*%b.x.hat)
points(x,model$m.x,col='red')
legend('topleft',c('Estimate','True'),lty=c(1,1),col=c('black','red'))
#---m2(z)
B.z<-B.f3[,c(6:9)]
b.z.hat<-b3.hat[c(6:9)]
plot(z,B.z%*%b.z.hat)
points(z,model$m.z,col='red')
legend('topleft',c('Estimate','True'),lty=c(1,1),col=c('black','red'))
#---m12(x,z)
B.xz<-B.f3[,-c(1:9)]
b.xz.hat<-b3.hat[-c(1:9)]
plot(x,B.xz%*%b.xz.hat)
points(x,model$m.xz,col='red')
legend('topleft',c('Estimate','True'),lty=c(1,1),col=c('black','red'))
However, the function estimate of m_1(x) is largely different from x^2, and the interaction function estimate m_{12}(x,z) is also largely different from xz defined in test1 above. The results are the same if I use predict(b3).
I really can't figure it out. Can anybody help me out by explaining why the results end up with this? Greatly appreciate it!
First, the problem of the above issue is not due to the package, of course. It is closely related to the identification conditions of the smooth functions. One common practice is to impose the assumptions that E(mj(.))=0 for all individual function j=1,...,d, and E(m_ij(x_i,x_j)|x_i)=E(m_ij(x_i,x_j)|x_j)=0 for i not equal to j. Those conditions require one to employ centered basis function in series estimator, which has been done already in GAM package. However, in my case above, function m(x,z)=x*z defined in test1 does not satisfy the above identification assumptions, since the integral of x*z with respect to either x or z is not zero when x and z have range from zero to two.
Furthermore, series estimator allows the individual and interaction function to be identified if one impose m(0)=0 or m(0,x_j)=m(x_i,0)=0. This can be readily achieved if we center the basis function around zero. I have tried both cases, and they work well whenever DGP satisfies the identification conditions.
What is the best approach to achieve a linear regression using CEP ?. We have tried two different options.
We do want to have the algorithm working in real time.
Basic code for both approach :
create context IntervalSpanning3Seconds start #now end after 30 sec;
create schema measure (
temperature float,
water float,
_hours float,
persons float,
production float
);
#Name("gattering_measures")
insert into measure
select
cast(getNumber(m,"measurement.bsk_mymeasurement.temperature.value"),
float) as temperature,
cast(getNumber(m, "measurement.bsk_mymeasurement.water.value"), float) as water,
cast(getNumber(m, "measurement.bsk_mymeasurement._hours.value"), float) as _hours,
cast(getNumber(m, "measurement.bsk_mymeasurement.persons.value"), float) as persons,
cast(getNumber(m, "measurement.bsk_mymeasurement.production.value"),float) as production
from MeasurementCreated m
where m.measurement.type = "bsk_mymeasurement";
1. Using the function stat:linest
#Name("get_data")
context IntervalSpanning3Seconds
select * from measure.stat:linest(water,production,_hours,persons,temperature)
output snapshot when terminated;
EDIT: The problem here is that it seems like the "get_data" is getting execute by each measurement and not by the entire collection of measurement.
2. Get data and passed a javascript function.
create expression String exeReg(data) [
var f = f(data)
function f(d){
.....
// return the linear regression as a string
}
return f
];
#Name("get_data")
insert into CreateEvent
select
"bsk_outcome_linear_regression" as type,
exeReg(m) as text,
....
from measure m;
EDIT: Here, I would like to know what is the type of the variable that is passed to the exeReg() function and how I should iterate it ? example would be nice.
I'll appreciate any help.
Using JavaScript would mean that the script computes a new result (recomputes) for each collection it receives. Instead of recomputing it the #lineest data window is a good choice. Or you can add a custom aggregation function or custom data window to the engine if there is certain code your want to use. Below is how the script can receive multiple events for the case when a script is desired.
create expression String exeReg(data) [
.... script here...
];
select exeReg(window(*)) ....
from measure#time(10 seconds);
Having had success with my last CLRS question, here's another:
In Introduction to Algorithms, Second Edition, p. 501-502, a linked-list representation of disjoint sets is described, wherein each list member the following three fields are maintained:
set member
pointer to next object
pointer back to first object (the set representative).
Although linked lists could be implemented by using only a single "Link" object type, the textbook shows an auxiliary "Linked List" object that contains a pointer to the "head" link and the "tail" link. Having a pointer to the "tail" facilitates the Union(x, y) operation, so that one need not traverse all of the links in a larger set x in order to start appending the links of the smaller set y to it.
However, to obtain a reference to the tail link, it would seem that each link object needs to maintain a fourth field: a reference to the Linked List auxiliary object itself. In that case, why not drop the Linked List object entirely and use that fourth field to point directly to the tail?
Would you consider this an omission in the text?
I just opened the text and the textbook description seems fine to me.
From what I understand the data-structure is something like:
struct Set {
LinkedListObject * head;
LinkedListObject * tail;
};
struct LinkedListObject {
Value set_member;
Set *representative;
LinkedListObject * next;
};
The textbook does not talk of any "auxillary" linked list structure in the book I have (second edition). Can you post the relevant paragraph?
Doing a Union would be something like:
// No error checks.
Set * Union(Set *x, Set *y) {
x->tail->next = y->head;
x->tail = y->tail;
LinkedListObject *tmp = y->head;
while (tmp) {
tmp->representative = x;
tmp = tmp->next;
}
return x;
}
why not drop the Linked List object entirely and use that fourth field to point directly to the tail?
An insight can be taken from path compression. There all the elements are supposed to point to head of list. If it doesn't happen then the find-set operation does that (by changing p[x] and returning that). You talk similarly of tail. So if such function is implemented only then can we use that.