i'm having a problem with this operation that is not really giving the right result.
The result is 60216 on terminal, but it should be 563376.
int a = 8536;
int b = 563376;
int d = 8536;
unsigned long long int k = (a*b);
cout << k/d << endl;
you need long long everywhere
long long int a = 8536;
long long int b = 563376;
long long int d = 8536;
unsigned long long int k = (a * b);
std::cout << k / d << std::endl;
note that its nothing to do with division. THis
int a = 8536;
int b = 563376;
unsigned long long int k = (a * b);
std::cout << k << std::endl;
gives the wrong answer too
Related
I have written a c++ program to find n-th catalan number by using combinations but I am always getting output 0. Please point out mistakes in this code:
#include <iostream>
using namespace std;
int fact(unsigned int x)
{
unsigned long long f = 1;
for (int i = 1; i <= x; i++)
{
f = f*i;
}
return f;
}
int comb(int y, int z)
{
unsigned long long int c;
c = fact(y) / (fact(z)*fact(y - z));
return c;
}
int catalan(int b)
{
unsigned long long int a;
a = (1 / (b + 1))*(comb((2 * b), b));
return a;
}
int main()
{
int n;
cout << "enter value of n for nth catalan number=";
cin >> n;
cout << n << " Catalan number=" << catalan(n) << endl;
return 0;
}
(1 / (b + 1)) is always going to be zero. Instead use
a = comb(2 * b, b) / (b + 1);
Also, you do your calculations using unsigned long long. Why not use that as return type instead of int.
I need to dynamically allocate some arrays inside the kernel function. Here is the code:
__global__
void kernel3(int N)
{
int index = blockIdx.x*blockDim.x + threadIdx.x;
if (index < N)
{
float * cost = new float[100];
for (int i = 0; i < 100; i++)
cost[i] = 1;
}
}
and
int main()
{
cudaDeviceSynchronize();
cudaThreadSynchronize();
size_t mem_tot_01 = 0;
size_t mem_free_01 = 0;
cudaMemGetInfo(&mem_free_01, &mem_tot_01);
cout << "Free memory " << mem_free_01 << endl;
cout << "Total memory " << mem_tot_01 << endl;
system("pause");
int blocksize = 256;
int aaa = 16000;
int numBlocks = (aaa + blocksize - 1) / blocksize;
kernel3 << <numBlocks, blocksize >> >(aaa);
cudaDeviceSynchronize();
cudaError_t err1 = cudaGetLastError();
if (err1 != cudaSuccess)
{
printf("Error: %s\n", cudaGetErrorString(err1));
system("pause");
}
cudaMemGetInfo(&mem_free_01, &mem_tot_01);
cout << "Free memory " << mem_free_01 << endl;
cout << "Total memory " << mem_tot_01 << endl;
system("pause");
}
In the first round of cudaMemGetInfo:
Free memory:3600826368
Total memory:4294967297
and i got an error:
Error:unspecified launch failure
and i tried to change the "int aaa" to some smaller value, it won't get error but the memory info is not equal to what i assigned.
What's wrong with it? The memory should be enough, 16000x100x32=512x10e5<3600826368
The device memory allocated by the new operator comes from a runtime heap of fixed size.
If your code requires a large amount of runtime heap memory, you will probably need to increase the size of the heap before running any kernels. NVIDIA provide the cudaDeviceSetLimit API for this purpose, which is used with the cudaLimitMallocHeapSize flag to set the size of the heap.
In libc++ i have found that basic_string destructor does not gets called , once string goes out of the scope the memory is freed by calling delete operator rather than calling its destructor and then calling the delete operator from destructor, why so?
Can some one explain this?
see the sample program
void * operator new ( size_t len ) throw ( std::bad_alloc )
{
void * mem = malloc( len );
if ( (mem == 0) && (len != 0) )
throw std::bad_alloc();
return mem;
}
void operator delete ( void * ptr ) throw()
{
if ( ptr != 0 )
free( ptr );
}
int main(int argc, const char * argv[])
{
std::string mystr("testing very very very big string for string class");
std::string mystr2(mystr1.begin(),mystr.end());
}
Put break point on new and delete and then check the call stack.
new operator gets called from basic_string class while the delete gets called from the end of main, while ideally basic_string destructor should have called first and then the delete operator should be called via deallocate call of allocator, this is valid for 2nd string creation.
I'm seeing the same thing in the debugger that you are; I don't know for sure, but I suspect that stuff is getting inlined. The destructor for basic_string is very small; a single test (for the small string optimization), and then a call to the allocator's deallocate function (through allocate_traits). std::allocators allocate function is also quite small, just a wrapper around operator delete.
You could test this by writing your own allocator. (Later: see below)
More stuff that I generated while investigating this question; read on if you're interested.
[Note: there's a bug in your code - in the second line you wrote: mystr1.begin(),mystr.end()) - where is mystr1 declared?]
Assuming that's a typo, I tried some slightly different code:
#include <string>
#include <new>
#include <iostream>
int news = 0;
int dels = 0;
void * operator new ( size_t len ) throw ( std::bad_alloc )
{
void * mem = malloc( len );
if ( (mem == 0) && (len != 0) )
throw std::bad_alloc();
++news;
return mem;
}
void operator delete ( void * ptr ) throw()
{
++dels;
if ( ptr != 0 )
free( ptr );
}
int main(int argc, const char * argv[])
{
{
std::string mystr("testing very very very big string for string class");
std::string mystr2(mystr.begin(),mystr.end());
std::cout << "News = " << news << "; Dels = " << dels << std::endl;
}
std::cout << "News = " << news << "; Dels = " << dels << std::endl;
}
If you run this code, it prints (at least for me):
News = 2; Dels = 0
News = 2; Dels = 2
which is exactly what it should.
If I toss the code into compiler explorer, then I see both the calls to basic_string::~basic_string(), exactly as I expect. (Well, I see three of them, but one of them is in an exception handling block, which ends with a call to _Unwind_resume).
Later - this code:
#include <string>
#include <new>
#include <iostream>
int news = 0;
int dels = 0;
template <class T>
class MyAllocator
{
public:
typedef T value_type;
MyAllocator() noexcept {}
template <class U>
MyAllocator(MyAllocator<U>) noexcept {}
T* allocate(std::size_t n)
{
++news;
return static_cast<T*>(::operator new(n*sizeof(T)));
}
void deallocate(T* p, std::size_t)
{
++dels;
return ::operator delete(static_cast<void*>(p));
}
friend bool operator==(MyAllocator, MyAllocator) {return true;}
friend bool operator!=(MyAllocator, MyAllocator) {return false;}
};
int main(int argc, const char * argv[])
{
{
typedef std::basic_string<char, std::char_traits<char>, MyAllocator<char>> S;
S mystr("testing very very very big string for string class");
S mystr2(mystr.begin(),mystr.end());
std::cout << "Allocator News = " << news << "; Allocator Dels = " << dels << std::endl;
}
std::cout << "Allocator News = " << news << "; Allocator Dels = " << dels << std::endl;
}
prints:
Allocator News = 2; Allocator Dels = 0
Allocator News = 2; Allocator Dels = 2
which confirms that the allocator is getting called.
I have a question about reading MNIST dataset. I got the idea of how the MNIST dataset is constructed. However, I have no clue, how does it read through a following code. Some of you may think that the result of couts are obvious( I wrote values as a comment). But for me it doesn't make sense because it uses the same exact function four times with the same input, but it gets the different output every time.. How does it possible? Please let me know If there is any ambiguity in my question.
Thank you.
Code start:
typedef unsigned char BYTE;
int main()
{
...
FILE *fp = fopen("MNIST/train-images.idx3-ubyte", "rb");
// delcare function;
int magicNumber = readFlippedInteger(fp);
int numImages = readFlippedInteger(fp);
int numRows = readFlippedInteger(fp);
int numCols = readFlippedInteger(fp);
cout << magicNumber << endl; // 2051
cout << numImages << endl; // 60000
cout << numRows << endl; // 28
cout << numCols << endl; // 28
...
}
int readFlippedInteger(FILE *fp)
{
int ret = 0;
BYTE *temp;
temp = (BYTE*)(&ret);
fread(&temp[3], sizeof(BYTE), 1, fp);
fread(&temp[2], sizeof(BYTE), 1, fp);
fread(&temp[1], sizeof(BYTE), 1, fp);
fread(&temp[0], sizeof(BYTE), 1, fp);
return ret;
}
Please don't mix C and C++ unless it is absolutely necessary. The underlying confusion is that the call to fread "moves" the file pointer through the file for you. As #RetiredNinja noted, you are advancing the file pointer 4 bytes at a time. That's how it "knows" how to read the next value even though you didn't tell it to explicitly. You can read all about file pointers here.
An implementation using slightly more idiomatic C++ could be
#include <fstream>
#include <iostream>
#include <algorithm>
int readFlippedInteger(std::istream &in) {
char temp[sizeof(int)];
in.read(temp, sizeof(int));
std::reverse(temp, temp+sizeof(int));
return *reinterpret_cast<int*>(temp);
}
int main() {
std::ifstream fin("MNIST/train-images.idx3-ubyte", std::ios::binary);
if (!fin) {
std::cerr << "Could not open file\n";
return -1;
}
// delcare function;
int magicNumber = readFlippedInteger(fin);
int numImages = readFlippedInteger(fin);
int numRows = readFlippedInteger(fin);
int numCols = readFlippedInteger(fin);
std::cout << magicNumber << std::endl // 2051
<< numImages << std::endl // 60000
<< numRows << std::endl // 28
<< numCols << std::endl; // 28
}
An implementation that uses a user-defined stream manipulator is left as an exercise for the reader.
I am trying to train my own detector based on HOG features and i trained a detector with CvSVM utility of opencv. Now to use this detector in HOGDescriptor.SetSVM(myDetector), i need to get trained detector in row-vector (primal) form to feed. For this i am using this code. my implementation is like given below:
vector<float>primal;
void LinearSVM::getSupportVector(std::vector<float>& support_vector) {
CvSVM svm;
svm.load("Classifier.xml");
cin.get();
int sv_count = svm.get_support_vector_count();
const CvSVMDecisionFunc* df = decision_func;
const double* alphas = df[0].alpha;
double rho = df[0].rho;
int var_count = svm.get_var_count();
support_vector.resize(var_count, 0);
for (unsigned int r = 0; r < (unsigned)sv_count; r++) {
float myalpha = alphas[r];
const float* v = svm.get_support_vector(r);
for (int j = 0; j < var_count; j++,v++) {
support_vector[j] += (-myalpha) * (*v);
}
}
support_vector.push_back(rho);
}
int main()
{
LinearSVM s;
s.getSupportVector(primal);
return 0;
}
When i use built-in CvSVM, it shows me SV as 3 bec i have only 3 SV in my saved file but since the decision_func is in protected mode, hence i can not access it. That's why i tried to use that wrapper but still of no use. Perhaps you guys can help me out here... Thanks alot!
Answer with a test harness. I put in new answer as it would add allot of clutter to the original answer, possibly making it a bit confusing.
//dummy features
std:: vector<float>
dummyDerReaderForOneDer(const vector<float> &pattern)
{
int i = std::rand() % pattern.size();
int j = std::rand() % pattern.size();
vector<float> patternPulNoise(pattern);
std::random_shuffle(patternPulNoise.begin()+std::min(i,j),patternPulNoise.begin()+std::max(i,j));
return patternPulNoise;
};
//extend CvSVM to get access to weights
class mySVM : public CvSVM
{
public:
vector<float>
getWeightVector(const int descriptorSize);
};
//get the weights
vector<float>
mySVM::getWeightVector(const int descriptorSize)
{
vector<float> svmWeightsVec(descriptorSize+1);
int numSupportVectors = get_support_vector_count();
//this is protected, but can access due to inheritance rules
const CvSVMDecisionFunc *dec = CvSVM::decision_func;
const float *supportVector;
float* svmWeight = &svmWeightsVec[0];
for (int i = 0; i < numSupportVectors; ++i)
{
float alpha = *(dec[0].alpha + i);
supportVector = get_support_vector(i);
for(int j=0;j<descriptorSize;j++)
{
*(svmWeight + j) += alpha * *(supportVector+j);
}
}
*(svmWeight + descriptorSize) = - dec[0].rho;
return svmWeightsVec;
}
// main harness entry point for detector test
int main (int argc, const char * argv[])
{
//dummy variables for example
int posFiles = 10;
int negFiles = 10;
int dims = 1000;
int randomFactor = 4;
//setup some dummy data
vector<float> dummyPosPattern;
dummyPosPattern.assign(int(dims/randomFactor),1.f);
dummyPosPattern.resize(dims );
random_shuffle(dummyPosPattern.begin(),dummyPosPattern.end());
vector<float> dummyNegPattern;
dummyNegPattern.assign(int(dims/randomFactor),1.f);
dummyNegPattern.resize(dims );
random_shuffle(dummyNegPattern.begin(),dummyNegPattern.end());
// the labels and lables mat
float posLabel = 1.f;
float negLabel = 2.f;
cv::Mat cSvmLabels;
//the data mat
cv::Mat cSvmTrainingData;
//dummy linear svm parmas
SVMParams cSvmParams;
cSvmParams.svm_type = cv::SVM::C_SVC;
cSvmParams.C = 0.0100;
cSvmParams.kernel_type = cv::SVM::LINEAR;
cSvmParams.term_crit = cv::TermCriteria(CV_TERMCRIT_ITER+CV_TERMCRIT_EPS, 1000000, FLT_EPSILON);
cout << "creating training data. please wait" << endl;
int i;
for(i=0;i<posFiles;i++)
{
//your feature for one box from file
vector<float> d = dummyDerReaderForOneDer(dummyPosPattern);
//push back a new mat made from the vectors data, with copy data flag on
//this shows the format of the mat for a single example, (1 (row) X dims(col) ), as training mat has each **row** as an example;
//the push_back works like vector add adds each example to the bottom of the matrix
cSvmTrainingData.push_back(cv::Mat(1,dims,CV_32FC1,d.data(),true));
//push back a pos label to the labels mat
cSvmLabels.push_back(posLabel);
}
//do same with neg files;
for(i=0;i<negFiles;i++)
{
float a = rand();
vector<float> d = dummyDerReaderForOneDer(dummyNegPattern);
cSvmTrainingData.push_back(cv::Mat(1,dims,CV_32FC1,d.data(),true));
cSvmLabels.push_back(negLabel);
}
//have a look
cv::Mat viz;
cSvmTrainingData.convertTo(viz,CV_8UC3);
viz = viz*255;
cv::imshow("svmData", viz);
cv::waitKey(10);
cout << "press any key to continue" << endl;
getchar();
viz.release();
//create the svm;
cout << "training, please wait" << endl;
mySVM svm;
svm.train(cSvmTrainingData,cSvmLabels,cv::Mat(),cv::Mat(),cSvmParams);
cout << "get weights" << endl;
vector<float> svmWeights = svm.getWeightVector(dims);
for(i=0; i<dims+1; i++)
{
cout << svmWeights[i] << ", ";
if(i==dims)
{
cout << endl << "bias: " << svmWeights[i] << endl;
}
}
cout << "press any key to continue" << endl;
getchar();
cout << "testing, please wait" << endl;
//test the svm with a large amount of new unseen fake one at a time
int totExamples = 10;
int k;
for(i=0;i<totExamples; i++)
{
cout << endl << endl;
vector<float> dPos = dummyDerReaderForOneDer(dummyPosPattern);
cv::Mat dMatPos(1,dims,CV_32FC1,dPos.data(),true);
float predScoreFromDual = svm.predict(dMatPos,true);
float predScoreBFromPrimal = svmWeights[dims];
for( k = 0; k <= dims - 4; k += 4 )
predScoreBFromPrimal += dPos[k]*svmWeights[k] + dPos[k+1]*svmWeights[k+1] +
dPos[k+2]*svmWeights[k+2] + dPos[k+3]*svmWeights[k+3];
for( ; k < dims; k++ )
predScoreBFromPrimal += dPos[k]*svmWeights[k];
cout << "Dual Score:\t" << predScoreFromDual << "\tPrimal Score:\t" << predScoreBFromPrimal << endl;
}
cout << "press any key to continue" << endl;
getchar();
return(0);
}
Hello again :) please extend the cvsm class rather than encapsulating it, as you need access to protected member.
//header
class mySVM : public CvSVM
{
public:
vector<float>
getWeightVector(const int descriptorSize);
};
//cpp
vector<float>
mySVM::getWeightVector(const int descriptorSize)
{
vector<float> svmWeightsVec(descriptorSize+1);
int numSupportVectors = get_support_vector_count();
//this is protected, but can access due to inheritance rules
const CvSVMDecisionFunc *dec = CvSVM::decision_func;
const float *supportVector;
float* svmWeight = &svmWeightsVec[0];
for (int i = 0; i < numSupportVectors; ++i)
{
float alpha = *(dec[0].alpha + i);
supportVector = get_support_vector(i);
for(int j=0;j<descriptorSize;j++)
{
*(svmWeight + j) += alpha * *(supportVector+j);
}
}
*(svmWeight + descriptorSize) = - dec[0].rho;
return svmWeightsVec;
}
something like that.
credits:
Obtaining weights in CvSVM, the SVM implementation of OpenCV