I have a probably pretty simple question but I am still not sure!
Actually I only want to smooth a histogram, and I am not sure which of the following to methods is correct. Would I do it like this:
vector<double> mask(3);
mask[0] = 0.25; mask[1] = 0.5; mask[2] = 0.25;
vector<double> tmpVect(histogram->size());
for (unsigned int i = 0; i < histogram->size(); i++)
tmpVect[i] = (*histogram)[i];
for (int bin = 1; bin < histogram->size()-1; bin++) {
double smoothedValue = 0;
for (int i = 0; i < mask.size(); i++) {
smoothedValue += tmpVect[bin-1+i]*mask[i];
}
(*histogram)[bin] = smoothedValue;
}
Or would you usually do it like this?:
vector<double> mask(3);
mask[0] = 0.25; mask[1] = 0.5; mask[2] = 0.25;
for (int bin = 1; bin < histogram->size()-1; bin++) {
double smoothedValue = 0;
for (int i = 0; i < mask.size(); i++) {
smoothedValue += (*histogram)[bin-1+i]*mask[i];
}
(*histogram)[bin] = smoothedValue;
}
My Questin is: Is it resonable to copy the histogram in a extra vector first so that when I smooth at bin i I can use the original i-1 value or would I simply do smoothedValue += (*histogram)[bin-1+i]*mask[i];, so that I use the already smoothed i-1 value instead the original one.
Regards & Thanks for a reply.
Your intuition is right: you need a temporary vector. Otherwise, you will end up using partly old values, and partly new values, and the result will not be correct. Try it yourself on paper with a simple example.
There are two ways you can write this algorithm:
Copy the data to a temporary vector first; then read from that one, and write to histogram. This is what you did in your first code fragment.
Read from histogram and write to a temporary vector; then copy from the temporary vector back to histogram.
To prevent needless copying of data, you can use vector::swap. This is an extremely fast operation that swaps the contents of two vectors. Using strategy 2 above, this would result in:
vector<double> mask(3);
mask[0] = 0.25; mask[1] = 0.5; mask[2] = 0.25;
vector<double> newHistogram(histogram->size());
for (int bin = 1; bin < histogram->size()-1; bin++) {
double smoothedValue = 0;
for (int i = 0; i < mask.size(); i++) {
smoothedValue += (*histogram)[bin-1+i]*mask[i];
}
newHistogram[bin] = smoothedValue;
}
histogram->swap(newHistogram);
Related
there!
I am studying Mr. Redmon's darknet code from https://github.com/pjreddie/darknet
I found the initialization of weights of a connected layer is like below:
// file: src/connected_layer.c
// function: make_connected_layer
float scale = sqrt(2./inputs);
for(i = 0; i < outputs*inputs; ++i){
l.weights[i] = scale*rand_uniform(-1, 1);
}
and the initialization of weights of a convolutional layer is like below:
// file: src/convolutional_layer.c
// function: make_convolutional_layer
float scale = sqrt(2./(size*size*c/l.groups));
for(i = 0; i < l.nweights; ++i) {
l.weights[i] = scale*rand_normal();
}
Could you tell me what the principle is behind these code, please? Links to resources such as related papers are also OK.
Thank you a lot!
my program is Directx Program that draws a container cube within it smaller cubes....these smaller cubes fall by time i hope you understand what i mean...
The program isn't complete yet ...it should draws the container only ....but it draws nothing ...only the background color is visible... i only included what i think is needed ...
this is the routines that initialize the program
bool Game::init(HINSTANCE hinst,HWND _hw){
Directx11 ::init(hinst , _hw);
return LoadContent();}
Directx11::init()
bool Directx11::init(HINSTANCE hinst,HWND hw){
_hinst=hinst;_hwnd=hw;
RECT rc;
GetClientRect(_hwnd,&rc);
height= rc.bottom - rc.top;
width = rc.right - rc.left;
UINT flags=0;
#ifdef _DEBUG
flags |=D3D11_CREATE_DEVICE_DEBUG;
#endif
HR(D3D11CreateDevice(0,_driverType,0,flags,0,0,D3D11_SDK_VERSION,&d3dDevice,&_featureLevel,&d3dDeviceContext));
if (d3dDevice == 0 || d3dDeviceContext == 0)
return 0;
DXGI_SWAP_CHAIN_DESC sdesc;
ZeroMemory(&sdesc,sizeof(DXGI_SWAP_CHAIN_DESC));
sdesc.Windowed=true;
sdesc.BufferCount=1;
sdesc.BufferDesc.Format=DXGI_FORMAT_R8G8B8A8_UNORM;
sdesc.BufferDesc.Height=height;
sdesc.BufferDesc.Width=width;
sdesc.BufferDesc.Scaling=DXGI_MODE_SCALING_UNSPECIFIED;
sdesc.BufferDesc.ScanlineOrdering=DXGI_MODE_SCANLINE_ORDER_UNSPECIFIED;
sdesc.OutputWindow=_hwnd;
sdesc.BufferDesc.RefreshRate.Denominator=1;
sdesc.BufferDesc.RefreshRate.Numerator=60;
sdesc.Flags=0;
sdesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
if (m4xMsaaEnable)
{
sdesc.SampleDesc.Count=4;
sdesc.SampleDesc.Quality=m4xMsaaQuality-1;
}
else
{
sdesc.SampleDesc.Count=1;
sdesc.SampleDesc.Quality=0;
}
IDXGIDevice *Device=0;
HR(d3dDevice->QueryInterface(__uuidof(IDXGIDevice),reinterpret_cast <void**> (&Device)));
IDXGIAdapter*Ad=0;
HR(Device->GetParent(__uuidof(IDXGIAdapter),reinterpret_cast <void**> (&Ad)));
IDXGIFactory* fac=0;
HR(Ad->GetParent(__uuidof(IDXGIFactory),reinterpret_cast <void**> (&fac)));
fac->CreateSwapChain(d3dDevice,&sdesc,&swapchain);
ReleaseCOM(Device);
ReleaseCOM(Ad);
ReleaseCOM(fac);
ID3D11Texture2D *back = 0;
HR(swapchain->GetBuffer(0,__uuidof(ID3D11Texture2D),reinterpret_cast <void**> (&back)));
HR(d3dDevice->CreateRenderTargetView(back,0,&RenderTarget));
D3D11_TEXTURE2D_DESC Tdesc;
ZeroMemory(&Tdesc,sizeof(D3D11_TEXTURE2D_DESC));
Tdesc.BindFlags = D3D11_BIND_DEPTH_STENCIL;
Tdesc.ArraySize = 1;
Tdesc.Format= DXGI_FORMAT_D24_UNORM_S8_UINT;
Tdesc.Height= height;
Tdesc.Width = width;
Tdesc.Usage = D3D11_USAGE_DEFAULT;
Tdesc.MipLevels=1;
if (m4xMsaaEnable)
{
Tdesc.SampleDesc.Count=4;
Tdesc.SampleDesc.Quality=m4xMsaaQuality-1;
}
else
{
Tdesc.SampleDesc.Count=1;
Tdesc.SampleDesc.Quality=0;
}
HR(d3dDevice->CreateTexture2D(&Tdesc,0,&depthview));
HR(d3dDevice->CreateDepthStencilView(depthview,0,&depth));
d3dDeviceContext->OMSetRenderTargets(1,&RenderTarget,depth);
D3D11_VIEWPORT vp;
vp.TopLeftX=0.0f;
vp.TopLeftY=0.0f;
vp.Width = static_cast <float> (width);
vp.Height= static_cast <float> (height);
vp.MinDepth = 0.0f;
vp.MaxDepth = 1.0f;
d3dDeviceContext -> RSSetViewports(1,&vp);
return true;
SetBuild() Prepare the matrices inside the container for the smaller cubes ....i didnt program it to draw the smaller cubes yet
and this the function that draws the scene
void Game::Render(){
d3dDeviceContext->ClearRenderTargetView(RenderTarget,reinterpret_cast <const float*> (&Colors::LightSteelBlue));
d3dDeviceContext->ClearDepthStencilView(depth,D3D11_CLEAR_DEPTH | D3D11_CLEAR_STENCIL,1.0f,0);
d3dDeviceContext-> IASetInputLayout(_layout);
d3dDeviceContext-> IASetPrimitiveTopology(D3D10_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
d3dDeviceContext->IASetIndexBuffer(indices,DXGI_FORMAT_R32_UINT,0);
UINT strides=sizeof(Vertex),off=0;
d3dDeviceContext->IASetVertexBuffers(0,1,&vertices,&strides,&off);
D3DX11_TECHNIQUE_DESC des;
Tech->GetDesc(&des);
Floor * Lookup; /*is a variable to Lookup inside the matrices structure (Floor Contains XMMATRX Piese[9])*/
std::vector<XMFLOAT4X4> filled; // saves the matrices of the smaller cubes
XMMATRIX V=XMLoadFloat4x4(&View),P = XMLoadFloat4x4(&Proj);
XMMATRIX vp = V * P;XMMATRIX wvp;
for (UINT i = 0; i < des.Passes; i++)
{
d3dDeviceContext->RSSetState(BuildRast);
wvp = XMLoadFloat4x4(&(B.Memory[0].Pieces[0])) * vp; // Loading The Matrix at translation(0,0,0)
HR(ShadeMat->SetMatrix(reinterpret_cast<float*> ( &wvp)));
HR(Tech->GetPassByIndex(i)->Apply(0,d3dDeviceContext));
d3dDeviceContext->DrawIndexed(build_ind_count,build_ind_index,build_vers_index);
d3dDeviceContext->RSSetState(PieseRast);
UINT r1=B.GetSize(),r2=filled.size();
for (UINT j = 0; j < r1; j++)
{
Lookup = &B.Memory[j];
for (UINT r = 0; r < Lookup->filledindeces.size(); r++)
{
filled.push_back(Lookup->Pieces[Lookup->filledindeces[r]]);
}
}
for (UINT j = 0; j < r2; j++)
{
ShadeMat->SetMatrix( reinterpret_cast<const float*> (&filled[i]));
Tech->GetPassByIndex(i)->Apply(0,d3dDeviceContext);
d3dDeviceContext->DrawIndexed(piese_ind_count,piese_ind_index,piese_vers_index);
}
}
HR(swapchain->Present(0,0));}
thanks in Advance
One bug in your program appears to be that you're using i, the index of the current pass, as an index into the filled vector, when you should apparently be using j.
Another apparent bug is that in the loop where you are supposed to be iterating over the elements of filled, you're not iterating over all of them. The value r2 is set to the size of filled before you append anything to it during that pass. During the first pass this means that nothing will be drawn by this loop. If your technique only has one pass then this means that the second DrawIndexed call in your code will never be executed.
It also appears you should be only adding matrices to filled once, regardless of the number of the passes the technique has. You should consider if your code is actually meant to work with techniques with multiple passes.
I have a text and I want to train by adding feature using the java API. Looking at the examples the main class to build the training set is the svm_problem. It appear like the svm_node represents a feature (the index is the feature and the value is the weight of the feature).
What I have done is to have a map (just to simplify the problem) that keeps an association between the feature and an index. For each of my weight> example I do create a new node :
svm_node currentNode = new svm_node();
int index = feature.getIndexInMap();
double value = feature.getWeight();
currentNode.index = index;
currentNode.value = value;
Is my intuition correct? What does the svm_problem.y refers to? Does it refer to the index of the label? Is the svm_problem.l just the length of the two vectors?
Your intuition is very close, but svm_node is a pattern not a feature. The variable svm_problem.y is an array that contains the labels of each pattern and svm_problem.l is the size of the training set.
Also, beware that svm_parameter.nr_weight is the weight of each label (useful if you have an unbalanced training set) but if you are not going to use it you must set that value to zero.
Let me show you a simple example in C++:
#include "svm.h"
#include <iostream>
using namespace std;
int main()
{
svm_parameter params;
params.svm_type = C_SVC;
params.kernel_type = RBF;
params.C = 1;
params.gamma = 1;
params.nr_weight = 0;
params.p= 0.0001;
svm_problem problem;
problem.l = 4;
problem.y = new double[4]{1,-1,-1,1};
problem.x = new svm_node*[4];
{
problem.x[0] = new svm_node[3];
problem.x[0][0].index = 1;
problem.x[0][0].value = 0;
problem.x[0][1].index = 2;
problem.x[0][1].value = 0;
problem.x[0][2].index = -1;
}
{
problem.x[1] = new svm_node[3];
problem.x[1][0].index = 1;
problem.x[1][0].value = 1;
problem.x[1][1].index = 2;
problem.x[1][1].value = 0;
problem.x[1][2].index = -1;
}
{
problem.x[2] = new svm_node[3];
problem.x[2][0].index = 1;
problem.x[2][0].value = 0;
problem.x[2][1].index = 2;
problem.x[2][1].value = 1;
problem.x[2][2].index = -1;
}
{
problem.x[3] = new svm_node[3];
problem.x[3][0].index = 1;
problem.x[3][0].value = 1;
problem.x[3][1].index = 2;
problem.x[3][1].value = 1;
problem.x[3][2].index = -1;
}
for(int i=0; i<4; i++)
{
cout << problem.y[i] << endl;
}
svm_model * model = svm_train(&problem, ¶ms);
svm_save_model("mymodel.svm", model);
for(int i=0; i<4; i++)
{
double d = svm_predict(model, problem.x[i]);
cout << "Prediction " << d << endl;
}
/* We should free the memory at this point.
But this example is large enough already */
}
I have been developing a graphics editor for SVG's to put in online for my users to access it through their web browsers. It's based on SVG-edit and it written in Javascript.
The application at the moment lacks something called ''boolean operations'', that is the ability for a user to select 2 or more shapes and join them together.
I have found a C++ library that is called LIB2GEOM, and its supposed to handle these operations, i believe this is what Inkscape uses as well.
So is their a possibility to link this library with my application since its not written in Javascript?
<div id="svgcontainer"></div>
<script>
function path2poly()
{
var subj_polygons = [[{X:10,Y:10},{X:110,Y:10},{X:110,Y:110},{X:10,Y:110}],
[{X:20,Y:20},{X:20,Y:100},{X:100,Y:100},{X:100,Y:20}]];
var clip_polygons = [[{X:50,Y:50},{X:150,Y:50},{X:150,Y:150},{X:50,Y:150}],
[{X:60,Y:60},{X:60,Y:140},{X:140,Y:140},{X:140,Y:60}]];
var scale = 100;
subj_polygons = scaleup(subj_polygons, scale);
clip_polygons = scaleup(clip_polygons, scale);
var cpr = new ClipperLib.Clipper();
cpr.AddPolygons(subj_polygons, ClipperLib.PolyType.ptSubject);
cpr.AddPolygons(clip_polygons, ClipperLib.PolyType.ptClip);
var subject_fillType = ClipperLib.PolyFillType.pftNonZero;
var clip_fillType = ClipperLib.PolyFillType.pftNonZero;
var clipTypes = [ClipperLib.ClipType.ctUnion];
var clipTypesTexts = "Union";
var solution_polygons, svg, cont = document.getElementById('svgcontainer');
var i;
for(i = 0; i < clipTypes.length; i++) {
solution_polygons = new ClipperLib.Polygons();
cpr.Execute(clipTypes[i], solution_polygons, subject_fillType, clip_fillType);
//console.log(JSON.stringify(solution_polygons));
alert(polys2path(solution_polygons, scale));
}
}
// helper function to scale up polygon coordinates
function scaleup(poly, scale) {
var i, j;
if (!scale) scale = 1;
for(i = 0; i < poly.length; i++) {
for(j = 0; j < poly[i].length; j++) {
poly[i][j].X *= scale;
poly[i][j].Y *= scale;
}
}
return poly;
}
// converts polygons to SVG path string
function polys2path (poly, scale) {
var path = "", i, j;
if (!scale) scale = 1;
for(i = 0; i < poly.length; i++) {
for(j = 0; j < poly[i].length; j++) {
if (!j) path += "M";
else path += "L";
path += (poly[i][j].X / scale) + ", " + (poly[i][j].Y / scale);
}
path += "Z";
}
return path;
}
</script>
You can try using emscripten to cross compile it to JS
First, let me clarify what I mean by "grouped models." I'm not actually sure what the standard terminology for this is. In order to reduce the number of rendering calls, I am grouping multiple models into a single model, and rendering the whole thing with a single call to glDrawElements (using VBOs). In my code, I call this a ModelGroup. I use it for various things, but especially for large groups of geometrically simple objects (like buildings in a city, or particles).
The problem has recently surfaced where my ModelGroups are rendering very slowly. I have isolated the slowdown to the actual call to glDrawElements by putting a timer around it. For instance, my particles used to render ~10k particles (without instancing) at around 2ms or so. I can't recall the exact number, but let's just say the rendering was definitely not the bottleneck as it currently is. As of now, a single call to glDrawElements with 10k particles takes right about 256ms. This performance is only marginally better than rendering the objects each with separate calls to glDrawElements. So, there is clearly a massive burden on the GPU for some reason.
What has changed in my engine:
I recently updated XCode and changed from using EAGLView to using GLKViewController. I changed nothing else in my code between these two very different states of performance. I will say that, in order to migrate over to the use of the GLKViewController, I recreated my project entirely and added all of my source files in. Then I rewrote my game loop to be updated by the GLKViewController's update function. This was a very minor change, however.
Just to be completely clear on what my ModelGroup class does, I will post the function that compiles the added models into the display model which is rendered.
-(bool) compileDisplayModelWithNormals:(bool)updateNormals withTexcoords:(bool)updateTexcoords withColor:(bool)updateColor withElements:(bool)updateElements
{
modelCompiled = YES;
bool initd = displayModel->positions;
// set properties
if( !initd )
{
displayModel->primType = GL_UNSIGNED_SHORT;
displayModel->elementType = GL_TRIANGLES;
displayModel->positionType = GL_FLOAT;
displayModel->texcoordType = GL_FLOAT;
displayModel->colorType = GL_FLOAT;
displayModel->normalType = GL_FLOAT;
displayModel->positionSize = 3;
displayModel->normalSize = 3;
displayModel->texcoordSize = 2;
displayModel->colorSize = 4;
// initialize to zero
displayModel->numVertices = 0;
displayModel->numElements = 0;
displayModel->positionArraySize = 0;
displayModel->texcoordArraySize = 0;
displayModel->normalArraySize = 0;
displayModel->elementArraySize = 0;
displayModel->colorArraySize = 0;
// sum the sizes
for( NSObject<RenderedItem> *ri in items )
{
GLModel *model = ri.modelAsset.model.displayModel;
displayModel->numVertices += model->numVertices;
displayModel->numElements += model->numElements;
displayModel->positionArraySize += model->positionArraySize;
displayModel->texcoordArraySize += model->texcoordArraySize;
displayModel->normalArraySize += model->normalArraySize;
displayModel->elementArraySize += model->elementArraySize;
displayModel->colorArraySize += model->colorArraySize;
}
displayModel->positions = (GLfloat *)malloc( displayModel->positionArraySize );
displayModel->texcoords = (GLfloat *)malloc( displayModel->texcoordArraySize );
displayModel->normals = (GLfloat *)malloc( displayModel->normalArraySize );
displayModel->elements = (GLushort *)malloc( displayModel->elementArraySize );
displayModel->colors = (GLfloat *)malloc( displayModel->colorArraySize );
}
// update the data
int vertexOffset = 0;
int elementOffset = 0;
for( int j = 0; j < [items count]; j++ )
{
NSObject<RenderedItem> *ri = (GameItem *)[items objectAtIndex:j];
GLModel *model = ri.modelAsset.model.displayModel;
if( !ri.geometryUpdate )
{
vertexOffset += model->numVertices;
continue;
}
// reset the update flag
ri.geometryUpdate = NO;
// get GameItem transform data
rpVec3 pos = [ri getPosition];
rpMat3 rot = [ri orientation];
int NoV = model->numVertices;
int NoE = model->numElements;
for( int i = 0; i < NoV; i++ )
{
// positions
rpVec3 r = rpVec3( model->positions, model->positionSize * i );
// scale
rpVec3 s = ri.scale;
r.swizzleLocal( s );
// rotate
r = rot * r;
// translate
r.addLocal( pos );
int start = model->positionSize * (vertexOffset + i);
for( int k = 0; k < model->positionSize; k++ )
displayModel->positions[start + k] = r[k];
if( updateTexcoords )
{
// texcoords
start = model->texcoordSize * (vertexOffset + i);
if( model->texcoords )
for( int k = 0; k < model->texcoordSize; k++ )
displayModel->texcoords[start + k] = model->texcoords[model->texcoordSize * i + k];
}
if( updateNormals )
{
// normals (need to be rotated)
if( model->normals )
{
for( int k = 0; k < model->normalSize; k++ )
{
rpVec3 vn = rpVec3( model->normals, model->normalSize * i );
rpVec3 vnRot = rot * vn;
start = model->normalSize * (vertexOffset + i);
displayModel->normals[start + k] = vnRot[k];
}
}
}
if( updateColor )
{
if( model->colors )
{
start = model->colorSize * (vertexOffset + i);
displayModel->colors[start] = ri.color.r;
displayModel->colors[start + 1] = ri.color.g;
displayModel->colors[start + 2] = ri.color.b;
displayModel->colors[start + 3] = ri.color.a;
}
}
}
if( updateElements )
{
for( int i = 0; i < NoE; i++ )
{
// elements
displayModel->elements[elementOffset + i] = model->elements[i] + vertexOffset;
}
}
vertexOffset += NoV;
elementOffset += NoE;
}
return YES;
}
Just to be complete, here is how I render the particles. Inside the particle field draw function:
glBindVertexArray( modelGroup.displayModel->modelID );
glBindTexture( GL_TEXTURE_2D, textureID );
// set shader program
if( changeShader ) glUseProgram( shader.programID );
[modelViewStack push];
mtxMultiply( modelViewProjectionMatrix.m, [projectionStack top].m, [modelViewStack top].m );
glUniformMatrix4fv( shader.modelViewProjectionMatrixID, 1, GL_FALSE, modelViewProjectionMatrix.m );
[DebugTimer check:#"particle main start"];
glDrawElements( GL_TRIANGLES, modelGroup.displayModel->numElements, GL_UNSIGNED_SHORT, 0 );
[DebugTimer check:#"particle main end"];
[modelViewStack pop];
The two statements that sandwich the glDrawElements statement are the timer I used to measure time between events.
Also, I just wanted to add that I have run on both the device and the iPad simulator 6.1 with the same result. The simulator is slower at performing multiple draw calls, but both are equally slow at calling glDrawElements for a ModelGroup. As far as hardware acceleration is concerned, I have checked to make sure that this performance hit isn't coming as some side effect of a lack of acceleration. I rendered a model read in from a file which contained 1024 cubes (similar to a ModelGroup for a city) which rendered with no problem (no 20ms delay as with 1000 cubes in a ModelGroup).
I believe that I have solved the mystery, in a manner of speaking. It is, after all, still a mystery to me why this solves the problem.
I had been using my own custom enum values for these functions
glEnableVertexAttribArray
glVertexAttribPointer
instead of using the newly (as of iOS 5.0, I think) Apple-specified values that came as part of the GLKViewController class:
GLKVertexAttribPosition
GLKVertexAttribNormal
GLKVertexAttribTexcoord0
Having made this change yielded the kind of performance I expected when calling glDrawElements. My model groups will now render on the order of 0.1 ms as they should, rather than ~20 ms. As I said, I really do not understand exactly why this fixes anything, but it's a solution all the same.