Unable to understand YOLOv4 architecture - machine-learning

I was going through yolov4 paper where the authors have mentioned Backbone(CSP DARKNET-53), Neck (SPP followed by PANet) & than Head(YOLOv3). Hence is the architecture something like this:
CSP Darknet-53-->SPP-->PANet-->YOLOv3(106 layers of YOLOv3).
Does this mean YOLOv4 incorporates entire YOLOv3?

First, what is YOLOv3 composed of?
YOLOv3 is composed of two parts:
Backbone or Feature Extractor --> Darknet53
Head or Detection Blocks --> 53 layers
The head is used for (1) bounding box localization, and (2) identify the class of the object inside the box.
In the case of YOLOv4, it uses the same "Head" with that of YOLOv3.
To summarize, YOLOv4 has three main parts:
Backbone --> CSPDarknet53
Neck (Connects the backbone with the head) --> SPP, PAN
Head --> YOLOv3's Head
References:
Section 1.A. in https://ieeexplore.ieee.org/document/9214094
Page 5 of http://arxiv.org/abs/2004.10934

Related

What is the difference between data association and feature matching in SLAM/VO?

I have read a little bit about it and saw for Instance the terms used interchangably or that feature matching is part of data association. In "An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics" by Yousif et. al. it is said, that "…feature matching is the process of individually extracting features and matching them over multiple frames" but also that "DA is defined as the process of associating a measurement (or feature) to its corresponding previously extracted feature.", but separetes them from each other. Other things i read about weren't that clear but mostly seem to indicate that feature matching is part of DA. Im a little bit confused.
Data Association methods are the ones where we choose how do we find the transformations between two images. There are as follows:
Features points
Image patches around the features -semi dense methods/ semi direct
pixel to pixel - direct/optical flow based methods

Generating road mesh from a graph

Background:
I got a unidirectional planner graph, each node in the graph contains its location and to which nodes it's connected to(up to 4 nodes each one in a separated variable).
Each connection between nodes is an edge, a road segment and each node is a junction\dead end.
The road should follow a 2D polar grid layout and will be edited in runtime.
This will be used as a road-building tool for city building game.
I'm using UE4 C++ and I'm pretty new to procedural generation.
The issue:
I'm looking for some guidance on how to generate the topology.
1. What algorithms\method\technic\math I should use or know about?
2. If I should use the extrude method then how do I include the junctions?
3. Where should I have overlapping verts? (other then places where I need to cut for UVs)
4. How do I incorporate sidewalk to the road segments and the junctions
Research:
The best way that I found is basically the extrude method which seems too primitive and will be problematic with intersections since it requires to lookup verts locations which seems extremely inefficient.
More details about the graph:
https://gamedev.stackexchange.com/questions/179214/generate-road-mesh-from-a-graph
(I'm posting here because game dev seems to be pretty dead sadly)

Kaggle: TrackML Particle Tracking Challenge

I'm new to ML and Kaggle. I was going through the solution of a Kaggle Challenge.
Challenge: https://www.kaggle.com/c/trackml-particle-identification
Solution: https://www.kaggle.com/outrunner/trackml-2-solution-example
While going through the code, I noticed that the author has used only train_1 file (not train_2, 3, …).
I know there is some strategy involved behind using only the train_1 file. Can someone, please, explain why is it so? Also, what are the use of blacklist_training.zip, train_sample.zip, and detectors.zip files?
I'm one of the organiser of the challenge. train_1 2 3 .. files are all equivalent. Outrunner has probably seen there was no improvement using more data.
train_sample.zip is a small dataset equivalent to train_1 2 3... provided for convenience.
blacklist_training.zip is a list of particles to be ignored due to a small bug in the simulator (not very important).
detectors.zip is the list of the geometrical surfaces where the x y z measurements are made.
David

Applying MACHINE learning in biological text data

I am trying to solve the following question - Given a text file containing a bunch of biological information, find out the one gene which is {up/down}regulated. Now, for this I have many such (60K) files and have annotated some (1000) of them as to which gene is {up/down}regulated.
Conditions -
Many sentences in the file have some gene name mention and some of them also have neighboring text that can help one decide if this is indeed the gene being modulated.
Some files also have NO gene modulated. But these still have gene mentions.
Given this, I wanted to ask (having absolutely no background in ML), what sequence learning algorithm/tool do I use that can take in my annotated (training) data (after probably converting the text to vectors somehow!) and can build a good model on which I can then test more files?
Example data -
Title: Assessment of Thermotolerance in preshocked hsp70(-/-) and
(+/+) cells
Organism: Mus musculus
Experiment type: Expression profiling by array
Summary: From preliminary experiments, HSP70 deficient MEF cells display moderate thermotolerance to a severe heatshock of 45.5 degrees after a mild preshock at 43 degrees, even in the absence of hsp70 protein. We would like to determine which genes in these cells are being activated to account for this thermotolerance. AQP has also been reported to be important.
Keywords: thermal stress, heat shock response, knockout, cell culture, hsp70
Overall design: Two cell lines are analyzed - hsp70 knockout and hsp70 rescue cells. 6 microarrays from the (-/-)knockout cells are analyzed (3 Pretreated vs 3 unheated controls). For the (+/+) rescue cells, 4 microarrays are used (2 pretreated and 2 unheated controls). Cells were plated at 3k/well in a 96 well plate, covered with a gas permeable sealer and heat shocked at 43degrees for 30 minutes at the 20 hr time point. The RNA was harvested at 3hrs after heat treatment
Here my main gene is hsp70 and it is down-regulated (deducible from hsp(-/-) or HSP70 deficient). Many other gene names are also there like AQP.
There could be another file with no gene modified at all. In fact, more files have no actual gene modulation than those who do, and all contain gene name mentions.
Any idea would be great!!
If you have no background in ML I suggest buying a product like this one, this one or this one. These products where in development for decades with team budgets in millions.
What you are trying to do is not that simple. For example a lot of papers contain negative statements by first citing the original statement from another paper and then negating it. In your example how are you going to handle this:
AQP has also been reported to be important by Doe et al. However, this study suggest that this might not be the case.
Also, if you are looking into large corpus of biomedical research papers, or for this matter any corpus of research papers. You will find tons of papers that suggest something for example gene being up-regulated or not, and then there is one paper published in Cell magazine that all previous research has been mistaken.
To make matters worse, gene/protein names are not that stable. Besides few famous ones like P53. There is a bunch of run of the mill ones that are initially thought that they are one gene, but later it turns out that these are two different things. When this happen there are two ways community handles it. Either both of the genes get new names (usually with some designator at the end) or if the split is uneven the larger class retains original name and the second one gets the new name. To compound this problem, after this split happens not all researchers get the memo at instantly, so there is still stream of publications using old publication.
These are just two simple problems, there are 100s of these.
If you are doing this for personal enrichment. Here are some suggestions:
Build a language model on biomedical papers. Existing language models are usually built from news-wire sources or from social media data. All three of the corpora claim to be written in English language. But in reality these are three different languages with their own grammar and vocabulary
Look into things like embeddings and word2vec.
Look into Kaggle competitions, this is somewhat popular topic there.
Subscribe to KDD and BIBM magazines or find them in nearby library. There are 100s of papers on this subject.

.VTX File Format?

I've recently taken the plunge into DirectX and have been messing around a little with Anim8or, and have discovered several file types that models can be exported to that are text based. I've particularly taken to VTX files. I've learned how to parse some basics out of it, but I'm obviously missing a few things.
It starts with a .Faceset with is immediately (on the same line) followed by the number of meshes in the file.
For each mesh, there is one .Vertex section and one .Index section in that order and the first pair of .Vertex/.Index sections are the first mesh, the second set are the second mesh and so on as you'd expect.
In a .Vertex section of the file, there's 8 numbers per line and an undefined number of lines (unless you want to trust the comments Anim8or has put just before the section, but that doesn't seem to be part of the specs of the file, just Anim8or being kind). The first 3 numbers correspond to X, Y, and Z coordinates for a particular point that'll later be used as a vertex, the other 5 I have no idea. A majority of the time, the last 2 numbers are both 0, but I've noticed that's not ALWAYS true, just usually true.
Next comes the matching .Index section. This section has 4 numbers. The first 3 are reference numbers to the Vertexes previously stated and the 3 points mark a triangle in the model. 0 meaning the first mentioned Vertex, 1 meaning the next one, and so on, like a zero-based array. The 4th number appears to always be -1, I can't figure out what importance it has and I can't promise it's ALWAYS -1. In case you can't tell, I'm not too certain about anything in this file type.
There's also other information in the file that I'm choosing to ignore right now because I'm new and don't want to overcomplicate things too much. Such as after every .Index section is:
.Brdf
// Ambient color
0.431 0.431 0.431
// Diffuse color
0.431 0.431 0.431
// Specular color and exponent
1 1 1 2
// Kspecular = 0.5
// end of .Brdf
It appears to me this is about the surface of the mesh just described. But it's not needed for placement of meshes so I moved past it for now.
Moving on to the real problem... I can load a VTX file when there's only one mesh in the VTX file (meaning the .FaceSet is 1). I can almost successfully load a VTX file that has multiple meshes, each mesh is successfully structured, but not properly placed in relation to the other meshes. I downloaded an AT-AT model from an Anim8or thread in a forum and it's made up of 344 meshes, when I load the file just using the specs I've mentioned so far, it looks like the AT-AT is exploded out as if it were a diagram of how to make it (when loaded in Anim8or, all pieces are close and resemble a fully assembled AT-AT). All the pieces are oriented correctly and have the same up direction, but there's plenty of extra space between the pieces.
Does somebody know how to properly read a VTX file? Or know of a website that'll explain what those other numbers mean?
Edit:
The file extension .VTX is used for a lot of different things and has a lot of different structures depending on what the expected use is. Valve, Visio, Anim8or, and several others use VTX, I'm only interested in the VTX file that Anim8or exports and the structure that it uses.
I have been working on a 3D Modeling program myself and wanted a simple format to be able to bring objects in to the editor to be able to test the speed of my drawing routines with large sets of vertices and faces. I was looking for an easy one where I could get models quickly and found the .vtx format. I googled it and found your question. When I was unable to find the format on the internet, I played around and compared .OBJ exports with .vtx ones. (Maybe it was created just for Anim8or?) Here is what I found:
1) Yes, the vertices have eight numbers on each line. The first three are, as you guessed, the x, y, and z coordinates. The next three are the vertex normals, nx, ny, and nz. You may notice that each vertex appears multiple times with different normals for each face that contains it. The last two numbers are texture coordinates.
2) As for the faces, I reached the same conclusions as you did. The first three numbers are indices into the vertex list above. The last number does appear to always be -1. I am going to assume that it has something to do with the facing of the face. (e.g. facing in or out.) Since most models are created with the faces all facing appropriately, it stands to reason that this would be the same number for all of them.
3) One additional note: When comparing the .obj with the .vtx, I did notice that the positions of the vertices changed. This was also true when comparing with the .an8 file. This should not be a "HUGE" problem as long as they are all offset by the same amount in each vertex and every file. At least then it could be compensated for.
Have you considered using the .obj file format? It is text-based and is not extremely difficult to parse or understand. There is quite a bit of information about it online.
I am going to add that, after a few hours inspection, the vtx export in Anim8or seems to be broken. I experienced the same problem as you did that the pieces were not located properly. My assumption would be that anim8or exports these objects using the local coordinates for each mesh and not accounting for transformations that have been applied. I do also note that it will not IMPORT the vtx file...
Based on some googling, it seems you're at the wrong end of the pipeline. As I understand it: A VTX file is a Valve Proprietary File Format that is the result of a set of steps.
The final output of Studiomdl for each
Half-Life model is a group of files in
the gamedirectory/models folder ready
to be used by the Game Engine:
an .MDL
file which defines the structure of
the model along with animation,
bounding box, hit box, material, mesh
and LOD information,
a .VVD file which
stores position independent flat data
for the bone weights, normals,
vertices, tangents and texture
coordinates used by the MDL, currently
three separate types of VTX file:
.sw.vtx (Software),
.dx80.vtx (DirectX
8.0) and
.dx90.vtx (DirectX 9.0) which store hardware optimized material,
skinning and triangle strip/fan
information for each LOD of each mesh
in the MDL,
often a .PHY file
containing a rigid or jointed
(ragdoll) collision model, and
sometimes
a .ANI file for To do:
something to do with model animations
Valve
Now the Valve Source SDK may have some utilities in it to read VTX's (it seems to have the ability to make them anyway). Some people may have made 3rd party tools or have code to read them, but it's likely to not work on all files just cause it's a 3rd party format. I also found this post which might help if you haven't seen it before.

Resources