Wrong assignment category_id to images in COCO - roboflow

i am annotating images for test dataset.
When exporting the images and json to COCO the assignement from category_id to the images are wrong.
So when I search the json-file for category_id: 4 I just found 12 annotations but I annotated 42 images with this category.
CanĀ“t find the problem. Back in roboflow looking threw the images I find my 42 images. Something must go wrong when cnverting to COCO? Or is there any other reason for this?
Best,
Jean

Related

Any ideas on why my coreml model created with turicreate isn't working?

Pretty much brand new to ML here. I'm trying to create a hand-detection CoreML model using turicreate.
The dataset I'm using is from https://github.com/aurooj/Hand-Segmentation-in-the-Wild , which provides images of hands from an egocentric perspective, along with masks for the images. I'm following the steps in turicreate's "Data Preparation" (https://github.com/apple/turicreate/blob/master/userguide/object_detection/data-preparation.md) step-by-step to create the SFrame. Checking the contents of the variables throughout this process, there doesn't appear to be anything wrong.
Following data preparation, I follow the steps in the "Introductory Example" section of https://github.com/apple/turicreate/tree/master/userguide/object_detection
I get the hint of an error when turicreate is performing iterations to create the model. There doesn't appear to be any Loss at all, which doesn't seem right.
After the model is created, I try to test it with a test_data portion of the SFrame. The results of these predictions are just empty arrays though, which is obviously not right.
After exporting the model as a CoreML .mlmodel and trying it out in an app, it is unable to recognize anything (not surprisingly).
Me being completely new to model creation, I can't figure out what might be wrong. The dataset seems quite accurate to me. The only changes I made to the dataset were that some of the masks didn't have explicit file extensions (they are PNGs), so I added the .png extension. I also renamed the images to follow turicreate's tutorial formats (i.e. vid4frame025.image.png and vid4frame025.mask.0.png. Again, the SFrame creation process using this data seems correct at each step. I was able to follow the process with turicreate's tutorial dataset (bikes and cars) successfully. Any ideas on what might be going wrong?
I found the problem, and it basically stemmed from my unfamiliarity with Python.
In one part of the Data Preparation section, after creating bounding boxes out of the mask images, each annotation is assigned a 'label' indicating the type of object the annotation is meant to be. My data had a different name format than the tutorial's data, so rather than each annotation having 'label': 'bike', my annotations had 'label': 'vid4frame25`, 'label': 'vid4frame26', etc.
Correcting this such that each annotation has 'label': 'hand' seems to have corrected this (or at least it's creating a legitimate-seeming model so far).

NIftyNet data organization

I want to use NiftyNet to implement Deep Learning on medical image processing. However, there is one thing I haven't figured out regarding the data input: how does it join the multi-modality images? I saw the demo of BRATS2017, they seems to use 4 different modalities, and in the configuration file, they just included the directory of the images and they claim it will "concatenate" the images. But I want to know more, as those images are 3D, how are they concatenated? [slice1-30]:[slice1-30].. or [slice1, slice1, slice1 ...]:[slice2, slice2, slice2...]?
And can we control the data organization part? If so, which file should I modify?
Any suggestion would be greatly appreciated!
In this case, the 3D images are concatenated in an additional dimension. You control the order they're concatenated in by specifying the order of files to load in the *.ini files.
However, as long as you're consistent, it shouldn't matter what order the modalities go in.
The images are concatenated in the channel dimension. For 2D images, the dimensions are NSSC: batch size, 2 spatial dimensions, then channel. For 3D images, the dimensions are NSSSC: batch size, 3 spatial dimensions, then channel.

Feeding images to a neural network

I have a very basic question,So kindly bear with me
The task im trying to do is classify 12 labels,There are 12 folders which each which have about 300-400 images which i plan to feed to a network,I am Not exactly sure how do i go about reading these images in the 12 folders,i know i have to convert them into arrays,What i currently have in mind is ill create 12 assignment variables(one for each label) and read each image as an array,Does this make sense or is there a better way to do this?
Thanks in advance
Read all the images for each folder and label the class for each image (same label) , do the same process for each folder and add the images to global list. At the end you get big collection with each item having image data(array) and corresponding label , this way you get 3600 (12*300) items. You can use this for training. Sample item [image array,class label].

Why is there a discrepancy in the imagenet dataset labels?

Are the labels used for training and the ones used for validation the same? I thought they should be the same; however, there seem to be a discrepancy in the labels that are available online. When I downloaded the imagenet 2012 labels for its validation data from the official website, I get labels that start with kit_fox as the first label, which matches the exact 2012's dataset validation images I downloaded from the official website. This is the example of the labels: https://gist.github.com/aaronpolhamus/964a4411c0906315deb9f4a3723aac57
However, for almost all the pretrained models, including those trained by Google, the imagenet labels they use for training, actually start with tench, tinca tinca instead. See here: https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a
Why is there such a huge discrepancy? Where did the 'tinca tinca' kind of labels come from?
If we use the first label mapping that corresponds to the actual validation images, we face another problem: 2 classes ("Crane" and "maillot") are actually duplicated, i.e. they have the same name but refer to different kind of crane - the mechanical crane and the animal crane - resulting in 100 image in 2 of the classes instead of the supposed 50. If we do not use the first mapping, where is a reliable source of the validation images that correspond to the second label mapping?
I have the same problem in my finetuning. You solve your problem change the name of classes tench, tinca tinca to the synset number. You can find here the mapping

Image Averaging and Saving output

I'm planning to process quite a large number of images and would like to average every 5 consecutive images. My images are saved as .dm4 file format.
Essentially, I want to produce a single averaged image output for each 5 images that I can save. So for instance, if I had 400 images, I would like to get 80 averaged images that would represent the 400 images.
I'm aware that there's the Running Z Projector plugin but it does a running average and doesn't give me the reduced number of images I'm looking for. Is this something that has already been done before?
Thanks for the help!
It looks like the Image>Stacks>Tools>Grouped Z_Projector does exactly you want.
I found it by opening the command finder ('L') and filtering on "project".

Resources