OpenCV Haar Cascade Creation - opencv

I want to try create my own .xml file for my graduation project with this reference.
But I have a problem which stage 6 doesn't work.It gives error such as:
Traceback (most recent call last):
File "./tools/mergevec.py", line 170, in <module>
merge_vec_files(vec_directory, output_filename)
File "./tools/mergevec.py", line 120, in merge_vec_files
val = struct.unpack('<iihh', content[:12])
TypeError: a bytes-like object is required, not 'str'
I have found a solution which says find 0 size vector files and delete them.
But, I don't know which vector files are 0 size and how I can detect them.
Can you help about this please?

I was able to solve my problem when i changed it:
for f in files:
with open(f, 'rb') as vecfile:
content = ''.join(str(line) for line in vecfile.readlines())
data = content[12:]
outputfile.write(data)
except Exception as e:
exception_response(e)
for it:
for f in files:
with open(f, 'rb') as vecfile:
content = b''.join((line) for line in vecfile.readlines())
outputfile.write(bytearray(content[12:]))
except Exception as e:
exception_response(e)
and like before i changed it:
content = ''.join(str(line) for line in vecfile.readlines())
for it:
content = b''.join((line) for line in vecfile.readlines())
because it was waiting for some str, and now it is able to receive the binary archives that we're in need.
:)

Try following this guide. It's more recent.

Related

How to address a Snakemake error at the stage of DAG computation?

I'm running in to an error with my Snakemake variant identification pipeline, when the original DAG of jobs is built. I believe this is a memory issue; when I test with a short list of input files, the DAG is constructed without issue, however, when I try with 300+ input paired-fastq, I receive the following error:
Building DAG of jobs...
Traceback (most recent call last):
File "/home//.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/__init__.py", line 633, in snakemake
keepincomplete=keep_incomplete,
File "/home//.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/workflow.py", line 568, in execute
dag.check_incomplete()
File "/home//.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/dag.py", line 281, in check_incomplete
incomplete = self.incomplete_files
File "/home//.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/dag.py", line 402, in incomplete_files
filterfalse(self.needrun, self.jobs),
File "/home/k/.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/dag.py", line 399, in <genexpr>
job.output
File "/home//.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/persistence.py", line 205, in incomplete
return any(map(lambda f: f.exists and marked_incomplete(f), job.output))
File "/home//.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/persistence.py", line 205, in <lambda>
return any(map(lambda f: f.exists and marked_incomplete(f), job.output))
File "/home//.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/persistence.py", line 203, in marked_incomplete
return self._read_record(self._metadata_path, f).get("incomplete", False)
File "/home//.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/persistence.py", line 322, in _read_record_cached
return self._read_record_uncached(subject, id)
File "/home//.conda/envs/snakemake/lib/python3.6/site-packages/snakemake/persistence.py", line 328, in _read_record_uncached
return json.load(f)
File "/home//.conda/envs/snakemake/lib/python3.6/json/__init__.py", line 299, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/home//.conda/envs/snakemake/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/home//.conda/envs/snakemake/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home//.conda/envs/snakemake/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I'm not sure how to resolve this - if this is a known bug or if there is a way to define my pipeline to build a less complex DAG? I am including the first section of my Snakemake file as well. I use the rule all to define all desired output files.
################################
#### Mtb bwa/GATK Snakemake ####
################################
import numpy as np
from collections import defaultdict
import pandas as pd
samples_df = pd.read_table('config/tgen_samples2a.tsv',sep = ',').set_index("sample", drop=False)
sample_names = list(samples_df['sample'])
batch_names = list(samples_df['batch'])
#print(sample_names)
# fastq1 input function definition
def fq1_from_sample(wildcards):
return samples_df.loc[wildcards.sample, "fastq_1"]
# fastq2 input function definition
def fq2_from_sample(wildcards):
return samples_df.loc[wildcards.sample, "fastq_2"]
# Define config file. Stores sample names and other things.
configfile: "config/config.yaml"
# Define a rule for running the complete pipeline.
rule all:
wildcard_constraints:
batch="IS-.+"
input:
trim = expand(['results/{batch}/{samp}/trim/{samp}_trim_1.fq.gz'], zip, samp=sample_names,batch=batch_names),
kraken=expand('results/{batch}/{samp}/kraken/{samp}_trim_kr_1.fq.gz', zip, samp=sample_names,batch=batch_names),
bams=expand('results/{batch}/{samp}/bams/{samp}_{mapper}_{ref}_sorted.bam', zip, samp=sample_names,batch=batch_names, ref = config['ref']*len(sample_names), mapper = config['mapper']*len(sample_names)), # When using zip, need to use vectors of equal lengths for all wildcards.
per_samp_run_stats = expand('results/{batch}/{samp}/stats/{samp}_{mapper}_{ref}_combined_stats.csv', zip, samp=sample_names,batch=batch_names, ref = config['ref']*len(sample_names), mapper = config['mapper']*len(sample_names)),
amr_stats=expand('results/{batch}/{samp}/stats/{samp}_{mapper}_{ref}_amr.csv', samp=sample_names,batch=batch_names, ref=config['ref'], mapper=config['mapper']),
cov_stats=expand('results/{batch}/{samp}/stats/{samp}_{mapper}_{ref}_cov_stats.txt', samp=sample_names,batch=batch_names, ref=config['ref'], mapper=config['mapper']),
all_sample_stats=expand('results/{batch}/stats/combined_per_run_sample_stats.csv',batch = batch_names),
vcfs=expand('results/{batch}/{samp}/vars/{samp}_{mapper}_{ref}_{caller}_qfilt.vcf.gz', samp=sample_names,batch=batch_names, ref=config['ref'], mapper=config['mapper'], caller = config['caller']),
ann_vcfs=expand('results/{batch}/{samp}/vars/{samp}_{mapper}_{ref}_gatk_ann.vcf.gz', samp=sample_names,batch=batch_names, ref=config['ref'], mapper=config['mapper'], caller = config['caller']),
fastas=expand('results/{batch}/{samp}/fasta/{samp}_{mapper}_{ref}_{caller}_{filter}.fa', samp=sample_names,batch=batch_names, ref=config['ref'], mapper=config['mapper'], caller = config['caller'], filter=config['filter']),
profiles=expand('results/{batch}/{samp}/stats/{samp}_{mapper}_{ref}_lineage.csv', samp=sample_names,batch=batch_names, ref=config['ref'], mapper=config['mapper'])
# Trim reads for quality.
rule trim_reads:
input:
p1=fq1_from_sample,
p2=fq2_from_sample
output:
trim1='results/{batch}/{sample}/trim/{sample}_trim_1.fq.gz',
trim2='results/{batch}/{sample}/trim/{sample}_trim_2.fq.gz'
log:
'results/{batch}/{sample}/trim/{sample}_trim_reads.log'
shell:
'{config[scripts_dir]}trim_reads.sh {input.p1} {input.p2} {output.trim1} {output.trim2} &>> {log}'
# Filter reads taxonomically with Kraken.
rule taxonomic_filter:
input:
trim1='results/{batch}/{samp}/trim/{samp}_trim_1.fq.gz',
trim2='results/{batch}/{samp}/trim/{samp}_trim_2.fq.gz'
output:
kr1='results/{batch}/{samp}/kraken/{samp}_trim_kr_1.fq.gz',
kr2='results/{batch}/{samp}/kraken/{samp}_trim_kr_2.fq.gz',
kraken_report='results/{batch}/{samp}/kraken/{samp}_kraken.report',
kraken_stats = 'results/{batch}/{samp}/kraken/{samp}_kraken_stats.csv'
log:
'results/{batch}/{samp}/kraken/{samp}_kraken.log'
threads: 8
shell:
'{config[scripts_dir]}run_kraken.sh {input.trim1} {input.trim2} {output.kr1} {output.kr2} {output.kraken_report} &>> {log}'
Thank you in advance for help using Snakemake!
All the best,
I kind of doubt memory is an issue. 300+ is not much, especially if each of them is processed independently of the others.
Try to start from the subset of samples that you say worked and gradually increase it until you see the problem appearing. Perhaps you have some funny value in your sample sheet or in your config? json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) hints at something like that in my impression.
The answer was from #TroyComi, above: after deleting the .snakemake directory, the issue was resolved. Thank you!

Gensim: error loading pretrained vectors No such file or directory: 'word2vec.kv.vectors.npy'

I am trying to load a Pretrained word2vec embeddings that is in gensim keyedvector 'word2vec.kv'
pretrained = KeyedVectors.load(args.pretrained mmap = 'r')
where arg.pretrained is "/ptembs/word2vec.kv"
and iam getting this error
File "main.py", line 60, in main
pretrained = KeyedVectors.load(args.pretrained, mmap = 'r')
File "C:\Users\ASUS\anaconda3\lib\site-packages\gensim\models\keyedvectors.py", line 1553, in load
model = super(WordEmbeddingsKeyedVectors, cls).load(fname_or_handle, **kwargs)
File "C:\Users\ASUS\anaconda3\lib\site-packages\gensim\models\keyedvectors.py", line 228, in load
return super(BaseKeyedVectors, cls).load(fname_or_handle, **kwargs)
File "C:\Users\ASUS\anaconda3\lib\site-packages\gensim\utils.py", line 436, in load obj._load_specials(fname, mmap, compress, subname)
File "C:\Users\ASUS\anaconda3\lib\site-packages\gensim\utils.py", line 478, in _load_specials
val = np.load(subname(fname, attrib), mmap_mode=mmap)
File "C:\Users\ASUS\anaconda3\lib\site-packages\numpy\lib\npyio.py", line 417, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'ptembs/word2vec.kv.vectors.npy'
i dont understand why it need word2vec.kv.vectors.npy file ? and i dont have it.
Any idea how to solve this problem?
gensim version 3.8.3
tried it on 4.1.2 also same error.
Where did you get the file 'word2vec.kv'?
If loading that file triggers an error mentioning a 2nd file by name, then that 2nd file should've been created alongside 'word2vec.kv' when it was 1st saved using a .save() operation.
That other file needs to be kept alongside 'word2vec.kv' in order for 'word2vec.kv' to be .load()ed again in the future.

Drake Visualizer : Unknown file extension in readPolyData when using .dae file

I am trying to add a custom mesh (a torus) .dae file for collision and visual to my .sdf model.
When I run my program, drake visualizer gives the following error
File "/opt/drake/lib/python2.7/site-packages/director/lcmUtils.py", line 119, in handleMessage
callback(msg)
File "/opt/drake/lib/python2.7/site-packages/director/drakevisualizer.py", line 352, in onViewerLoadRobot
self.addLinksFromLCM(msg)
File "/opt/drake/lib/python2.7/site-packages/director/drakevisualizer.py", line 376, in addLinksFromLCM
self.addLink(Link(link), link.robot_num, link.name)
File "/opt/drake/lib/python2.7/site-packages/director/drakevisualizer.py", line 299, in __init__
self.geometry.extend(Geometry.createGeometry(link.name + ' geometry data', g))
File "/opt/drake/lib/python2.7/site-packages/director/drakevisualizer.py", line 272, in createGeometry
polyDataList, visInfo = Geometry.createPolyDataFromFiles(geom)
File "/opt/drake/lib/python2.7/site-packages/director/drakevisualizer.py", line 231, in createPolyDataFromFiles
polyDataList = [ioUtils.readPolyData(filename)]
File "/opt/drake/lib/python2.7/site-packages/director/ioUtils.py", line 25, in readPolyData
raise Exception('Unknown file extension in readPolyData: %s' % filename)
Exception: Unknown file extension in readPolyData: /my_path/model.dae
Since prius.sdf also uses prius.dae, I assume this is possible. What am I doing wrong?
tl;dr drake_visualizer doesn't load dae files. If you put a similarly named .obj file in the same folder it will load that (and you can leave your sdf file still referencing the dae file).
Long answer:
drake_visualizer has a very specific, arbitrary protocol for loading files. Given an arbitrary file name (e.g., my_geometry.dae) it will
Strip off the extension.
Try the following files (in order), loading the first one it finds:
my_geometry.vtm
my_geometry.vtp
my_geometry.obj
original extension.
It can load: vtm, vtp, ply, obj, and stl files.
The worst thing is if you have both a vtp and an obj file in the same folder with the same name and you specify the obj, it'll still favor the vtp file.

keyerror when trying to load data

I was trying to parse my dataset with the images and annotations with the code below. I am using anaconda with Spyder IDE in a windows machine:
DATA_DIR = 'input'
# Directory to save logs and trained model
ROOT_DIR = 'working'
train_dicom_dir = os.path.join(DATA_DIR, 'stage_1_train_images')
test_dicom_dir = os.path.join(DATA_DIR, 'stage_1_test_images')
print(train_dicom_dir)
print(test_dicom_dir)
def get_dicom_fps(dicom_dir):
dicom_fps = glob.glob(dicom_dir+'/'+'*.jpg')
return list(set(dicom_fps))
def parse_dataset(dicom_dir, anns):
image_fps = get_dicom_fps(dicom_dir)
image_annotations = {fp: [] for fp in image_fps}
for index, row in anns.iterrows():
fp = os.path.join(dicom_dir, row['patientId']+'.jpg')
image_annotations[fp].append(row)
return image_fps, image_annotations
image_fps, image_annotations = parse_dataset(train_dicom_dir, anns=anns)
On running the code, i get the following error:
Traceback (most recent call last):
File "<ipython-input-21-fe2564bb2360>", line 1, in <module>
image_fps, image_annotations = parse_dataset(train_dicom_dir, anns=anns)
File "<ipython-input-12-a5d26437db38>", line 6, in parse_dataset
image_annotations[fp].append(row)
KeyError: 'input\\stage_1_train_images\\1.2.276.0.7230010.3.1.4.8323329.10001.1517874346.163716.jpg'
My current working directory is'C:\Users\rajaramans2\codes\input'
It seems like you are trying to process images in .dcm (DICOM format), not .jpg, so your code should work if you replace “.jpg” for “.dcm”, I hope this helps.

\[Errno -101\] NetCDF: HDF error when opening netcdf file

I have this error when opening my netcdf file.
The code was working before.
How do I fix this ?
Traceback (most recent call last):
File "", line 1, in
...
File "file.py", line 71, in gather_vgt
return xr.open_dataset(filename)
File "/.../lib/python3.6/site-packages/xarray/backends/api.py", line
286, in open_dataset
autoclose=autoclose)
File "/.../lib/python3.6/site-packages/xarray/backends/netCDF4_.py",
line 275, in open
ds = opener()
File "/.../lib/python3.6/site-packages/xarray/backends/netCDF4_.py",
line 199, in _open_netcdf4_group
ds = nc4.Dataset(filename, mode=mode, **kwargs)
File "netCDF4/_netCDF4.pyx", line 2015, in
netCDF4._netCDF4.Dataset.init
File "netCDF4/_netCDF4.pyx", line 1636, in
netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -101] NetCDF: HDF error: b'file.nc'
When I try to open the same netcdf file with h5py I get this error :
OSError: Unable to open file (file locking disabled on this file
system (use HDF5_USE_FILE_LOCKING environment variable to override),
errno = 38, error message = '...')
You must be in this situation :
your HDF5 library has been updated (1.10.1) (netcdf uses HDF5 under the hood)
your file system does not support the file locking that the HDF5 library uses.
In order to read your hdf5 or netcdf files, you need set this environment variable :
HDF5_USE_FILE_LOCKING=FALSE
For references, this was introduced in HDF5 version 1.10.1,
Added a mechanism for disabling the SWMR file locking scheme.
The file locking calls used in HDF5 1.10.0 (including patch1)
will fail when the underlying file system does not support file
locking or where locks have been disabled. To disable all file
locking operations, an environment variable named
HDF5_USE_FILE_LOCKING can be set to the five-character string
'FALSE'. This does not fundamentally change HDF5 library
operation (aside from initial file open/create, SWMR is lock-free),
but users will have to be more careful about opening files
to avoid problematic access patterns (i.e.: multiple writers) >that the file locking was designed to prevent.
Additionally, the error message that is emitted when file lock
operations set errno to ENOSYS (typical when file locking has been
disabled) has been updated to describe the problem and potential
resolution better.
(DER, 2016/10/26, HDFFV-9918)
In my case, the solution as suggested by #Florian did not work. I found another solution, which suggested that the order in which h5py and netCDF4 are imported matters (see here).
And, indeed, the following works for me:
from netCDF4 import Dataset
import h5py
Switching the order results in the error as described by OP.

Resources