Moses tuning fails to run extractor - machine-learning

When I attempt to tune my Moses tuner (according to the moses baseline), it reaches the end of my tuning dataset (75k lines) and then exits on the code:
Executing: /home/alexm/Desktop/dissertation/models/cy-en/transsystem/mert-work/extractor.sh > extract.out 2> extract.err
Exit code: 127
ERROR: Failed to run '/home/alexm/Desktop/dissertation/models/cy-en/transsystem/mert-work/extractor.sh'. at ../../../mosesdecoder/scripts/training/mert-moses.pl line 1775.
It exits without giving the current tuning weights too, causing the loss of hours of progress.

It seems like you're missing some file or access to it
127 typically means command not found
Are you missing the file /home/alexm/Desktop/dissertation/models/cy-en/transsystem/mert-work/extractor.sh ?

Related

Cannot import HOL in Isabelle batch mode from Docker

I'm trying to use HOL in Isabelle in batch mode from Docker, but it can't seem to find HOL.
If I have this My.thy file
theory My
imports HOL
begin
end
and then run this to process the file in batch mode
docker run --rm -it -v $PWD/My.thy:/home/isabelle/My.thy makarius/isabelle:Isabelle2022_ARM process -T My
I get
*** No such file: "/home/isabelle/HOL.thy"
*** The error(s) above occurred for theory "Draft.HOL" (line 2 of "~/My.thy")
*** (required by "Draft.My")
Exception- TOPLEVEL_ERROR raised
However, I can import Main. In more detail, if I change My.thy to be
theory My
imports Main
begin
end
then running the same Docker command as above to run the batch process results in
Loading theory "Draft.My"
### theory "Draft.My"
### 0.039s elapsed time, 0.078s cpu time, 0.000s GC time
val it = (): unit
How can I import HOL Isabelle's batch mode in Docker?

parallel: Error: Command line too long (68914 >= 65524) at input 0

Given a file with long lines, parallel fails to pass these lines as an argument to any command:
$> cat johny_long_lines.txt | parallel echo {}
parallel: Error: Command line too long (68906 >= 65524) at input 0: 2236439425|\x308286873082856fa003020102020c221ff03...
This gets more confusing when I see that the line is 68900 characters long:
$> cat johny_long_lines.txt | head -n 1 | wc -m
68900
while the max line length allowed by parallel is way more longer that my input:
$> parallel --max-line-length-allowed
131049
Also: if you think that it's a problem of execve, this might interest you:
$> getconf ARG_MAX
2097152
Any idea what I'm doing here wrong?
UPDATE
I figured out that the problem persists for versions 20161222 and 20220522 but not for 20210822 (delivered with Ubuntu 22.04 LTS). Further inspection reveals that this line causes the problem:
# Usable len = maxlen - 3000 for wrapping, div 2 for hexing
int(($Global::minimal_command_line_length - 3000)/2);
Which I can confirm using --show-limits:
$> parallel --show-limits
[...]
Maximal size of command: 131063
Maximal usable size of command: 64031
This annoying feature does not exist in version 20210822 and I my file goes through as expected.
Can this be disabled?
I got the message
parallel --show-limits
Maximal size of command: 131049
Maximal used size of command: 83906
and had some trouble finding the source for the 83906
but then found it in
~/.parallel/tmp/sshlogin/$(hostname)/linelen
and had no clue how it was set to this small value.
But the Version of parallel was quite old:
GNU parallel 20180922

snakemake: MissingOutputException within docker

I am trying to run a pipeline within a docker using snakemake. I am having problem using the sortmerna tool to produce {sample}_merged_sorted_mRNA and {sample}_merged_sorted output from control_merged.fq and treated_merged.fq input files.
Here my Snakefile:
SAMPLES = ["control","treated"]
for smp in SAMPLES:
print("Sample " + smp + " will be processed")
rule final:
input:
expand('/output/{sample}_merged.fq', sample=SAMPLES),
expand('/output/{sample}_merged_sorted', sample=SAMPLES),
expand('/output/{sample}_merged_sorted_mRNA', sample=SAMPLES),
rule sortmerna:
input: '/output/{sample}_merged.fq',
output: merged_file='/output/{sample}_merged_sorted_mRNA', merged_sorted='/output/{sample}_merged_sorted',
message: """---SORTING---"""
shell:
'''
sortmerna --ref /usr/share/sortmerna/rRNA_databases/silva-bac-23s-id98.fasta,/ usr/share/sortmerna/rRNA_databases/index/silva-bac-23s-id98: --reads {input} --paired_in -a 16 --log --fastx --aligned {output.merged_file} --other {output.merged_sorted} -v
'''
When runnig this I get:
Waiting at most 5 seconds for missing files.
MissingOutputException in line 57 of /input/Snakefile:
Missing files after 5 seconds:
/output/control_merged_sorted_mRNA
/output/control_merged_sorted
This might be due to filesystem latency. If that is the case, consider to increase the wait $ime with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /input/.snakemake/log/2018-11-05T091643.911334.snakemake.log
I tried to increase the latency with --latency-wait but I get the same result. Funny thing is that two output files control_merged_sorted_mRNA.fq and control_merged_sorted.fq are produced but the program fails and exits. The version of snakemake is 5.3.0. Any help?
snakemake fails because the outputs described by the rule sortmerna are not produced. This is not a latency problem, it is a problem with your outputs.
Your rule sortmerna expects as output:
/output/control_merged_sorted_mRNA
and
/output/control_merged_sorted
but the program you are using (I know nothing about sortmerna) is apparently producing
/output/control_merged_sorted_mRNA.fq
and
/output/control_merged_sorted.fq
Make sure that when you specify the options --aligned and --other on the command line of your program, it should be the real names of the files produced or if it is only the basename and the program will add a suffix .fq. If you are in the latter case, I suggest you use:
rule final:
input:
expand('/output/{sample}_merged.fq', sample=SAMPLES),
expand('/output/{sample}_merged_sorted', sample=SAMPLES),
expand('/output/{sample}_merged_sorted_mRNA', sample=SAMPLES),
rule sortmerna:
input:
'/output/{sample}_merged.fq',
output:
merged_file='/output/{sample}_merged_sorted_mRNA.fq',
merged_sorted='/output/{sample}_merged_sorted.fq'
params:
merged_file_basename='/output/{sample}_merged_sorted_mRNA',
merged_sorted_basename='/output/{sample}_merged_sorted'
message: """---SORTING---"""
shell:
"""
sortmerna --ref /usr/share/sortmerna/rRNA_databases/silva-bac-23s-id98.fasta,/usr/share/sortmerna/rRNA_databases/index/silva-bac-23s-id98: --reads {input} --paired_in -a 16 --log --fastx --aligned {params.merged_file_basename} --other {params.merged_sorted_basename} -v
"""

createsample.exe crashes every time

Hi I am trying to create samples using opencv_createsamples.exe but it crashes every time. I've checked different builds, opencv2 and opencv3. I don't know how to overcome this. My command looks like this
opencv_createsamples.exe -img C:/Users/dpach/Pictures/Interfejsy/img/1.jpg -maxxangle 15 -maxyangle 15 -maxzangle 1 -w 80 -h 40 -vec test.vec -bgtresh 0 -bgcolor 0 -show
The windows to show samples is opening but after that I receive info that program is not responding. Any ideas ?
// EDIT
I've tried to start it from a pseudo unix bash and I receive then Segmentation fault
// EDIT2
It crush after that Create training samples from single image applying distortions...

OpenCL ptxas error

I'm launching a program of mine which uses openCV/Opengl and which used to work fine with no errors. Now when it starts I get this:
OpenCL program build log: -D LOCAL_SIZE_X=8 -D LOCAL_SIZE_Y=8 -D SPLIT_STAGE=1 -D N_STAGES=20 -D MAX_FACES=10000 -D LBP
ptxas application ptx input, line 637; error : Instruction '{atom,red}.shared' requires .target sm_12 or higher
ptxas application ptx input, line 884; error : Instruction '{atom,red}.shared' requires .target sm_12 or higher
ptxas fatal : Ptx assembly aborted due to errors
(Mac Os X 10.11)
and then my program continues running normally. I have no idea what might be causing this and if this is relevant or not to my code and I have no idea where to look either. The very same code used to be OK. Is it something related to the openGL wrapper libraries I use? How serious is this? Could somebody please explain this error to me?
EDIT
I managed to identify the code which causes this error:
face_cascade.detectMultiScale(frame_gray, faces, 1.1, 2, 0, cv::Size(80, 80));
this is essentially a call to cv::CascadeClassifier::detectMultiScale with a cv::Mat a std::vector<cv::Rect> an a cv::Size as arguments.

Resources