Anchor replace with '#id1' , '#id2' , '#id3' ... Sphinx - anchor

I named using number in rst file but after building it the anchor is replaced with '#id1', '#id2', '#id3' ....
I need to name the anchor using only number....but I don't know how...
Here is my rst file.
Actual Code
*************
Screen(HTS)
*************
Quote
=============
1101
------------------
.. raw:: html
:file: _static/1101.html
1151
------------------
.. raw:: html
:file: _static/1151.html
1152
------------------
.. raw:: html
:file: _static/1152.html
1103
------------------
.. raw:: html
:file: _static/1103.html

First of all, the underlines and overlines must be the same length as the text.
Unfortunately you want anchors that start with numbers, but Sphinx does not support this. You could instead use an arbitrary anchor that starts with a string. See the documentation for Cross-referencing arbitrary locations with :ref:. Here's an example.
***********
Screen(HTS)
***********
Quote
=====
.. _section-1101:
1101
----

Related

How to split paired-end fastq files?

I have Illumina paired-end reads contained within one .fastq file, denoted as '/1' for forward reads and '/2' for reverse reads.
I am using grep to pull out the individual reads and place them into 2 respective files (one for forward reads and one for reverse.
grep -A 3 "/1$" sample21_pe.unmapped.fq > sample21_1_rfa.fq
grep -A 3 "/2$" sample21_pe.unmapped.fq > sample21_2_rfa.fq
However, when I try to use the files (fastqc, assembly, etc), they do not work. When running
fastqc i get the following error:
Failed to process file sample21_1_rfa.fq
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: ID line didn't start with '#'
at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:134)
at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:105)
at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:76)
at java.lang.Thread.run(Thread.java:662)
But, if you look at the files they identifier does indeed start with an '#'. Any advice on why these files aren't working? I had originally converted .bam files into the .fastq files with
samtools bam2fq
Here are samples of each individual file:
merged .fastq
#HISEQ:534:CB14TANXX:4:1101:1091:2161/1
GAGAAGCTCGTCCGGCTGGAGAATGTTGCGCTTGCGGTCCGGAGAGGACAGAAATTCGTTGATGTTAACGGTGCGCTCGCCGCGGACGCTCTTGATGGTGACGTCGGCGTTGAGCGTGACGCACG
+
B/</<//B<BFF<FFFFFF/BFFFFFFB<BFFF<B/7FFF7B/B/FF/F/<<F/FFBFFFBBFFFBFB/FF<BBB<B/B//BBFFFFFFF/B/FF/B77B//B7B7F/7F###############
#HISEQ:534:CB14TANXX:4:1101:1091:2161/2
TGACGCCTGCCGTCAGGTAGGTTCTCCGCAGATCCGAAATCTCGCGACGCTCGGCGGCAACATCTGCCAGTCGTCCGTGGCGGGCGACGGTCTCGCGGCGTGCGTCACGCTCAACGCCGACGTAC
+
/B<B//F/F//B<///<FB/</F<<FFFFF<FFBF/FF<//FB/F//F7FBFFFF/B</7<F//<BB7/7BB7/B<F7BF<BFFFB7B#####################################
#HISEQ:534:CB14TANXX:4:1101:1637:2053/1
NGTTTACCATACAACAATCTTGCGACCTATTCAAATCATCTATATGCCTTATCAAGTTTTCATAGCTTTCAAGATTCTCAATTTCCTCACGTCTCGCTTTGCTCAACCTACAAAAACTCTCTTCT
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFB/BFBBFBB<<<<FFFFFFBB<FBFFBFF
#HISEQ:534:CB14TANXX:4:1101:1637:2053/2
TCGGTCGTTGGGAAAAGACCTGTGGTAAACATCCTACGCAAAAGCCATTGCGGTTACTCGTTCGTATGATTCTTGCATCAACTAATCAAGGCGATTGGGTTCTCGACCCATTTTGTGGAAGTTCG
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFBFFFFFB<FF<<BBFB
#HISEQ:534:CB14TANXX:4:1101:1792:2218/1
TCTATCGGCTGACCGATAAGCTGTCGCCTGCCGACCGTCCTGCCATGGGACGGCGCATCGCACAGCTCACCCTGGACTAACTCTCCAACACCATGATGCTGACACGCTCGGCAAAAACACCCGAT
+
<<B/<B</FF/<B/<//F<//FF<<<FF//</7/F<</FFF####################################################################################
#HISEQ:534:CB14TANXX:4:1101:1792:2218/2
TGCCGGAGGGCGTCGATGGTGGCATCGAGCTTTTTTGCCGAGCGTGTCAGCATGATGGTGTTGTAGAGATAGTCCATGGTGAGCTGTGCGATGCGCCGTTCCATGGCAGGACGGTCGGCAGGCGC
+
BBBBBFFFFFFFFBFFFBBFFFFFFFFFFFBBFFFF/FF<F7FF//F/FBB/FFBFFF/F7BFF<F/FFFFFFFFB/7BB<7BFFFFFFFFFFFFF<B///B/7B/7/B//77BB//7B/B7/B#
#HISEQ:534:CB14TANXX:4:1101:1903:2238/1
TATTCCAGCGACCGTTATAATCAAACTCAACTACATAGTCATTGCGGATTGCTTCAAGAAATTTTTTCCAGACTATTTCATCAATATTTATTTTGGGAACTGGTGCAACAGCAATTCTTTTTAAA
+
BBBBBFFFFFFFFFFFFFBFF/FFBFFBFFFFFFFF/FFFFFF<<FFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFBF/B/<B<B/FBF7/<FFFFFFF/BB/7///7FF<BFFF//B/FFF###
#HISEQ:534:CB14TANXX:4:1101:1903:2238/2
TAAGGTTGGAGAAGCAACAATTTACCGTGATATTGATTTGCTCCGAACATATTTTCATGCGCCACTCGAGTTTGACAGGGAGAAAGGCGGGTATTATTATTTTAATGAAAAATGGGATTTTGCCC
+
B<BBBFFFFFFF<FFFFFFFFFFFFFFFFFF/BFFFFFFF<<FF<F<FFF/FF/FFFFBFB</<//<B/////<<FFFFB/<F<BFF/7/</7/7FB/B/BFF<//7BFF###############
#HISEQ:534:CB14TANXX:4:1101:2107:2125/1
TGTAGTATTTATTACATCATATAGAATGTAGATATAAAAGATGAAAAAGCTATAATTTCTTTGATAATATAAGGAGGGAATAACACTATGAGGATTGATAGAGCAGGAATCGAGGATTTGCCGAT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFF/FFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBFFFFFFFBB<FBB7BFF#
#HISEQ:534:CB14TANXX:4:1101:2107:2125/2
TACCACTATCGGCAAATCCTCGATTCCTGCTCTATCAATCCTCATAGTGTTATTCCCTCCTTATATTATCAAAGAAATTATAGCTTTTTCATCTTTTATATCTACATTCTATATGATGTAATAAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFBFBFFFFFFFBBFFFFFFFBF7F/B/BBF7/</FF/77F/77BB#
#HISEQ:534:CB14TANXX:4:1101:2023:2224/1
TCACCAGCTCGGCACGCTTGTCCTTGACCTCCTGCTCGATCTGACCGTCCATCTTGGCTGCCACGGTGTTCTCCTCGGCGGAGTAGGCAAAGCAGCCCAGACGGTCGAACTGTATCTCCTTGACA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFB<<B7BBFBFFF<FFBBFFFBF/7B/<B<
#HISEQ:534:CB14TANXX:4:1101:2023:2224/2
TCGAGGATCTGTGCAACTTTGTCAAGGAGATACAGTTCGACCGTCTGGGCTGCTTTGCCTACTCCGCCGAGGAGAACACCGTGGCAGCCAAGATGGACGGTCAGATCGAGCAGGAGGTCAAGGAC
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFBFBFFFFFFFFFFFFFFFFFFFFFFFFFBBFFFFFFFFFFFFF<7BF/<<BB###
#HISEQ:534:CB14TANXX:4:1101:2038:2235/1
TTTATGCGAATGTAGAGTGGCTTCTCCACTGCCTCGGTGAAGCCCACGCGCGAGATGAGCGAATTAAGCTGCTTTGCAGTGAATTGCATTGCATATACACCTGCGTCGGCTTGAATACTTGTGCT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFF//BFFFFFFFFFFFFF<B<BB###
#HISEQ:534:CB14TANXX:4:1101:2038:2235/2
AATCCGCTCGTGAAAGCTCCCGATAACGCCACAGTGAACACCGTGGAGTTCTCTGATACCGAAGATTTCGCACGCAGCACAAGTATTCAAGCCGACGCAGGTGTATATGCAATGCAATTCACTGC
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBBFFFFFFFFFFFFFFFFFFFFFFF
#HISEQ:534:CB14TANXX:4:1101:2271:2041/1
NACACTTGTCGATGATCTTGCCAAGCTGCTTCTTGCCCACCAGGAAGCCGATCTCCAGATCAAACTCGTGGCCGGGAACACTCCGGTCCACAAAGCCCAGGTCCTGGGGAATGGGCTCATCGTAG
+
#<</BB/F/BB/F<FFFFFFFFF/<BFFFFFFFF<<FFBFFFFFFBFBFBBB<<FFFFBFFF/<B/FFFFFFFFFFFFFFFFF<FB<<BFF77BFFF/<BFFFB<</BB</7BFFFB########
#HISEQ:534:CB14TANXX:4:1101:2271:2041/2
GACTCATCTACAATGAGCCCATTCCCCAGGACCTGGGCTTTGTGGACCGGAGTGTTCCCGGCCACGAGTTTGATCTGGAGATCGGCTTCCTGGTGGGCAAGAAGCAGCTTGGCAAGATCATCGCC
+
<<BBBFFF<F/BFFFBFBF<BFF<<F/FFFBFFFF<<FFFFBFFFFFFBFFF/<B<F/<</<FFF//FFFFF/<<F/B/B/7/FF<<FF/7B/BBB/7///7////<B/B/BB/B/B/B/7BB##
Example of forward reads after being pulled out and placed into their own .fastq file:
#HISEQ:534:CB14TANXX:4:1101:1091:2161/1
GAGAAGCTCGTCCGGCTGGAGAATGTTGCGCTTGCGGTCCGGAGAGGACAGAAATTCGTTGATGTTAACGGTGCGCTCGCCGCGGACGCTCTTGATGGTGACGTCGGCGTTGAGCGTGACGCACG
+
B/</<//B<BFF<FFFFFF/BFFFFFFB<BFFF<B/7FFF7B/B/FF/F/<<F/FFBFFFBBFFFBFB/FF<BBB<B/B//BBFFFFFFF/B/FF/B77B//B7B7F/7F###############
--
#HISEQ:534:CB14TANXX:4:1101:1637:2053/1
NGTTTACCATACAACAATCTTGCGACCTATTCAAATCATCTATATGCCTTATCAAGTTTTCATAGCTTTCAAGATTCTCAATTTCCTCACGTCTCGCTTTGCTCAACCTACAAAAACTCTCTTCT
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFB/BFBBFBB<<<<FFFFFFBB<FBFFBFF
--
#HISEQ:534:CB14TANXX:4:1101:1792:2218/1
TCTATCGGCTGACCGATAAGCTGTCGCCTGCCGACCGTCCTGCCATGGGACGGCGCATCGCACAGCTCACCCTGGACTAACTCTCCAACACCATGATGCTGACACGCTCGGCAAAAACACCCGAT
+
<<B/<B</FF/<B/<//F<//FF<<<FF//</7/F<</FFF####################################################################################
--
#HISEQ:534:CB14TANXX:4:1101:1903:2238/1
TATTCCAGCGACCGTTATAATCAAACTCAACTACATAGTCATTGCGGATTGCTTCAAGAAATTTTTTCCAGACTATTTCATCAATATTTATTTTGGGAACTGGTGCAACAGCAATTCTTTTTAAA
+
BBBBBFFFFFFFFFFFFFBFF/FFBFFBFFFFFFFF/FFFFFF<<FFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFBF/B/<B<B/FBF7/<FFFFFFF/BB/7///7FF<BFFF//B/FFF###
--
#HISEQ:534:CB14TANXX:4:1101:2107:2125/1
TGTAGTATTTATTACATCATATAGAATGTAGATATAAAAGATGAAAAAGCTATAATTTCTTTGATAATATAAGGAGGGAATAACACTATGAGGATTGATAGAGCAGGAATCGAGGATTTGCCGAT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFF/FFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBFFFFFFFBB<FBB7BFF#
--
#HISEQ:534:CB14TANXX:4:1101:2023:2224/1
TCACCAGCTCGGCACGCTTGTCCTTGACCTCCTGCTCGATCTGACCGTCCATCTTGGCTGCCACGGTGTTCTCCTCGGCGGAGTAGGCAAAGCAGCCCAGACGGTCGAACTGTATCTCCTTGACA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFB<<B7BBFBFFF<FFBBFFFBF/7B/<B<
--
#HISEQ:534:CB14TANXX:4:1101:2038:2235/1
TTTATGCGAATGTAGAGTGGCTTCTCCACTGCCTCGGTGAAGCCCACGCGCGAGATGAGCGAATTAAGCTGCTTTGCAGTGAATTGCATTGCATATACACCTGCGTCGGCTTGAATACTTGTGCT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFF//BFFFFFFFFFFFFF<B<BB###
--
#HISEQ:534:CB14TANXX:4:1101:2271:2041/1
NACACTTGTCGATGATCTTGCCAAGCTGCTTCTTGCCCACCAGGAAGCCGATCTCCAGATCAAACTCGTGGCCGGGAACACTCCGGTCCACAAAGCCCAGGTCCTGGGGAATGGGCTCATCGTAG
+
#<</BB/F/BB/F<FFFFFFFFF/<BFFFFFFFF<<FFBFFFFFFBFBFBBB<<FFFFBFFF/<B/FFFFFFFFFFFFFFFFF<FB<<BFF77BFFF/<BFFFB<</BB</7BFFFB########
--
#HISEQ:534:CB14TANXX:4:1101:2678:2145/1
CTGTACATAGTACGTATTTGACGCCTGCGTCGATGTAGCGTTTGAGGAAGGGAAGCAGCGGTTCTGCAGAGTCCTCTTTCCATCCGTTGATGCTAATCATTCCGTTGCGTACATCCGCTCCGAGA
+
BBBBBFFFFFFF<FFF<FFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<BFFF7BFFFFFFFF<BBFFFFFFFFBBFBBB<FFBFFFFFFFFFFFFB<BFFFFFFBFB/BFFF####
--
#HISEQ:534:CB14TANXX:4:1101:2972:2114/1
CTCTGTGCCGATCCCTTTGCCTTTGCGTTTTGAGGAAAGGAAACCACCTTCTGGGTCGGTGAGGATAGTTCCGGTGAAGGTGTTGTCCACCGCCAGGCATAGGGAATAGCTGTCAGCCTTTGCTC
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFB/FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFBFFFF<FFFFFFFFFF<BFFFFF
--
#HISEQ:534:CB14TANXX:4:1101:2940:2222/1
CTAATTTTTTCATTATATTACTAATTTTGTAATTGGTAAAATATTATAATATCCTTGTACATTAAGACCCCAATAATCAGAAGAAGTAAAATTAATTCCTGCAACAGTTCTTAAATATCCATTAG
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FBFFFFFFFFFFFFFFF/FBFBFFBFFFFF/<F<FFFFFFFFFF<FFFFFFBFFFFFFFFF</FBFBBF<F/7//FFBFBBFFF/<7BF#
--
#HISEQ:534:CB14TANXX:4:1101:3037:2180/1
CGTCAGTTCCGCAACGATAAAGAGTTCCGCATTGCAGTCACCTGTACGCTGGTAGCCACCGGAACCGATGTCAAGCCGTTGGAGGTGGTGATGTTCATGCGCGACGTAGCTTCCGAGCCGTTATA
+
B/BBBBBFFFFFFF<FFBFFFFFF<FFFFBFFFFFFF<BBFFFFFFFFFFFFFFFFFBFF/FFFFBFFBFFFFBFF/7F/BFB/BBFFFFFFFFBFF<BBF<7BBFFFFFFBBFFF/B#######
--
#HISEQ:534:CB14TANXX:4:1101:3334:2171/1
ACCGATGTACATACCCGGACGGGTACGCACATGCTCCATATCGCTCAAGTGGCGGATATTGTCATCTGTATATTCTACAGGTTGCTCCTGAGGGGTATTTGCCAGTTCTTCGGCAGCACCCTTTT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFBFFFFFFFFFFFBFFFFFFFFFF</<BFFFFFFFFBBFFFFFFBF</BB///BF<FFFFF<</<B
--
#HISEQ:534:CB14TANXX:4:1101:3452:2185/1
CGCAGACGGATTTGCTTGAAGTCCGTCTCATCGTATTCCGACAACTCATCGAGGAACACACGCTTGTATTGACTGATACCCTTGATTTTCTCCGGGTCGTCAAGACCACTGAAATCAATCTTGCC
+
BBBBBFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFBFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFBFFFFFBFFBFFFFFFFFFB/77B/FBBFFF/<FFF/77BBFFFBFFBBB
--
Any advice would be appreciated. Thanks!
In general, this operation is called deinterlace fastq or deinterleave fastq. The question already has the answer here:
deinterleave fastq file
https://www.biostars.org/p/141256/
I am copying it here, with minor reformatting for clarity:
paste - - - - - - - - < interleaved.fq \
| tee >(cut -f 1-4 | tr "\t" "\n" > read1.fq) \
| cut -f 5-8 | tr "\t" "\n" > read2.fq
This command converts the interlaced fastq file into 8-column tsv file, cuts columns 1-4 (read 1 lines), changes from tsv to fastq format (by replacing tabs with newlines) and redirects the output to read1.fq. In the same STDOUT stream (for speed), using tee, it cuts columns 5-8 (read 2 lines), etc, and redirects the output to read2.fq.
You can also use these command line tools:
iamdelf/deinterlace: Deinterlaces paired-end FASTQ files into first and second strand files.
https://github.com/iamdelf/deinterlace
deinterleave FASTQ files
https://gist.github.com/nathanhaigh/3521724
Or online tools with Galaxy web UI, for example this tool: "FASTQ splitter on joined paired end reads", installed on several public Galaxy instances, such as https://usegalaxy.org/ .
Avoid using a regex for simple fastq file parsing if you can use line numbers, both for speed (pattern matching is slower than simple counting) and for robustness.
Highly unlikely, but a pattern like ^#.*/1$ (or whatever the readers might change it to, while reusing this code later) can match also the base quality line. A good general rule is to simply rely on fastq spec, which says 4 lines per record.
Note that #, /, 1, and 2 characters are allowed in Illumina Phred scores: https://support.illumina.com/help/BaseSpace_OLH_009008/Content/Source/Informatics/BS/QualityScoreEncoding_swBS.htm .
A one-liner that pulls out such (admittedly, very rare) reads is left as an exercise to the reader.
The fastq format uses 4 lines per read.
Your snippet has 5, as there are -- lines. That could cause confusion to softwares expecting a 4 line format.
You can add --no-group-separator to the grep call to avoid adding that separator.
I usually follow these steps to convert bam to fastq.gz
samtools bam2fq myBamfile.bam > myBamfile.fastq
cat myBamfile.fastq | grep '^#.*/1$' -A 3 --no-group-separator > sample_1.fastq
cat myBamfile.fastq | grep '^#.*/2$' -A 3 --no-group-separator > sample_2.fastq
gzip sample_1.fastq
gzip sample_1.fastq
Once you have the two files, you should order them to be sure that the reads are really paired.
We can split FASTQ files using Seqkit.
seqkit split2 -p 2 sample21_pe.unmapped.fq
https://bioinf.shenwei.me/seqkit/usage/#split2
Example 4 will help this question.
I'm not sure if it recognize the read ID. It split and write alternately into 1st-output-file and 2nd-output-file.

Sphinx starts chapter numbering from 0 in LaTeX output

I'm generating a PDF document With Sphinx using latex.
======================
RX-loader instructions
======================
Program
=======
However this generates
0.1 RX-loader instructions
0.1.1 Program
How do I make the numbering start with 1?
Edit:
My index.rst looks like this
#######################
RX-loader documentation
#######################
.. toctree::
:maxdepth: 2
rx-loader
My problem seems to be that the numbering takes "RX-loader documentation" as chapter 0 and always writes the chapters the section numbers.
The following might be a dirty trick but it results in the desired pdf behaviour on Sphinx 1.6.7 (In html the first line must be removed):
Filler text
======================
RX-loader instructions
======================
Program 1
---------
Program 2
---------
Output:

Cannot build pandoc with LaTeX \hline using RStudio RMarkdown

Note: This summary can be maxed with other summary calls where a line every row is needed, so a solution that puts lines between every row of every table will not work. Need it just for the tables I'm creating.
Using latest RStudio, I have an object type for which summary.type produces Pandoc output of a table. I would like to, in pdf / LaTeX output, have a horizontal line between each row of the table. All my attempts fail with the error :
! Misplaced \noalign.
\hline ->\noalign
{\ifnum 0=`}\fi \hrule \#height \arrayrulewidth \futurelet...
l.207 \hline
pandoc: Error producing PDF from TeX source
Given the following in a .md file:
------------------------------------------------------------------------------------------------
term estimate std.error statistic p.value N adj.rsq
------------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
**1** NA 26.8 0.766 35 0 1466 0.004
**2** NA 0.012 0.012 0.939 0.348 . .
-------------------------------------------------------------------------------------------------
Another option is to add labels after you have created the table
\hline
And executing the following command (created by RStudio):
/Applications/RStudio.app/Contents/MacOS/pandoc/pandoc +RTS -K512m -RTS type.md --to latex --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash-implicit_figures --output type.pdf --table-of-contents --toc-depth 3 --template ~/Library/R/3.2/library/rmarkdown/rmd/latex/default-1.14.tex --highlight-style tango --latex-engine /Library/TeX/texbin/pdflatex --variable graphics=yes --variable 'geometry:margin=1in'
Remove that last line (the "\hline") and everything compiles correctly.
When I create html instead of pdf, the \hline is ignored (as it should be) and the file is created successfully.
What am I doing wrong? What is the minimum LaTeX I can embed in my Pandoc output in order to have a line with each row of a table?
Attempted solutions:
Use \hrulefill: Problem: Will only fill 90% of a single column. Putting it in every column doesn't fill the line
Use "---" to tell Pandoc I need a horizontal line: Problem: Ends the table
Change LaTeX template so every row of every table has a line separating it: Problem: Only want to does this for the rows created by summary.myType, not for all tables anywhere in the document
Attempted solutions and results:
------------------------------------------------------------------------------------------------
term estimate std.error statistic p.value N adj.rsq
------------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
**1** NA 26.8 0.766 35 0 1466 0.004
\hrulefill \hrulefill \hrulefill \hrulefill \hrulefill \hrulefill \hrulefill \hrulefill
**2** NA 0.012 0.012 0.939 0.348 . .
---
**3** NA -0.718 0.291 -2.469 0.014 . .
-------------------------------------------------------------------------------------------------
Another option is to add labels after you have created the table
The answer is there's no answer. This is a snip of the tex that Pandoc's markdown -> tex converter is creating:
\begin{longtable}[c]{#{}llll#{}}
\toprule
\begin{minipage}[b]{0.16\columnwidth}\raggedright\strut
\strut\end{minipage} &
Every single cell is its own "page", with its own margins, as such nothing that produces a line can extend beyond that page, and through those margins.
Short of creating a template that forces LaTeX to put a line at the end of each row of every single table in the file (as was suggested here), I do not believe there's any way to do what I want, short of coding up the LaTeX for the table myself.

How can I extract some data out of the middle of a noisy file using Perl 6?

I would like to do this using idiomatic Perl 6.
I found a wonderful contiguous chunk of data buried in a noisy output file.
I would like to simply print out the header line starting with Cluster Unique and all of the lines following it, up to, but not including, the first occurrence of an empty line. Here's what the file looks like:
</path/to/projects/projectname/ParameterSweep/1000.1.7.dir> was used as the working directory.
....
Cluster Unique Sequences Reads RPM
1 31 3539 3539
2 25 2797 2797
3 17 1679 1679
4 21 1636 1636
5 14 1568 1568
6 13 1548 1548
7 7 1439 1439
Input file: "../../filename.count.fa"
...
Here's what I want parsed out:
Cluster Unique Sequences Reads RPM
1 31 3539 3539
2 25 2797 2797
3 17 1679 1679
4 21 1636 1636
5 14 1568 1568
6 13 1548 1548
7 7 1439 1439
One-liner version
.say if /Cluster \s+ Unique/ ff^ /^\s*$/ for lines;
In English
Print every line from the input file starting with the once containing the phrase Cluster Unique and ending just before the next empty line.
Same code with comments
.say # print the default variable $_
if # do the previous action (.say) "if" the following term is true
/Cluster \s+ Unique/ # Match $_ if it contains "Cluster Unique"
ff^ # Flip-flop operator: true until preceding term becomes true
# false once the term after it becomes true
/^\s*$/ # Match $_ if it contains an empty line
for # Create a loop placing each element of the following list into $_
lines # Create a list of all of the lines in the file
; # End of statement
Expanded version
for lines() {
.say if (
$_ ~~ /Cluster \s+ Unique/ ff^ $_ ~~ /^\s*$/
)
}
lines() is like <> in perl5. Each line from each file listed on the command line is read in one at a time. Since this is in a for loop, each line is placed in the default variable $_.
say is like print except that it also appends a newline. When written with a starting ., it acts directly on the default variable $_.
$_ is the default variable, which in this case contains one line from the file.
~~ is the match operator that is comparing $_ with a regular expression.
// Create a regular expression between the two forward slashes
\s+ matches one or more spaces
ff is the flip-flop operator. It is false as long as the expression to its left is false. It becomes true when the expression to its left is evaluated as true. It becomes false when the expression to its right becomes true and is never evaluated as true again. In this case, if we used ^ff^ instead of ff^, then the header would not be included in the output.
When ^ comes before (or after) ff, it modifies ff so that it is also false the iteration that the expression to its left (or right) becomes true.
/^\*$/ matches an empty line
^ matches the beginning of a string
\s* matches zero or more spaces
$ matches the end of a string
By the way, the flip-flop operator in Perl 5 is .. when it is in a scalar context (it's the range operator in list context). But its features are not quite as rich as in Perl 6, of course.
I would like to do this using idiomatic Perl 6.
In Perl, the idiomatic way to locate a chunk in a file is to read the file in paragraph mode, then stop reading the file when you find the chunk you are interested in. If you are reading a 10GB file, and the chunk is found at the top of the file, it's inefficient to continue reading the rest of the file--much less perform an if test on every line in the file.
In Perl 6, you can read a paragraph at a time like this:
my $fname = 'data.txt';
my $infile = open(
$fname,
nl => "\n\n", #Set what perl considers the end of a line.
); #Removed die() per Brad Gilbert's comment.
for $infile.lines() -> $para {
if $para ~~ /^ 'Cluster Unique'/ {
say $para.chomp;
last; #Quit reading the file.
}
}
$infile.close;
# ^ Match start of string.
# 'Cluster Unique' By default, whitespace is insignificant in a perl6 regex. Quotes are one way to make whitespace significant.
However, in perl6 rakudo/moarVM the open() function does not read the nl argument correctly, so you currently can't set paragraph mode.
Also, there are certain idioms that are considered by some to be bad practice, like:
Postfix if statements, e.g. say 'hello' if $y == 0.
Relying on the implicit $_ variable in your code, e.g. .say
So, depending on what side of the fence you live on, that would be considered a bad practice in Perl.

Sweave to LaTeX "undefined control sequence" error

Im trying to include some R results in a TeX document using RStudio. I have managed to get RStudio to generate, what to me looks to be, a fine tex file but it fails to compile the pdf.
I get errors returned saying ! Undefined control sequence. ' which seems to be returned due to the first lines of str(data) calls and the lines showing significance levels:
"! Undefined control sequence.
<argument> '
data.frame': 1980 obs. of 5 variables:
l.39 'data.frame': 1980 obs. of 5 variables:
The control sequence at the end of the top line
of your error message was never \def'ed. If you have
misspelled it (e.g., `\hobx'), type `I' and the correct
spelling (e.g., `I\hbox'). Otherwise just continue,
and I'll forget about whatever was undefined."
"! Undefined control sequence. <argument>
Signif. codes: 0 '
***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
l.95 ...**' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The control sequence at the end of the top line
of your error message was never \def'ed. If you have
misspelled it (e.g., `\hobx'), type `I' and the correct
spelling (e.g., `I\hbox'). Otherwise just continue,
and I'll forget about whatever was undefined."
Files with just the summary(data) for instance work fine
Looking around other mailing lists etc Ive seen that this could be because tex cannot find the Sweave package so have copied it to various locations (the same folder as the Rnw and tex files, and a directory without spaces in the path) and tried to rerun the file. Nothing seems to work.
Similarly, this doesnt work, but using summary(cars) instead of str(cars) does. This suggests to me that its something to do with the ' character.
\documentclass [a4paper]{article}
\usepackage{Sweave}
\title {Sweave Example 1}
\author {Friedrich Leisch}
\begin {document}
\maketitle
In this example we embed parts of the examples from the
\texttt {kruskal.test} help page into a \ LaTeX {} document :
<<>>=
data ( cars )
str(cars)
#
\end{document}
(adapted from the sweave manual)
Any ideas on what Im doing wrong?
Any suggestions would be much appreciated.
Add the [noae] package option to your \usepackage{Sweave} statement.

Resources