I'm using odt file as some kind of template and Libre Office as tool to create this template. It usually works fine except one thing.
Let assume our odt file has a paragraph of text.
There is my text.
XML file may or may not look (seems random) like this (messy, not very good thing for for parsing or as a template):
<text:p text:style-name="P7">There is</text:p><text:p text:style-name="P7"> my text<text:p text:style-name="P7">.</text:p></text:p>
Sometimes it's (again seems random) like this (expected result, makes sense after all):
<text:p text:style-name="P7">There is my text.</text:p>
Is there any way to get rid superfluous xml tags? Or at least can user see a raw document in LibreOffice/OpenOffice to manually remove redundancy?
The key is to provide easy tool for a user, to detect and fix artefacts like this.
Have you tried Ctrl-M? If all formatting is defined in styles and style formatting is not manually overridden, it should not disturb the formatting but should remove redundant tags.
A tedious user process would be to cut and paste-special as text and apply style again.
Finally, a macro would definitely do the trick.
Related
I've recently taken on a project of document conversion to HTML. That is, a client gives me a .DOC file, and I need to convert the contents to one long HTML file - no styling, no CSS, just clean HTML with paragraph tags, header tags tags, etc.
I found an application that does a pretty good job of automating the first part of it. The problem is that I need to do some advanced find and replace based on strings using variables.
For instance, I have footnotes that were converted properly. They're currently displayed as superscript numbers with the
I'd like to change how the footnote is displayed. Instead of a superscript number 6 for the 6th footnote, I'd like it to show (Note 6)
To do that on the entire document (hundreds of footnotes), I'm wondering if I can do something like:
FIND:
<sup><a name="FN[0-9]" href="FNR[0-9]">[0-9]</a></sup>
REPLACE:
<a name="FN%1" href="FNR%2">(Note %3)</a>
The problem is, I can't find a Find and Replace tool that lets me maintain the variables in the replace area. All I get is the superscript 6 appearing as (Note %3), as well as every other footnote doing the same thing.
Anyone have any ideas on how I can accomplish my task efficiently?
In Perl it would look roughly like this on the command line (I have NOT tested this):
perl -i -p -e's{<sup><a name="(FN\d)" href="(FNR\d)">(\d)</a></sup>}{<a name="$1" href="$2">(Note $3)</a>}' filenames....
-i says "Edit this file in place", -p means "print each line after we do whatever is in the -e switch".
That's assuming you're only looking for a single digit where you have [0-9]. If you want to match FN427, then you change (FN\d) to (FN\d+), for example.
This also assumes that the HTML that are you parsing looks EXACTLY LIKE THAT. If you get some HTML that is <a href=... name=... (with the attributes in opposite order than you have) then it will break. In that case, you'll want to use an HTML parser.
I hope that gives you enough to start with.
I have a database which I want to export as an iOS compatible PLIST.
The work around I have come up with is to create a calculated field which adds the tagged padding and header and creates a report using these fields. I then export the preview of the report as a PDF, open the PDF in Acrobat Reader, select all text, copy and paste into XCode which recognises the PLIST format and all works as expected.
Is there a better way of doing this? (This seems a really convoluted way of doing things, high chance of error, etc.) The Export as XML option looks promising but I can't seem to join the dots.
Two ways that I can think of to do what you're trying to do. The most elegant way is probably the XML with XSLT export which you suggest. If you don't already know XSLT, though, you might try the following -- it sounds like with the calculated XML line you've already created, like this would be a simple change to your database:
Create a single new global field, say outputXML
Create a script, say plistCreator
In the plistCreator script:
Set outputXML to ""
Go to the first record you want to export
Loop through every record putting your calculated XML line into outputXML (set field outputXML to outputXML & ΒΆ & calculatedXMLLine)
Go to next record, exit after last
Export Field Contents (note that this is a different command than Export) for outputXML
The cleanest solution is to use the export XML with an XSLT for transforming the output. You'll need to know a little XSLT to do this, or at least be able to customize the examples from FileMaker.
I'd like to add a code appendix to my LyX document. There are a few options I already considered, but they all have their problems.
I know a bit about listings, but one problem with those is that, if I copy & paste my code into them, I lose all enters/newlines. Since the code is too large to correct by hand, I was wondering if there is an alternative.
In LyX there is the possibility of inserting child documents, but that seems to be only for .tex files. Would have been ideal if I could just insert my .java file as a child document.
I could print the code to PDF, but it will include margins that mess up the final document, since the PDF is placed on the left margin of the final document and then there is the margin of the PDF. Also, this PDF always contains the entire code and white areas where not the entire page has been filled.
Does anyone have good alternative?
The listings package found here
http://www.ctan.org/tex-archive/macros/latex/contrib/listings/
allows the include of external source code files (look into the reference for \lstinputlisting).
EDIT: here you find some samples how to use it:
http://en.wikibooks.org/wiki/LaTeX/Packages/Listings
If you need to copy-paste code to LyX listing box then use Edit -> Paste Special -> Seletion or Ctrl+Alt+V.
For what it's worth, at least the 2.0 versions of LyX have the ability to include listings as child documents. Insert, File, Child Document, and choose from the dropdown box "Program Listing". This uses the listings package and lets you keep your source in its own file.
If listings doesn't support your language, you can always use something like highlight or source-highlight to generate a latex snippet of syntax-highlighted code that you can add as a child document of type "Input"
Yes, if you copy&paste code into the LyX listings box, you lose all newlines, but you can preprocess your code (insert an extra newline below each line):
$ cat foo.java | sed -e 's/$/\n/' > bar.java
Then you can copy&paste the new file bar.java and everything will be ok.
When I build the LaTeX file generated from sphinx, the TOC entries, and section headers are blue. Is there an easy way to disable coloring these items? If not, is there an easy way to make them black instead? My goal is to print the document on a non-color printer, and the TOC and headings do not look as dark as the rest of the text when I do so.
I would like to make one change that applies to the whole document if possible.
Note: I am using the howto document class.
Update
Thanks to ddbeck's input, I took a closer look at sphinx.sty which defines the colors that I needed to change. I set (created) the latex_elements dictionary in conf.py as follows:
mypreamble ='''
\\pagenumbering{arabic}
\\definecolor{TitleColor}{rgb}{0,0,0}
\\definecolor{InnerLinkColor}{rgb}{0,0,0}
'''
latex_elements = {
'papersize':'letterpaper',
'pointsize':'11pt',
'preamble':mypreamble
}
This worked out exactly as I wanted it. Thanks ddbeck!
You can add LaTeX by using the latex_elements['preamble'] configuration option. If you change the value of that key, you can override Sphinx's normal LaTeX. The docs on this option aren't particularly illuminating, however. You may find this thread from sphinx-dev a bit more helpful; it has more detail on how that might be used, as well as some good links for learning about LaTex (if that's something you need to get black and white output). Finally, it might help to take a look at the default .cls and .sty files.
I need a way to add text comments in "Word style" to a Latex document. I don't mean to comment the source code of the document. What I want is a way to add corrections, suggestions, etc. to the document, so that they don't interrupt the text flow, but that would still make it easy for everyone to know, which part of the sentence they are related to. They should also "disappear" when compiling the document for printing.
At first, I thought about writing a new command, that would just forward the input to \marginpar{}, and when compiling for printing would just make the definition empty. The problem is you have no guarantee where the comments will appear and you will not be able to distinguish them from the other marginpars.
Any idea?
todonotes is another package that makes nice looking callouts. You can see a number of examples in the documentation.
Since LaTeX is a text format, if you want to show someone the differences in a way that they can use them (and cherry pick from them) use the standard diff tool (e.g., diff -u orig.tex new.tex > docdiffs). This is the best way to annotate something like LaTeX documents, and can be easily used by anyone involved in the production of a document from LaTeX sources. You can then use standard LaTeX comments in your patch to explain the changes, and they can be very easily integrated. If the document lives in a version control system of some sort, just use the VCS to generate a patch file that can be reviewed.
I have used changes.sty, which gives basic change colouring:
\added{new text}
\deleted{old text}
\replaced{new text}{old text}
All of these take an optional parameter with the initials of the author who did this change. This results in different colours used, and these initials are displayed superscripted after the changed text.
\replaced[MI]{new text}{old text}
You can hide the change marks by giving the option final to the changes package.
This is very basic, and comments are not supported, but it might help.
My little home-rolled "fixme" tool uses \marginpar where possible and goes inline in places (like captions) where that is hard to arrange. This works out because I don't often use margin paragraphs for other things. This does mean you can't finalize the layout until everything is fixed, but I don't feel much pain from that...
Other than that I heartily agree with Michael about using standard tools and version control.
See also:
Tips for collaboratively editing a LaTeX document (which addresses you main question...)
https://stackoverflow.com/questions/193298/best-practices-in-latex
and a self-plug:
How do I get Emacs to fill sentences, but not paragraphs?
You could also try the trackchanges package.
You can use the changebar package to highlight areas of text that have been affected.
If you don't want to do the markup manually (which can be tedious and interrupt the flow of editing) the neat latexdiff utility will take a diff of your document and produce a version of it with markup added to visually display the changes between the two versions in the typeset output.
This would be my preferred solution, although I haven't tested it out on large, multi-file documents.
The best package I know is Easy Review that provides the commenting functionality into LaTeX environment. For example, you can use the following simple commands such as \add{NEW TEXT}, \remove{OLD TEXT}, \replace{OLD TEXT}{NEW TEXT}, \comment{TEXT}{COMMENT}, \highlight{TEXT}, and \alert{TEXT}.
Some examples can be found here.
The todonotes package looks great, but if that proves too cumbersome to use, a simple solution is just to use footnotes (e.g. in red to separate them from regular footnotes).
Package trackchanges.sty works exactly the way changes.sty. See #Svante's reply.
It has easy to remember commands and you can change how edits will appear after compiling the document. You can also hide the edits for printing.