Converting openoffice.org text document to spreadsheet - openoffice.org

Any ideas for how to convert an openoffice.org text document
into spreadsheet format, programatically, if possible?

For the record: This is what I finally did: With synaptic I installed
the excellent little program odt2txt.
With that I extracted the text content, edited it manually a little in emacs,
gave it extension .csv and then it imported nice in openoffice.org calc.

Related

iOS search and replace PDF string

Is it possible to search and replace a known string from a PDF with Objective-C/Quartz 2D?
I've some nice formatted PDF with tabular data, created with Latex (and generated with pdflatex). Every pdf will have a placeholder string, something like XXXXXX that I would like to change programmatically.
This strings will be replaced only by other numbers.
I'm aware that the PDF could be an editable form, but i don't want it because i prefer to leave all the fonts and formatting as they're typeset by Latex.
It is not possible to search and replace text in PDF files using Quartz 2D. Quartz 2D offers a read only low level interface for reading PDF files. While searching can be implemented on top of it, although with much effort, modifying the files and replacing text is not possible.

Where can I find get a dump of raw text on the web?

I am looking to do some text analysis in a program I am writing. I am looking for alternate sources of text in its raw form similar to what is provided in the Wikipedia dumps (download.wikimedia.com).
I'd rather not have to go through the trouble of crawling websites, trying to parse the html , extracting text etc..
What sort of text are you looking for?
There are many free e-books (fiction and non-fiction) in .txt format available at Project Gutenberg.
They also have large DVD images full of books available for download.
NLTK provides a simple Python API to access many text corpora, including Gutenberg, Reuters, Shakespeare, and others.
>>> from nltk.corpus import brown
>>> brown.words()
['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', ...]
the gutenberg project has huge amounts of ebooks in various formats (including plain text)

Ruby on Rails: In there a way to convert word to html?

maybe a way to batch convert also?
You could use Google Docs API to upload and convert .doc's.
http://code.google.com/apis/documents/overview.html
Some samples and code: http://code.google.com/apis/documents/code.html
Ruby example and demo:
http://code.google.com/p/gdata-samples/source/browse/#svn/trunk/doclist/DocListManager
http://doclistmanager.googlecodesamples.com/
The short answer is no, but the long answer is sorta.
MS Word itself will save a file out as html - but it's a total friggin' mess. To an extent this is simply because the customer base that is converting word files to html directly are not concerned about it being sloppy, so Word hasn't worked hard on making a clean output. On the other hand, it's intrinsically difficult, because word is oriented to create fixed size, non-dynamic documents, like a paper-base book. So it's easy to convert to other static formats (say a PDF), but how do you convert to HTML? Do you just make the text flow across? Do you set a width that will hopefully make the layout stay the same? What if there is fonts or layout elements in the word doc that are not available in the HTML renderer?
The easiest thing to do is to do it project by project - you can create a DTD to convert an RTF file, for instance - but this involves you making programmer level decisions about how these will be converted.

PasteSpecial using Ole,PowerPoint,Delphi

How do you use PasteSpecial in Delphi to paste into an Ole PowerPoint. I have rtf data i want to paste into powerpoint and I need to use PasteSpecial. However I cannot find documentation on how to fill out the parameters it needs.
PasteSpecial is just going to favor one format over the other. So you can prioritize the formats, or eliminate formats, to influence the pasting. For example, if you have RTF and TEXT on the clipboard, and PP always pastes TEXT by default, even if RTF is listed first, then you could just eliminate TEXT and provide ONLY RTF. Then it has to paste as RTF.
MSDN has documentation for the 2003 and 2007 versions. In both cases, the first parameter should be ppPasteRTF if you want to choose the clipboard contents with RTF format. You can use EmptyParam for the remaining five parameters.

Convert EPS to PDF on the fly with pdflatex on the fly

I'm trying to include an EPS figure in a document that will be compiled using pdflatex. Of course, the picture can be converted to pdf using epstopdf (which comes with the MikTeX distribution). Is there any way to do this on the fly, that is, make pdflatex do the conversion?
I'm looking for such a solution because I want to set up an easy-to-learn environment for students. Ideally, the converted picture is placed in the directory that also contains the original .eps, and the .pdf is used if available.
The relevant answer in the TeX FAQ points to epstopdf.sty, included with Heiko Oberdiek's packages.
I would recommend using latex-mk which is a nice way to have a very simple Makefile for latex construction. Of course you can have eps file converted to pdf, or fig to eps, etc, during the build process.
Currently my Makefile look like that :
NAME=report
TEXSRCS=report.tex
BIBTEXSRCS=biblio.bib
USE_PDFLATEX=true
VIEWPDF=open # cause i'm on osx, gv for most unix
XFIGDIRS=img
## For osx users :
include /opt/local/share/latex-mk/latex.gmk
## For unix users :
#include /usr/share/latex-mk/latex.gmk
When I invoke make, the first thing it does is converting some .fig into .pdf files. I'm pretty sure it would do the same with eps files.
If you want to include one EPS figure in latex then you need to at first make the figure in EPS format if it is not in EPS format.Like if your figure is in .jpeg extension, then you need to make it .eps
Then you need to include it in the LaTex with using some code which is common in LaTex and then to make it in pdf format you need to use one small instruction that is \usepackage{epstopdf}
I was also facing this problem and found this post very helpful "How to Convert .eps to PDF in Latex ?"
Now i am able to include EPS figure in LaTex and also can convert it in PDF. I think you will also get help and all the details from the above link.Let me know if you face any further problem.

Resources