How to upgrade an MSXML Document from version 1 to verison 6? - domdocument

My application uses MSXML version 1 (MSXML.DOMDocument) to store user documents in XML format.
I want to upgrade to MSXML6 (Msxml2.DOMDocument.6.0). The problem is that old documents are not always readable with the new version.
The cause of this is that the old MSXML parser does not correctly encodes non-Latin character as UTF-8, and the new parser refuses to load these document.
My question - how can I read / convert my customers' existing files to be readable in MSXML6?

It is really a good idea to fix those old xml files with correct encoding. In fact, a W3C conformant xml parser is expected to choke when handling this kind of xml files.
As far as I know, MSXML does not provide functionality to fix the encoding for old xml files.
To fix the encoding, you can do it manually with Notepad++ (choose the actual encoding, and then convert to utf-8), or convert programmatically if you are sure of the original encoding, e.g. ANSI in your case. There should be いろいろ sample codes over the internet.

Related

How to use long strings in delphi xe4?

I want decode a large base64 code on my delphi project
When I paste it in my project I see the Long String error ..
for solve it I use to it syntax:
'samecode'+
'samecode'+
'samecode';
But if I manually using this syntax it's too large time ...
Is there the quick way for solve it ?
You have a few options:
Compile the text to a string resource and link that to your executable. Load the resource at runtime.
Place the text in a file that you deploy alongside your executable and load it at runtime.
Write a script to read the text and format it to a manner suitable for inclusion in your source code.
Since your text is actually a base64 encoded file, I doubt that you want to do any of this. What you really ought to be doing is decoding the base64 text to a binary file and linking that as a resource.
Given that the base64 encoded file is in fact a virus (MSIL/Bladabindi.AJ), I cannot imagine anybody wanting to help you. I'm disappointed that I've done as much as I have. You should be ashamed of yourself.

DataTables, PDF and special characters

I am using DataTables and the TableTools PDF export function. The PDF-export does not take care of certain special characters and translate them into rubbish (or ISO equivalences, i guess). The characters are '●' ●, '○' ○, and '‭٭‭' ٭.
Is there any way to define the character set for the PDF so I can preserve those special characters? (I'm guessing that character set is the problem) Or any other workaround?
No, there isn't a way to configure the character set for the PDF. DataTables, or specifically its TableTools add-on, uses a fairly limited Flash-based PDF exporter.
You can, however, edit the ActionScript used to make the TableTools Flash add-on.
Download TableTools and look in the archive's \media\as3 directory for .as files.
If you don't have Adobe's software for Flash authoring, you might try the open source Adobe Flex.
A late answer (to my self) but others could benefit. I figured out to use mPDF instead. It supports UTF8, languages with special characters and embedded stylesheets.

No Unicode-version HTML Tidy for Delphi?

I downloaded the latest version (TidyPas_Delphi2010.zip) from the official homepage (http://sourceforge.net/projects/curlpas/files/).
But to my surprise, there are full of AnsiString in the unit instead of string(UnicodeString).
Does anybody use this? No Unicode version?
Thanks
TidyPas is just a wrapper around the HTML Tidy library API. It does not provide a UnicodeString facade over that API, it exposes the API as-is.
As far as I can tell from the docs, HTML Tidy itself only supports a limited range of character sets, but these do include the UTF8 encoding of Unicode, which with a bit of care I think should be OK with ANSIString and ANSIChar types used by the API.
Any further inquiries about Unicode support in HTML Tidy other than with UTF8 would probably be best directed at the HTML Tidy community itself. It doesn't seem to have been updated for a while though (since 2008).
Yes, it does work in Delphi 2010 - I updated the code ;-) And yes, you need to convert the input from Unicode to UTF8 to handle it. You can find the (working) code I use at http://www.csinnovations.com/framework_delphi.htm

XMl Data into String in blackberry

Read the contents of an local XML file in an application and get the whole contents of xml file into a string for blackberry application?
To create a string from a local file see this blackberry forum entry: Open txt file from mediacard
Assuming you want to use the data within the XML, I would recommend using a XML parser rather than string manipulation. The following links should get you going with XML parsers and explain some of the trade-offs:
Blackberry How To - Use the XML Parser
Parsing XML in J2ME
Add XML parsing to your J2ME applications
If, however, you have any say about the format used JSON might be a good alternative. JSON is easy for machines to parse (thus using fewer resources) and it's human readable.
I have found using a SAXParser and subclassing DefaultHandler has worked well. Allows to go element by element.

Creating Microsoft Word (.docx) documents in Ruby

Is there an easy way to create Word documents (.docx) in a Ruby application? Actually, in my case it's a Rails application served from a Linux server.
A gem similar to Prawn but for DOCX instead of PDF would be great!
As has been noted, there don't appear to be any libraries to manipulate Open XML documents in Ruby, but OpenXML Developer has complete documentation on the format of Open XML documents.
If what you want is to send a copy of a standard document (like a form letter) customized for each user, it should be fairly simple given that a DOCX is a ZIP file that contains various parts in a directory hierarchy. Have a DOCX "template" that contains all the parts and tree structure that you want to send to all users (with no real content), then simply create new (or modify existing) pieces that contain the user-specific content you want and inject it into the ZIP (DOCX file) before sending it to the user.
For example: You could have document-template.xml that contains Dear [USER-PLACEHOLDER]:. When a user requests the document, you replace [USER-PLACEHOLDER] with the user's name, then add the resulting document.xml to the your-template.docx ZIP file (which would contain all the images and other parts you want in the Word document) and send that resulting document to the user.
Note that if you rename a .docx file to .zip it is trivial to explore the structure and format of the parts inside. You can remove or replace images or other parts very easily with any ZIP manipulation tools or programmatically with code.
Generating a brand new Word document with completely custom content from raw XML would be very difficult without access to an API to make the job easier. If you really need to do that, you might consider installing Mono, then use VB.NET, C# or IronRuby to create your Open XML documents using the Open XML Format SDK 1.0. Since you would just be using the Microsoft.Office.DocumentFormat.OpenXml.Packaging Namespace to manipulate Open XML documents, it should work okay in Mono, which seems to support everything the SDK requires.
Maybe this gem is interesting for you.
https://github.com/trade-informatics/caracal/
It like prawn but with docx.
You can use Apache POI. It is written in Java, but integrates with Ruby as an extension
This is an old question but there's a new answer. If you'd like to turn an HTML doc into a Word (docx) doc, just use the 'htmltoword' gem:
https://github.com/karnov/htmltoword
I'm not sure why there was answer creep and everyone started posting templating solutions, but this answers the OP's question. Just like Prawn, except Word instead of PDF.
UPDATE:
There's also pandoc and an API wrapper for pandoc called docverter. Both have slightly complicated installs since pandoc is a haskell library.
I know if you serve a HTML document as a word document with the .doc extension, it will open in Word just fine. Just don't do anything fancy.
Edit: Here is an example using classic ASP. http://www.aspdev.org/asp/asp-export-word/
Using a technique very similar to that suggested by Grant Wagner I have created a Ruby html to word gem that should allow you to easily output Word docx files from your ruby app. You can check it out at http://github.com/nickfrandsen/htmltoword - Simply pass it a html string and it will create a corresponding word docx file.
def show
respond_to do |format|
format.docx do
file = Htmltoword::Document.create params[:docx_html_source], "file_name.docx"
send_file file.path, :disposition => "attachment"
end
end
end
Hope you find it useful. If you have any problems with it feel free to open a github issue.
Disclosure: I'm the leader of the docxtemplater project.
I know you're looking for a ruby solution, but because all other solutions only tell you how to do it globally, without giving you a library that does exactly what you want, here's a solution based on JS or NodeJS (works in both)
DocxTemplater Library
Demo of the library
You can also use it in the commandline:
npm install docxtemplater -g
docxtemplater <configFile>
----config.docxFile: The input file in docx format
----config.outputFile: The outputfile of the document
This is a way Doccy (doccyapp.com) has a api that does just that which you can use. Supports docx, odt and pages and converts to PDF as well if you like
Further to Grant's answer, you can also send Word a "Flat OPC" file, which is essentially the docx unzipped and concatenated to create a single xml file. This way, you can replace [USER-PLACEHOLDER] in one file and be done with it (ie no zipping or unzipping).
If anyone is still looking at this, this post explains how to use an XML data source. This works nicely for me.
http://seroter.wordpress.com/2009/12/23/populating-word-2007-templates-through-open-xml/
Check out this github repo: https://github.com/jawspeak/ruby-docx-templater
It allows you to create a document from a word template.
If you're running on Windows, of course, it's a matter of WIN32OLE and some pain with the Word COM objects.
Chances are that your serving from a *nix environment, though. Word 2007 uses the "Microsoft Office Open XML" format (*.docx) which can be opened using the appropriate compatibility pack from Microsoft.
Some of the more recent Office apps (2002/XP and 2003 at least) had their own XML formats which may also be useable.
I'm not aware of any Ruby tools to make the process easier, sadly.
If it can be made acceptable, I think I'd be inclined to go down the renamed-html file route. I just saved a document as HTML from WordXP, renamed it to a .doc and opened it without problem.
I encountered the same problem. Unfortunately I could not manipulate the xml because my clients should themselves to fill in templates. And to do this is not always possible (for example, office for mac does not allow this).
As a solution to this problem, I made ​​a simple gem, which can be used as an rtf document template with embedded ruby: https://github.com/eicca/rtf-templater
I tested it and it works ok for filling reports and documents. However, formatting badly displays for complex loops and conditions.

Resources