Hi i have an xml file with
<?xml version="1.0" encoding="windows-1251"?>
can jsoup auto detect the encoding of an xml file?
No it cannot. This is an open issue at the Github/jsoup and has been for ~2 years. As far as I know it is still the same.
Related
I am working on Symfony-1.1 in an existing project. How can I read pdf files and extract text from them?
It's not a Symfony 1.1 related question, actually. It's a PHP one. There several libraries to handle PDFs in PHP. Following are some suggestions.
https://github.com/smalot/pdfparser
http://pastebin.com/dvwySU1a
http://www.pdflib.com/
If you just need to parse pdf in anyway and then process the text in PHP, you can also consider using a java library like the following.
http://pdfbox.apache.org/ (Is there a PDF parser for PHP?)
I want to use php to output plist that is in item-service link.
Like this
itms-services://?action=download-manifest&url=xxxxxx.php?id=xxx
(In php file, output plist string from app data in db)
Is it possible?
If it is possible, please show me example code in php file.
A plist file is just a specially formed XML file. This is something you could generate in any language. Examples for PHP XML output can be found in other questions such as this one: How to output XML string from PHP
I'm receiving a file from a server, but instead of being an xml file it is a wsdl file with the same exact text that would be in an xml file. Since the content is the exact same can I parse it as if it were an xml file? Or do I need to convert it somehow?
WSDL is in fact a XML to describe and locate web services. It is not the content itself. Even though technically you can parse it, you should expect a XML file from server.
I know that Apache Tika is a text extractor. It can extract text from doc, pdf, ppt and lots of other file formats. Now I need this function in ios, so I want to know is there any alternative to Apache Tika for ios?
If there is no such library for ios, you can tell me tools that can extract specified file format.
Thank you in advance.
libopc for extracting text from docx, xlsx, pptx.
Antiword for older MS formats.
You can extract strings from a PDF using CoreGraphics also, and
using PDFiPhone too.
If you're also looking for extracting text from a HTML document, have a look at NSXMLParser.
Read the contents of an local XML file in an application and get the whole contents of xml file into a string for blackberry application?
To create a string from a local file see this blackberry forum entry: Open txt file from mediacard
Assuming you want to use the data within the XML, I would recommend using a XML parser rather than string manipulation. The following links should get you going with XML parsers and explain some of the trade-offs:
Blackberry How To - Use the XML Parser
Parsing XML in J2ME
Add XML parsing to your J2ME applications
If, however, you have any say about the format used JSON might be a good alternative. JSON is easy for machines to parse (thus using fewer resources) and it's human readable.
I have found using a SAXParser and subclassing DefaultHandler has worked well. Allows to go element by element.