Is there a way to convert text stored in a textview text storage as HTML characters? - ios

For example, I have a mini RTF editor that consist of a textview and I change the sizes of text in the text storage. Is there a way I can get these values as HTML? Or would I have to parse it manually?

There is no one to one conversion from the RTF spec to the HTML spec. You will either need to parse/convert yourself, or use a third party HTML - RTF converter.
Since your ultimate goal is to convert the RTF content to PDF, you might like to consider an RTF to PDF Converter.

Related

how to use tika for extracting the content from ppt?

fellow programmers!I extract a ppt file with using tika,which has only plain text.However,the result that tika give a content type is a jpg format!So my question is how to deal with it for I only want that case to be detected as a plain text type.
I change some source code in the tika,so I can get what content I want.In this way,I extract the ppt file and get the right result.

objective-c, PDF, How to solve "failed to parse embedded CMap." issue in PDF Seaching?

I am trying perform searching text in PDF, My project works fine on mostly PDF, but it fails to search text on some PDF, and xcode shows this message on console :
"failed to parse embedded CMap." How to solve this issue, So that I can search text on all PDF. Any suggestion will be great. Thanks in advance .
In general, it is impossible to search for text in all PDFs. This is for two main reasons:
PDFs use character codes that do not correspond to Unicode. A Cmap is used in this case to associate PDF character codes with a Unicode, but is not required to be present in the PDF document.
Even if a Cmap is included, the characters of text are not guaranteed to appear in order in the PDF document. PDF displays the glyphs corresponding to a character code based on geometry not on text.

How to Insert NSAttributedString using custom keyboard extension?

I want to write text in some custom fonts using keyboard Extension as these apps (1,2,3,4) are doing. I know how we can insert normal string in document proxy.
[self.textDocumentProxy insertText:mystring];
I have tried to insert NSAttributedString using above approach but I can't see any way to insert NSAttributedString to document proxy.
Some one can guide what will the best way to get rid of this issue? any suggestion will be appreciated.
It is not possible to insert attributed strings (or otherwise rich content) using the text document proxy.
The keyboards you have linked are not using custom fonts. They use (or abuse) Unicode symbols such as Enclosed Alphanumerics and Enclosed Alphanumeric Supplement.
In other instances, different alphabet symbols with visual similarity to latin symbols are used to create "funky" text, like here.
Last, some keyboard extensions, like the image keyboards, use the pasteboard to copy the rich content, and the user is responsible to paste it where he seems fit.
The apps you are referring to don't use NSAttributedString or custom fonts. They simply replace letters with similar-looking Unicode characters. You can see these characters in any OS X app inside Edit -> Special Characters menu.

RTF file to TXT/CSV file in objective-c?

I have RTF files containing that sort of content:
long_text_description_1 number1a number1b number1c
long_text_description_2 number2a number2b number2c
long_text_description_3 number3c
long_text_description_4 number4a number4b number4c
…
I need to extract the plain raw text without the colours, fonts and other formatting thing.
The only thing I need to keep are the most basic row/column information, ideally I would like a CSV file.
The file I get contain all the formatting:
{\cs18\lang1033\langfe1033\f0\b\i0\ul0\strike0\scaps0\fs15\afs15\charscalex100\expndtw0\cf1\dn0 number1a}
What is the best way to remove all rtf information while only keeping the row information?
Trying to figure out myself many many regular expressions sound dangerous unless there is a complete understanding of the RTF format.
What I could find on the Internet mostly focused on using Windows languages & libraries unavailable in iOS.
All rtf tags are in the form \xxx.
Try using a regular expression like "\\S+" and remove all matches or replace with nothing.
For your example, you'll end up with { number1a} This will remove any backslash followed by any characters.

What Character encoding is this?

When i backup my blackberry using blackberry desktop mananger, it saves it as an .ipd file.
its in hex... Not sure if its any particular type. But i used software called ABC amber Text Converter to convert this .ipd file into plain text format. And some of it comes out as plain text, Like all the messages saved in the backup file. But some of the text in the file looks like this:
qÖ²u_+;¢õ¿B[[¤†D`Ø,>p
|Cñ:ÌQ†nÁä¼sÒ®sKDv©{(]
)++³É«.gsn>
z
'‚51o4Kq
8Ütâ¯cí¿þ2´Õ|5kl$S,H
dbiIjz
*!~k$|
&*OÝ>0ðî­wã
+zno%q
2k;
YnÁÅŸ5|Xñ7Ú<}y2
A
V܉lO5‰<œtÅRI-I
Does anybody have any idea What the hell this is or if there is Any way i can decode this?
Thanks
It's just binary data. You may have been able to extract some text from the file where strings of text were stored, but the rest will be just bytes of data.
You'll need a specific program that understands these backup files. A quick google reveals a few choices, such as MagicBerry.
One of the Blackberry developers has helpfully blogged a bit of information about the binary format, so you could try using that to write your own program to parse it:
http://us.blackberry.com/devjournals/resources/journals/jan_2006/ipd_file_format.jsp

Resources