iOS: Read in XLS - ios

I'm trying to figure out how to read in the contents of an XLS document and I'm able to get the bytes just fine, but I don't have any clue where to go from here. Trying [[NSString alloc] initWithBytes:data.bytes length:data.length encoding:NSUTF8StringEncoding] and [NSString stringWithUTF8String:data.bytes] both don't get me anywhere (null). What are you supposed to do to read in the contents of an XLS file?

Trying to combine two answer.
"There is no innate ability to read Excel data into a Foundation container, like an NSArray or NSDictionary. You could, however, convert the file (with Excel) to a comma-separated-value (CSV) file and then parse each line's cells on the iPhone using the NSString instance method -componentsSeparatedByString:."
"A comma-separated values (CSV) file stores tabular data (numbers and text) in plain-text form. Plain text means that the file is a sequence of characters, with no data that has to be interpreted instead, as binary numbers. A CSV file consists of any number of records, separated by line breaks of some kind; each record consists of fields, separated by some other character or string, most commonly a literal TAB or comma. Usually, all records have an identical sequence of fields"
--
How to read cell data from an Excel document with objective-c
objective-c loading data from excel

Even though saving your Excel file to CSV is the easier answer, sometimes that's not really what you're looking for, so I created QZXLSReader. It's a drag-and-drop solution so it's a lot easier to use. I don't think it's as feature complete, but it worked for me.
It's basically a library that can open XLS files and parse them into Obj-C classes. Once you have the classes, it's very easy to send them to Core Data or a dictionary or what have you.
I hope it helps!

Related

NSKeyedArchiver sometimes makes a broken file

My iOS app saves NSCoding objects in Document directory.
NSKeyedArchiver archives them. It is always O.K. but sometimes makes broken files.
The broken files have the following two patterns.
Lack of data
I can convert them to ascii strings and recover meaningful
How do I convert an NSData object with hex data to ASCII in Swift?
They have bplist prefix. But they don’t have the trailers.
Total loss
I cannot convert them to ascii strings.
They look shifting all bytes.
This is one of the headers in the broken files comparing with the correct header.
broken (sequence of characters seems to be different every data):
Nè\à¡<99>K<80>^_È<97>▸T§:Æñã9µú▸Ñ1^LË^VYGfM^A%KÍ<95
expected:
bplist00Ô^A^B^C^D^E^H01T$topX$objectsX$versionY$
Has anyone experience the same case?

How to parse a JSON file line by line in objective c

I am working with very large JSON files, so I do not want to read the entire file and then iterate and parse each data entry.
Instead, I would like to iterate on the JSON file itself (for example: line-by-line/one object at a time).
I thought about holding the next line location as part of the current line data, so the JSON is a semi linked list, but I did not manage to extract a specific line from the JSON file.
Am I missing an easier way to achieve that? Is it even possible to extract and parse a specific line from a JSON file?
Thanks a lot!
JSON is not a line oriented format, so the idea of parsing "line by line" doesn't really make sense.
That said, there is at least one event-driven JSON parser for iOS that I know of, https://github.com/stig/json-framework. The built-in parser NSJSONSerialization only works on entire files.

How do I go about converting .docx into an array or dictionary?

I am currently trying to build an iOS game that match word with its definition for myself and my classmates.
I'm having a hard time thinking how do I go about converting a list of words with their definitions in a .docx files into something(JSON, XML, ...) that I can then read it into an Array or Dictionary.
Most of the words in the .docx have the following format:
" Word (): Definition. "
This would be easier with excel. Excel already has the function to export to xml, and should make your life a lot easier, instead of getting all the words out of docx and then converting them to either JSON or XML.
http://www.excel-easy.com/examples/xml.html

Convert a Text file in to ARFF Format

I know how to convert a Set of text or web page files in to arff file using TextDirectoryLoader.
I want to know how to convert a single Text file in to Arff file.
Any help will be highly appreciated.
Please be more specific. Anyway:
If the text in the file corresponds to a single document (that it, a
single instance), then all you need is to replace all "new lines"
with the escape code \n to make the full text be in a single line,
then manually format as an arff with a single text attribute and a
single instance.
If the text corresponds to several instances (e.g. documents), then I
suggest to make an script to break it into several files and to apply
TextDirectoryLoader. If there is any specific formating (e.g.
instances are enclosed in XML tags), you can either do the same (by
taking advantage of the XML format), or to write a custom Loader
class in WEKA to recognize your format and build an Instances object.
If you post an example, it would be easier to get a more precise suggestion.

retrieve txt content of as many file types as possible

I maintain a client server DMS written in Delphi/Sql Server.
I would like to allow the users to search a string inside all the documents stored in the db. (files are stored as blob, they are stored as zipped files to save space).
My idea is to index them on "checkin", so as i store a nwe file I extract all the text information in it and put it in a new DB field. So somehow my files table will be:
ID_FILE integer
ZIPPED_FILE blob
TEXT_CONTENT text field (nvarchar in sql server)
I would like to support "indexing" of at least most common text-like files, such as:pdf, txt, rtf, doc, docx,pdf, may be adding xls and xlsx, ppt, pptx.
For MS Office files I can use ActiveX since I alerady do it in my application, for txt files i can simply read the file, but for pdf and odt?
Could you suggest the best techinque or even a 3rd party component (not free too) that parses with "no fear" all file types?
Thanks
searching documents this way would leed to a very slow and inconvenient to use, I'd advice you produce two additional tables instead of TEXT_CONTENT field.
When you parse the text, you should extract valuable words and try to standardise them so that you
- get rid of lower/upper case problems
- get rid of characters that might be used interchangeably.
i.e. in Turkish we have ç character that might be entered as c.
- get rid of verbs that are common in the language you are dealing with.
i.e. "Thing I am looking for", "Thing" "Looking" might be in your interest
- get rid of whatever problem use face.
Each word, that has already an entry in the table should re-use the ID already given in the string_search table.
the records may look like this.
original_file_table
zip_id number
zip_file blob
string_search
str_id number
standardized_word text (or any string type with an appropriate secondary index)
file_string_reference
zip_id number
str_id number
I hope that I could give you the idea what I am thinking of.
Your major problem is zipping your files before putting them as a blob in your database which makes them unsearchable by the database itself. I would suggest the following.
Don't zip files you put in the database. Disk space is cheap.
You can write a query like this as long as you save the files in a text field.
Select * from MyFileTable Where MyFileData like '%Thing I am looking for%'
This is slow but it will work. This will work because the text in most of those file types is in plain text not binary (though some of the newer file types are now binary)
The other alternative is to use an indexing engine such as Apache Lucene or Apache Solr which will as you put it
parses with "no fear" all file types?

Resources