On malformed files - file-type

Suppose you start looking at an XML file, which you parse and confirm that it in fact an XML file. Life is good.
Then someone removes a > somewhere in a file, which effectively makes the file a malformed XML from parser's stand point. As far as it's concerned, the file is no longer a properly formed XML file.
Is there a way one can one confirm that file is in fact still an XML file, albeit a malformed one?
The question extends beyond XML (obviously). How can one arrive at a conclusion that a file is "probably of a certain type", as opposed "i can't parse it and therefore it is certainly not of a certain type"?

Related

Swift - how do i convert a regular XML file to a .plist XML file

I searched the web for this one and mostly found Obj-C answers or outdated answers so ill simply post the question and hope for the best :)
Im working with a team of Android Developers while im working on IOS. they use XML files for a lot of data. since we want the data between us to stay consistent - we want to be able to have shared XML files to use in our projects.
I know .plist is basically a type of XML, but i want to be able to get the regular shared XML file and turn it into a .plist and then use it in my IOS project.
any known method of doing it at all? if so, is it possible to do it automatically somehow? (script or something of that sort. maybe even do it locally inside the application)
Thanks for the help
A .plist file is a very specific XML file with only a small set of keys allowed. A random XML file will not be convertible to a .plist file.
Try converting your file to JSON first. If you can convert it to JSON you will be able to automatically convert it to a .plist after that using the plutil command with the xml1 or binary1 format.
plutil -convert xml1 -o output.plist input.json
If you can't convert it to JSON, you can keep your XML file as is and parse it in your app using NSXMLParser. XML is harder to parse than .plist or JSON files, but aside from that the difference should be minimal.

How do I add files to my app and find their path to access them

I have some txt files that store some important data for my app. Due to its nature I want them to be in external text files. Currently i plan on reading them using a streamreader that reads the txt line by line. However, i don't know where to put my txt files, so i can access them in my streamreader which requires their path. Ive seen examples of using NSBundle.mainBundlepathforResource. However, I'm not really sure what a Bundle is or how to place my files there in the first place.
You can use NSBundle. Here's another answer that shows how to create and use bundles: https://stackoverflow.com/a/23884501/1228075

Parsing a WSDL file like XML?

I'm receiving a file from a server, but instead of being an xml file it is a wsdl file with the same exact text that would be in an xml file. Since the content is the exact same can I parse it as if it were an xml file? Or do I need to convert it somehow?
WSDL is in fact a XML to describe and locate web services. It is not the content itself. Even though technically you can parse it, you should expect a XML file from server.

How to ensure that a file is correctly written to file system?

I hava an application that reads a file from a ZIP archive and saves it to file on file system. After writing it to file system I start start immediately to read this file with a SAX2 reader. On bigger files (300+ MB) it sometimes occures, that SAX2 stops parsing because of an unclosed tag. But when I check the file (or even try to read it again later) it works, so the file it self it OK.
FZipKit.ExtractToStream(LFileName, LStream);
LStream.SaveToFile(OutputFilename);
SAX2.processUrl(OutputFilename);
My assumption is, that the file was not yet fully written to file system when I started the parsing process.
Is there a way to ensure, that the file was written or the steam has been flushed to file system?
thx
I'm going to first of all assume that the XML parser operates correctly. If it is incapable of reading files, well the solution is obvious.
Which leads us to look at how the file is created. When you call SaveToFile, the file is opened, written, closed and buffers are flushed. In a plain vanilla system, your XML parser will see the entire content of the file. The only conclusion is that something is interfering. The most like suspect is your virus scanner. Many scanners, even the most respected ones, cannot properly handle a file being closed and then immediately re-opened.
The bottom line is that your code is fine and the problem almost certainly lies with your local environment.

(rails) how to validate whether an uploaded .txt file is not, say, an image file?

I have a upload text file field, and with it I plan to save the file somewhere and then store the location of the file in a database. However, I want to make sure the file they uploaded is a .txt file, and not, say, an image file. I imagine this happens in the validation step. How does one validate such a thing? Also, how do you get the filename of the uploaded file? I could always just check if it said '.txt' but for future reference knowing how to validate without just the filename would be helpful.
Trying to validate the contents of a file based on the filename extension is opening the door for major hackerdom. It's trivial to change the extension and upload the file.
If you are on a Mac/Linux/Unix-based system the OS "file" command is the standard because it looks inside the file for key bytes that flag file types. http://en.wikipedia.org/wiki/File_(Unix) I'm not sure what's available for Windows, but this might help: Determine file type in Ruby
One way of doing it, the simple way really, would be to pass the file through an image loader, preferably one that handles multiple common formats, and see if it throws an error.
The other way is to manually check the file header for common image format headers. For example, .bmp files start with BM. Other formats have their own specific markings you can use.

Resources