Decoding only part of a JSON using Codable protocol - ios

I have a JSON which contains an array of dictionaries and I decode it, using Swift's JSONDecoder class.
I wonder, is it possible to make the class to decode only some dictionaries, not all, for example (maybe based on some criteria)? I guess, this might be useful if the array contains many dictionaries, but you don't want all of them but only a single one.
If you know how to do this, I would appreciate your help.

Technically one can write an init(from:) method that manually gets the container for the decoder and then get the "nested" container (e.g., nestedUnkeyedContainer), and manually decode the items within that collection, only adding the ones you want. See Encoding and Decoding Custom Types for an introduction to writing init(from:) methods.
But I would discourage you from doing that. It's going to be much simpler and logical to parse the whole JSON and then filter the resulting collection to distill it down to the ones you want.
Unless you have a lot of records (e.g. millions?) where the parsing overhead becomes observable, I would suggest performing a decode the entire JSON and then filter your array. This will require far less code and is the more logical approach.
And if you had that many records, before I contemplated the init(from:) kludge, I would reconsider using JSON at all. I'd use CoreData or SQLite or something like that which is better suited for dynamic filtering of data as it is being extracted.

Related

Does the coder we select significantly affect performance?

I'm having trouble understanding the purpose of "coders". My understanding is that we choose coders in order to "teach" dataflow how a particular object should be encoded in byte format and how equality and hash code should be evaluated.
By default, and perhaps by mistake, I tend to put the words " implement serializable" on almost all my custom classes. This has the advantage the dataflow tends not to complain. However, because some of these classes are huge objects, I'm wondering if the performance suffers, and instead I should implement a custom coder in which I specify exactly which one or two fields can be used to determine equality and hash code etc. Does this make sense? Put another way, does creating a custom coder (which may only use one or two small primitive fields) instead of the default serial coder improve performance for very large classes?
Java serialization is very slow compared to other forms of encoding, and can definitely cause performance problems. However, only serializing part of your object means that the rest of the object will be dropped when it is sent between processes.
Much better that using Serializable, and pretty much just as easy, you can use AvroCoder by annotation your classes with
#DefaultCoder(AvroCoder.class)
This will automatically deduce an Avro schema from your class. Note that this does not work for generic types, so you'll likely want to use a custom coder in that case.

Persistent Storage of an Array containing Tuples in Swift IOS

I have this Array in a swift IOS App. I want it to be both editable and permanently stored on the App. NSFileManager doesn't like the fact the Array contains tuples. Is there any way round that or does anyone have any suggestions as to how else I could store it?
var topicBook = [("Name",["Species"],"Subject","Rotation","Unit","extra",[(0,0)],"content"),("Name",["Species"],"Subject","Rotation","Unit","extra",[(0,0)],"contentTwo"),("Name",["Species"],"Subject","Rotation","Unit","extra",[(0,0)],"contentThree")]**strong text**
From Apple in The Swift Programming Langauge:
Tuples are useful for temporary groups of related values. They are not suited to the creation of complex data structures. If your data structure is likely to persist beyond a temporary scope, model it as a class or structure, rather than as a tuple. For more information, see Classes and Structures.
And your tuple is pretty complex, so whether or not you need to persist the data I'd recommend using a struct or class anyway. Otherwise it will inevitably become hard to read and work with and will hurt the productivity of you and anyone you're working with that has to use it.

haskell parsing data structure with extra information

I have problem to extract extra information from my parsing.
I have my own data structure to parse, and that works fine. I wrote the parser for my data structure as Parse MyDataStructure which parse all the information about MyDataStructure.
The problem is that in the string I'm parsing, mixed with MyDataStructure, there is also some information about what should I do with MyDataStructure which is of course not part of MyDataStructure, i.e. I cannot store this information inside MyDataStructure.
Now the problem is that I don't know how to store this information, since in Haskell I cannot change some global variable to store information, and the return value of my parser is already MyDataStructure.
Is there a way I can somehow store this new information, without changing MyDataStructure, i.e. including field to store the extra information (but the extra information are not part of MyDataStructure so I would really like avoiding doing that)?
I hope I have been clear enough.
As #9000 says, you could use a tuple. If you find yourself needing to pass it through a number of functions, using the State Monad might make things easier.

Delphi TStringList wrapper to implement on-the-fly compression

I have an application for storing many strings in a TStringList. The strings will be largely similar to one another and it occurs to me that one could compress them on the fly - i.e. store a given string in terms of a mixture of unique text fragments plus references to previously stored fragments. StringLists such as lists of fully-qualified path and filenames should be able to be compressed greatly.
Does anyone know of a TStringlist descendant that implement this - i.e. provides read and write access to the uncompressed strings but stores them internally compressed, so that a TStringList.SaveToFile produces a compressed file?
While you could implement this by uncompressing the entire stringlist before each access and re-compressing it afterwards, it would be unnecessarily slow. I'm after something that is efficient for incremental operations and random "seeks" and reads.
TIA
Ross
I don't think there's any freely available implementation around for this (not that I know of anyway, although I've written at least 3 similar constructs in commercial code), so you'd have to roll your own.
The remark Marcelo made about adding items in order is very relevant, as I suppose you'll probably want to compress the data at addition time - having quick access to entries already similar to the one being added, gives a much better performance than having to look up a 'best fit entry' (needed for similarity-compression) over the entire set.
Another thing you might want to read up about, are 'ropes' - a conceptually different type than strings, which I already suggested to Marco Cantu a while back. At the cost of a next-pointer per 'twine' (for lack of a better word) you can concatenate parts of a string without keeping any duplicate data around. The main problem is how to retrieve the parts that can be combined into a new 'rope', representing your original string. Once that problem is solved, you can reconstruct the data as a string at any time, while still having compact storage.
If you don't want to go the 'rope' route, you could also try something called 'prefix reduction', which is a simple form of compression - just start out each string with an index of a previous string and the number of characters that should be treated as a prefix for the new string. Be aware that you should not recurse this too far back, or access-speed will suffer greatly. In one simple implementation, I did a mod 16 on the index, to establish the entry at which prefix-reduction started, which gave me on average about 40% memory savings (this number is completely data-dependant of course).
You could try to wrap a Delphi or COM API around Judy arrays. The JudySL type would do the trick, and has a fairly simple interface.
EDIT: I assume you are storing unique strings and want to (or are happy to) store them in lexicographical order. If these constraints aren't acceptable, then Judy arrays are not for you. Mind you, any compression system will suffer if you don't sort your strings.
I suppose you expect general flexibility from the list (including delete operation), in this case I don't know about any out of the box solution, but I'd suggest one of the two approaches:
You split your string into words and
keep separated growning dictionary
to reference the words and save list of indexes internally
You implement something related to
zlib stream available in Delphi, but operating by the block that
for example can contains 10-100
strings. In this case you still have
to recompress/compress the complete
block, but the "price" you pay is lower.
I dont think you really want to compress TStrings items in memory, because it terribly ineffecient. I suggest you to look at TStream implementation in Zlib unit. Just wrap regular stream into TDecompressionStream on load and TCompressionStream on save (you can even emit gzip header there).
Hint: you will want to override LoadFromStream/SaveToStream instead of LoadFromFile/SaveToFile

What datatype/structure to store file list info?

I have an application that searches files on the computer (configurable path, type etc). Currently it adds information to a database as soon as a matching file is found. Rather than that I want to hold the information in memory for further manipulation before inserting to database. The list may contain a lot of items. I consider performance as important factor. I may need iterating thru the items, so a structure that can be coded easily is another key issue. and how can I achieve php style associative arrays for this job?
If you're using Delphi 2009, you can use a TDictionary. It takes two generic parameters. The first should be a string, for the filename, and the second would be whatever data type you're associating with. It also has three built-in enumerators, one for key-value pairs, one for keys only and one for values only, which makes iterating easy.
Another solution would be to use just a standard TStringList.
As long as it's sorted and has some duplicate setting other than dupAccept, you can use indexof or indexofname to find items in the list quickly.
It also has the Objects addition which allows you to store object information attached to the name. Starting with D2009, TStringList has the OwnsObject property which allows you to delegate object cleanup to the TStringList. Prior to D2009 you have to handle that yourself.
Much of this will depend on how you are going to use the list and to what scale. If you are going to use it as a stack, or queue, then a TList would work fine. If your needing to search through the list for a specific item then you will need something that allows faster retrieval. TDictionary (2009) or TStringList (pre 2009) would be the most likely choice.
Dynamic arrays are also a possiblity, but if you use them you will want to minimize the use of SetLength as each time it is called it will re-allocate memory. TList manages this for you, which is why I suggested using a TList. if you KNOW how many you will deal with in advance, then use a dynamic array, and set its length on the onset.
If you have more items than will fit in memory then your choices also change. At that point I would either use a database table, or a tFileStream to store the records to be processed, then seek to the beginning of the table/stream for processing.
Try using the AVL-Tree by http://sourceforge.net/projects/alcinoe/ as your associative Array. It has an iterate-method for fast iteration. You may need to derive from his baseclass and implement your own comparator, but it's easy to use.
Examples are included.

Resources