I thought I'd ask after hours of inconclusive research and tests:
Introduction
I'm trying to send very large arrays of Doubles from an app to a server, naturally, I want to compress this as much as possible.
Specifically, these array contain CMDeviceMotion components (acceleration, x, y, z, gyroscope, etc...), but this question should apply to any large array of numbers (over 100K or a million values)
What I've tried and found by researching options
Say I have a large array of Double (There are many others) :
var CMX = CM.map({$0.userAcceleration.x})
here, CMX is of type [Double] and CM is [CMDeviceMotion]
I've tried making POST requests to my server by sending CMX in different ways, then calculating the total size after I receive it on the server :
First, as a single comma separated string :
{"AX":"-0.0441827848553658,-0.103976868093014,-0.117475733160973,-0.206566318869591,-0.266509801149368,-0.282151937484741,-0.260240525007248,-0.266505032777786,-0.315020948648453,-0.305839896202087,0.0255246963351965,0.0783950537443161,0.0749507397413254,0.0760494321584702,-0.0101579604670405,0.106710642576218,0.131824940443039,0.0630970001220703,0.21177926659584,0.27022996544838,0.222621202468872,0.234281644225121,0.288497060537338,0.176655143499374,0.193904414772987,0.169417425990105,0.150193274021149,0.00871349219232798,-0.0270088445395231,-0.0 ....
Size 153 Kb.
It makes sense that this is larger than sending as binary data, since a single number here is 64 bits (8 bytes), and becomes 17 bytes long (one byte per character) +1 = 18 (added a character for the comma).
With this reasoning, sending the array as binary data should be smaller.
Base 64 encoding
Here, I convert the array to a Data object using NSKeyedArchiver and base 64 encode the data before sending it :
["AX":NSKeyedArchiver.archivedData(withRootObject:CM.map({$0.userAcceleration.x})).base64EncodedString()]
This made the file size 206 Kb
Sending the data as a JSON array
By just sending :
["AX": CM.map({$0.userAcceleration.x})]
It turned out that this array of numbers was practically converted to a comma separated string, the size ended up being the same as in trial 1 (160Kb)
Sending as Data without base 64 encoding
Doing this:
["AX":NSKeyedArchiver.archivedData(withRootObject:CM.map({$0.userAcceleration.x}))
made the application crash at runtime, so I can't send a Data object as a value in a JSON
Question
How can I send these array in a more condensed way in a JSON object ?
Note that I already have downsampling in mind, and using 32 bit floats to reduce the size.
Simple way would be to do this:
let data: Data = CMX.withUnsafeBufferPointer { pointer in
return Data(buffer: pointer)
}
And you have binary buffer with all your Doubles/Floats combined.
But because HTTP is text-based protocol you will have to convert this data to base64 string:
let base64String = data.base64EncodedString()
And this base64String should be passed for AX parameter of your POST(?) HTTP request.
EDIT:
To convert it back you may use code like this:
extension Array {
init?(data: Data) {
// This check should be more complex, but here we just check if total byte count divides to one element size in bytes
guard data.count % MemoryLayout<Element>.size == 0 else { return nil }
let elementCount = data.count / MemoryLayout<Element>.size
let buffer = UnsafeMutableBufferPointer<Element>.allocate(capacity: elementCount)
data.copyBytes(to: buffer)
self = buffer.map({$0})
buffer.deallocate()
}
// Wrapped here code above
var data: Data {
return self.withUnsafeBufferPointer { pointer in
return Data(buffer: pointer)
}
}
}
let converted: [Double]? = Array(data: CMX.data) // converted now should be equal to CMX
Related
I'm trying to compress data to improve the space complexity, but I'm not sure if I'm incorrectly compressing data or incorrectly measuring the size.
I tried the following in the Playground.
import Foundation
import Compression
// Example data
struct MyData: Encodable {
let property = "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum."
}
// I tried using MemoryLayout to measure the size of the uncompressed data
let size = MemoryLayout<MyData>.size
print("myData type size", size) // 16
let myData = MyData()
let myDataSize = MemoryLayout.size(ofValue: myData)
print("myData instance size", myDataSize) // 16
func run() {
// 1. This shows the size of the encoded data
guard let encoded = try? JSONEncoder().encode(myData) else { return }
print("myData encoded size", encoded) // 589 bytes
/// 2. This shows the size after using a first compression method
guard let compressed = try? (encoded as NSData).compressed(using: .lzfse) else { return }
let firstCompression = Data(compressed)
print("firstCompression", firstCompression) // 491 bytes
/// 3. Second compression method (just wanted to try a different compression method)
let secondCompression = compress(encoded)
print("secondCompression", secondCompression) // 491 bytes
/// 4. Wanted to compare the size difference between compressed and uncompressed for a bigger data so here is the array of uncompressed data.
var myDataArray = [MyData]()
for _ in 0 ... 100 {
myDataArray.append(MyData())
}
guard let encodedArray = try? JSONEncoder().encode(myDataArray) else { return }
print("myData encodedArray size", encodedArray) // 59591 bytes
print("memory layout", MemoryLayout.size(ofValue: encodedArray)) // 16
/// 5. Compressed array
var compressedArray = [Data]()
for _ in 0 ... 100 {
guard let compressed = try? (encoded as NSData).compressed(using: .lzfse) else { return }
let data = Data(compressed)
compressedArray.append(data)
}
guard let encodedCompressedArray = try? JSONEncoder().encode(compressedArray) else { return }
print("myData compressed array size", encodedCompressedArray) // 66661 bytes
print("memory layout", MemoryLayout.size(ofValue: encodedCompressedArray)) // 16
/// 6. Compression using lzma
var differentCompressionArray = [Data]()
for _ in 0 ... 100 {
guard let compressed = try? (encoded as NSData).compressed(using: .lzma) else { return }
let data = Data(compressed)
differentCompressionArray.append(data)
}
guard let encodedCompressedArray2 = try? JSONEncoder().encode(differentCompressionArray) else { return }
print("myData compressed array size", encodedCompressedArray2) // 60702 bytes
print("memory layout", MemoryLayout.size(ofValue: encodedCompressedArray2)) // 16
}
run()
// The implementation for the second compression method
func compress(_ sourceData: Data) -> Data {
let pageSize = 128
var compressedData = Data()
do {
let outputFilter = try OutputFilter(.compress, using: .lzfse) { (data: Data?) -> Void in
if let data = data {
compressedData.append(data)
}
}
var index = 0
let bufferSize = sourceData.count
while true {
let rangeLength = min(pageSize, bufferSize - index)
let subdata = sourceData.subdata(in: index ..< index + rangeLength)
index += rangeLength
try outputFilter.write(subdata)
if (rangeLength == 0) {
break
}
}
}catch {
fatalError("Error occurred during encoding: \(error.localizedDescription).")
}
return compressedData
}
The MemoryLayout object doesn't seem to be helpful in measuring the size of encoded arrays whether or not they're compressed. I'm not sure how to measure a struct or an array of struts without encoding them with JSONEncoder which already compresses the data.
The before/after compression for the single instance of MyData (#1, #2, and #3) seems to show that the data is being properly compressed going from 589 bytes to 491 bytes. However, the comparison between an array of uncompressed data and an array of compressed data (#4, #5) seems to show that the size increased from 59591 to 66661 after the compression.
Finally, I tried using a different compression algorithm lzma (#6). It reduced the size to 60702 which is lower than the previous compression, but it still wasn't smaller than the uncompressed data.
To get a bit of confusion out of the way first: MemoryLayout gives you information about the size and structure of the layout of a type at compile time, but can't be used to determine the amount of storage an Array value needs at runtime because the size of the Array structure itself does not depend on how much data it contains.
Highly simplified, the layout of an Array value looks like this:
┌─────────────────────┐
│ Array │
├──────────┬──────────┤ ┌──────────────────┐
│ length │ buffer ─┼───▶│ storage │
└──────────┴──────────┘ └──────────────────┘
1 word / 1 word /
8 bytes 8 bytes
└─────────┬─────────┘
└─▶ MemoryLayout<Array<UInt8>>.size
An Array value stores its length, or count (mixed in with some flags, but we don't need to worry about that) and a pointer to the actual space where the items it contains are stored. Those items aren't stored as part of the Array value itself, but separately in allocated memory which the Array points to. Whether the Array "contains" 10 values or 100000 values, the size of the Array structure remains the same: 1 word (or 8 bytes on a 64-bit system) for the length, and 1 word for the pointer to the actual underlying storage. (The size of the storage buffer, however, is exactly determined by the number of elements it is able to contain, at runtime.)
In practice, Array is significantly more complicated than this for bridging and other reasons, but this is the basic gist; this is why you only ever see MemoryLayout.size(ofValue:) return the same number every time. [And incidentally, the size of String is the same as Array for similar reasons, which is why MemoryLayout<MyData>.size also reports 16.]
In order to know how many bytes an Array or a Data effectively take up, it's sufficient to ask them for their .count: Array<UInt8> and Data are both collections of UInt8 values (bytes), and their .count will reflect the amount of data effectively stored in their underlying storage.
As for the size increase between step (4) and (5), note that
Step 4 takes 100 copies of your MyData and joins them together before converting them to JSON, while
Step 5 takes 100 copies of individually compressed MyData instances, joins those together, and then re-coverts them to JSON
Step 5 has a few issues compared to step 4:
Compression benefits heavily from repetition in data: a bit of data compressed and repeated 100 times won't be nearly as compact as a bit of data repeated 100 times, then compressed, because each round of compression can't benefit from knowing that there's another copy of the data that came before it. As a simple example:
Let's say we wanted to use a form of run-length encoding to compress the string Hello: there isn't a lot we can do, except maybe turn it into Hel{2}o (where {2} indicates a repetition of the last character 2 times)
If we compress Hello and join it 3 times, we get might get Hel{2}oHel{2}oHel{2}o,
But if we first joined Hello 3 times and then compressed, we could get {Hel{2}o}{3}, which is much more compact
Compression also typically needs to insert some information about how the data was compressed in order to be able to recognize and decompress the data later. By compressing MyData 100 times and joining all of those instances, you're repeating that metadata 100 times
Even after compressing your MyData instances, re-representing them as JSON decreases how compressed they are because it can't represent the binary data exactly. Instead, it has to convert each Data blob into a Base64-encoded string, which causes it to grow again
Between these issues, it's not terribly surprising that your data is growing. What you actually want is a modification to step 4, which is compressing the joined data:
guard let encodedArray = try? JSONEncoder().encode(myDataArray) else { fatalError() }
guard let compressedEncodedArray = try? (encodedArray as NSData).compressed(using: .lzma) else { fatalError() }
print(compressedEncodedArray.count) // => 520
This is significantly better than
guard let encodedCompressedArray = try? JSONEncoder().encode(compressedArray) else { fatalError() }
print(encodedCompressedArray.count) // => 66661
As an aside: it seems unlikely that you're actually using JSONEncoder in practice to join data in this way, and this was just for measurement here — but if you actually are, consider other mechanisms for doing this. Converting binary data to JSON in this way is very inefficient storage-wise, and with a bit more information about what you might actually need in practice, we might be able to recommend a more effective way to do this.
If what you're actually doing in practice is encoding an Encodable object tree and then compressing that the one time, that's totally fine.
I want to just have the data chunk of the .wav file and exclude all other chunks i.e the riff headers.
let voiceData = try? Data(contentsOf: soundUrl).advanced(by: 44)
I did try this but for some reason, there is still some baggage left before the actual audio. could anyone please help me with this issue. if there an efficient way to read the .wav file and only include the data section?
First, are you certain this is actually a WAV file. WAV does typically have 44 bytes of header. Why do you believe there is "some baggage?" How are you determining that?
You can of course parse the RIFF format directly. The easiest (sloppiest) approach is to scan down until you find the bytes "data" (0x64 61 74 61). The next 4 bytes will the the length (in little-endian format, which you can skip if you're just going to read to the end), followed by the actual data you want.
Finding the data bytes is done with range(of:)
let dataBytes = Data([0x64, 0x61, 0x74, 0x61])
if let dataRange = riff.range(of: dataBytes) {
let start = dataRange.endIndex + 4 // Skip over length bytes
let samples = riff[start...] // read the rest of the bytes
// use samples
}
I'm sending a video via OutputStream.write(_maxLength:) but the write method doesn't send all the data bytes but only a fixed amount every time.
The total data count is videoData.count = 7357450 but the bytes written(returned by outputStream.write) is only 131768.
This is the method for writing to output stream.
extension OutputStream {
func write(data: Data) -> Int {
return data.withUnsafeBytes { write($0, maxLength: data.count) }
}
}
Is there something wrong with the code?
Is there a way to increase the .write capacity?
Note: This is not related to this question: Writing Data to an NSOutputStream in Swift 3. This question asks how to write while my question is about the limits of writing data.
There's a lot of similar questions on here, but i couldn't find an answer that's help me in solving this problem.
What I want to do is upload an image with Alamofire with parameters
and the image itself should be part of the parameters.
For example the parameters should be like this:
["user_id": 1, "picture": 'and here will be the image that i want to upload']
in the response i will be getting the image as a link like this:
"/uploads/members/'user_id'/images/member_'user_id'/28121284ase2.jpg"
something similar to that.
Also the image should be ' jpg, png and not more than 5MB '.
I'm new to uploading images w\ API so I don't what to do exactly.
I tried every solution I could find in here and google but no luck so far. Any help would be appreciated.
I'm going to assume that you have an instance of UIImage to start with, you'll have to encode it to base64 and send it to your backend as you would any other String parameter. This should help :
extension UIImage {
func toBase64() -> String? {
guard let imageRectoData = jpeg(.low) else {
return nil
}
return imageRectoData.base64EncodedString()
}
enum JPEGQuality: CGFloat {
case lowest = 0
case low = 0.25
case medium = 0.5
case high = 0.75
case highest = 1
}
/// Returns the data for the specified image in JPEG format.
/// If the image object’s underlying image data has been purged, calling this function forces that data to be reloaded into memory.
/// - returns: A data object containing the JPEG data, or nil if there was a problem generating the data. This function may return nil if the image has no data or if the underlying CGImageRef contains data in an unsupported bitmap format.
func jpeg(_ jpegQuality: JPEGQuality) -> Data? {
return jpegData(compressionQuality: jpegQuality.rawValue)
}
}
You can do that by using multipart/form-data request.
your question is relative to this one: Upload image with multipart form-data iOS in Swift
I am an intermediate student in iOS development, I am trying to make a method that uploads an image to a server. I understand the server side scripting in PHP.
But when I am following a tutorial to upload an image in Xcode, I don't really grasp about NSData, NSObject, NSMutableData, NSString, it seems the tutorial doesn't really the fundamental aspect of NSData, NSMutableData, NSString...
If want to take beautiful display, I should learn about auto layout, collection view etc.
So, what kind of topic in iOS development that I should learn to really understand about these things step by step? It seems that I never learn specifically about these things. I don't know where to start.
The code to upload an image is like this:
func createBodyWithParameters(parameters: [String: String]?, filePathKey: String?, imageDataKey: NSData, boundary: String) -> NSData {
let body = NSMutableData();
if parameters != nil {
for (key, value) in parameters! {
body.appendString("--\(boundary)\r\n")
body.appendString("Content-Disposition: form-data; name=\"\(key)\"\r\n\r\n")
body.appendString("\(value)\r\n")
}
}
let filename = "user-profile.jpg"
let mimetype = "image/jpg"
body.appendString("--\(boundary)\r\n")
body.appendString("Content-Disposition: form-data; name=\"\(filePathKey!)\"; filename=\"\(filename)\"\r\n")
body.appendString("Content-Type: \(mimetype)\r\n\r\n")
body.appendData(imageDataKey)
body.appendString("\r\n")
body.appendString("--\(boundary)--\r\n")
return body
}
func generateBoundaryString() -> String {
return "Boundary-\(NSUUID().UUIDString)"
}
}
extension NSMutableData {
func appendString(string: String) {
let data = string.dataUsingEncoding(NSUTF8StringEncoding, allowLossyConversion: true)
appendData(data!)
}
}
Learning about objects you have described may mean many things. If you are after their capabilities then the documentation should be enough. But if you are more after what is under the hood and why we need these objects then I could only suggest you to look into some older language like C.
These objects NSData, NSMutableData, NSString are all data containers, buffers. But NSObject is just a base class from which all other objects inherit.
So a bit about NSData and NSMutableData:
In C when creating a raw buffer you use malloc which reserves a chunk in memory which you may use as you please. Once done you need to call free to release that memory or you will have a memory leak.
void *voidPointer = malloc(100); // Reserved 100 bytes of whatever data
int *intPointer = (int *)malloc(sizeof(int)*100); // Reserved enough bytes to fit 100 integers whatever their size may be
intPointer[13] = 1; // You may use these as normal array
free(voidPointer); // Free this memory
free(intPointer); // Free this memory
So NSData is basically a wrapper for that and does all of it for you. You may even access the raw pointer by calling bytes on NSData object.
Then the mutable version NSMutableData is just a subclass which has some additional functionality. You may actually append data. From what is under the hood appending data is not so simple. You need to allocate a new memory chunk, copy old data to it, copy new data and release the previous memory chunk.
void *currentData = malloc(100); // Assume we have some data
void *dataToAppend = malloc(100); // Another chunk of data we want to append
void *combinedData = malloc(200); // We need a larger buffer now
memcpy(combinedData, currentData, 100); // Copy first 100 bytes to new data
memcpy(combinedData+100, dataToAppend, 100); // Copy next 100 bytes to new data
free(currentData); // Free old data
free(dataToAppend); // Free old data
... use combinedData here ...
free(combinedData); // Remember to free combined data once done
These are all really simple methods but they may already be pain to write and it is easy to produce bugs doing so. So NSData or NSMutableData and even Data in Swift are all just data containers that make your developer life easier. And in Objective-C conversion from data to C buffers is as easy as it gets:
NSData *myData = [NSData dataWithBytes:myDataPointer length:myDataLength];
void *myRawPointer = [myData bytes];
The NSString is not really that different. In C we again have character pointer which is used as string so we write something like:
char *myText = "Some text";
These are a bit special, a convenience really. We could as well do:
char *myText = (char *)malloc(sizeof(char)*100);
And then fill the data character by character:
myText[0] = 'S';
myText[1] = 'o';
myText[2] = 'm';
...
myText[9] = 't';
myText[10] = '\0'; // We even need to set a null terminator at the end
and then we needed to free the memory again... But never mind the C strings, NSString is again a wrapper that is responsible to allocate the memory, assign data and do whatever you want with it. It has may methods you can use simply to make your life easier.
As to the code you posted it is a combination of the two. In your case your API accepts images as multipart form data requests which you may understand as a raw image file with a few texts added around it just to explain what the data contains. It is one of a generally used way but not the only one. You might as well just post the raw image data or you might even post a JSON containing a base64 string encoded data. Also as usually these texts are represented as an utf8 encoded data.
In the end it is a set of standards that are generally used so our computers may communicate between each other. Your image is most likely defined by a standard from png or jpg on how to present it with a string of bytes, your strings are defined by utf8 standard and your whole request body is defined by some HTTP standards (not even sure what part of it is that). And the objects you use and want to learn about are just some helpers for achieving your result. Understanding them in most cases is like understanding a screwdriver; you won't need to in most cases, but you do need to know they exist and you need to know when to use them.
The code itself you posted is relatively bad but should do its job. For a beginner it might be a bit confusing even. Probably a more logical pseudocode for this solution would be something like:
let imageData: Data // My image data
let headerString: String // Text I need to put before the image data
let footerString: String // Text I need to put after the image data
var dataToSend: Data = Data() // Generate data object
dataToSend.append(headerString.utf8Data) // Append header
dataToSend.append(imageData) // Append raw data
dataToSend.append(footerString.utf8Data) // Append footer
I hope this clears up a few things.