I want to integrate Flite TTS into my iOS app. The Flite download is close to 68 MB. The source files added to the project occupy a lot of size. I want to know the extra overhead of using Flite TTS library in my iOS app.
Flite overhead is about 3 MB for a single voice and can go upto 15 MB for multiple voices. The quality of voice is also not that great for the default 3 MB voice.
Related
I want my app to have a bunch of 30-sec mp4 clips. I want to ship these clips with the App and not have the users download them from the cloud
Each of my clip is around 5 MB and I expect to have a lot of them.
Is there a way to compress them to reduce the app download size? ( the 5Mb size is after all the CODEC's etc) I need an iOS solution for this.
MP4 is very compressed already so there isn't a way to compress it more. That's why zipping mp4s barely changes their size.
You have two options:
1) Include whichever ones the user needs first and download the rest, hopefully before they're needed.
2) If you absolutely have to have them all in the app you could reduce the resolution and/or encode at a lower bitrate.
If you go with option 2, you could still download higher quality ones from the cloud in the background and use those if available, but default to the lower quality ones if not.
We are using Web Audio API to play and manipulate audio in a web app.
When trying to decode large mp3 files (around 5MB) the memory usage spikes upwards in Safari on iPad, and if we load another similar size file it will simply crash.
It seems like Web Audio API is not really usable when running on the iPad unless we use small files.
Note that the same code works well on Chrome Desktop version - Safari version does complain on high memory usage.
Does anybody knows how to get around this issue? or what's the memory limit for playing audio files using Web Audio on an iPad?
Thanks!
Decoded audio files weight a lot more in RAM than on disk. A single sample uses 4 bytes (32-bit float). This translates to 230 MB of RAM for 10 minutes of audio at 48 000 Hz sample rate and in stereo. One hour of audio at the same sample rate and with stereo will take ~1,3 GB of RAM!
So, if you decode a lot of files, you can consume big amounts of RAM. My suggestion is to "undecode" files that you don't need (just "forget" unneeded audio buffers, so garbage collector can free memory).
You can also use mono audio files instead of stereo, that should reduce memory usage by half.
Note, that decoded audio files are always resampled to device's sample rate. This means that using audio with low sample rates won't help with memory usage.
I have used the vlc plugin(vlc web plugin 2.1.3.0) in Firefox to display the receiving live stream from my server into my browser. and i need to display 16 channels into one web page, but when i play more than 10 channels in the same time, i show that the processor is 100% and some breaking in the video appear. i have checked the plugin-memory in the running task, i have showed that around 45 MB from memory is dedicated for each video (so 10 channels : 10 * 45 = 450 MB).
kindly, do you have any method to reduce the consumption of the VLC plugin to allow the display of 16 channels in the same time ?
best regards,
There is no way to do that correctly. You could probably save a few megabytes by disabling audio decoding if there are audio tracks in one of your 16 streams in case you don't need them. Except for that, 45MB per stream is quite reasonable in terms of VLC playback and won't be able to go much below that, unless you reduce the video dimensions.
Additionally, your problem is probably not the use of half a giga byte of memory (Chrome and Firefox easily manage to use that much memory by themselves if you open a few tabs), but that VLC exceeds your CPU capacity. Make sure not to use windowless playback since this is less efficient that the normal windowed mode.
VLC 2.2 will improve the performance of the webplugins on windows by adding hardware acceleration known from the standalone application.
I am developing a game for iOS. The memory I am using is around 80 MB according to the profile tool (no leaks). That just seems like a lot of memory to me. How much memory usage is safe, and are there any special programming issues associated with using a lot of memory?
It's all about what devices you plan to get you game. iPads are just fine with 100 MB of ram games, iPhone 3G's don't even have 100 mb. If you use a lot of memory you app will be forced to close, this could cause major problems if the user is on an old device.
i am currently trying out ivona SDK for iOS, amazing voice and very very natural.
But the voice i am using (german female) have a voicefile with a filesize of 230 MB.
when i want to use 4 voices then my app is approximately 1GB big.
And also no use for offline. Is this voice just for the testphase? Or is it also for production?
I think its horrible to implement a few voices for a small TTS application so that the app size is very very huge...
can someone give me an answer to that?
Perhaps the best solution would be to include no voices and allow the user to download which voice they would prefer to use. You could also unlock each voice as a separate in app purchase if you are attempting to monetize each voice.
Voices for testing are indeed the same as for production. But at IVONA they have different sizes for each voice:
You could opt to use IVONA voices for automotive/navigation systems. These voices are limited so they only are about ~70 MB in size, and they are at 16 kHz instead of 22 kHz. If you have a navigation app these are for you. Otherwise just give it a try, you may ask your contact at IVONA about this.
In my project we use 5 of these voices, each "vox" file is between 65-74 MB.
But even these smaller voices grow the bundle pretty much (but not as much as your 230 MB) we decided to download them on demand (per IAP, hosted at Apple). Consider that users normally need only one language, so it'd be a waste of space to bundle more than one voice with the app.
Another option is to prepare a set of samples and bundle them instead of the IVONA voice. Of course this only works if you have a limited set of texts without dynamic parts. And maybe write a small sound queueing engine to splice sounds together, e.g. numbers.