AVMutableMetadataItem's time & duration INVALID after reading - ios

I have a question.
Recently I needed to add custom tags for recorded video. Local video on device not a streamed video. The task is to add some event specific tags in video, position of which could be set by pressing forward/backward like buttons like in any player.
It is not important whether the movie file will be mov file or mp4 format.
I searched on forum, found several samples how to add metadata using AVExportSession & it worked.
Although, when I tried to add metadata using AVAssetWriter. I wasn't able to append attributes to video.
What I do not understand is that after adding attribute, returned (time & duration) properties are always invalid.
For instance let's say I have a video with duration 2 seconds.
I have tried different key spaces. I am not able to write keys' from ID3 space.
IS ID3 used for stream video? (as far as I understood ID3 metadata of .mp3). Therefore, I was not able to write it into MPEG-4 file
I also used QuickTimeUserData & ISOUserData but again results are the same.
Here is an example
AVMutableMetadataItem *item2 = [AVMutableMetadataItem new];
item2.keySpace = AVMetadataKeySpaceiTunes;
item2.key = AVMetadataiTunesMetadataKeyUserComment;
item2.value = #"One two three";
item2.duration =CMTimeMakeWithSeconds(1, 1);
item2.time = CMTimeMakeWithSeconds(0, 1);
After reading I got the following:
AVMutableMetadataItem: 0xa4301f0, keySpace=itsk, key=\U00a9cmt, commonKey=(null), locale= (null), value=One two three, time={INVALID}, duration={INVALID}, extras={\n dataType = 1;\n}
I would like to use time & duration properties for metadata instead of writing custom data and processing it after that.
Ideally it would be great to append array of items with time = t1, duration = d1, .... (tn,dn).
Does anyone know how to accomplish that?

I've ended with a solution adding chapters to a video file instead of using metadata.
I looked at available libraries, took mpv4lib.
The library currently is not compiled for iOS, therefore, I ported the source project into static library for iOS platform.
That library allows to add custom "atoms" to mp4 file, and one of them is Quick Time text track, containing chapters.
I do similar with that post
The library is located here.

Related

Google cloud speech very inaccurate and misses words on clean audio

I am using Google cloud speech through Python and finding many transcriptions are inaccurate and missing several words. This is a simple script I'm using to return a transcript of an audio file, in this case 'out307.wav':
client = speech.SpeechClient()
with io.open('out307.wav', 'rb') as audio_file:
content = audio_file.read()
audio = speech.types.RecognitionAudio(content=content)
config = speech.types.RecognitionConfig(
enable_word_time_offsets=True,
language_code='en-US',
audio_channel_count=1)
response = client.recognize(config, audio)
for result in response.results:
alternative = result.alternatives[0]
print(u'Transcript: {}'.format(alternative.transcript))
This returns the following transcript:
to do this the tensions and suspicions except
This is very far off what the actual audio says (I've uploaded it at https://vocaroo.com/i/s1zdZ0SOH1Ki). The audio is a .wav and very clear with no background noise. This is worse than average, as in some cases it will get the transcription fully correct on a 10 second audio file, or it may miss just a couple of words. Is there anything I can do to improve results?
This is weird, I tried your audio file with your code and I get the same result, but, if I change the language_code to "en-UK" I am able to get the full response.
I'm working for Google Cloud and I created for you a public issue here, you can track there the updates.

Adding metadata to generated audio file

I'm generating an audio file programmatically, and I'd like to add metadata to it, such as the title and artist. I don't particularly care what format the file is written in, as long as AVPlayer will read it and send it to the playing device. (The whole goal is to send this generated audio and its track name to a Bluetooth device. I'm happy to explore easier ways to achieve this on iPhone that don't require writing the file or adding metadata directly to the file.)
So far I've discovered that AVAssetWriter will often just throw away metadata that it doesn't understand, without generating errors, so I'm stumbling a bit trying to find what combinations of file formats and keys are acceptable. So far I have not found a file format that I can auto-generate that AVAssetWriter will add any metadata to. For example:
let writer = try AVAssetWriter(outputURL: output, fileType: .aiff)
let title = AVMutableMetadataItem()
title.identifier = .commonIdentifierTitle
title.dataType = kCMMetadataBaseDataType_UTF8 as String
title.value = "The Title" as NSString
writer.metadata = [title]
// setup the input and write the file.
I haven't found any combination of identifiers or fileTypes (that I can actually generate) that will include this metadata in the file.
My current approach is to create the file as an AIFF, and then use AVAssetExportSession to rewrite it as an m4a. Using that I've been able to add enough metadata that iTunes will show the title. However, Finder's "File Info" is not able to read the title (which it does for iTunes m4a files). My assumption is that if it doesn't even show up in File Info, it's not going to be sent over Bluetooth (I'll be testing that soon, but I don't have the piece of hardware I need handy).
Studying iTunes m4a files, I've found some tags that I cannot recreate with AVMetadataItem. For example, Sort Name (sonm). I don't know how to write tags that aren't one of the known identifiers (and I've tested all 263 AVMetadataIdentifiers).
With that background, my core questions:
What metadata tags are read by AVPlayer and sent to Bluetooth devices (i.e. AVRCP)?
Is it possible to write metadata directly with AVAssetWriter to a file format that supports Linear PCM (or some other easy-to-generate format)?
Given a known tag/value that does not match any of the AVMetadataIdentifiers), is it possible to write it in AVAssetExportSession?
I'll explore third-party id3 frameworks later, but I'd like to achieve it with AVFoundation (or other built-in framework) if possible.
I've been able to use AVAssetWriter to store metadata values in a .m4a file using the iTunes key space:
let songID = AVMutableMetadataItem()
songID.value = "songID" as NSString
songID.identifier = .iTunesMetadataSongID
let songName = AVMutableMetadataItem()
songName.value = "songName" as NSString
songName.identifier = .iTunesMetadataSongName
You can write compressed .m4a files directly using AVAssetWriter by specifying the correct settings when you set up the input object, so there’s no need to use an intermediate AIFF file.

Read HLS Playlist information to dynamically change the preferredBitRate of an Item

I'm working on a video app, we are changing form regular mp4 files to HLS, one of the many reasons we have to do the change is that we hace much more control over the bandwidth usage of videos (we load lots of other stuff in our player, so we need to optimize the experience the best way).
So, AVFoundation introduced in iOS10 the ability to control the bandwidth using:
AVPlayerItem *playerItem = [AVPlayerItem playerItemWithAsset:self.urlAsset];
playerItem.preferredForwardBufferDuration = 30.0;
playerItem.preferredPeakBitRate = 200000.0; // Remember this line
There's also a configuration introduced on iOS11 to set the maximum resolution of the item with preferredMaximumResolution, So we're using it, but we still need a solution for iOS10 devices.
Well, now we have control over the preferredPeakBitRate that's nice, but we have a problem, not all the HLS sources are generated by us, so, let's say we want to set a maximum resolution of 480p when you're not connected to a wifi network, today I don't have way to achieve that, not always I'm going to be able to know how much bandwidth needs the 480p source for the selected HLS playlist.
One thing I was thinking about is to read the information inside the m3u8 file, to at least know which are the different quality sources that my player can show and how much bandwidth needs everyone.
One way to do this, would download the m3u8 playlist as a plain text, use a regex to read the file and process this data, well, I'm trying to avoid that, I think that this should far less difficult.
I cannot read this information from the tracks, because a) I can't find the information, b) the tracks are replaced dynamically when changing the quality, yeah 1 track for every quality level.
So, I don't know how I can get this information, I've searched google, stackoverflow and I can't find this information, does any one can help me?
Here's an example for what I want to do, I have this example playlist:
#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=314000,RESOLUTION=228x128,CODECS="mp4a.40.2"
test-hls-1-16a709300abeb08713a5cada91ab864e_hls_duplex_192k.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=478000,RESOLUTION=400x224,CODECS="avc1.42001e,mp4a.40.2"
test-hls-1-16a709300abeb08713a5cada91ab864e_hls_duplex_400k.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=691000,RESOLUTION=480x270,CODECS="avc1.42001e,mp4a.40.2"
test-hls-1-16a709300abeb08713a5cada91ab864e_hls_duplex_600k.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1120000,RESOLUTION=640x360,CODECS="avc1.4d001f,mp4a.40.2"
test-hls-1-16a709300abeb08713a5cada91ab864e_hls_duplex_1000k.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1661000,RESOLUTION=960x540,CODECS="avc1.4d001f,mp4a.40.2"
test-hls-1-16a709300abeb08713a5cada91ab864e_hls_duplex_1500k.m3u8
And I just want to have that information available on an array inside my code, something like this:
NSArray<ZZMetadata *> *metadataArray = self.urlAsset.bandwidthMetadata;
NSLog(#"Metadata info: %#", metadataArray);
And print something like this:
<__NSArrayM 0x123456789> (
<ZZMetadata 0x234567890> {
trackId: 1
neededBandwidth: 314000
resolution: 228x128
codecs: ...
...
}
<ZZMetadata 0x345678901> {
trackId: 2
neededBandwidth: 478000
resolution: 400x224
}
...
}

Flex/Flash Builder/Actionscript/AIR/Mobile iOS How to take video using the camera and/or browse for & view/access video stored in the 'Camera Roll"

My understanding currently is that:
CameraUI
I can use the CameraUI to access the built in camera for MediaType.VIDEO and that delegates to the built-in video camera app and lets me record a video. My app does that now.
When I stop recording and click the "Use" button, I am returned to my app and theoretically I have a valid MediaPromise.
iOS does -not- provide a valid/usable url/filename to the recorded video (or to photos) and so I would have to use a Loader to bring-in/use/access the 'recorded' video... AND... iOS does not actually create a file anywhere on the device, most importantly, in the Camera Roll where one would expect by the normal behavior when uses the system native camera/video app.
The documentation says that the Loader can load various image types and SWFs but nothing about video data, so I conclude from that that I cannot actually use the CameraUI to generate a valid MediaPromise that I can then pass to a Loader class or similar to read in the information created by the system camera and then manipulate (upload, save to applicationStorageDirectory, and/or display in one of the two video player components available in the API).
CameraRoll
I can have video entities in the iOS Camera Roll but the AS3/Air3.5 CameraRoll class won't let me view/access/reference them in any way.
Normal File I/O
All my attempts to use the Air3.5 File classes to browse to the storage location of the iOS Camera Roll have been rebuffed.
------- Questions -------
Am I correct in believing that there is a way to take video but no way to use the video that's been captured. (No way to use the resulting MediaPromise successfully).
I believe you can take video and access it using Android, but there's nothing in the documentation that says that you cannot using iOS.
Am I correct in believing that iOS sandboxes apps so that they cannot browse to video/photo storage using standard File I/O, but only through the apparently non-workable means I've tried (CameraUI & CameraRoll)
Am I wrong to think that these should be rather obvious NEEDS that one can achieve using the XCode Objective C++ etc route but the AIR Mobile Framework does not allow either because of Apple blocking functionality or because Adobe has failed to meet reasonable expectations?
One item of ironic note to convey. If I use the iOS system camera app to record a video, a thumnail of that video then appears in the Gallery/Camera Roll, and of course, I can share it or view it, or whatever... If I use AIR's CameraRoll.browseForImage(), provided I haven't used the camera to take another image, when it shows me the folder where the pictures are stored, the folder icon uses the thumbnail of the last object added... in this case, the video I took, but if I then enter the folder, the video cannot be found. It's teasing us. It knows it's there, but it is apparently forbidden fruit.
I can't answer all your questions, so this entry may not be acceptable, but I found this page while searching a solution for some the problems you described and thought that someone else may find this answer (partially) useful.
To save the movie you just took you need to open and read the data from the promise.
The iOS won't save the file anywere, so the MediaPromise.file is always null.
This is my solution to the problem:
private var camera:CameraUI;
private var dataInput:IDataInput;
public function recordVideo():void
{
// Start the camera and ask for a video
camera = new CameraUI();
camera.addEventListener(MediaEvent.COMPLETE, onCameraComplete);
camera.launch(MediaType.VIDEO);
}
private function onCameraComplete(event:MediaEvent):void
{
// event.data is a MediaPromise and MediaPromise.open() returns a IDataInput
// Let's cast it to a dispatcher and check when it's complete
dataInput = event.data.open();
var dispatcher:IEventDispatcher = IEventDispatcher(dataInput);
dispatcher.addEventListener(Event.COMPLETE, onDataInputComplete);
}
private function onDataInputComplete(event:Event):void
{
// We can do whatever we want with the data, so we'll store it in a File
var file:File = new File();
var bytes:ByteArray = new ByteArray();
var stream:FileStream = new FileStream();
// Reading the data from the opened MediaPromise
dataInput.readBytes(bytes);
stream.open(file, FileMode.WRITE);
stream.writeBytes(bytes, 0, bytes.bytesAvailable);
stream.close();
}
Also, I'm still looking for a way to put the movie in the CameraRoll

Quicktime metadata APIs and iTunes

I'm trying to set some metadata in a .mov file with the quicktime metadata APIs and have it show up in iTunes. I've got it working for most of the properties, but I can't get the description field to populate. Here is the code I'm using (shortened to only show what I think is the relevant portion).
const char* cString = ([#"HELLO WORLD" cStringUsingEncoding:NSMacOSRomanStringEncoding]);
QTMovie* qtMovie = [[QTMovie alloc] initWithFile:filename error:&error];
Movie movie = [qtMovie quickTimeMovie];
QTMetaDataRef metaDataRef = NULL;
OSStatus err = noErr;
err = QTCopyMovieMetaData(movie, &metaDataRef);
QTMetaDataItem outItem;
QTMetaDataAddItem(metaDataRef,
kQTMetaDataStorageFormatiTunes,
kQTMetaDataKeyFormatCommon,
(const UInt8 *)&key,
sizeof(key),
(const UInt8 *)cString,
strlen(cString),
kQTMetaDataTypeUTF8,
&outItem);
I found the following link, stating that for the information and description properties, I should be using kQTMetaDataStorageFormatQuicktime, but that doesn't seem to make any difference. Has anyone else had any success getting the description column to populate when importing metadata into iTunes videos?
http://lists.apple.com/archives/quicktime-api/2006/May/msg00115.html
I ended up using AtomicParsley http://atomicparsley.sourceforge.net/ without any issues which also has the benefit that it supports mp4 and m4v files and not just mov files which is also something I needed. With that the descriptions showed up fine. It was also much easier to use than the QTMetaData api.
Edit: Argh.. Just found out that this app doesn't work with mov files. This will work with mp4 and m4v files, but I guess the original question still stands because I would like to support mov files as well.
Figured it out finally with the help of this post and some deep debugging into the contents of my tagged media.
Retrieving the key name on AVMetadataItem for an AVAsset in iOS
I set the data format to kQTMetaDataStorageFormatiTunes and the key format to kQTMetaDataKeyFormatiTunesShortForm. And then the tags I use are the encoded id3 tags like in the post above. The common keys (kQTMetaDataCommonKeyArtist, kQTMetaDataCommonKeyComment) will generally not work if your goal is to view the data in iTunes. It seems a couple of them still do work, but in general they don't map over properly to their id3 counterparts.

Resources