Trying to append CVPixelBuffers to AVAssetWriterInputPixelBufferAdaptor at the intended framerate - ios

I'm trying to append CVPixelBuffers to AVAssetWriterInputPixelBufferAdaptor at the intended framerate, but it seems to be too fast, and my math is off. This isn't capturing from the camera, but capturing changing images. The actual video is much to fast than the elapsed time it was captured.
I have a function that appends the CVPixelBuffer every 1/24 of a second. So I'm trying to add an offset of 1/24 of a second to the last time.
I've tried:
let sampleTimeOffset = CMTimeMake(value: 100, timescale: 2400)
and:
let sampleTimeOffset = CMTimeMake(value: 24, timescale: 600)
and:
let sampleTimeOffset = CMTimeMakeWithSeconds(0.0416666666, preferredTimescale: 1000000000)
I'm adding onto the currentSampleTime and appending like so:
self.currentSampleTime = CMTimeAdd(currentSampleTime, sampleTimeOffset)
let success = self.assetWriterPixelBufferInput?.append(cv, withPresentationTime: currentSampleTime)
One other solution I thought of is get the difference between the last time and the current time, and add that onto the currentSampleTime for accuracy, but unsure how to do it.

I found a way to accurately capture the time delay by comparing the last time in milliseconds compared to the current time in milliseconds.
First, I have a general current milliseconds time function:
func currentTimeInMilliSeconds()-> Int
{
let currentDate = Date()
let since1970 = currentDate.timeIntervalSince1970
return Int(since1970 * 1000)
}
When I create a writer, (when I start recording video) I set a variable in my class to the current time in milliseconds:
currentCaptureMillisecondsTime = currentTimeInMilliSeconds()
Then in my function that's supposed to be called 1/24 of a second is not always accurate, so I need to get the difference in milliseconds between when I started writing, or my last function call.
Do a conversion of milliseconds to seconds, and set that to CMTimeMakeWithSeconds.
let lastTimeMilliseconds = self.currentCaptureMillisecondsTime
let nowTimeMilliseconds = currentTimeInMilliSeconds()
let millisecondsDifference = nowTimeMilliseconds - lastTimeMilliseconds
// set new current time
self.currentCaptureMillisecondsTime = nowTimeMilliseconds
let millisecondsToSeconds:Float64 = Double(millisecondsDifference) * 0.001
let sampleTimeOffset = CMTimeMakeWithSeconds(millisecondsToSeconds, preferredTimescale: 1000000000)
I can now append my frame with the accurate delay that actually occurred.
self.currentSampleTime = CMTimeAdd(currentSampleTime, sampleTimeOffset)
let success = self.assetWriterPixelBufferInput?.append(cv, withPresentationTime: currentSampleTime)
When I finish writing the video and I save it to my camera roll, it is the exact duration from when I was recording.

Related

generateCGImagesAsynchronously produces duplicate images according to actualTime

I am able to get the frames from a video using AVAssetImageGenerator, and I do so using generateCGImagesAsynchronously. Whenever the result succeeds, I print the requestedTime in seconds as well as the actualTime for that image. In the end, however, two generated images are the same and have the same actualTimes even though my step value and frames for times are evenly spaced out.
Here is a snippet of my printed requested and actual times in seconds for each image:
Requested: 0.9666666666666667
Actual: 0.9666666666666667
Requested: 1.0
Actual: 1.0
Requested: 1.0333333333333334
Actual: 1.0333333333333334
Requested: 1.0666666666666667
Actual: 1.0666666666666667
Requested: 1.1
Actual: 1.1
Requested: 1.1333333333333333
Actual: 1.1
Requested: 1.1666666666666667
Actual: 1.135
It seems to be going fine until the frame corresponding to 1.1 seconds in the video is generated, which results in two of the same images and the actualTime to be delayed for the rest of the process.
I've already tried adjusting the way in which I compute the frames that should be generated, but it seems to be correct. I am using frames per second and multiplying that by the video duration to figure out how many frames I need to have in total, and I'm dividing the total duration by the sample counts to make sure cgImages are generated evenly.
let videoDuration = asset.duration
print("video duration: \(videoDuration.seconds)")
let videoTrack = asset.tracks(withMediaType: AVMediaType.video)[0]
let fps = videoTrack.nominalFrameRate
var frameForTimes = [NSValue]()
let sampleCounts = Int(videoDuration.seconds * Double(fps))
let totalTimeLength = Int(videoDuration.seconds * Double(videoDuration.timescale))
let step = totalTimeLength / sampleCounts
for i in 0 ..< sampleCounts {
let cmTime = CMTimeMake(value: Int64(i * step), timescale: Int32(videoDuration.timescale))
frameForTimes.append(NSValue(time: cmTime))
}
and the way in which I create images (see this):
imageGenerator.generateCGImagesAsynchronously(forTimes: timeValues) { (requestedTime, cgImage, actualTime, result, error) in
if let cgImage = cgImage {
print("Requested: \(requestedTime.seconds), Actual: \(actualTime.seconds)")
let image = UIImage(cgImage: cgImage)
// scale image if you want
frames.append(image)
}
}
I also set tolerance to zero before calling generateCGImages:
imageGenerator.requestedTimeToleranceBefore = CMTime.zero
imageGenerator.requestedTimeToleranceAfter = CMTime.zero
I expected the actual times to be consistent with the requested times, and for each image produced to be different. Looking through the images, there is always a duplicate regardless of the video being tested, and it normally occurs towards the middle to end.
Edit:
I found this, this, and this which mention the same problem but I've had no success with any of them.

Getting Duration of AVPlayer in minutes

I am trying to get the duration of an AVPlayer asset in Hours, Minutes, Seconds. I am able to get the time but it seems to be in seconds and milliseconds.
This is how I get the time:
let duration : CMTime = (player.currentItem!.asset.duration)!
let seconds : Float64 = CMTimeGetSeconds(duration)
I am then applying that to a slider using
slider.maximumValue = Float(seconds)
The outcome of this obviously gives me the duration in seconds however I want to be able to use the duration to set the maximumValue of my slider for video clips which may be under a minute.
For Example: My code above returns 30.865 for a 30 second clip. I need it to return 0.30
This ended up working for me:
let duration : CMTime = (player.currentItem!.asset.duration)!
let timeInMinutes = Float(duration.value)

AVAsset video adjust duration

Given a list of CMSampleBuffers that have been read in from an asset, I want to adjust the duration of the asset so that it's half length (twice the speed) of the original.
Currently my function for generating new time stamps looks like:
func adjustTimeStampsForBuffers(buffers: [CMSampleBuffer]) -> [CMTime] {
let frameCount = buffers.count
// self.duration is CMTimeGetSeconds(asset.duration)
let increment = Float(self.duration / 2) / Float(frameCount)
return Array(0.stride(to: frameCount, by: 1)).enumerate().map {
let seconds: Float64 = Float64(increment) * Float64($0.index)
return CMTimeMakeWithSeconds(seconds, self.asset.duration.timescale)
}
}
however this doesn't seem to work and the outputted assets are in fact twice the length, not half. Can anybody point out where I'm going wrong?
Edit:
Thanks to #sschale, here's my final answer:
func adjustTimeStampsForBuffers(buffers: [CMSampleBuffer]) -> [CMTime] {
return buffers.map {
let time = CMSampleBufferGetPresentationTimeStamp($0)
return CMTimeMake(time.value, time.timescale * 2)
}
}
Instead of calculating new values, the timestamp is adjusted instead.
Based on my reading of the docs, it looks like that self.asset.duration.timescale may be the key here, as changing it will influence the whole file (if I'm understanding the reference you're making that that timescale is for the whole file, or maybe you need to adjust it in each of the buffers).
See here for more info as well.
Relevant section:
A CMTime is represented as a rational number, with a numerator (an
int64_t value), and a denominator (an int32_t timescale).
Conceptually, the timescale specifies the fraction of a second each
unit in the numerator occupies. Thus if the timescale is 4, each unit
represents a quarter of a second; if the timescale is 10, each unit
represents a tenth of a second, and so on. In addition to a simple
time value, a CMTime can represent non-numeric values: +infinity,
-infinity, and indefinite. Using a flag CMTime indicates whether the time been rounded at some point.
CMTimes contain an epoch number, which is usually set to 0, but can be
used to distinguish unrelated timelines: for example, it could be
incremented each time through a presentation loop, to differentiate
between time N in loop 0 from time N in loop 1

How to get the accurate time position of a live streaming in avplayer

I'm using AVPlayer to play a live streaming. This stream supports one hour catch-up which means user can seek to one hour ago and play. But I have one question how do I know the accurate position that the player is playing. I need to display current position on the player view. For example,if user is playing half an hour ago then display -30:00; if user is playing the latest content, the player will show 00:00 or live. Thanks
Swift solution :
override func getLiveDuration() -> Float {
var result : Float = 0.0;
if let items = player.currentItem?.seekableTimeRanges {
if(!items.isEmpty) {
let range = items[items.count - 1]
let timeRange = range.timeRangeValue
let startSeconds = CMTimeGetSeconds(timeRange.start)
let durationSeconds = CMTimeGetSeconds(timeRange.duration)
result = Float(startSeconds + durationSeconds)
}
}
return result;
}
To get a live position poison and seek to it you can by using seekableTimeRanges of AVPlayerItem:
CMTimeRange seekableRange = [player.currentItem.seekableTimeRanges.lastObject CMTimeRangeValue];
CGFloat seekableStart = CMTimeGetSeconds(seekableRange.start);
CGFloat seekableDuration = CMTimeGetSeconds(seekableRange.duration);
CGFloat livePosition = seekableStart + seekableDuration;
[player seekToTime:CMTimeMake(livePosition, 1)];
Also when you seek some time back, you can get current playing position by calling currentTime method
CGFloat current = CMTimeGetSeconds([self.player.currentItem currentTime]);
CGFloat diff = livePosition - current;
I know this question is old, but I had the same requirement and I believe the solutions aren't addressing properly the intent of the question.
What I did for this same requirement was to gather the current point in time, the starting time, and the length of the total duration of the stream.
I'll explain something before going further, the current point in time could surpass the (starting time + total duration) this is due to the way hls is structured as ts segments. Ts segments are small chucks of playable video, you could have on your seekable range 5 ts segments of 10 seconds each. This doesn't mean that 50 secs is the full length of the live stream, there is around a full segment more (so 60 seconds of playtime total) but it isn't categorized as seekable since you shouldn't seek to that segment. If you were to do this you'll notice in most instances rebuffering (cause the source may be still creating the next ts segment when you already reached the end of playback).
What I did was checking if the current stream time is further than the seekable rage, if so this would mean were are live on stream. If it isn't you could easily calculate how far behind you are from live if you subtract the current time, starting time, and total duration.
let timeRange:CMTimeRange = player.currentItem?.seekableTimeRanges.last
let start = timeRange.start.seconds
let totalDuration = timeRange.duration.seconds
let currentTime = player.currentTime().seconds
let secondsBehindLive = currentTime - totalDuration - start
The code above will give you a negative number with the number of seconds behind "live" or more specifically the start of the lastest ts segment. Or a positive number or zero when it's playing the latest ts segment.
Tbh I don't really know when does the seekableTimeRanges will have more than 1 value, it has always been just one for the streams I have tested with, but if you find in your streams more than 1 value you may have to figure if you want to add all the ranges duration, which time range to use as the start value, etc. At least for my use case, this was enough.

Trying to understand CMTime

I have seen some examples of CMTime (Three separate links), but I still don't get it. I'm using an AVCaptureSession with AVCaptureVideoDataOutput and I want to set the max and min frame rate of the the output. My problem is I just don't understand the CMTime struct.
Apparently CMTimeMake(value, timeScale) should give me value frames every 1/timeScale seconds for a total of value/timeScale seconds, or am I getting that wrong?
Why isn't this documented anywhere in order to explain what this does?
If it does truly work like that, how would I get it to have an indefinite number of frames?
If its really simple, I'm sorry, but nothing has clicked just yet.
A CMTime struct represents a length of time that is stored as rational number (see CMTime Reference). CMTime has a value and a timescale field, and represents the time value/timescale seconds .
CMTimeMake is a function that returns a CMTime structure, for example:
CMTime t1 = CMTimeMake(1, 10); // 1/10 second = 0.1 second
CMTime t2 = CMTimeMake(2, 1); // 2 seconds
CMTime t3 = CMTimeMake(3, 4); // 3/4 second = 0.75 second
CMTime t4 = CMTimeMake(6, 8); // 6/8 second = 0.75 second
The last two time values t3 and t4 represent the same time value, therefore
CMTimeCompare(t3, t4) == 0
If you set the videoMinFrameDuration of a AVCaptureSession is does not make a difference if you set
connection.videoMinFrameDuration = CMTimeMake(1, 20); // or
connection.videoMinFrameDuration = CMTimeMake(2, 40);
In both cases the minimum time interval between frames is set to 1/20 = 0.05 seconds.
My experience differs.
For let testTime = CMTime(seconds: 3.83, preferredTimescale: 100)
If you set a breakpoint and look in the debugger side window it says:
"383 100ths of a second"
Testing by seeking to a fixed offset in a video in AVPlayer has confirmed this.
So put the actual number of seconds in the seconds field, and the precision in the preferredTimescale field. So 100 means precision of hundredths of a second.
Doing
let testTime = CMTime(seconds: 3.83, preferredTimescale: 100)
Still seeks to the same place in the video, but it displays in the debugger side window as "3833 1000ths of a second"
Doing
let testTime = CMTime(seconds: 3.83, preferredTimescale: 1)
Does not seek to the same place in the video, because it's been truncated, and it displays in the debugger side window as "3 seconds". Notice that the .833 part has been lost due to the preferredTimescale.
CMTime(seconds: value, timescale: scale)
means value/scale in a just one second

Resources