AVAudioPlayerNode lastRenderTime - ios

I use multiple AVAudioPlayerNode in AVAudioEngine to mix audio files for playback.
Once all the setup is done (engine prepared, started, audio file segments scheduled), I'm calling play() method on each player node to start playback.
Because it takes times to loop through all player nodes, I take a snapshot of the first nodes's lastRenderTime value and use it to compute a start time for the nodes play(at:) method, to keep playback in sync between nodes :
let delay = 0.0
let startSampleTime = time.sampleTime // time is the snapshot value
let sampleRate = player.outputFormat(forBus: 0).sampleRate
let startTime = AVAudioTime(
sampleTime: startSampleTime + AVAudioFramePosition(delay * sampleRate),
atRate: sampleRate)
player.play(at: startTime)
The problem is with the current playback time.
I use this computation to get the value, where seekTime is a value I keep track of in case we seek the player. It's 0.0 at start :
private var _currentTime: TimeInterval {
guard player.engine != nil,
let lastRenderTime = player.lastRenderTime,
lastRenderTime.isSampleTimeValid,
lastRenderTime.isHostTimeValid else {
return seekTime
}
let sampleRate = player.outputFormat(forBus: 0).sampleRate
let sampleTime = player.playerTime(forNodeTime: lastRenderTime)?.sampleTime ?? 0
if sampleTime > 0 && sampleRate != 0 {
return seekTime + (Double(sampleTime) / sampleRate)
}
return seekTime
}
While this produces a relatively correct value, I can hear a delay between the time I play, and the first sound I hear. Because the lastRenderTime immediately starts to advance once I call play(at:), and there must be some kind of processing/buffering time offset.
The noticeable delay is around 100ms, which is very big, and I need a precise current time value to do visual rendering in parallel.
It probably doesn't matter, but every audio file is AAC audio, and I schedule segments of them in player nodes, I don't use buffers directly.
Segments length may vary. I also call prepare(withFrameCount:) on each player node once I have scheduled audio data.
So my question is, is the delay I observe is a buffering issue ? (I mean should I schedule shorter segments for example), is there a way to compute precisely this value so I can adjust my current playback time computation ?
When I install a tap block on one AVAudioPlayerNode, the block is called with a buffer of length 4410, and the sample rate is 44100 Hz, this means 0.1s of audio data. Should I rely on this to compute the latency ?
I'm wondering if I can trust the length of the buffer I get in the tap block. Alternatively, I'm trying to compute the total latency for my audio graph. Can someone provide insights on how to determine this value precisely ?

From a post on Apple's developer forums by theanalogkid:
On the system, latency is measured by:
Audio Device I/O Buffer Frame Size + Output Safety Offset + Output Stream Latency + Output Device Latency
If you're trying to calculate total roundtrip latency you can add:
Input Latency + Input Safety Offset to the above.
The timestamp you see at the render proc. account for the buffer frame size and the safety offset but the stream and device latencies are not accounted for.
iOS gives you access to the most important of the above information via AVAudioSession and as mentioned you can also use the "preferred" session settings - setPreferredIOBufferDuration and preferredIOBufferDuration for further control.
/ The current hardware input latency in seconds. */
#property(readonly) NSTimeInterval inputLatency NS_AVAILABLE_IOS(6_0);
/ The current hardware output latency in seconds. */
#property(readonly) NSTimeInterval outputLatency NS_AVAILABLE_IOS(6_0);
/ The current hardware IO buffer duration in seconds. */
#property(readonly) NSTimeInterval IOBufferDuration NS_AVAILABLE_IOS(6_0);
Audio Units also have the kAudioUnitProperty_Latency property you can query.

Related

How do I control AVAssetWriter to write at the correct FPS

Let me see if I understood it correctly.
At the present most advanced hardware, iOS allows me to record at the following fps: 30, 60, 120 and 240.
But these fps behave differently. If I shoot at 30 or 60 fps, I expect the videos files created from shooting at these fps to play at 30 and 60 fps respectively.
But if I shoot at 120 or 240 fps, I expect the video files creating from shooting at these fps to play at 30 fps, or I will not see the slow motion.
A few questions:
am I right?
is there a way to shoot at 120 or 240 fps and play at 120 and 240 fps respectively? I mean play at the fps the videos were shoot without slo-mo?
How do I control that framerate when I write the file?
I am creating the AVAssetWriter input like this...
NSDictionary *videoCompressionSettings = #{AVVideoCodecKey : AVVideoCodecH264,
AVVideoWidthKey : #(videoWidth),
AVVideoHeightKey : #(videoHeight),
AVVideoCompressionPropertiesKey : #{ AVVideoAverageBitRateKey : #(bitsPerSecond),
AVVideoMaxKeyFrameIntervalKey : #(1)}
};
_assetWriterVideoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:videoCompressionSettings];
and there is no apparent way to control that.
NOTE: I have tried different numbers where that 1 is. I have tried 1.0/fps, I have tried fps and I have removed the key. No difference.
This is how I setup `AVAssetWriter:
AVAssetWriter *newAssetWriter = [[AVAssetWriter alloc] initWithURL:_movieURL fileType:AVFileTypeQuickTimeMovie
error:&error];
_assetWriter = newAssetWriter;
_assetWriter.shouldOptimizeForNetworkUse = NO;
CGFloat videoWidth = size.width;
CGFloat videoHeight = size.height;
NSUInteger numPixels = videoWidth * videoHeight;
NSUInteger bitsPerSecond;
// Assume that lower-than-SD resolutions are intended for streaming, and use a lower bitrate
// if ( numPixels < (640 * 480) )
// bitsPerPixel = 4.05; // This bitrate matches the quality produced by AVCaptureSessionPresetMedium or Low.
// else
NSUInteger bitsPerPixel = 11.4; // This bitrate matches the quality produced by AVCaptureSessionPresetHigh.
bitsPerSecond = numPixels * bitsPerPixel;
NSDictionary *videoCompressionSettings = #{AVVideoCodecKey : AVVideoCodecH264,
AVVideoWidthKey : #(videoWidth),
AVVideoHeightKey : #(videoHeight),
AVVideoCompressionPropertiesKey : #{ AVVideoAverageBitRateKey : #(bitsPerSecond)}
};
if (![_assetWriter canApplyOutputSettings:videoCompressionSettings forMediaType:AVMediaTypeVideo]) {
NSLog(#"Couldn't add asset writer video input.");
return;
}
_assetWriterVideoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo
outputSettings:videoCompressionSettings
sourceFormatHint:formatDescription];
_assetWriterVideoInput.expectsMediaDataInRealTime = YES;
NSDictionary *adaptorDict = #{
(id)kCVPixelBufferPixelFormatTypeKey : #(kCVPixelFormatType_32BGRA),
(id)kCVPixelBufferWidthKey : #(videoWidth),
(id)kCVPixelBufferHeightKey : #(videoHeight)
};
_pixelBufferAdaptor = [[AVAssetWriterInputPixelBufferAdaptor alloc]
initWithAssetWriterInput:_assetWriterVideoInput
sourcePixelBufferAttributes:adaptorDict];
// Add asset writer input to asset writer
if (![_assetWriter canAddInput:_assetWriterVideoInput]) {
return;
}
[_assetWriter addInput:_assetWriterVideoInput];
captureOutput method is very simple. I get the image from the filter and write it to file using:
if (videoJustStartWriting)
[_assetWriter startSessionAtSourceTime:presentationTime];
CVPixelBufferRef renderedOutputPixelBuffer = NULL;
OSStatus err = CVPixelBufferPoolCreatePixelBuffer(nil,
_pixelBufferAdaptor.pixelBufferPool,
&renderedOutputPixelBuffer);
if (err) return; // NSLog(#"Cannot obtain a pixel buffer from the buffer pool");
//_ciContext is a metal context
[_ciContext render:finalImage
toCVPixelBuffer:renderedOutputPixelBuffer
bounds:[finalImage extent]
colorSpace:_sDeviceRgbColorSpace];
[self writeVideoPixelBuffer:renderedOutputPixelBuffer
withInitialTime:presentationTime];
- (void)writeVideoPixelBuffer:(CVPixelBufferRef)pixelBuffer withInitialTime:(CMTime)presentationTime
{
if ( _assetWriter.status == AVAssetWriterStatusUnknown ) {
// If the asset writer status is unknown, implies writing hasn't started yet, hence start writing with start time as the buffer's presentation timestamp
if ([_assetWriter startWriting]) {
[_assetWriter startSessionAtSourceTime:presentationTime];
}
}
if ( _assetWriter.status == AVAssetWriterStatusWriting ) {
// If the asset writer status is writing, append sample buffer to its corresponding asset writer input
if (_assetWriterVideoInput.readyForMoreMediaData) {
if (![_pixelBufferAdaptor appendPixelBuffer:pixelBuffer withPresentationTime:presentationTime]) {
NSLog(#"error", [_assetWriter.error localizedFailureReason]);
}
}
}
if ( _assetWriter.status == AVAssetWriterStatusFailed ) {
NSLog(#"failed");
}
}
I put the whole thing to shoot at 240 fps. These are presentation times of frames being appended.
time ======= 113594.311510508
time ======= 113594.324011508
time ======= 113594.328178716
time ======= 113594.340679424
time ======= 113594.344846383
if you do some calculation between them you will see that the framerate is about 240 fps. So the frames are being stored with the correct time.
But when I watch the video the movement is not in slow motion and quick time says the video is 30 fps.
Note: this app grabs frames from the camera, the frames goes into CIFilters and the result of those filters is converted back to a sample buffer that is stored to file and displayed on the screen.
I'm reaching here, but I think this is where you're going wrong. Think of your video capture as a pipeline.
(1) Capture buffer -> (2) Do Something With buffer -> (3) Write buffer as frames in video.
Sounds like you've successfully completed (1) and (2), you're getting the buffer fast enough and you're processing them so you can vend them as frames.
The problem is almost certainly in (3) writing the video frames.
https://developer.apple.com/reference/avfoundation/avmutablevideocomposition
Check out the frameDuration setting in your AVMutableComposition, you'll need something like CMTime(1, 60) //60FPS or CMTime(1, 240) // 240FPS to get what you're after (telling the video to WRITE this many frames and encode at this rate).
Using AVAssetWriter, it's exactly the same principle but you set the frame rate as a property in the AVAssetWriterInput outputSettings adding in the AVVideoExpectedSourceFrameRateKey.
NSDictionary *videoCompressionSettings = #{AVVideoCodecKey : AVVideoCodecH264,
AVVideoWidthKey : #(videoWidth),
AVVideoHeightKey : #(videoHeight),
AVVideoExpectedSourceFrameRateKey : #(60),
AVVideoCompressionPropertiesKey : #{ AVVideoAverageBitRateKey : #(bitsPerSecond),
AVVideoMaxKeyFrameIntervalKey : #(1)}
};
To expand a little more - you can't strictly control or sync your camera capture exactly to the output / playback rate, the timing just doesn't work that way and isn't that exact, and of course the processing pipeline adds overhead. When you capture frames they are time stamped, which you've seen, but in the writing / compression phase, it's using only the frames it needs to produce the output specified for the composition.
It goes both ways, you could capture only 30 FPS and write out at 240 FPS, the video would display fine, you'd just have a lot of frames "missing" and being filled in by the algorithm. You can even vend only 1 frame per second and play back at 30FPS, the two are separate from each other (how fast I capture Vs how many frames and what I present per second)
As to how to play it back at different speed, you just need to tweak the playback speed - slow it down as needed.
If you've correctly set the time base (frameDuration), it will always play back "normal" - you're telling it "play back is X Frames Per Second", of course, your eye may notice a difference (almost certainly between low FPS and high FPS), and the screen may not refresh that high (above 60FPS), but regardless the video will be at a "normal" 1X speed for it's timebase. By slowing the video, if my timebase is 120, and I slow it to .5x I know effectively see 60FPS and one second of playback takes two seconds.
You control the playback speed by setting the rate property on AVPlayer https://developer.apple.com/reference/avfoundation/avplayer
The iOS screen refresh is locked at 60fps, so the only way to "see" the extra frames is, as you say, to slow down the playback rate, a.k.a slow motion.
So
yes, you are right
the screen refresh rate (and perhaps limitations of the human visual system, assuming you're human?) means that you cannot perceive 120 & 240fps frame rates. You can play them at normal speed by downsampling to the screen refresh rate. Surely this is what AVPlayer already does, although I'm not sure if that's the answer you're looking for.
you control the framerate of the file when you write it with the CMSampleBuffer presentation timestamps. If your frames are coming from the camera, you're probably passing the timestamps straight through, in which case check that you really are getting the framerate you asked for (a log statement in your capture callback should be enough to verify this). If you're procedurally creating frames, then you choose the presentation timestamps so that they're spaced 1.0/desiredFrameRate seconds apart!
Is 3. not working for you?
p.s. you can discard & ignore AVVideoMaxKeyFrameIntervalKey - it's a quality setting and has nothing to do with playback framerate.

iOS: Synchronizing frames from camera and motion data

I'm trying to capture frames from camera and associated motion data.
For synchronization I'm using timestamps. Video and motion is written to a file and then processed. In that process I can calculate motion-frames offset for every video.
Turns out motion data and video data for same timestamp is offset from each other by different time from 0.2 sec up to 0.3 sec.
This offset is constant for one video but varies from video to video.
If it was same offset every time I would be able to subtract some calibrated value but it's not.
Is there a good way to synchronize timestamps?
Maybe I'm not recording them correctly?
Is there a better way to bring them to the same frame of reference?
CoreMotion returns timestamps relative to system uptime so I add offset to get unix time:
uptimeOffset = [[NSDate date] timeIntervalSince1970] -
[NSProcessInfo processInfo].systemUptime;
CMDeviceMotionHandler blk =
^(CMDeviceMotion * _Nullable motion, NSError * _Nullable error){
if(!error){
motionTimestamp = motion.timestamp + uptimeOffset;
...
}
};
[motionManager startDeviceMotionUpdatesUsingReferenceFrame:CMAttitudeReferenceFrameXTrueNorthZVertical
toQueue:[NSOperationQueue currentQueue]
withHandler:blk];
To get frames timestamps with high precision I'm using AVCaptureVideoDataOutputSampleBufferDelegate. It is offset to unix time also:
-(void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
CMTime frameTime = CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer);
if(firstFrame)
{
firstFrameTime = CMTimeMake(frameTime.value, frameTime.timescale);
startOfRecording = [[NSDate date] timeIntervalSince1970];
}
CMTime presentationTime = CMTimeSubtract(frameTime, firstFrameTime);
float seconds = CMTimeGetSeconds(presentationTime);
frameTimestamp = seconds + startOfRecording;
...
}
It is actually pretty simple to correlate these timestamps - although it's not clearly documented, both camera frame and motion data timestamps are based on the mach_absolute_time() timebase.
This is a monotonic timer that is reset at boot, but importantly also stops counting when the device is asleep. So there's no easy way to convert it to a standard "wall clock" time.
Thankfully you don't need to as the timestamps are directly comparable - motion.timestamp is in seconds, you can log out mach_absolute_time() in the callback to see it is the same timebase. My quick test shows the motion timestamp is typically about 2ms before mach_absolute_time in the handler, which seems about right for how long it might take for the data to get reported to the app.
Note mach_absolute_time() is in tick units that need conversion to nanoseconds; on iOS 10 and later you can just use the equivalent clock_gettime_nsec_np(CLOCK_UPTIME_RAW); which does the same thing.
[_motionManager
startDeviceMotionUpdatesUsingReferenceFrame:CMAttitudeReferenceFrameXArbitraryZVertical
toQueue:[NSOperationQueue currentQueue]
withHandler:^(CMDeviceMotion * _Nullable motion, NSError * _Nullable error) {
// motion.timestamp is in seconds; convert to nanoseconds
uint64_t motionTimestampNs = (uint64_t)(motion.timestamp * 1e9);
// Get conversion factors from ticks to nanoseconds
struct mach_timebase_info timebase;
mach_timebase_info(&timebase);
// mach_absolute_time in nanoseconds
uint64_t ticks = mach_absolute_time();
uint64_t machTimeNs = (ticks * timebase.numer) / timebase.denom;
int64_t difference = machTimeNs - motionTimestampNs;
NSLog(#"Motion timestamp: %llu, machTime: %llu, difference %lli", motionTimestampNs, machTimeNs, difference);
}];
For the camera, the timebase is also the same:
// In practice gives the same value as the CMSampleBufferGetOutputPresentationTimeStamp
// but this is the media's "source" timestamp which feels more correct
CMTime frameTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
uint64_t frameTimestampNs = (uint64_t)(CMTimeGetSeconds(frameTime) * 1e9);
The delay between the timestamp and the handler being called is a bit larger here, usually in the 10s of milliseconds.
We now need to consider what a timestamp on a camera frame actually means - there are two issues here; finite exposure time, and rolling shutter.
Rolling shutter means that not all scanlines of the image are actually captured at the same time - the top row is captured first and the bottom row last. This rolling readout of the data is spread over the entire frame time, so in 30 FPS camera mode the final scanline's exposure start/end time is almost exactly 1/30 second after the respective start/end time of the first scanline.
My tests indicate the presentation timestamp in the AVFoundation frames is the start of the readout of the frame - ie the end of the exposure of the first scanline. So the end of the exposure of the final scanline is frameDuration seconds after this, and the start of the exposure of the first scanline was exposureTime seconds before this. So a timestamp right in the centre of the frame exposure (the midpoint of the exposure of the middle scanline of the image) can be calculated as:
const double frameDuration = 1.0/30; // rolling shutter effect, depends on camera mode
const double exposure = avCaptureDevice.exposureDuration;
CMTime frameTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
double midFrameTime = CMTimeGetSeconds(frameTime) - exposure * 0.5 + frameDuration * 0.5;
In indoor settings, the exposure usually ends up the full frame time anyway, so the midFrameTime from above ends up identical to the frameTime. The difference is noticeable (under extremely fast motion) with short exposures that you typically get from brightly lit outdoor scenes.
Why the original approach had different offsets
I think the main cause of your offset is that you assume the timestamp of the first frame is the time that the handler runs - ie it doesn't account for any delay between capturing the data and it being delivered to your app. Especially if you're using the main queue for these handlers I can imagine the callback for that first frame being delayed by the 0.2-0.3s you mention.
The best solution I was able to find to this problem was
to run a feature tracker over the recorded video, pick one of the strong features and plot the the speed of it's movement along say X axis and then correlate this plot to the accelerometer Y data.
When there's 2 similar plots that are offset of each other along abscissa there's a technique called cross-correlation that allows to find the offset.
There's an obvious drawback of this approach - it's slow as it requires some video processing.

How to get the accurate time position of a live streaming in avplayer

I'm using AVPlayer to play a live streaming. This stream supports one hour catch-up which means user can seek to one hour ago and play. But I have one question how do I know the accurate position that the player is playing. I need to display current position on the player view. For example,if user is playing half an hour ago then display -30:00; if user is playing the latest content, the player will show 00:00 or live. Thanks
Swift solution :
override func getLiveDuration() -> Float {
var result : Float = 0.0;
if let items = player.currentItem?.seekableTimeRanges {
if(!items.isEmpty) {
let range = items[items.count - 1]
let timeRange = range.timeRangeValue
let startSeconds = CMTimeGetSeconds(timeRange.start)
let durationSeconds = CMTimeGetSeconds(timeRange.duration)
result = Float(startSeconds + durationSeconds)
}
}
return result;
}
To get a live position poison and seek to it you can by using seekableTimeRanges of AVPlayerItem:
CMTimeRange seekableRange = [player.currentItem.seekableTimeRanges.lastObject CMTimeRangeValue];
CGFloat seekableStart = CMTimeGetSeconds(seekableRange.start);
CGFloat seekableDuration = CMTimeGetSeconds(seekableRange.duration);
CGFloat livePosition = seekableStart + seekableDuration;
[player seekToTime:CMTimeMake(livePosition, 1)];
Also when you seek some time back, you can get current playing position by calling currentTime method
CGFloat current = CMTimeGetSeconds([self.player.currentItem currentTime]);
CGFloat diff = livePosition - current;
I know this question is old, but I had the same requirement and I believe the solutions aren't addressing properly the intent of the question.
What I did for this same requirement was to gather the current point in time, the starting time, and the length of the total duration of the stream.
I'll explain something before going further, the current point in time could surpass the (starting time + total duration) this is due to the way hls is structured as ts segments. Ts segments are small chucks of playable video, you could have on your seekable range 5 ts segments of 10 seconds each. This doesn't mean that 50 secs is the full length of the live stream, there is around a full segment more (so 60 seconds of playtime total) but it isn't categorized as seekable since you shouldn't seek to that segment. If you were to do this you'll notice in most instances rebuffering (cause the source may be still creating the next ts segment when you already reached the end of playback).
What I did was checking if the current stream time is further than the seekable rage, if so this would mean were are live on stream. If it isn't you could easily calculate how far behind you are from live if you subtract the current time, starting time, and total duration.
let timeRange:CMTimeRange = player.currentItem?.seekableTimeRanges.last
let start = timeRange.start.seconds
let totalDuration = timeRange.duration.seconds
let currentTime = player.currentTime().seconds
let secondsBehindLive = currentTime - totalDuration - start
The code above will give you a negative number with the number of seconds behind "live" or more specifically the start of the lastest ts segment. Or a positive number or zero when it's playing the latest ts segment.
Tbh I don't really know when does the seekableTimeRanges will have more than 1 value, it has always been just one for the streams I have tested with, but if you find in your streams more than 1 value you may have to figure if you want to add all the ranges duration, which time range to use as the start value, etc. At least for my use case, this was enough.

How to synchronize multiple audio files on iOS?

I like to synchronize (one-shot) audio effects with the beat of (looping) background music on iOS.
How do I approach that task?
Edit: So, to give more details, say the background music loops over 4 bars. I want to be able to start the playback of another audio file (of an audio effect) on the next 8th (or 16th or 4th...) note.
Simple. Put them on the same timeline (e.g. audio playback callback), and use the input tempo to determine the metric intervals in samples (or ticks, if using MIDI events).
Update
double SamplesPerBeat(const double audioSampleRate,
const double beatsPerMinute) {
assert(8000.0 <= audioSampleRate && 192000.0 >= audioSampleRate);
assert(20.0 <= beatsPerMinute && 500.0 >= beatsPerMinute);
return audioSampleRate / beatsPerMinute;
}
uint32_t StartPosition(const double audioSampleRate,
const double beatsPerMinute,
const uint32_t beatNumber) {
const double samplesPerBeat = SamplesPerBeat(audioSampleRate, beatNumber);
return (uint32_t)floor(samplesPerBeat * beatNumber);
}

IOS AVPlayer get fps

Im trying to figure out how to retrieve a videos frame rate via AVPlayer. AVPlayerItem has a rate variable but it only returns a value between 0 and 2 (usually 1 when playing). Anybody have an idea how to get the video frame rate?
Cheers
Use AVAssetTrack's nominalFrameRate property.
Below method to get FrameRate : Here queuePlayer is AVPlayer
-(float)getFrameRateFromAVPlayer
{
float fps=0.00;
if (self.queuePlayer.currentItem.asset) {
AVAssetTrack * videoATrack = [[videoAsset tracksWithMediaType:AVMediaTypeVideo] lastObject];
if(videoATrack)
{
fps = videoATrack.nominalFrameRate;
}
}
return fps;
}
Swift 4 version of the answer:
let asset = avplayer.currentItem.asset
let tracks = asset.tracks(withMediaType: .video)
let fps = tracks?.first?.nominalFrameRate
Remember to handle nil checking.
There seems to be a discrepancy in this nominalFrameRate returned for the same media played on different versions of iOS. I have a video I encoded with ffmpeg at 1 frame per second (125 frames) with keyframes every 25 frames and when loading in an app on iOS 7.x the (nominal) frame rate is 1.0, while on iOS 8.x the (nominal) frame rate is 0.99. This seems like a very small difference, however in my case I need to navigate precisely to a given frame in the movie and this difference screws up such navigation (the movie is an encoding of a sequence of presentation slides). Given that I already know the frame rate of the videos my app needs to play (e.g. 1 fps) I can simply rely on this value instead of determining the frame rate dynamically (via nominalFrameRate value), however I wonder WHY there is such discrepancy between iOS versions as far as this nominalFrameRate goes. Any ideas?
The rate value on AVPlayer is the speed relative to real time to which it's playing, eg 0.5 is slow motion, 2 is double speed.
As Paresh Navadiya points out a track also has a nominalFrameRate variable however this seems to sometimes give strange results. the best solution I've found so far is to use the following:
CMTime frameDuration = [myAsset tracksWithMediaType:AVMediaTypeVideo][0].minFrameDuration;
float fps = frameDuration.timescale/(float)frameDuration.value;
The above gives slightly unexpected results for variable frame rate but variable frame rate has slightly odd behavior anyway. Other than that it matches ffmpeg -i in my tests.
EDIT ----
I've found sometimes the above gives time kCMTimeZero. The workaround I've used for this is to create an AVAssetReader with a track output,get the pts of the first frame and second frame then do a subtraction of the two.
I don't know anything in AVPlayer that can help you to calculate the frame rate.
AVPlayerItem rate property is the playback rate, nothing to do with the frame rate.
The easier options is to obtain a AVAssetTrack and read its nominalFrameRate property. Just create an AVAsset and you'll get an array of tracks.
Or use AVAssetReader to read the video frame by frame, get its presentation time and count how many frames are in the same second, then average for a few seconds or the whole video.
This is not gonna work anymore, API has changed, and this post is old. :(
The swift 4 answer is also cool, this is answer is similar.
You get the video track from the AVPlayerItem, and you check the FPS there. :)
private var numberOfRenderingFailures = 0
func isVideoRendering() -> Bool {
guard let currentItem = player.currentItem else { return false }
// Check if we are playing video tracks
let isRendering = currentItem.tracks.contains { ($0.assetTrack?.mediaType == .video) && ($0.currentVideoFrameRate > 5) }
if isRendering {
numberOfRenderingFailures = 0
return true
}
numberOfRenderingFailures += 1
if numberOfRenderingFailures < 5 {
return true
}
return false
}

Resources