AVAudioEngine reconcile/sync input/output timestamps on macOS/iOS

AVAudioEngine reconcile/sync input/output timestamps on macOS/iOS - ios

I'm attempting to sync recorded audio (from an AVAudioEngine inputNode) to an audio file that was playing during the recording process. The result should be like multitrack recording where each subsequent new track is synced with the previous tracks that were playing at the time of recording.
Because sampleTime differs between the AVAudioEngine's output and input nodes, I use hostTime to determine the offset of the original audio and the input buffers.
On iOS, I would assume that I'd have to use AVAudioSession's various latency properties (inputLatency, outputLatency, ioBufferDuration) to reconcile the tracks as well as the host time offset, but I haven't figured out the magic combination to make them work. The same goes for the various AVAudioEngine and Node properties like latency and presentationLatency.
On macOS, AVAudioSession doesn't exist (outside of Catalyst), meaning I don't have access to those numbers. Meanwhile, the latency/presentationLatency properties on the AVAudioNodes report 0.0 in most circumstances. On macOS, I do have access to AudioObjectGetPropertyData and can ask the system about kAudioDevicePropertyLatency, kAudioDevicePropertyBufferSize,kAudioDevicePropertySafetyOffset, etc, but am again at a bit of a loss as to what the formula is to reconcile all of these.
I have a sample project at https://github.com/jnpdx/AudioEngineLoopbackLatencyTest that runs a simple loopback test (on macOS, iOS, or Mac Catalyst) and shows the result. On my Mac, the offset between tracks is ~720 samples. On others' Macs, I've seen as much as 1500 samples offset.
On my iPhone, I can get it close to sample-perfect by using AVAudioSession's outputLatency + inputLatency. However, the same formula leaves things misaligned on my iPad.
What's the magic formula for syncing the input and output timestamps on each platform? I know it may be different on each, which is fine, and I know I won't get 100% accuracy, but I would like to get as close as possible before going through my own calibration process
Here's a sample of my current code (full sync logic can be found at https://github.com/jnpdx/AudioEngineLoopbackLatencyTest/blob/main/AudioEngineLoopbackLatencyTest/AudioManager.swift):
//Schedule playback of original audio during initial playback
let delay = 0.33 * state.secondsToTicks
let audioTime = AVAudioTime(hostTime: mach_absolute_time() + UInt64(delay))
state.audioBuffersScheduledAtHost = audioTime.hostTime
...
//in the inputNode's inputTap, store the first timestamp
audioEngine.inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (pcmBuffer, timestamp) in
if self.state.inputNodeTapBeganAtHost == 0 {
self.state.inputNodeTapBeganAtHost = timestamp.hostTime
}
}
...
//after playback, attempt to reconcile/sync the timestamps recorded above
let timestampToSyncTo = state.audioBuffersScheduledAtHost
let inputNodeHostTimeDiff = Int64(state.inputNodeTapBeganAtHost) - Int64(timestampToSyncTo)
let inputNodeDiffInSamples = Double(inputNodeHostTimeDiff) / state.secondsToTicks * inputFileBuffer.format.sampleRate //secondsToTicks is calculated using mach_timebase_info
//play the original metronome audio at sample position 0 and try to sync everything else up to it
let originalAudioTime = AVAudioTime(sampleTime: 0, atRate: renderingEngine.mainMixerNode.outputFormat(forBus: 0).sampleRate)
originalAudioPlayerNode.scheduleBuffer(metronomeFileBuffer, at: originalAudioTime, options: []) {
print("Played original audio")
}
//play the tap of the input node at its determined sync time -- this _does not_ appear to line up in the result file
let inputAudioTime = AVAudioTime(sampleTime: AVAudioFramePosition(inputNodeDiffInSamples), atRate: renderingEngine.mainMixerNode.outputFormat(forBus: 0).sampleRate)
recordedInputNodePlayer.scheduleBuffer(inputFileBuffer, at: inputAudioTime, options: []) {
print("Input buffer played")
}
When running the sample app, here's the result I get:

This answer is applicable to native macOS only
General Latency Determination
Output
In the general case the output latency for a stream on a device is determined by the sum of the following properties:
kAudioDevicePropertySafetyOffset
kAudioStreamPropertyLatency
kAudioDevicePropertyLatency
kAudioDevicePropertyBufferFrameSize
The device safety offset, stream, and device latency values should be retrieved for kAudioObjectPropertyScopeOutput.
On my Mac for the audio device MacBook Pro Speakers at 44.1 kHz this equates to 71 + 424 + 11 + 512 = 1018 frames.
Input
Similarly, the input latency is determined by the sum of the following properties:
kAudioDevicePropertySafetyOffset
kAudioStreamPropertyLatency
kAudioDevicePropertyLatency
kAudioDevicePropertyBufferFrameSize
The device safety offset, stream, and device latency values should be retrieved for kAudioObjectPropertyScopeInput.
On my Mac for the audio device MacBook Pro Microphone at 44.1 kHz this equates to 114 + 2404 + 40 + 512 = 3070 frames.
AVAudioEngine
How the information above relates to AVAudioEngine is not immediately clear. Internally AVAudioEngine creates a private aggregate device and Core Audio essentially handles latency compensation for aggregate devices automatically.
During experimentation for this answer I've found that some (most?) audio devices don't report latency correctly. At least that is how it seems, which makes accurate latency determination nigh impossible.
I was able to get fairly accurate synchronization using my Mac's built-in audio using the following adjustments:
// Some non-zero value to get AVAudioEngine running
let startDelay = 0.1
// The original audio file start time
let originalStartingFrame: AVAudioFramePosition = AVAudioFramePosition(playerNode.outputFormat(forBus: 0).sampleRate * startDelay)
// The output tap's first sample is delivered to the device after the buffer is filled once
// A number of zero samples equal to the buffer size is produced initially
let outputStartingFrame: AVAudioFramePosition = Int64(state.outputBufferSizeFrames)
// The first output sample makes it way back into the input tap after accounting for all the latencies
let inputStartingFrame: AVAudioFramePosition = outputStartingFrame - Int64(state.outputLatency + state.outputStreamLatency + state.outputSafetyOffset + state.inputSafetyOffset + state.inputLatency + state.inputStreamLatency)
On my Mac the values reported by the AVAudioEngine aggregate device were:
// Output:
// kAudioDevicePropertySafetyOffset: 144
// kAudioDevicePropertyLatency: 11
// kAudioStreamPropertyLatency: 424
// kAudioDevicePropertyBufferFrameSize: 512
// Input:
// kAudioDevicePropertySafetyOffset: 154
// kAudioDevicePropertyLatency: 0
// kAudioStreamPropertyLatency: 2404
// kAudioDevicePropertyBufferFrameSize: 512
which equated to the following offsets:
originalStartingFrame = 4410
outputStartingFrame = 512
inputStartingFrame = -2625

I may not be able to answer your question, but I believe there is a property not mentioned in your question that does report additional latency information.
I've only worked at the HAL/AUHAL layers (never AVAudioEngine), but in discussions about computing the overall latencies, some audio device/stream properties come up: kAudioDevicePropertyLatency and kAudioStreamPropertyLatency.
Poking around a bit, I see those properties mentioned in the documentation for AVAudioIONode's presentationLatency property (https://developer.apple.com/documentation/avfoundation/avaudioionode/1385631-presentationlatency). I expect that the hardware latency reported by the driver will be there. (I suspect that the standard latency property reports latency for an input sample to appear in the output of a "normal" node, and IO case is special)
It's not in the context of AVAudioEngine, but here's one message from the CoreAudio mailing list that talks a bit about using the low level properties that may provide some additional background: https://lists.apple.com/archives/coreaudio-api/2017/Jul/msg00035.html

Related

AVAudioPlayerNode - Mix between schedule buffers and segments

I'm writing an application where I should play parts of audio files. Each audio file contains audio data for a separate track.
These parts are sections with a begin time and a end time, and I'm trying to play those parts in the order I choose.
So for example, imagine I have 4 sections :
A - B - C - D
and I activate B and D, I want to play, B, then D, then B again, then D, etc..
To make smooth 'jumps" in playback I think it's important to fade in/out start/end sections buffers.
So, I have a basic AVAudioEngine setup, with AVAudioPlayerNode, and a mixer.
For each audio section, I cache some information :
a buffer for the first samples in the section (which I fade in manually)
a tuple for the AVAudioFramePosition, and AVAudioFrameCount of a middle segment
a buffer for the end samples in the audio section (which I fade out manually)
now, when I schedule a section for playing, I say the AVAudioPlayerNode :
schedule the start buffer (scheduleBuffer(_:completionHandler:) no option)
schedule the middle segment (scheduleSegment(_:startingFrame:frameCount:at:completionHandler:))
finally schedule the end buffer (scheduleBuffer(_:completionHandler:) no option)
all at "time" nil.
The problem here is I can hear clic, and crappy sounds at audio sections boundaries and I can't see where I'm doing wrong.
My first idea was the fades I do manually (basically multiplying sample values by a volume factor), but same result without doing that.
I thought I didn't schedule in time, but scheduling sections in advance, A - B - C for example beforehand has the same result.
I then tried different frame position computations, with audio format settings, same result.
So I'm out of ideas here, and perhaps I didn't get the schedule mechanism right.
Can anyone confirm I can mix scheduling buffers and segments in AVAudioPlayerNode ? or should I schedule only buffers or segments ?
I can confirm that scheduling only segments works, playback is perfectly fine.
A little context on how I cache information for audio sections..
In the code below, file is of type AVAudioFile loaded on disk from a URL, begin and end are TimeInterval values, and represent the start/end of my audio section.
let format = file.processingFormat
let startBufferFrameCount: AVAudioFrameCount = 4096
let endBufferFrameCount: AVAudioFrameCount = 4096
let audioSectionStartFrame = framePosition(at: begin, format: format)
let audioSectionEndFrame = framePosition(at: end, format: format)
let segmentStartFrame = audioSectionStartFrame + AVAudioFramePosition(startBufferFrameCount)
let segmentEndFrame = audioSectionEndFrame - AVAudioFramePosition(endBufferFrameCount)
startBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: startBufferFrameCount)
endBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: endBufferFrameCount)
file.framePosition = audioSectionStartFrame
try file.read(into: startBuffer)
file.framePosition = segmentEndFrame
try file.read(into: endBuffer)
middleSegment = (segmentStartFrame, AVAudioFrameCount(segmentEndFrame - segmentStartFrame))
frameCount = AVAudioFrameCount(audioSectionEndFrame - audioSectionStartFrame)
Also, the framePosition(at:format:) multiplies the TimeInterval value by the sample rate of the AVAudioFormat passed in.
I cache this information for every audio section, but I hear clicks at section boundaries, no matter if I schedule them in advance or not.
I also tried not mixing buffer and segments when scheduling, but I doesn't change anything, so I start thinking I'm doing wrong frame computations.

Simple Babymonitor with Bass.DLL

I am trying to program a simple Babymonitor for Windows (personal use).
The babymonitor should just detect the dB level of the microphone and triggers at a certain volume.
After some research, I found the Bass.dll library and came across it's function BASS_ChannelGetLevel, which is great but seems to have limitations and doesn't fit my needs (Peak equals to a DWORD value).
In the examples I found a livespec example which is "almost" what I need. The example uses BASS_ChannelGetData, but I don't quite know how to handle the returned array...
I want to keep it as simple as possible: Detect the volume from the microphone as dB or any other value (e.g. value 0-MAXINT).
How can this be done with the Bass.dll library?

The BASS_ChannelGetLevel returns the value that is capped to 0dB (return value is 32768 in this case). If you adjust your source level (lower microphone level in sound card settings) then it will work just fine.
Another way, if you want to get uncapped value is to use the BASS_ChannelGetLevelEx function instead: it returns floating point levels, where 1 is maximum (0dB) value that corresponds to BASS_ChannelGetLevel's 32767, but it can exceed 1 to detect sound levels above 0dB which is what you may need.
I also suggest you to monitor sound level for a while: trigger only if certain level exists for 2-3 seconds at least (this way you will exclude false alarms).

Here is how you obtain the db level given an input stream handle (streamHandle):
var peak = (double)Bass.BASS_ChannelGetLevel(streamHandle);
var decibels = 20 * Math.Log10(peak / Int32.MaxValue);
Alternatively, you can use the following to get the RMS (average) peak. To get the RMS value, you have to pass in a sample length into BASS_ChannelGetLevel. I'm using 20 milliseconds here but you can play with the value to see which works best for your needs.
var decibels = 0m;
var channelCount = 2; //Assuming two channels
var sampleLengthMS = 20f;
var rmsLevels = new float[channelCount];
var rmsObtained = Bass.BASS_ChannelGetLevel(streamHandle, rmsLevels, sampleLengthMS / 1000f, BASSLevel.BASS_LEVEL_RMS);
if (rmsObtained)
decibels = 20*Math.Log10(rmsLevels[0]); //using first channel (index 0) but you can get both if needed.
else
Console.WriteLine(Bass.BASS_ErrorGetCode());
Hope this helps.

Xcode iOS: how to find position of one audio file (snippet) inside another?

I have audio files, with different durations. They have common content and unique content. E.g. two files, 70 seconds each, last 10 seconds of the first file is the same as first two seconds of the second file. How can I find the exact position of common content (e.g. 60.0 of the first file)?
Sounds a little bit messy, hope the following image can help https://drive.google.com/file/d/0BzBE2Kfw8uQoUWNTN1RXOEtLVEk/view?usp=sharing
So, I'm looking for the red mark - common content starts at 60.0 sec of the first file.
The problem is that I have files with different durations. Sometimes it's 70 seconds long, sometimes one file is 70 seconds, the other is 80 seconds long, etc. Most likely they have 60.0 seconds of unique content, but I'm not sure (it could be 59.9 of unique content, etc.).
Thus, I assume I need to get a short snippet of the second file from first 10 seconds and find it in the first file:
For example, output: 2.5 sec of the second file = 62.5 from the first file - works for me, as well.
THE MAIN GOAL IS TO PLAY FILE AFTER FILE GAPLESS. If I get the values, I'll be able to do this. Sometimes the values can be: 2.5 = 63.7, that's why I need the exact match.
Can anybody help with the code or at least some information of how to compare two snippets of audio content? Thanks in advance!

Wow, that is quite a problem to solve. And I must confess that i've not done anything exactly like this or have any code based suggestions.
All I will say is that if I were looking to try and solve this problem, then I would try and save the audio file as some kind of uncompressed and fixed size (as in a known number of bytes per second) format.
Then you could take a section of one file and byte match it with another, then you would know how many bytes inwards that snippet occurred. Then, knowing the bytes per ms (sort of frame size), you could work out the exact time position.
It's a bit hair brained, but i've used that technique with images before but at least audio is linear!
Here is an approximate example of how I would go about doing the comparison of a sample within a sound file.
- (int)positionOf:(NSData*)sample inData:(NSData*)soundfile {
// the block size has to be big enough to find something genuinely unique but small enough to ensure it is still fast.
int blockSize = 128;
int position = 0;
int returnPosition = INT32_MAX;
// check to see if the block size exceeds the sample or data file size
if (soundfile.length < blockSize || sample.length < blockSize) {
return returnPosition;
}
// create a byte array of the sample, ready to use to compare with the shifting buffer
char* sampleByteArray = malloc(sample.length);
memcpy(sampleByteArray, sample.bytes, sample.length);
// now loop through the sound file, shifting the window along.
while (position < (soundfile.length - blockSize)) {
char* window = malloc(blockSize);
memcpy(window, soundfile.bytes + position, blockSize);
// check to see if this is a match
if(!memcmp(sampleByteArray, window, blockSize)) {
// these are the same, now to check if the whole sample is the same
if ((position + sample.length) > soundfile.length) {
// the sample won't fit in the remaining soundfile, so it can't be this!
free(window);
break;
}
if(!memcmp(sampleByteArray, soundfile.bytes + position, sample.length)) {
// this is an entire match, position marks the start in bytes of the sample.
free(window);
returnPosition = position;
break;
}
}
free(window);
position++;
}
free(sampleByteArray);
return returnPosition;
}
It compiles, didn't have time to setup the scenario to check your exact case, but i'm quite confident this may help.

Calculate PTS before frame encoding in FFmpeg

How to calculate correct PTS value for frame before encoding in FFmpeg C API?
For encoding I'm using function avcodec_encode_video2 and then writing it by av_interleaved_write_frame.
I found some formulas, but none of them work.
In doxygen example they are using
frame->pts = 0;
for (;;) {
// encode & write frame
// ...
frame->pts += av_rescale_q(1, video_st->codec->time_base, video_st->time_base);
}
This blog says that formula must be like this:
(1 / FPS) * sample rate * frame number
Someone uses only frame number to set pts:
frame->pts = videoCodecCtx->frame_number;
Or an alternative way:
int64_t now = av_gettime();
frame->pts = av_rescale_q(now, (AVRational){1, 1000000}, videoCodecCtx->time_base);
And the last one:
// 40 * 90 means 40 ms and 90 because of the 90kHz by the standard for PTS-values.
frame->pts = encodedFrames * 40 * 90;
Which one is correct? I think answer for this question will be helpful for not only for me.

It's better to think about PTS more abstractly before trying code.
What you're doing is meshing 3 "time sets" together. The first is time we're used to, based on 1000 ms per second, 60 seconds per minute, and so on. The second is the codec time for the particular codec you are using. Each codec has a certain way it wants to represent time, usually in a 1/number format meaning that for every second there is "number" amount of ticks. The third format works similar to the second except that it is the time base for the container that you are used.
Some people prefer to start with actual time, others frame count, neither is "wrong".
Starting with a frame count you need to first convert it based on your frame rate. Note all conversions I speak of use av_rescale_q(...). The purpose of this conversion is to turn a counter into time, so you rescale with your frame rate (video steam time base usually). Then you have to convert that into the time_base of your video codec before encoding.
Similarly, with a real time, your first conversion needs to be from current_time - start_time scaled to your video codec time.
Anyone using only frame counter is probably using a codec with a time_base equal to their frame rate. Most codecs do not work like this and their hack is not portable. Example:
frame->pts = videoCodecCtx->frame_number; // BAD
Additionally, anyone using hardcoded numbers in their av_rescale_q is leveraging the fact that they know what their time_base is and this should be avoided. The code isn't portable to other video formats. Instead use video_st->time_base, video_st->codec->time_base, and output_ctx->time_base to figure things out.
I hope understanding it from a higher level will help you see which of those are "correct" and which are "bad practice". There is no single answer, but maybe now you can decide which approach is best for you.

Time is measured not in seconds or milliseconds or any standard unit. Instead, it is measured by the avCodecContext's timebase.
So if you set the codecContext->time_base to 1/1, it means using second for measurement.
cctx->time_base = (AVRational){1, 1};
Assuming you want to encode at a steady fps of 30. Then, the time when a frame is encoded is framenumber * (1.0/fps)
But once again, the PTS is also not measured in seconds or any standard unit. It's measured by avStream's time_base.
In the question, the author mentioned 90k as the standard resolution for pts. But you will see that this is not always true. The exact resolution is saved in avstream. you can read it back by:
if ((err = avformat_write_header(ofctx, NULL)) < 0) {
std::cout << "Failed to write header" << err << std::endl;
return -1;
}
av_dump_format(ofctx, 0, "test.webm", 1);
std::cout << stream->time_base.den << " " << stream->time_base.num << std::endl;
The value of stream->time_stamp is only populated after calling avformat_write_header
Therefore, the right formula for calculating PTS is:
//The following assumes that codecContext->time_base = (AVRational){1, 1};
videoFrame->pts = frameduration * (frameCounter++) * stream->time_base.den / (stream->time_base.num * fps);
So really there are 3 components in the formula,
fps
codecContext->time_base
stream->time_base
so pts = fps*codecContext->time_base/stream->time_base
I have detailed my discovery here

There's also the option with setting it like frame->pts = av_frame_get_best_effort_timestamp(frame) but I'm not sure this is the correct approach either.

Detect SSD using Delphi [duplicate]

I'm getting ready to release a tool that is only effective with regular hard drives, not SSD (solid state drive). In fact, it shouldn't be used with SSD's because it will result in a lot of read/writes with no real effectiveness.
Anyone knows of a way of detecting if a given drive is solid-state?

Finally a reliable solution! Two of them, actually!
Check /sys/block/sdX/queue/rotational, where sdX is the drive name. If it's 0, you're dealing with an SSD, and 1 means plain old HDD.
I can't put my finger on the Linux version where it was introduced, but it's present in Ubuntu's Linux 3.2 and in vanilla Linux 3.6 and not present in vanilla 2.6.38. Oracle also backported it to their Unbreakable Enterprise kernel 5.5, which is based on 2.6.32.
There's also an ioctl to check if the drive is rotational since Linux 3.3, introduced by this commit. Using sysfs is usually more convenient, though.

You can actually fairly easily determine the rotational latency -- I did this once as part of a university project. It is described in this report. You'll want to skip to page 7 where you see some nice graphs of the latency. It goes from about 9.3 ms to 1.1 ms -- a drop of 8.2 ms. That corresponds directly to 60 s / 8.2 ms = 7317 RPM.
It was done with simple C code -- here's the part that measures the between positions aand b in a scratch file. We did this with larger and larger b values until we have been wandered all the way around a cylinder:
/* Measure the difference in access time between a and b. The result
* is measured in nanoseconds. */
int measure_latency(off_t a, off_t b) {
cycles_t ta, tb;
overflow_disk_buffer();
lseek(work_file, a, SEEK_SET);
read(work_file, buf, KiB/2);
ta = get_cycles();
lseek(work_file, b, SEEK_SET);
read(work_file, buf, KiB/2);
tb = get_cycles();
int diff = (tb - ta)/cycles_per_ns;
fprintf(stderr, "%i KiB to %i KiB: %i nsec\n", a / KiB, b / KiB, diff);
return diff;
}

This command lsblk -d -o name,rota lists your drives and has a 1 at ROTA if it's a rotational disk and a 0 if it's an SSD.
Example output :
NAME ROTA
sda 1
sdb 0

Detecting SSDs is not as impossible as dseifert makes out. There is already some progress in linux's libata (http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg03625.html), though it doesn't seem user-ready yet.
And I definitely understand why this needs to be done. It's basically the difference between a linked list and an array. Defragmentation and such is usually counter-productive on a SSD.

You could get lucky by running
smartctl -i sda
from Smartmontools. Almost all SSDs has SSD in the Model field. No guarantee though.

My two cents to answering this old but very important question... If a disk is accessed via SCSI, then you will (potentially) be able to use SCSI INQUIRY command to request its rotational rate. VPD (Vital Product Data) page for that is called Block Device Characteristics and has a number 0xB1. Bytes 4 and 5 of this page contain a number with meaning:
0000h "Medium rotation rate is not reported"
0001h "Non-rotating medium (e.g., solid state)"
0002h - 0400h "Reserved"
0401h - FFFEh "Nominal medium rotation rate in rotations per minute (i.e.,
rpm) (e.g., 7 200 rpm = 1C20h, 10 000 rpm = 2710h, and 15 000 rpm = 3A98h)"
FFFFh "Reserved"
So, SSD must have 0001h in this field. The T10.org document about this page can be found here.
However, the implementation status of this standard is not clear to me.

I wrote the following javascript code. I needed to determine if machine was ussing SSD drive and if it was boot drive. The solution uses MSFT_PhysicalDisk WMI interface.
function main()
{
var retval= false;
// MediaType - 0 Unknown, 3 HDD, 4 SSD
// SpindleSpeed - -1 has rotational speed, 0 has no rotational speed (SSD)
// DeviceID - 0 boot device
var objWMIService = GetObject("winmgmts:\\\\.\\root\\Microsoft\\Windows\\Storage");
var colItems = objWMIService.ExecQuery("select * from MSFT_PhysicalDisk");
var enumItems = new Enumerator(colItems);
for (; !enumItems.atEnd(); enumItems.moveNext())
{
var objItem = enumItems.item();
if (objItem.MediaType == 4 && objItem.SpindleSpeed == 0)
{
if (objItem.DeviceID ==0)
{
retval=true;
}
}
}
if (retval)
{
WScript.Echo("You have SSD Drive and it is your boot drive.");
}
else
{
WScript.Echo("You do not have SSD Drive");
}
return retval;
}
main();

SSD devices emulate a hard disk device interface, so they can just be used like hard disks. This also means that there is no general way to detect what they are.
You probably could use some characteristics of the drive (latency, speed, size), though this won't be accurate for all drives. Another possibility may be to look at the S.M.A.R.T. data and see whether you can determine the type of disk through this (by model name, certain values), however unless you keep a database of all drives out there, this is not gonna be 100% accurate either.

write text file
read text file
repeat 10000 times...
10000/elapsed
for an ssd will be much higher, python3:
def ssd_test():
doc = 'ssd_test.txt'
start = time.time()
for i in range(10000):
with open(doc, 'w+') as f:
f.write('ssd test')
f.close()
with open(doc, 'r') as f:
ret = f.read()
f.close()
stop = time.time()
elapsed = stop - start
ios = int(10000/elapsed)
hd = 'HDD'
if ios > 6000: # ssd>8000; hdd <4000
hd = 'SSD'
print('detecting hard drive type by read/write speed')
print('ios', ios, 'hard drive type', hd)
return hd

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart