(Mis)Understanding USB Audio

(Mis)Understanding USB Audio

I’ve read a lot of discussions/threads (in various places), and had numerous discussions (online and in-person) that demonstrate an enormous level of misunderstanding relating to USB audio - especially UAC2 (USB Audio Class 2) - which is the USB audio specification that almost all modern DACs utilize.

Lots of magical attributes are assigned to USB Audio, most of which are either due to faulty assumptions, lack of understanding, or pure flights of fantasy couched in technical terms intended to support either subjective or objective perspectives.

NOTE: Before you go reaching for your pitchforks … what this (and possible subsequent) posts does NOT say is that there cannot be differences in audible, or measured, results resulting from different USB systems. A USB system being the aggregate of USB source, chipset, cabling and USB receiver.

But it will address some of the common misconceptions and why, while differences may exist, they cannot be for the reasons usually claimed.

Everything here can be validated via familiarity within the USB 2.0 specification, including the UAC2 components there-of.

USB Audio/UAC2 Does NOT Guarantee Bit-Perfect Delivery

There are USB protocols that do (e.g. bulk-transfer mode for USB storage); USB Audio/UAC2 just isn’t one of them.

UAC2 sends data using the Isochronous transfer mode. This is intended for high-bandwidth, time-sensitive, data transmission where dropped/corrupted packets are non-critical.

Isochronous transfer mode does NOT include a checksum, nor any error-correction/detection mechanism. It is also unidirectional. Thus, there is no built-in way for the sender to know if data was even received, nor for a receiver to detect bad data, much less correct it or request it be resent.

UAC2 partially addresses this by including its own CRC within the protocol itself (i.e. one level up from the transfer mechanism). This adds the ability for the receiver to detect if incoming data is bad. But since there is still no error-correction code/mechanism, nor a retry/resend mechanism, it significantly limits what can be done when bad data is received.

A DAC receiving bad data via UAC2 only has a few options:

  • Mute its output until good data arrives (i.e. you get dropouts).
  • Limit the output level for data in that frame.
  • Play the data as-is and hope for the best.

Better DACs tend to take the first approach.

If nothing else, this approach indicates that you have a problem – which might be a faulty cable, a broken device, or simply that the source is overloaded and unable to maintain the necessary transfer rate.

It also guarantees you don’t suddenly wind up with full-level output out of nowhere (which is a pitfall of the last approach).

Beyond that, USB audio it’s a send-only, non-correctable, transmission mode that doesn’t guarantee packet receipt at all , much less that the received packets are correct!

(The only thing that gets sent to the sender is an async control that tells it to send the next packet faster or slower (and the timing constraints/variations there are very limited).

USB Audio/UAC Data Corruption Rates

Fortunately, data corruption/loss using to-spec cables (including observing the length limitations in the USB 2.0 specification) in a properly functioning system are rare.

How rare?

I’ve run tests that simply stream audio constantly, validating the CRCs and logging any errors, and gone for days without seeing a single bad packet/frame.

Now, repeat that with an out-of-spec cable (a large number of “audiophile” or “boutique” cables) and things can change markedly.

For those without suitable analyzers and test equipment, a simple and often-revealing test with weird-and-wonderful boutique/out-of-spec USB cables is to use them with a storage device and benchmark a big transfer. Because USB storage transfers ARE error correctable, when transmission errors occur the transfer rate DROPS.

And the practical upshot of that is that if your boutique cable cannot maintain the same data transfer rate for storage applications as a <$10 Amazon Basics cable it is because it is not transferring data reliably, and that absolutely WILL affect USB 2.0 audio performance (the exact results depend on what the DAC does with data that’s corrupt).

(More to come).


Interesting timing. Just sharing a simple example that happened to me yesterday. A few months ago I bought this set of cables. The goal was always to recharge my USB-C stuff.

But as time goes by you tend to relax and overlook a few requirements. I started using them – mainly the 3 and 6 ft ones – to connect my FiiO K3 and to listen to some music. Note my K3 is not plugged at all times.

I grabbed the longest one (10 ft) and laid back in the couch for a chill out session with classical music. I started hearing some crackling noise. I initially thought it was the streaming service (or album – GPM). Since it was low volume I ignored it.

After waking up from the nap I decided to replace the cables with the shorter ones. Bang. Mystery solved. Culprit was the 10ft cable.

Never again.


USB 2.0/UAC2 DACs & Re-Clocking USB

Since USB 2.0 audio/UAC2 does not include a sample-clock, there is no clock to “re-clock” that relates to the audio signal/data.

All USB DACs receive data into a buffer and then use a local* clock (i.e. within the DAC itself) to clock sample data out of that buffer at a rate specified in the UAC2 pre-amble.

UAC2 does include time-sync data, but this still isn’t a sample-clock and is used simply to allow synchronization across multiple UAC2 devices.

Thus, all USB DACs are already “re-clocking” their USB inputs (and you’re only as good as your final clock).

A “USB Re-clocker” only re-clocks the USB clock, which is not related to the audio or sample clock**. There may (or may not) be other artifacts (good or bad, measurable/audible or not) resulting from the use of such devices, but they’re not because re-clocking the USB clock has any correlated effect on the jitter of the sample clock in the DAC (unless it is incompetently engineered or there are other issues in the overall USB system being employed).

NOTE: That this situation is completely different for S/PDIF or AES connections, as those protocols use a bi-phase clock as part of the signaling scheme and that bi-phase clocking IS the sample-clock.

*Some DACs can use an external word or sample-clock, but this is still unrelated to the timing/clocking of low-level USB transfers.
**It is technically possible to build one that was, but you’d have to be a muppet to do it as it is much more involved than simply using an appropriate local clock, introduces lots of other issues, and is unlikely to be as accurate anyway.


Amazing information @Torq, Thank you for this!


I would love to hear your thoughts on ethernet cables, switches and or fiber that are “audiophile” rated


This is another fact I learned the hard way – i.e., by experience.

Some context. With the same tiny FiiO K3 from my previous post I can route a S/PDIF signal out into my Scarlett interface. The signal comes out of the box and one can listen to music. However, there’s a small audio crackling at every 1 second or so. The fix is a simple switch at software level:

It is clear both clocks (signal and USB interface) get out of sync quickly and the pops happen. Switching to S/PDIF clock source, problem solved.

But the above is for a Focusrite interface. I wonder if there are some manufacturers that actually do not respect the signal clock from S/PDIF and try to re-clock the signal? Just curious though.

Or maybe I should invert my own question. Why would the user have to manually switch this source clock in the interface? I can’t think on an application for this. Nahh. Nothing that important.

*Insert nuclear Facepalm here*

IEEE 802 specifies what Layer 2 of the OSI-model does. And part of that is detection and possible correction of errors.

I would classify any switch you could hide in your living/listening room without anyone noticing its presence as audiophile grade.
Get some CAT 6 cable and done.

If you venture into fiber (what is the point at home except between NAS and workstation?), just pay attention you buy matching fibre and transcivers (just buy multimode stuff).


Ethernet can guarantee perfect data delivery via EDC and ECC/re-transmit, so getting bit-perfect data delivery is a non-issue.

Some protocols running on top of ethernet don’t (e.g. UDP vs TCP), but that’s normally only an issue if they are running on a datalink (layer 2) or network (layer 3) implementation that doesn’t provide for innate data integrity.

In the event that a cable or other component is incapable of delivering the data successfully at the necessary rate the interface can decide what it does with it … but the typical response is that you get drop-outs and, in many cases, the player just stops and reports an error.

Same is true for switches.

What you won’t get, is corrupt sample data*.

Either enough correct data shows up in a timely fashion to yield uninterrupted playback or it doesn’t.

Ethernet “re-clockers” are a bigger farce than USB re-clockers. For one thing, most protocols/formats used for audio transmission over ethernet do not carry a sample-clock (just as with USB). There are protocols that can run on Ethernet that do (or can), but those are usually only found in professional systems (Dante, Ravenna/AES67).

But, and this is important …

Re-clocking ethernet makes no sense for audio on the simple basis that it DOES NOT guarantee in-order packet delivery! It doesn’t really matter how good the clock on the physical layer is if your samples arrive at the interface with the first 1,000 samples ordered after the second 1,000 samples!

The interfaces DO guarantee proper data reconstruction … but any re-try is likely to result in out of order packets, and thereby samples out of order, which then have to be put back in order by the NIC (or in its driver/software stack) … and that process is independent of any clocks anyway.

*You can certainly put ethernet devices into states that permit transmit and receive of bad data, but that doesn’t happen by accident … most commonly via special diagnostic tools and/or security/hacking issues.


I remember a section of my real-time data processing class at Uni talking about time critical stuff over IP.
And in case of audio or video over IP, there is RFC 3550 which takes care of all the possible problems the Data link can introduce to the arriving data.


Yep, outside putting interfaces into special modes (normally inaccessible in user-mode applications) it’s fundamentally a non-issue.


Thank you. My thoughts exactly. I got banned from several forums because I call a certain product snake oil :slight_smile:


I got myself into hot water on a few forums for looking at PCBs and prices of the components concluding that “muh $600 amp” has a BOM of $60

Am aware of that. It is just for me as a hobby tinkerer unbelieveable what people are willing to fork out for essentially a weekend soldering project.


Another thing … true of both USB and Ethernet “re-clockers” … is that they CAN apply more accurate clocking to the physical signaling layer. This is definitely measurable (usually presented as an “eye” pattern).

Depending on the clocks present in the overall system, however, having a better clock in a transmitter doesn’t necessarily mean a better eye pattern. Depending on the exact phase and beat drift characteristics of BOTH clocks you can wind up making things WORSE for the receiver.

Though the receiver can either properly reconstruct the data or it can’t. The how and why of that is largely immaterial, as the result is still either bit-perfect data or drop-outs.


Could also lead to manual starring and questions as to why the link speed is 10 Mbit link when you would expect 1 Gbit.

Buffer underflows are also fun to trouble shoot :expressionless:


In designing, building, shipping and supporting consumer electronics, it is typical that the MSRP will need to be 3 to 8 times the raw BoM in order for the product to be profitable.

There are exceptions, as with all things, but 3x-8x is typical.

You figure distributors want a 25-45% cut. The actual retailer wants something similar. And you’re already at 50% of MSRP. Direct-to-customer vendors (e.g. Schiit) fare better here because they cut out to middle-men.


There are a good number of products that really DO do what they say they will to the low-level signals. And the results are typically measurable at that level.

Where they tend to completely fall apart is that when you measure the output of the DAC itself with such products in the chain and find … absolutely bugger all difference.

If they’re only claiming the former, that’s not really snake-oil (and you need to be careful what you say as you can be sued). If they’re DOING the former, and also claiming audible/measurable differences at the output of the DAC that simple are not there (which is what most of them do), then that’s much better ground to call “snake oil” on.

Mostly it’s not worth the debate …


This, above. Every product claims the same thing. I’ll take my risk. :slightly_smiling_face: at the end of the day you are right is not worth it and what I did learn from my lessons is that, is all stupid… We will have our opinions and you will have “experts” come for every corner of the world to defend what they believe is different.


Lots of DACs re-clock their S/PDIF inputs. And there are many S/PDIF re-clockers. These work as advertised … if they’re built properly … in improving jitter performance that’s measurable at the output of the DAC.

Most S/PDIF re-clocking schemes are flip-flop based and don’t impose latency (well, one half clock cycle) as you would get with a more typical buffer-and-clockout scheme.

It becomes important with multiple devices and maintaining clock synchronization. For a single device, you typically use the internal clock. When you have several interfaces, all being fed from other devices, that are then all clocked from a common master word-clock, they need to be in-sync … which requires using the S/PDIF clock (if there is no pure-clock input on the interface itself).


As a computer scientist: thank you Torq. So many times I have had to battle the uneducated (in CS and engineering matters) on this front. Your explanation is clear and easy to understand. What I don’t understand is why so many people decide not to educate themselves and look into the actual tech stuff instead of wasting money in things that make absolutely no difference. It’s their choice to make, of course, it’s just something I struggle to come to terms with.


So how much can USB Audio cable affect USB playback? Is it even worth it or noticeable enough to buy “audiophile grade” usb cable?