Following on from yesterday’s Stable Audio 3 install / workflow post. I can confirm that the Small Music version of SA3 works and works wonderfully. Super-quick, quality and commercial use. Nice. Fun, too, as it’s so amazingly fast.
I’ve here swopped out the previous standard VAE Decode Audio node for a VAE Decode Audio (Tiled) node, which is apparently better according to Reddit wisdom. Though that node does take longer to process if you go beyond the 47 second default output. Small Music can output up to 120 seconds, but if you go that long then the VAE Decode Audio (Tiled) node will noticeably pause.
There are no vocals, instrumentals only.
Note also that there are now Base versions for the Small models, and that LoRAs can also be trained on them.
How to ‘outpaint’ and ‘inpaint’ audio with SA3 in ComfyUI? That’s still to discover.
Finally, there’s an official prompting guide with a music focus. It obviously helps if you’re conversant with audio terminology, or can find an AI that can describe an audio clip in musicologist terms.

