{"id":24641,"date":"2026-05-20T19:19:05","date_gmt":"2026-05-20T19:19:05","guid":{"rendered":"https:\/\/jurn.link\/dazposer\/?p=24641"},"modified":"2026-05-23T09:27:53","modified_gmt":"2026-05-23T09:27:53","slug":"release-stable-audio-3-with-comfyui-checkpoints-and-encoder","status":"publish","type":"post","link":"https:\/\/jurn.link\/dazposer\/index.php\/2026\/05\/20\/release-stable-audio-3-with-comfyui-checkpoints-and-encoder\/","title":{"rendered":"Release: Stable Audio 3, with ComfyUI checkpoints and encoder"},"content":{"rendered":"<p>Stable Audio 3 has been released. Free, offline and local. Feed it a text prompt and it generates field-recordings, movie-style sound-effects, mixed-effect soundscapes&#8230; and now also music. It&#8217;s of obvious use for those creating animations, visual novels, motion comics, YouTube illustrated audiobooks, Ken Burns style documentaries etc. <\/p>\n<p>Very fast, small, and with outputs free for commercial use. Full 44.1 kHz stereo quality. Even the very small models for version 3 (2.2Gb + 1.1Gb text encoder) can produce up to two minutes and can apparently generate this in a few seconds even on a CPU.<\/p>\n<p>Very importantly for creatives, the new version offers <strong>iterative editing<\/strong> of outputs. You can regenerate just the bits you don&#8217;t like (audio &#8216;inpainting&#8217;) and keep the rest. And the speed should make that a relatively easy matter. You can also auto-extend your audio in the same style (audio &#8216;outpainting&#8217;).<\/p>\n<p>Trained on legally clean sources, so the anti-AI mob can&#8217;t wail about &#8216;stealing&#8217;.<\/p>\n<p>The official release is gated behind a Huggingface sign-in, presumably to guard against vexatious lawsuits re: misuse by miscreants. But the ComfyUI models and encoder are here and freely available: <a href=\"https:\/\/huggingface.co\/Comfy-Org\/stable-audio-3\">..\/Comfy-Org\/stable-audio-3<\/a>. I&#8217;m still downloading, but I guess it might then need an update of ComfyUI, re: getting the text encoder recognised? <em>Update: needs to be the latest ComfyUI 0.22 or higher.<\/em> <\/p>\n<p>What&#8217;s currently missing is i) an example prompt structure to generate a complex but coherent sequential soundscape (can JSON be used, with timings?); ii) a working example of how the iterative editing is done; and iii) an optimized ComfyUI &#8216;studio&#8217; workflow (e.g. for optimised stereo separation and movement, and for running the Small SFX with the Small Music model alongside each other for a basic multitrack in the workflow).<\/p>\n<p>For the previous Stable Audio, note that one could multitrack simply via the prompt. e.g. <em>&#8220;A balanced mix between a good field recording of a man walking through dry leaves in winter, and a recording of small birds calling plaintively in the surrounding Canadian boreal forest.&#8221;<\/em> The word <em>mixdown<\/em> would also work. I assume this will work on the new version 3, once I get it installed and working.<\/p>\n<p>I&#8217;m less interested in the music than in the SFX and soundscapes, but if the music interests you then note the new <a href=\"https:\/\/stableaudio.com\/user-guide\/prompt-structure\">Stable Audio Prompt Guide for Music<\/a> at their site. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Stable Audio 3 has been released. Free, offline and local. Feed it a text prompt and it generates field-recordings, movie-style sound-effects, mixed-effect soundscapes&#8230; and now also music. It&#8217;s of obvious use for those creating animations, visual novels, motion comics, YouTube illustrated audiobooks, Ken Burns style documentaries etc. Very fast, small, and with outputs free for [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13,9,4,7],"tags":[],"class_list":["post-24641","post","type-post","status-publish","format-standard","hentry","category-companion-software","category-freebies","category-spotted-in-the-news","category-the-animation-industry"],"_links":{"self":[{"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/posts\/24641","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/comments?post=24641"}],"version-history":[{"count":5,"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/posts\/24641\/revisions"}],"predecessor-version":[{"id":24659,"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/posts\/24641\/revisions\/24659"}],"wp:attachment":[{"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/media?parent=24641"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/categories?post=24641"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jurn.link\/dazposer\/index.php\/wp-json\/wp\/v2\/tags?post=24641"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}