<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Arch on despatches</title><link>https://icle.es/tags/arch/</link><description>Recent content in Arch on despatches</description><generator>Hugo</generator><language>en</language><lastBuildDate>Thu, 17 Jul 2025 15:36:08 +0100</lastBuildDate><atom:link href="https://icle.es/tags/arch/index.xml" rel="self" type="application/rss+xml"/><item><title>Setup Whisper</title><link>https://icle.es/2025/07/15/setup-whisper/</link><pubDate>Tue, 15 Jul 2025 08:24:28 +0100</pubDate><guid>https://icle.es/2025/07/15/setup-whisper/</guid><description>&lt;p>I wanted to generate chapter markers from a devlog audio recording using
OpenAI&amp;rsquo;s Whisper, and figured I&amp;rsquo;d run it locally. Whisper is Python-based, and
I&amp;rsquo;m on Arch. What could go wrong?&lt;/p>
&lt;p>Turns out… not much, but it still took a few hops.&lt;/p>
&lt;h2 id="choosing-the-right-setup">Choosing the Right Setup&lt;/h2>
&lt;p>I already had Python installed, but rather than littering system Python or
managing a bunch of ad hoc virtualenvs, I decided to do it properly — with
Poetry.&lt;/p></description><content:encoded><![CDATA[<p>I wanted to generate chapter markers from a devlog audio recording using
OpenAI&rsquo;s Whisper, and figured I&rsquo;d run it locally. Whisper is Python-based, and
I&rsquo;m on Arch. What could go wrong?</p>
<p>Turns out… not much, but it still took a few hops.</p>
<h2 id="choosing-the-right-setup">Choosing the Right Setup</h2>
<p>I already had Python installed, but rather than littering system Python or
managing a bunch of ad hoc virtualenvs, I decided to do it properly — with
Poetry.</p>
```bash
sudo pacman -S poetry
poetry new whisper-transcriber
cd whisper-transcriber
```
<p>So far so good.</p>
<h2 id="pytorch--cuda-the-pypy-pitfall">PyTorch + CUDA: the PyPy Pitfall</h2>
<p>My first attempt to install <code>torch</code>, <code>torchvision</code>, and <code>torchaudio</code> failed in a
confusing way — no versions found at all. The clue was in the command: I&rsquo;d
accidentally run it with <code>pip-pypy3</code>. PyTorch doesn&rsquo;t build wheels for PyPy.
CPython only.</p>
<h2 id="sorting-out-python-versions">Sorting Out Python Versions</h2>
<p>My system Python was 3.13. PyTorch had just released 3.13 wheels for <code>torch</code>,
but not yet for <code>torchaudio</code> — version mismatch. I used <code>pyenv</code> to install 3.12
instead:</p>
```bash
pyenv install 3.12.3
```
<p>Updated <code>pyproject.toml</code>:</p>
```toml
python = ">=3.12,<3.14"
```
<p>And re-pointed Poetry:</p>
```bash
poetry env use $(pyenv prefix 3.12.3)/bin/python
```
<p>Poetry ignored me the first time because 3.13 was still hardcoded. After
recreating the environment and verifying the version, I was ready.</p>
<h2 id="pep-668-the-externally-managed-false-alarm">PEP 668: the &ldquo;Externally Managed&rdquo; False Alarm</h2>
<p>Even inside the Poetry shell, Arch&rsquo;s patched Python threw a
<code>--break-system-packages</code> error. This check is meant to protect system Python —
but it was firing inside a fully isolated Poetry environment. Safe to ignore. I
added the flag:</p>
```bash
poetry run pip install torch torchvision torchaudio \
  --index-url https://download.pytorch.org/whl/cu121 \
  --break-system-packages
```
<p>Worked perfectly.</p>
<h2 id="the-result">The Result</h2>
```bash
poetry run whisper output000.mp3 --model base --output_format json
```
<p>Transcribed 2.5 hours of audio, timestamped segments ready for chapter
generation. All local, GPU-accelerated, isolated from system Python, and
repeatable.</p>
<hr>
<h2 id="in-summary">In Summary</h2>
<p>If you&rsquo;re on Arch and want Whisper with CUDA:</p>
<ol>
<li>Use <code>poetry</code> + <code>pyenv</code></li>
<li>Set Python to 3.12 (not 3.13)</li>
<li>Install torch with <code>--break-system-packages</code> and the <code>cu121</code> index</li>
<li>Whisper just works</li>
</ol>
<p>A bit of fiddling up front, but now it&rsquo;s a solid local tool — one less cloud
dependency to think about.</p>
]]></content:encoded></item></channel></rss>