<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Infrastructure on hereticles</title><link>https://icle.es/tags/infrastructure/</link><description>Recent content in Infrastructure on hereticles</description><generator>Hugo</generator><language>en</language><lastBuildDate>Wed, 27 May 2026 16:32:19 +0100</lastBuildDate><atom:link href="https://icle.es/tags/infrastructure/index.xml" rel="self" type="application/rss+xml"/><item><title>Sovereignty Over Convenience</title><link>https://icle.es/2026/05/27/sovereignty-over-convenience/</link><pubDate>Wed, 27 May 2026 09:54:20 +0100</pubDate><guid>https://icle.es/2026/05/27/sovereignty-over-convenience/</guid><description>&lt;p&gt;I&amp;rsquo;ve recently been on a journey reclaiming sovereignty over all of my data and
infrastructure. I still remember the era before the cloud when you had to do
everything yourself.&lt;/p&gt;
&lt;p&gt;Cloud changed all of that, and it was really nice - and convenient. In the back
of my mind, though, there was a tiny little scratch.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Security is inversely proportional to convenience&lt;/p&gt;
&lt;p&gt;&amp;ndash; Professor Evi Nemeth&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;GitHub was so convenient compared to running your own infrastructure. The
scratch stayed small. Then Microsoft bought GitHub.&lt;/p&gt;</description><content:encoded><![CDATA[<p>I&rsquo;ve recently been on a journey reclaiming sovereignty over all of my data and
infrastructure. I still remember the era before the cloud when you had to do
everything yourself.</p>
<p>Cloud changed all of that, and it was really nice - and convenient. In the back
of my mind, though, there was a tiny little scratch.</p>
<blockquote>
<p>Security is inversely proportional to convenience</p>
<p>&ndash; Professor Evi Nemeth</p>
</blockquote>
<p>GitHub was so convenient compared to running your own infrastructure. The
scratch stayed small. Then Microsoft bought GitHub.</p>
<p>Over the last few years, things changed. Trust in big tech has eroded so
heavily.</p>
<p>LLM&rsquo;s have also made things a lot worse - in two ways.</p>
<p>Every organisation seems to be trusting LLM&rsquo;s to write and deploy code. I&rsquo;ve got
the LLM&rsquo;s to write code - and I would not trust that anywhere near security
critical code. GitHub is using non-deterministic, hallucination-prone models to
make security decisions.</p>
<p>Secondly, LLM&rsquo;s are vacuuming up all the data they can get their hands on - no
permission sought.</p>
<p>However, my much bigger issue is that I have no choice in the matter. My consent
was not sought - my concerns were not voiced. My only choice is to carry on in
their boat or get out.</p>
<p>Ultimately, though, way down deep, in the very core of my questions, there were
two basic questions.</p>
<ul>
<li>What is all my stuff on the cloud being used for?</li>
<li>How safe is it?</li>
</ul>
<p>As each day passed, and my trust in the big tech eroded, the pain of setting up
the infrastructure started to pale in comparison to the pain of my data being
used without my informed consent.</p>
<p>I had to act. But first - an inventory!</p>
<p>My cloud data mainly lived across:</p>
<ul>
<li>GitHub
<ul>
<li>multiple repos, public &amp; private</li>
<li>blog site (on github pages with custom domain)</li>
</ul>
</li>
<li>Google
<ul>
<li>Emails</li>
<li>Documents</li>
<li>Photos</li>
</ul>
</li>
<li>Facebook
<ul>
<li>Photos</li>
<li>Posts</li>
<li>who knows what else</li>
</ul>
</li>
</ul>
<p>As a starting point, I wanted to tackle my public infra - mainly GitHub.</p>
<p>I ultimately wanted to feel as safe as I felt before the trust in the cloud was
eroded, with the minimal effort. That meant:</p>
<ul>
<li>Truly private storage for my private data</li>
<li>Strong GDPR support.</li>
<li>Hosting the public aspects on trustworthy platforms</li>
<li>Resilience. I&rsquo;d need to build that manually, starting with offsite backups.</li>
<li>High availability. I&rsquo;d need to get as close to that as I reasonably could - so
I also needed monitoring</li>
</ul>
<p>It would be no small task, and the biggest pain point would be ongoing
maintenance and supporting it if something goes wrong.</p>
<p>I would need to document how every bit ties together, make it easy enough to
redeploy services, reinstall servers and to monitor them, all the while also
keeping it safe, secure and minimising any attack surface.</p>
<p>Right&hellip; There is a cost to getting all of these &ldquo;for free&rdquo; on the cloud.</p>
<h2 id="tooling--services">Tooling &amp; Services</h2>
<h3 id="cloud-iac">Cloud IaC</h3>
<p>Documenting all of this would be tricky - unless I use Infrastructure as code.
I&rsquo;d used terraform quite a bit and was comfortable with it - well, opentofu now.
However, I knew that I would want to separate things into small units and that I
would want to make re-usable components - for static site hosting for example.
These features were easier with pulumi - and I can skip the cloud features.</p>
<h3 id="bare-metal-iac">Bare Metal IaC</h3>
<p>That would however, not cover any server configuration. I would have servers
both at home and remotely. I could just configure them once and hope for the
best - but they will inevitably need a full upgrade, which might need config
tweaking again. I might need to reinstall, and then I&rsquo;d need to figure out all
the configuration I did.</p>
<p>I have been pretty disciplined in the past about noting config changes so I
could reproduce it later, but that&rsquo;s neither an easy nor a pleasant task.</p>
<p>I wanted something automated - I considered something like
<a href="https://etckeeper.branchable.com/">etckeeper</a>, but that would store every
config change, including from package updates. I wanted to track <em>only</em> my
changes.</p>
<p>I wondered if <a href="https://docs.kernel.org/filesystems/overlayfs.html">overlayfs</a>
could be a solution, with my version overlaid over the packaged installed
<code>/etc</code> - but it wasn&rsquo;t really designed for such a use case. I&rsquo;d also need to
somehow redirect the package installs to a different place than <code>/etc</code></p>
<p>Another option was <code>stow</code>. I was already using it for my dotfiles. I could have
<code>root</code> <code>stow</code> files into <code>/etc</code>. This option felt the most straightforward until
I had to re-install my home server (omv -&gt; proxmox).</p>
<p>I wanted to be able to automate any reinstalls further - not just <code>/etc</code>, but
also package installs. The most straightforward tool fit for this was <code>ansible</code>.
I did consider <code>chef</code>, <code>puppet</code>, etc. but they offered a lot of features I
didn&rsquo;t need. The only feature that <code>ansible</code> didn&rsquo;t give me that I wanted was
state management like <code>pulumi</code>, but from what I could see, the alternatives did
not provide that.</p>
<p>In the end, I just decided to manually clean up after ansible when it leaves any
files, config or packages behind. Worst case, I also had the nuclear option of
wiping and reinstalling everything to get rid of any cruft since I configure
everything through ansible. I&rsquo;ve already done that once.</p>
<h3 id="cloud-services">Cloud Services</h3>
<p>There are specific cloud services I would still need. I need offsite backup - in
case something happens to my server. I also need to host my blog and my public
sources repos.</p>
<p>In Europe, I found two strong candidates: <a href="https://www.hetzner.com/">Hetzner</a>,
immediately felt more enterprise level, which was confirmed with the pricing.
<a href="https://www.scaleway.com/en/">Scaleway</a> felt friendlier and the pricing was
more accessible.</p>
<p>Ultimately, I picked Scaleway because the pricing was cheaper at lower usage
levels, giving me a bit of time to ramp up. The interface was also easier to
understand and navigate.</p>
<p>Scaleway has Object Storage for offsite backup and for static site hosting.</p>
<p>I considered hosting a <a href="https://forgejo.org/">forgejo</a> instance but that meant a
VPS, a database server, patches, potential issues around bots / LLM training,
higher level of complexity, and cost. It would also add friction to user
interaction - they&rsquo;d have to register to my instance, which would have only my
code.</p>
<p>I instead opted for <a href="https://codeberg.org/">codeberg</a>. If it&rsquo;s good enough for
zig, it&rsquo;ll be good enough for me.</p>
<p>forgejo might become interesting again after it integrates
<a href="https://forgefed.org/">ForgeFed</a>, which would make cross-instance collaboration
easier.</p>
<h2 id="backup">Backup</h2>
<p>For backup, <a href="https://restic.net/">restic</a> was best fit compared to borg. It has
better S3/Object-Storage backend support natively. More importantly, borg needs
to be a running service, and so more maintenance. With restic, I can just sftp
to the server for all operations, backup, view or restore.</p>
<h2 id="monitoring">Monitoring</h2>
<p>The last time I did monitoring and alerting in production, I was using munin /
monit / nagios. Everyone has moved on. I&rsquo;d evaluated datadog, grafana, New Relic
etc in a previous job, but of course, I am not opting for a cloud option.</p>
<p>While there are a few options out there, <a href="https://prometheus.io/">prometheus</a>
and <a href="https://grafana.com/">grafana</a> came up as fairly standard, and I wanted
more experience in them, so they were picked for the stack.</p>
<h2 id="migrating">Migrating</h2>
<p>The first step was to back everything up.</p>
<p>I set up restic on my desktop, backing up to my server (<code>atlas</code>), which backed
everything up to Scaleway&rsquo;s object storage.</p>
<p>That took many, many hours to complete.</p>
<p>The next step was to tidy up my home infrastructure.</p>
<h3 id="atlas">atlas</h3>
<p>The server I have was already running <a href="https://www.openmediavault.org/">omv</a> and
<a href="https://watch.plex.tv/me">plex</a>. While 15+ years old, it is a dual cpu box with
32G of RAM. It had 4 2T magnetic drives on <code>mdadm</code> using <code>raid-6</code></p>
<p>I had three more 2T drives on my desktop that I wanted to move over to the
server because it&rsquo;ll be good to have the space.</p>
<p>I would also need additional services on the box.</p>
<ul>
<li><code>prometheus</code> and <code>grafana</code> and any other related services for monitoring and
alerting.</li>
<li>codeberg actions runner in a contained environment to mitigate security risks</li>
<li>forgejo instance for my private source repos</li>
</ul>
<p>I could still run debian with docker and deploy services on there. However, I
wanted more isolation for the codeberg actions runner - just in case it managed
to escape the container.</p>
<p><a href="https://www.proxmox.com/en/products/proxmox-virtual-environment/overview">Proxmox Virtual Environment</a>
was a good fit. I&rsquo;d used it years earlier, but wiped it after it&rsquo;d fallen behind
on updates and it felt like a huge task to upgrade it. This time, I&rsquo;ll be using
ansible so I could even wipe and reinstall if I had to.</p>
<p>Proxmox would also bring zfs, with raidz2 which provided safer array expansion
and scrubbing to catch bitrot early.</p>
<p>There is a single SSD on atlas, and installing proxmox on there was the easy
part. I then had to find temporary storage for around 4T of data.</p>
<p>I spread them out on my desktop over several drives, I had some space on my M2
SSD and my games drives which had a lot of space but no resilience. I put all of
my data which I could recreate if I needed to on there - e.g. my blu ray rips.</p>
<p>All the more valuable data luckily fit on my desktop&rsquo;s raid5.</p>
<p>Worst case, all of it was also backed up using restic to a scaleway storage
bucket.</p>
<h4 id="base-config">Base Config</h4>
<p><code>atlas</code> remains for the most part the core install of proxmox.</p>
<p>Apart from basics like neovim, it&rsquo;ll also host the zfs storage:</p>
<ul>
<li>through sftp for restic</li>
<li>nfsv4 for the other linux boxes</li>
<li>samba (when I need shares on my dual boot or vm&rsquo;s)</li>
<li>ssh for git</li>
<li>data directories for all the services running in docker.</li>
</ul>
<p>It also hosts the LXC containers for:</p>
<ul>
<li>Codeberg Actions Runner (<code>ussain</code>)
<ul>
<li>on a separate network so that it can&rsquo;t access my LAN</li>
</ul>
</li>
<li>docker services (<code>loom</code>)
<ul>
<li>prometheus
<ul>
<li>blackbox-exporter (web site monitoring)</li>
<li>prometheus-pve-exporter (for proxmox stats)</li>
</ul>
</li>
<li>grafana</li>
<li><a href="https://jellyfin.org/">jellyfin</a> (privacy respecting alternative to plex)</li>
</ul>
</li>
</ul>
<p>Since all of these are deployed via ansible with the data stored in atlas&rsquo; zfs,
it&rsquo;s easy enough to change the configuration and deploy them to another lxc.
Only other manual step would be to delete the previous instance since ansible
doesn&rsquo;t clean up after itself.</p>
<h3 id="scaleway">Scaleway</h3>
<p>Scaleway will host my blog site. I&rsquo;d been meaning to rename it anyway, so this
was a good opportunity to do that. My website uses <a href="https://gohugo.io/">hugo</a>
which outputs static html, so I can host it on S3 like Object Store. The problem
with that it gives you a long domain name and doesn&rsquo;t support custom domains
directly.</p>
<p>To be able to route it via a custom domain, the simplest solution is
<a href="https://www.scaleway.com/en/edge-services/">Edge Services</a> on Scaleway. On AWS,
I&rsquo;d used CloudFront - but it is annoying, and you needed to set up http -&gt; https
redirect as well.</p>
<p>Scaleway also has an additional cost component. You need to pay €0.99/month
minimum. That&rsquo;ll give you one pipeline (i.e. one domain) and it&rsquo;s €4 for each
additional pipeline.</p>
<p>I wanted to host two domains - so that&rsquo;d set me back €4.99/month. Doesn&rsquo;t break
the bank, but it&rsquo;s also much more expensive than AWS. A small price to pay for
privacy and sovereignty.</p>
<p>I found the edge services quite fiddly though. I could delete it from the
console, but pulumi didn&rsquo;t detect the changes correctly. I also had some issues
which meant that it burned through generating 50 ssl certificates and required I
wait 7 days before trying again.</p>
<p>At this point, I decided to deploy a VPS instead.</p>
<p>I knew that I would eventually need a VPS for <a href="https://remark42.com/">remark42</a>
to support commenting on my blog. A VPS is around €7, only €1 more than the Edge
Services pipelines.</p>
<p>I used pulumi to provision the VPS(hera), including ipv4 and ipv6, and used
ansible to configure it.</p>
<p>I also put together a <code>pulumi</code> stack for <code>ansible</code> outputs. It picks up the
bucket s3 urls and writes a config file for <code>ansible</code>. <code>ansible</code> then uses this
to configure <code>caddy</code> to route the relevant domains to their s3 buckets. This
automation helps to keep the cloud state synced with the server configuration
without manual intervention.</p>
<p>What it does have though is a node exporter for prometheus. However, it&rsquo;s not
easy for prometheus to connect to it. The safe option I could come up with was
to use wireguard between the docker lxc (loom) and <code>hera</code>.</p>
<p>I also added a firewall rule on loom to prevent any new connections originating
from <code>hera</code>. I want prometheus to be able to access <code>hera</code>, but not the other
way around. That adds an extra layer of protection if <code>hera</code> is ever
compromised.</p>
<p>Again, all of this is configured through <code>ansible</code>, so if <code>hera</code> is ever
compromised, I could just wipe it and reconfigure it with one command.</p>
<h3 id="deploying-the-websites">Deploying the websites</h3>
<p>Now that the websites were configured, I wanted to give the visitors more than
an error page.</p>
<p>This was also the appropriate time to move my repo from GitHub to codeberg. I
created a new repo on codeberg and pushed the repo up. I just edited
<code>.git/config</code> instead of removing the origin and adding it back in.</p>
<p>The site was previously using both GitHub Actions and GitHub Pages. It would now
be using forgejo actions and rclone to push to Scaleway. But wait - it needed
auth.</p>
<p>So far, codeberg has been the one bit of infrastructure I couldn&rsquo;t code. I had
to actually go on to the website and click through the UI manually. It has an
API, but pulumi does not support it.</p>
<p>I did however put together a little script which takes a <code>CODEBERG_TOKEN</code> env
var, picks up the secret key from pulumi outputs and sets it as a secret on the
specified repo.</p>
<p>I have a function in pulumi for static sites which also creates and exports this
token, which makes it pretty straightforward to add more static sites. This
reusable functionality was the reason I wanted pulumi from the start. I&rsquo;d done
reusable components in terraform as well, but they were a lot clunkier.</p>
<p>I considered collecting web server logs, as I remembered doing back in the day -
with apache and analysing them <a href="https://awstats.sourceforge.io/">awstats</a>. These
days, it would probably be <a href="https://goaccess.io/">GoAccess</a>. However the privacy
concerns around storing ip addresses needed handling properly, so I put that on
the backburner.</p>
<h2 id="monitoring--alerting">Monitoring &amp; Alerting</h2>
<p>Monitoring and Alerting was honestly the biggest reason why I kept putting off
bringing everything in-house. Not the work involved in building or running it,
but the feeling of being constantly on call - which I was for 13 years.</p>
<p>That was for multiple high profile, high traffic websites. This is my personal
blog - it&rsquo;s not a problem if the site is offline for a few hours - nobody is
losing money. Still, the pavlovian response was one of stress.</p>
<p>I was able to work through it largely by reminding myself that there are no
SLA&rsquo;s.</p>
<p>I got prometheus to scrape all the data it can from the server, the lxc&rsquo;s and my
desktop. I pulled in some dashboards from grafana to visualise them. It&rsquo;s nice
to see a historical usage for my desktop and the server, including averages,
growth rate etc.</p>
<p>However, alerting was the main reason for all of this.
<a href="https://github.com/prometheus/blackbox_exporter">blackbox-exporter</a> monitors my
blog and another site I&rsquo;ve got set up. This is valuable, and nice to see green
across the board and that it checks expiry for the ssl certificate. I&rsquo;d been
burned in the past with certificate expiry and it&rsquo;s nice to not have to worry.</p>
<p>When I set up the new site, I also noticed that I&rsquo;d forgotten to set up a
redirect from a previous domain. That domain had a lot of SEO juice, which was
cut off for a while because I hadn&rsquo;t tested it.</p>
<p>This time, I made sure to add all redirects to the <code>blackbox_exporter</code> tests.</p>
<p>I configured them to go to a channel in a discord server, the path of least
resistance. A better option would probably be <a href="https://fluxer.app/">fluxer.app</a>
but it doesn&rsquo;t yet have a mobile app.</p>
<p>It triggered an alert the other day. My heart skipped a beat before I remembered
that it was just my blog. It resolved itself while investigating it. I could
find nothing wrong on the server, which was unlikely to be the culprit anyway.</p>
<p>It was probably the Object Storage Bucket, so I added the website endpoints to
the monitoring as well. In the event of a future site failure, I will be able to
see at a glance at which point the failure is.</p>
<p>For a moment, I regretted switching it all to my infrastructure - because it
would not have gone down for a few minutes on GitHub. I then realised that I
simply don&rsquo;t know how often my site went down when it was on GitHub or for how
long - it was never monitored.</p>
<h2 id="journey-so-far">Journey So Far</h2>
<p>I am thrilled to have my blog live on my own infra instead of on GitHub&rsquo;s.
Having the whole setup documented in pulumi and ansible makes it much easier to
maintain and manage. It also makes it much easier to take a look at how I set
something up.</p>
<p>I also appreciate that I can redeploy all or parts of the services with ease if
required, and that upgrades should be relatively pain-free.</p>
<p>The infrastructure that is deployed feels less opaque than when I had done
similar things 15+ years ago, thanks to it all being managed via IaC.</p>
<h2 id="next-steps">Next Steps</h2>
<p>I also have to pull everything down from google and facebook. I already have the
file storage ready - I just need to pull everything down, then delete it from
the cloud.</p>
<p>Email will be a bit more work. I am using <a href="https://mailbox.org/en/">mailbox.org</a>
for my new domain, which feels good enough. I have half a dozen email addresses.
I can rationalise them down to three, but I also want to wipe out all the junk
email in the process.</p>
<p>Google Photos will be trickier still, partly because I&rsquo;ve not looked for a
self-hosted solution for that yet. There is also the problem with how to provide
access to my internal servers to the internet safely so that photos can be
uploaded / downloaded from the phone when I&rsquo;m out and about. Headscale might be
one way to solve this.</p>
<p>I also want to put together a dashboard on grafana that will show me a
comprehensive high level overview of my whole estate on one monitor - that&rsquo;s an
endeavour for another day.</p>
<h2 id="conclusion">Conclusion</h2>
<p>So far in my journey, I have migrated key repos from GitHub, both private and
public to safer places. I feel substantially safer already. Most of the repos
were code, but one was my zettelkasten / second brain knowledge archive. This
used to be a private repo on GitHub. It is now stored on <code>atlas</code> across from me
in my room, encrypted and backed up to Scaleway. That was the biggest win of
this whole process.</p>
<p>The rest of the data will come down here in time, and I have no doubt that I&rsquo;ll
feel safer by the end of it.</p>
<p>Am I actually safer? That is much harder to measure. I know that I have limited
the attack surface as much as possible. Plus, my scale is so small that I am
unlikely to be targeted.</p>
<p>On the other hand, GitHub and other cloud platforms have a lot of people
worrying about and considering security and safety on a daily basis. They
regularly patch the servers and track security vulnerabilities.</p>
<p>Am I actually safer? I don&rsquo;t know - but I feel safer.</p>
]]></content:encoded></item></channel></rss>