Dos and Do Not Dos for creating containers

February 7, 2025

#docker #fediverse #bestpractices #homelab

I'd like to say I've stood up my fair share of projects, both personally and professionally over the years. From running open source tiny projects like WriteFreely here to standing up massive microservices in Kubernetes for large companies, I've done my fair share of Docker.

Over the last weekend I've been trying to stand up a Fediverse instance. Those of you who know me know that I'm fairly passionate about the fediverse, allowing us to democratize social media a bit by hosting our own version of Insta/FB/etc and allow our self hosted instances to talk to each other.

However what started as a very fun idea to stand up my own has instead turned into over a week of frustration and questioning, and it inspired this post today.

So today rather than talking about hosting, or Kubernetes, or Docker, today I want to write some essential dos and maybe more importantly do-not-dos for Docker images. None of these things are me saying “You did this wrong”, but more me trying to show the multitude of ways that people use Docker, and how to make sure people find your docker image easy to use, instead of finding it frustrating.

I won't point fingers or use any exact examples, but I will come up with examples of both the negative and positive cases of each.

Don't: Write shell scripts for building/running containers

To be clear I don't mean scripts inside the container, but install scripts to actually run your containers, scripts that actually run docker build or docker run.

I've seen this trend in a lot of open source projects, and it's a bit of a pet peeve because I know maintainers are trying to make images easy to use for their users. After all, we all know plenty of people will demand “Just give me the exe”. (If you don't know, I warn you there's plenty of profanity, but here's the link). However, there's a line that can be crossed, where making it so easy that containers will “just start” hides a lot of the crucial info that is required for actually running a container.

I've spent hours reading through install.sh scripts trying to figure out how maintainers actually build their docker containers, when really all I want to do is figure out what steps I need to build the container, or even how to simply run the container.

The reason this is difficult is simply this – what if I'm not running this container on this machine? What if your user is using Windows? Mac? FreeBSD? What if the machine I'm going to run the container on is in the cloud, across the room, or in my case, I run Kubernetes. I will not be using docker run or docker compose, so while you've made it marginally easier for people using those tools, many other people are left with an even harder to stand up version of your software.

Docker was built to be able to run your code on any machine already, with a standard set of rules and guidelines. Adding custom scripts into the mix means that the process has deviated from the standard Docker guidelines, and new users will need to learn your specific way of standing up your containers.

Do: Write clear documentation on building containers

It may seem backwards, but providing clear and concise documentation in this case of your environment variables, volumes, ports, and other docker-related items is much more helpful than a script.

You can run any container by knowing the environment variables, volumes, and ports – and providing that documentation really is all you should need to do to get your users up and running.

As for the “Give me the exe” folks, well, if you want to self host, learning docker is one of the best ways to get started.

Environment Variables

Below are the environment variables to run , their description, and what their expected values would be.

Name	Example	Description
DB_HOST	127.0.0.1	The host (or IP address) to locate the database on.
DB_USERNAME	admin	The who will log into the database (located at DB_HOST)
DB_PASSWORD	(Your DB password)	The password for DB_USERNAME

It's tempting to build a nice CLI that allows your users to type that in and it sets it up for you – but by doing that you're only helping a fraction of your users who may run this on their one computer. Help all of your users by instead providing clear documentation.

Don't: Use .env files in Docker

Following our environment variables, another worrying trend I've seen with small projects, the use of .env files in production.

For those who don't know, .env is a growing standard for setting up environment variables for development purposes. I use them myself in python and JS projects. It allows you to set your environment up in a simple file rather than trying to figure out how to set up your system's environment variables while debugging your code.

DB_HOST=127.0.0.1
DB_USER=admin
DB_PASSWORD=foobar!!

The format of a .env file

This is great for development on your machine. This is a maintainability and security nightmare in production. Let's break it down for a few reasons why you should never use .env files in production.

An additional file is an additional dependency

Adding a .env file is another file that needs to be maintained and backed up in the case of a server failing. Rebuilding the environment of a container should not be something that can be lost in case items are not backed up.

Secrets/Credentials

Security here is the most damning for .env files. I gave the database example above to highlight the point, in the case of .env files, your secrets are stored in plain text somewhere on a computer. Docker, kubernetes, openstack, everyone provides some sort of secret store where you can safely place your secrets and inject them into your running container. Personally, I really enjoy using 1password's secret management tools that handle this for me.

Forcing the use of a .env file in production completely negates all of this security work by forcing passwords to be stored in plaintext.

Conflicts with Docker principals

This one I have to throw in, containers aim to be as stateless as possible, by introducing state where there doesn't need to be any (especially when environment variables already exist), goes against what containerization is attempting to do.

All major languages/frameworks allow some way of setting configuration values from either a local file (.env, appsettings.json, etc) or from environment variables and merging them together. I urge developers to learn these settings, it takes a few minutes to learn and will save many headaches later.

Do: Use environment variables

Simple as that. Simply use environment variables to configure your application. Most .env libraries were built to overwrite the environment but will use the environment variables in the background. Let those libraries do their job. If it finds a .env file, they will use it for debugging. If they don't find one, they will use the environment set by the system.

I've only seen this on a few projects, but it's a critical one. Never share volumes between containers. It's just more hassle than it's worth, and usually is highlighting an underlying problem with your project.

I see this mostly in the case of something like an app server that serves HTTP, with workers running in other containers. That there is a great paradigm, it allows me to make a separate worker while keeping my HTTP server up. Say though, that you need to access something from the HTTP server. Well, the first approach might be to just share the same volume. HTTP server uploads it to /volume/foo.txt, and then the worker can also mount the same volume and read it, right?

Well, yes – if these containers are always running on the same machine. But – will they? The example I laid out actually kind of contradicts that thought, the point was that we would not want to overwhelm our API server, so wouldn't a natural thought be to move our worker to another machine? If we require a shared volume, that immediately becomes more complex.

Kubernetes offers a few types of volumes, RWO (Read-write Once) allows you to share a volume across running containers (pods), but the limitation is that many can read, only one container can write. Maybe the worker needs to write back to the API container? Well, they offer RWX (Read-write Many) which does work – however it usually requires quite a bit of setup, and how do they accomplish that? Most RWX providers are running an NFS server inside of the cluster, which means that local storage is no longer local, and writes are happening over a network. So to achieve this simple cross-container volume, we must set up quite a bit of overhead to achieve a solution that does not work as well as we wanted it to in the first place.

Do: Use a cache or database

A much cleaner approach, and one that was meant to handle this from the start, is to use a cache. Redis is extremely easy to set up and will solve 99% of the issues you may have here. In the case of text, json, or anything else you may need to pass through, sending it through Redis will probably save you.

For small files, like an avatar or thumbnail, Redis does allow binary files, if you need security most SQL and NoSQL databases allow binary data or attachments. I know many will scoff at the idea, but if the goal is simply to facilitate communication between different containers, this can be a fine solution, say to pass a file back to a worker, who will process it and eventually load it to S3, as an example.

Don't: Write files around the filesystem

A large source of code-smell is if files are written all around the file system. If your code needs some files over under /var and others under /data, maybe it needs to write some files under another location – it becomes very difficult to track all of them. Confusion though is just one problem.

Kubernetes and containerd actually don't let you modify files that aren't in a volume. The entire filesystem of a container running under containerd (k3s for example uses containerd as a default, which is good because it's more lightweight) is completely immutable. Attempting to write to a file that is not explicitly mounted as a writable volume will cause your application to throw “Read-Only filesystem” errors.

There are exceptions to this. /var/log for example is mapped to a temporary cache automatically so logs can be written there, as is standard. /tmp is the same, as anything written to tmp is assumped to be temporary and doesn't matter.

If you need to write to a random file though, say /var/foo/bar, containerd will throw an error, and that directory will need to be explicitly mapped by your users before they can write to it.

Do: Choose one workspace to work out of

There is nothing wrong with needing persistent storage, but try to keep it to the minimum number of volumes. Say you want everything to be under /data, then that allows users to create one and only one volume that will be persisted.

However conversely:

Don't: Store temporary data in persistent storage

The exact opposite of the above, if you have temporary data, store it in temporary storage. Most persistent storage options take on the approach that storage can always grow, but cannot shrink. Meaning if you store logs under /data/logs and /data is a persistent volume but logs can grow to say gigabytes in size, your users will end up paying gigabytes for storing those logs.

Do: Use standard locations for temporary storage

Put logs under /var/logs and temporary storage under /tmp and everything will work perfectly.

Don't: Log to a file

Or don't store logs at all! On the subject of logs, in a containerized world – don't worry about them.

If I have a container acting up, if I want to see the logs I vastly prefer using the docker logs or kubectl logs command over needing to exec into a container to tail a logfile.

Do: Print logs to the console

Kubernetes, Docker, all of the orchestration frameworks then will be able to consume your logs and rotate them accordingly. All logging frameworks will dump logs to the console, there is no need to write your containerized logs to a file at all.

My final one for today, the worst offender I have, this one is important folks.

NEVER: put source code in volumes.

This one is so bad that I'll call out a direct example – Nextcloud. Nextcloud is a PHP application, and their AIO container is one of the most frustrating I've worked with – mostly because their source code is actually stored in the persistent volume. There's many, many reasons for why this is bad practice and goes directly against the very ideas of containerization, but I'll call out a couple.

Containers should never need to be “updated”

Containers should be all encompassing for their environment. There should never be a case when a container starts that some packages may be out of date, or some code may be the wrong version – that's the point of containerization. When a container starts, that is everything it needs to be able to run.

No version control

Since persistent volumes are outside of a git repo, there is no way to update the versions without writing your own update functionality. There are simply too many ways that code in the volume can become inconsistent with what is in the container.

Do: Make use of temporary data and configuration

Code should remain under it's own directory, with persistent data under another completely separate directory. If you're building your own container, I recommend /app for where you put your workspace, and then mounting /data for any persistent storage needs. To go above and beyond, make an environment variable for MOUNT_POINT that maps to /data by default, but let's your users mount anywhere on the file system, that gives them the most flexibility. Then you don't need to worry at all about where in the system your mount points are.

If there are dynamic packages or plugins, there should be a way to define these in environment variables and they should be ran ephemerally. On container start download the plugin to /tmp and use code from there. On container restart, the same thing will happen. This also ensures plugins stay up to date, as plugins will always download the latest version.

Running GPU Containers in Kubernetes

January 15, 2025

Hello everyone, hope everyone had a great holiday break and are starting the new year off okay. I'm back, and as ever spent the break working on my homelab. (Okay, and a fair amount of time in Satisfactory as well).

Today we're getting into some of the more fun stuff, working with GPUs in Kubernetes.

To recap so far, I've talked about setting up hypervisors at home, and setting up a basic Kubernetes cluster. In case you don't know, Kubernetes is simply an orchestration layer on scheduling your pods across different VMs.

This orchestration makes running numerous pods both on premise or at scale a breeze, only needing to do minimal setup on the machine itself. (If you're curious about how I run Kubernetes at home, I recommend checking out my post on k3s, a lightweight kubernetes )

Why choose Kubernetes for GPU workloads?

It's a fair question, why go through the extra orchestration layer at all? Hey Rob, I have VMs, I have GPUs, why can't I just run my workloads on the VMs directly? You absolutely can, and I did for quite a while. GPUs are massive workhorses that you can add directly into your VM and you can transcode video, train models, right there.

As mentioned though, the problem is scheduling those workloads. By not using an orchestration layer you are left with the task of scheduling those nodes yourself. Both on-premise and cloud that is a daunting task, and adding graphics cards to the list only makes that more complex.

On-Premise/Homelabbing you have a finite number of graphics cards. I won't say how many I personally have, but more than 2 and I can count them on one hand. Having a non-infinite amount of resources makes scheduling those workloads crucial. If I have a transcode/training job I need to run, I don't want to spend time figuring out which nodes have what workload and which one is almost done so I can run the next, I need a way to queue those workloads on it's own.

In the cloud it has a very different problem – cost. While there is an “infinite” amount of compute power available, GPU workloads are expensive, starting at $10/hour expensive. If you are running in the cloud your goals are also directly tied to scheduling. If you're paying that much per hour, you want to run a job as quickly as you can, maximize as much of the power as you're paying for, and then stop the node as quickly as you can before you're paying for reserved cores.

Both of these use cases are prime use cases for Kubernetes, but – what if we could have the best of both worlds? Now, this is stepping out of homelabbing, but as a company, is there another option?

The Hybrid-Model

Admittedly this is stepping out of homelabbing, but let's build a scenario. You're a company who needs heavy GPU workloads. You see the prices that AWS, GCP, and Azure are charging and do the obvious jaw drop. Let's analyze your workload.

Most GPU workloads are going to have a baseline of things running, and most workloads are fairly predictable. You may be training models regularly, you may have video transcodes regularly, and those have metrics involved where you can see roughly on average how many you are doing at any given time. For simplicity's sake, let's say on average you are running 10 GPU workloads at any given time, but the problem is that you can sometimes spike up. Maybe there's an event and everyone uploads videos?

For hard-iron, it's cheaper to simply buy some GPUs, but it can't scale. For cloud you can scale easily, but the cost of keeping those 10GPUs running is pretty large.

Kubernetes again solves this problem easily. Out of the box you can attach on-premise (or in a datacenter, anywhere really) nodes to your cloud-based cluster. If you run AWS/Azure Kubernetes, you can attach on-premise nodes to that cluster, and even better you can give those nodes priority over the cloud nodes.

So to use our example, that means for that average 10GPU workload, let's say we buy 2 servers, and we have 12 GPUs running on them. We attach them to our cloud-based kubernetes cluster, and we give priority to the on premise nodes. With that alone, our base 10GPU workload is now completely runnable on-premise, and we're only paying for the datacenter cost to run that server we bought (negligible compared to running them in the cloud), and the negligible cost of orchestrating kubernetes in the cloud.

Then, we could attach an autoscaler to Kubernetes in our cloud provider so when demand does go up, we can quickly and easily add a few more GPUs into the cluster, knowing that when demand goes down those workloads will be stopped and we have minimized our cost!

Kubernetes really does bring value immediately when running GPUs. So okay, I've talked about why enough, it's time we get into some of the nuts and bolts.

How does Kubernetes schedule pods?

Before we can actually add a GPU to the cluster, let's pause and talk about how the Kubernetes scheduler works, at least at a high level.

The scheduler in Kubernetes is in charge of deciding where your pod is going to run. There's many factors in how it decides, and I'm going to stay very high level for this explanation. Things can include

Are there specific nodes (VMs) that this pod has to run on?
Are there nodes that this pod is not allowed to run on?
Are there nodes that have sufficient resources to run this pod?
Is there any specific hardware requirements for running this pod?
From our example above – is there any affinity for running these pods on one node vs another, any preference? (Like on-prem vs in cloud?)

There's of course more, and there's plenty of nuance there. What I want to convey is that the Scheduler is flexible. You can set up rules that say:

I need 4 cores to run.
I need 4 cores and 3GB of RAM to run
I need 4 cores and 3GB of RAM (but I might go up to 16GB of RAM)
I need 4 cores and 3GB of RAM, and also I need those cores to be ARM
I need 16 cores and a GPU.

Any one of these are do-able and valid when scheduling pods. So, we know it's flexible. Let's actually get started.

Preparing the cluster (on-premise / homelab only)

Most cloud clusters will come prepared for GPU workloads, but in case it doesn't, or you're like me and like the hard-iron approach let's talk about how to pass NVidia GPUs into our containers.

Installing the NVidia Container Toolkit

This is not enabled out of the box when you install k3s, docker, or any of the container runtimes. This is done by installing the NVidia Container Toolkit.

There are multiple how-tos on that site, for mine I am running a Debian based setup, so I followed the Debian steps. Install the drivers, add the repository, run the install command. That was pretty simple honestly, just installing the toolkit. Installing the toolkit is mandatory even if you wanted to simply run a container with docker -run --gpus.

Now comes the more complex part. Making GPUs available to kubernetes. Again for this, I'm using k3s, as described in my previous blog post, many cloud providers will have this set up for you.

Installing the NVidia Device Operator

The next step is to install the NVidia Device Operator. The Device Operator will expose which nodes have GPUs, their health, and how to run GPU workloads. It will automatically apply some labels to your nodes, and expose metadata.

This is available as a helm chart which can be easily applied to your cluster. Really this is a set it and forget it operation, in the now 2 years of running my cluster, I have not needed to even think about this. (I should probably check to see if there's an update...)

helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm repo update

helm upgrade -i nvdp nvdp/nvidia-device-plugin --namespace nvidia-device-plugin --create-namespace --version 0.14.3 --set runtimeClassName=nvidia

Adding the NVidia Runtime Class

The runtime class is available from kubernetes as one of the most low level options for running your containers. Think of it as a way to tell kubernetes that you want to run this pod on docker vs containerd. Well, we're going to add another option, nvidia, we're going to mark our pods as they need to run on the NVidia Container Toolkit described above.

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: nvidia
handler: nvidia

Simply create that as a yaml (I just named mine runtime.yaml), and then apply it with kubectl apply -f runtime.yaml. Congrats, you now have added the nvidia runtime. Pods can now be ran on the NVidia Container Toolkit!

Let's finally start running some workloads!

Running a GPU pod

Here's a very basic Kubernetes job.

I'm choosing a job because most workloads will be a single-time object we want to run, and then close out. There are cases to make it a deployment, where an API will hold onto a GPU indefinitely, but many of our harder to schedule workloads are something that will need a GPU, run for several hours, and then complete.

apiVersion: batch/v1
kind: Job
metadata:
  labels:
    job: myheavyworkload
spec:
  template:
    metadata:
      name: heavyworkload
      labels:
        job: myheavyworkload
    spec:
      containers:
      - name: heavyworkload
        image: busybox:latest
        imagePullPolicy: Always
        env:
        - name: ENVIRONMENT_VARIABLE
           value: foobar
        resources:
          requests:
            cpu: 2
      restartPolicy: Never
  backoffLimit: 4

Let's look at this quickly. This is a fairly simple job, for the example it uses the busybox:latest docker image. It has one environment variable we set, and we set some labels on it. At the bottom, we have some resources, where we request 2 cpus. The job is set to never restart, and if it fails it will attempt 4 more times before giving up.

Hopefully this is pretty standard to people, even if you don't know kubernetes well I'm hoping that reading through that configuration you can see how it's laid out.

So how do we add a GPU to it? It must be very complex right? Well nope, I'd say the most complex things are behind us! Let's see what it takes to add a GPU to this workload now.

apiVersion: batch/v1
kind: Job
metadata:
  labels:
    job: myheavyworkload
spec:
  template:
    metadata:
      name: heavyworkload
      labels:
        job: myheavyworkload
    spec:
      containers:
      - name: heavyworkload
        image: busybox:latest
        imagePullPolicy: Always
        env:
        - name: ENVIRONMENT_VARIABLE
           value: foobar
        resources:
          requests:
            cpu: 2
          limits:
            nvidia.com/gpu: 1
      restartPolicy: Never
  backoffLimit: 4

What changed? Well, we set runtimeclass: nvidia, which from above as I described means we explicitly want to use the nvidia container toolkit. Then we also set a limit of one nvidia.com/gpu.

That's it! That's all it takes now to say you want this container to use a GPU. If you request to start that pod, kubernetes will now find a node with a GPU available and start it as soon as it can. If one is not available it will remain Pending until one frees up either from another job finishing, or an autoscaler adding more nodes based on whatever rules you set.

I hope you found this post interesting. GPU workloads add another level of complexity, but the freedom of abstracting GPUs away from specific nodes can give you amazing new opportunities. Scheduling pods, re-runnable jobs, dynamically adding more GPUs, staring jobs programmatically, all of these become incredibly easy with Kubernetes!

Thank you for making it this far, if you have any questions feel free to reach out as always. I'll be adding more social handles, but you can on LinkedIn. Take care everyone!

K3s

November 14, 2024

Welcome back! It's been a while, I've been busy, I'm sure you've been busy, but it's time for yet another insight into self hosting your own lab of computers. For those who said “Sure, I can run it on the cloud, but what if I want something that will also heat my house?”

Last time we talked about Proxmox, the solution I chose for my hypervisors. Proxmox allows you to create Virtual Machines for running workloads and supports clustering, or combining multiple hosts into one cluster to manage at a high level.

This is a great way to get started. For many years this is how I hosted most of my services. Proxmox VM, Inside that VM I would install Docker and use a docker-compose.yml file to run services on that VM. This VM would have more cores because it's a heavier workload, this one would be less.

Note: Yes, I also did it in LXCs. Do not do this. Don't run Docker in LXCs. Only pain will you find down that path

This approach worked for a while, but it was painful. Multiple layers of docker/VMs across multiple machines, it was tedious. If I had items that needed to talk to other containers that were outside of the VM I would lose the niceties of docker networking and I'd have to step out and grab IPs. The worst part was managing ports, every service would need to have it's own port open, on the host, and would need open communication. Eventually it was just plain annoying to deal with.

So, the big decision loomed, do I move to kubernetes.

The answer? Yes. If you, Dear Reader, have reached a point where managing containers at a compose level is now tedious, then it may be time to seriously consider Kubernetes at home.

K3S – A Lightweight Kubernetes

K3S https://k3s.io/ was the obvious choice for home use. The alternative was full-blown kubernetes, but kubernetes as a whole was way more than I needed, and there are certain assumptions that it has, such as you are running in a cloud. K3S is a direct fork of Kubernetes, and stays relatively up to date – it simply doesn't have everything kubernetes does.

Get it, K3s, because it's lighter than full k8s, you see because kubernetes is 8 letters between the... nevermind let's keep going

Minikube was recommended everywhere, but Minikube isn't built for production ready. Minikube is for tests and demos, a cluster made on demand. K3s is production worthy and is meant to be online and stay online.

Now, K3s has a major caveat that you should know before diving in, and that is it's backing store. K3s continues on with the idea that there is a master node, and all of the other nodes are agents. The master node will contain a sqlite database by default which is managing the internal state of the cluster. What does this mean?

The master node should be backed up. Now, I want kubernetes to remain as light and carefree as possible, I want to (and have) rebuilt my entire cluster from the ground up, but if we can avoid that it's preferable. I'm prefacing this whole post by saying this, keep your master nodes light, and smaller. Since we're going to build on top of Proxmox, just make your master node something like 4 cores and a smaller disk size, and keep it out of the way.

The Plan

My plan was simple. Get kubernetes running. I planned on:

The master node would be a smaller VM that I could easily back up regularly.
Each physical computer would run Proxmox, and in there a VM would run debian, which I would then install k3s.

Why keep them VMs and the Proxmox layer at all? Well, mostly because I still have a few other VMs I run, so it makes sense that the k3s nodes are one of a few VMs running on the host. Let proxmox do what it does best. A secondary but lesser issue is if the k3s node does fail, it means I don't have to go and reboot a machine manually, I simply can log into proxmox and handle it there.

Each primary node would have these basic requirements:

CPU – As much as I could spare. Proxmox will let you share CPU too, but there are performance tradeoffs. I took how many cores of my machine I had and distributed them between the VMs.
RAM – Same approach, distributed about 90% of the total RAM on the system to each VM (leaving some for proxmox itself).
Disk – This is going to be an interesting one, we'll get into why in an upcoming post, but yes, we will need a fair amount of disk space, more than you think. I allocated minimum 500GB to each node.

Pre-install Notes

Test it Out

Do not expect your first cluster to be the cluster you start out with. My current cluster is probably the 4th or 5th, with many teardowns and rebuilds until I had something I liked. If you are following this as more of a guide, then please take my advice. Plan on your first cluster to be torn down, it reduces the stress of getting it right the first time knowing you can tear it down at any moment

KEEP NOTES.

You are going to quickly have an abundance of knowledge you will lose. Before you start on anything here, determine a place to keep all of this knowledge. You are going to be inundated with scripts, yaml files, little hacks, and little fixes. You are going to be overwhelmed with them and you will lose them. The only reason I can write this blog post is because I have my old notes open to explain why I did each step.

Personally, I recommend a git repo with a nice healthy README for how to install that you keep very meticulous notes in.

Installation

I built out each VM and set up debian. This is where if you really wanted to automate you could utilize templates, (or yes, even ansible, you know who you are if you're reading this), but my cluster has 6 nodes, and to add a new one means I physically bought a new node – so it's going to be rare.

After debian was up, there are a few dependencies you should install. This is one of those “just trust me” moments, I went through a lot of pain figuring out what dependencies would be needed.

Note: I followed best practices, installed sudo already, and created my own user account. You can of course do all of this from root, but please don't.

Dependencies

First, make double sure your system is up to date.

sudo apt update
sudo apt upgrade

These are mostly for storage which is coming, but trust me when I say it'll just make your life easier by making sure these are all installed and ready to go now with your primary cluster.

sudo apt install -y curl cifs-utils nfs-common open-iscsi cryptsetup

Deciding on your control plane

For my cluster, I'm going to be using Istio. I was introduced to Istio years ago by one of my colleagues, and I owe him a lot of thanks for doing a small lunch and learn on this amazing framework. I'm not going to dive heavily into it today, it deserve it's own post – but for now know that Istio is going to handle my networking layer, mostly things like Ingress gateways, URL pathing, and port management.

You do not have to. Out of the box k3s uses traefik, it is my personal preference to use Istio.

Create Master Node

The time is here, you can finally install k3s to your master node! There are some options you can make here, and I would recommend fully reading up on them here.

As mentioned, I'm using Istio, so in my command you'll see that I'm disabling Traefik.

curl -sfL https://get.k3s.io | sh -s - --disable=traefik

This command will take a second, but it's quicker than you'd think. When it's done you can check your service by running sudo service k3s status

● k3s.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s.service; enabled; preset: enabled)
     Active: active (running) since Mon 2024-11-11 21:15:02 PST; 2 days ago
       Docs: https://k3s.io
    Process: 464 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited, status=0/SUCCESS)
    Process: 469 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
    Process: 515 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
   Main PID: 516 (k3s-server)
      Tasks: 405
     Memory: 2.4G
        CPU: 18h 58min 50.271s
     CGroup: /system.slice/k3s.service

You did it! Your first node!

Grabbing your Kubeconfig

This took me a while, but your kubeconfig is actually pretty easy to grab from k3s. You will need to be root, but you can see it by running:

sudo cat /etc/rancher/k3s/k3s.yaml

You can copy that to your primary machine now, and given your machine can access your node, you should be able to run standard kubectl commands. Like kubectl get nodes

Joining another node

Okay we've had fun adding one node, but what about a second?

Joining a new node is simpler than you'd think.

On your master node, run this command:

sudo cat /var/lib/rancher/k3s/server/node-token

This token returned will be your joining token. It will not change, so I recommend storing it somewhere secure.

On a new machine (VM), go through the same steps above, install debian, update, install dependencies. Everything up to the k3s install script.

Grab the IP address of the master node, and then try to ping it. Make sure it can communicate with your master node. This seems redundant, but it'll just make debugging easier. Once you are sure you can ping your master node you can install k3s.

curl -sfL https://get.k3s.io | K3S_URL=https://<<IP of your Master Node>>:6443 K3S_TOKEN=<<YOUR_JOIN_TOKEN>> sh -

If everything works correctly you should now be able to run sudo service k3s-agent status. Note that this is an agent, so it's k3s-agent, not k3s

● k3s-agent.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s-agent.service; enabled; preset: enabled)
     Active: active (running) since Sat 2024-10-19 19:46:16 PDT; 3 weeks 4 days ago
       Docs: https://k3s.io
   Main PID: 768 (k3s-agent)

If we go back to our main machine where we have kubectl we should also be able to see our second node.

kubectl get nodes

NAME              STATUS     ROLES                  AGE    VERSION
k3s-master          Ready      control-plane,master   0d   v1.30.6+k3s1
k3s-node-1    Ready      <none>                 0d   v1.30.6+k3s1

Wrapping Up

There you go! With this you have a 2-node kubernetes cluster running at home! This by itself is an achievement. There is a lot of work still to go migrating anything existing over, and getting new items up and running, but this is an accomplishment in and of itself.

This entire project took me about 2 months, from getting the first nodes up to having my services finally up and running, and then another 2 months of stabilizing those services. This will not be an immediate win, this will probably not be an “over the weekend” project. This will be a long running project you will poke at over time, with a lot of setbacks as you learn new concepts and debug extremely vague errors.

Take the win here, have a coffee, take a break, you got kubernetes running at home. By taking these steps you are starting a whole new path into DevOps and Infrastructure that most people will never touch. Take the win, grab a coffee, and I'll see you next time for the next steps!

Proxmox – A free, open source hypervisor

October 18, 2024

A few posts ago I talked about how I store the majority of my data in my cluster on #unraid, a software based raid operating system that also has a great plugin store and basic hypervisor support. (Hypervisor – a system that controls, monitors, and maintains virtual machines)

What is the limitation of unraid, though? Unraid is a great system to get you started but it only runs on one machine. You can grow your unraid server, it actually is very easy to migrate your installation to bigger hardware if you need it, but what if it's still not enough power? What if you need to host so many applications that one machine is starting to feel the strain?

Then it might be time to start looking into hypervisors that don't run on your data cluster, but instead will connect to it. This was my main problem, I was running a single hypervisor, but was running out of processing capacity for it. Worse, when my hypervisor was under heavy loads since it ran on my data storage, my data access was also slowed, compounding the issue. When my virtual machines could barely perform any tasks I knew I had put it off for too long, it was time to actually plan expansion.

Choosing a Hypervisor

I searched around for a lot of hypervisors. I wasn't opposed to proprietary, but I knew going in this was something that I would be using for years, so an annual or god forbid monthly subscription was out.

On top of not wanting to pay, I wanted this to be an opportunity to start really learning some new things. A lot of hypervisors come with amazing UIs that let you not really know what's happening underneath. This is honestly a great path for those who “just want things to work”, but I really did tell myself that this was the time I was going to learn how these things work, and what feels like an overwhelming UI wasn't going to stop me.

Finally, while Unraid is on Linux, I knew that I would probably finally be forced to confront Linux head on. I would be looking at many types of hypervisors, but I didn't want my uneasiness of Linux to hold me back (and I am very glad I set that expectation now that everything I run is Linux)

So my options were

vmware vSphere

vSphere had really confusing licensing terms, that were very clearly made for business. Their licensing was on a per core basis, which was vyery alarming to me. vmware free was available for personal workstations, which I could probably use on a few computers, but I really wanted a single interface. Plus, I've done the whole “I'll just use the license technically and install 4 different personal use licenses” where you're in a grey area because hey, it is for personal use right? Take it from me, it always backfires. Either they figure it out and revoke your license, or the license changes and then you're frantically trying to change your entire system over. So, vSphere was a non-starter.

Microsoft Hyper-V

Hyper-V is Microsoft's hypervisor, and is built into a surprising amount of Windows versions already, so I had already played with a few times. While it's on most any version of Windows 10/11 now (enabled by turning on the Hyper-V feature), I specifically was looking at the Windows Datacenter Server 2016 – mostly because I already had a license for it – that checked the box of no ongoing payments.

Hypervisor is very manual, it does have the standard Windows interface for adding and managing your VMs, but there was no way to manage them from a remote computer without using remote desktop to manage them. This wasn't a dealbreaker, but it was annoying to me. (There probably is some way to do this in a completely Windows ecosystem that I'm not aware of, but I really do like how web-based interfaces have taken off).

The real dealbreaker was that the license has a clause that I could only use 16 cores on my system. I'm sure this was made earlier on where 16 seemed like a ton of cores and would cover most business needs, but I was looking at buying an AMD Threadripper, a beast of a CPU with a whopping 48 cores – so right there my license for Hyper-V was useless, and my search continued.

Proxmox

So finally I landed at Proxmox. Proxmox is an open source hypervisor that is built on top of Debian Linux. Since it's open source they of course have the community version as the default, with no noticeable differences except buying an enterprise license will get you more tested and stable releases and also some support paths.

Proxmox works both as a standalone hypervisor, using only one node, but what really got my attention was how it supports clusters out-of-the-box. So as my server cluster grows, all I need to do is put them all on proxmox and they'll work together. It also offers a decent UI to manage all of your proxmox nodes from one single pane, which if you remember was nice to have for me. Seeing the health of my entire system without needing to open up a dozen dashboards can really save time.

Screenshot of the proxmox dashboard, showing the list of nodes and VMs, and a basic dashboard

If you're dipping your toes in to a hypervisor, I always – always recommend taking it for a test drive first. Do not simply dive in headfirst and start spinning up VMs, you must try it out and see how it works, and more importantly what you do when it inevitably fails. So for today, let's just spin up a simple Virtual Machine to play with.

Setting up Storage

Before you can begin, we need to set the stage with storage in Proxmox. Storage consists of a few different types that you can read more fully about here. You can either access your storage settings clicking on “Datacenter” and selecting “Storage”, or if you prefer a terminal edit the file /etc/pve/storage.cfg.

Essentially when setting up “storage” in proxmox you are setting up different locations on your disk, other disks, or network shares. On each of these storages, you will say what type of items will be stored there, things like VM/container disks, images (like ISOs), templates for VMs, and a few others.

Local Storage

Proxmox comes out of the box with two pre-configured. local and local-lvm. They are similar except for one key concept – lvm is thinly provisioned. Meaning items written to local-lvm will only take up the space that has been written to them. local is not thinly provisioned, meaning if you reserve a 20GB drive space, it will take up 20GB. On lvm, it's size would be zero until you start writing to it.

So, when I'm creating new drives for my virtual machines like we will below, I like to use thin provisioning. There is a very small tradeoff with writing speed, but it's negligible. The real risk is that you can over-provision your drive space, which can be a problem. For this guide just take my advice and do not create more disks with sizes than you can handle. There will be future guides on how to... undo the damage if you do overprovision your drive and run out of space.

Network Storage

Now, if you're like me, you already have some big data array with all of your data somewhere else, like on unraid. In proxmox, hooking into this data can be achieved multiple ways. Out of the box, it's simplest to use an NFS or SMB share from your data store and add it as a storage type in proxmox. I like to create a separate area within my datastore and set it up as a storage option in proxmox. I'm not going to go into all of the details yet on this, but check out the proxmox wiki on setting up an NFS or SMB/CIFS shares.

This is how proxmox will know where to find your ISO files to install your operating systems – you'll need to place the ISO files into a storage that you've configured to use “images”. Once you've set one of your storages for that, if you look at the disk/share there will be a templates directory created with a folder iso in there. Copy an ISO file into that directory and proxmox will be able to use it to install an Operating System onto your new drive.

Creating a Virtual Machine

Today let's focus on the simplest task in proxmox, setting up a simple VM. As a classic, I'm going to install Windows XP for pure nostalgia, but you can install any OS you like. (Caveat is Windows 11 requires a bit more setup because of their TPMs, but you can still make it work). So let's get started installing a virtual machine.

Start by clicking “Create VM” Screenshot of the top right of the UI, highlighting the button "Create VM"

This will bring up the Create VM wizard.

Screenshot of the first page of the wizard, showing the options to select the node, the ID, and the name

The first screen is fairly straightforward. Node will be autofilled (I blocked mine out for security), ID should be autogenerated, and Name will be up to you to decide. I recommend coming up with a naming convention, deciding on upper case vs lower case, etc.

Screenshot of the second page, asking where to find the ISO file to install

The second screen is where we set up what OS options we want to run. For this one I'm selecting that I want to use an ISO file. I select the storage (set up in the previous step here) where I keep my ISOs. I then select the ISO file from the list. Under Guest options I select the values that match the type of operating system I am installing.

Screenshot of third page, System. Selecting the default firmware and BIOS

Third is “System”, where we can select different types of BIOS and virtual machine types. For now, we're going to leave all of this default – but you may find yourself back in here if you decide to get into things like hardware/GPU passthrough (another topic for another day).

Screenshot of fourth page, Disks. Setting up and selecting which disks to use

Fourth is “Disks”. Here you will create a virtual disk to use with your new VM. Everything is pretty straightforward – your BUS/Device you want to mimic (defaults to IDE, but you can set SATA or SAS, remember this has nothing to do with your physical connection). Then the storage, remember from above this is your storage that you set up, and where you'll want the disk to live. Then the size. Disks can be expanded later, but the process is a bit arduous. If you're concerned about the space, best just to make sure it has the space here and now.

Screenshot of the CPU tab, asking how many sockets and cores

Next up is “CPU”, where you simply say how many sockets/cores the virtual CPU will have. Personally I've had bad luck setting anything but 1 socket. Cores can be whatever your machine can handle. For this small test one I'll be setting simply 2 cores.

Screenshot of Memory tab, asking how much RAM to use

Memory is pretty straightforward, how much RAM do you want to allocate to your machine. Personal experience, you can go over your machine's physical RAM, but the paging will cause massive slowdowns. Just allocate whatever you can safely.

Screenshot of Network tab

The last configuration page, you will need to set up what networking you want. I won't go into the depths of linux networking, but essentially Proxmox has already set up a bridge for you, usually called vmbr0 or something. This will be connected to your main network. If you use VLANs, you will supply it with a VLAN tag there, but most of you will simply leave it empty. You can finally select the model of the virtual card, this will be helpful if your OS has drivers pre-installed for that model of network card.

Click Next and you'll be greeted with a confirmation page for all of your settings. Once you're happy with them, click “Finish” and your VM will be provisioned!

Screenshot of the VM overview, looking at empty graphs of our new VM

You'll be able to see your new VM on the left side, under your node now. Everything will be empty. When ready click “Start” in the top right, and then you can view your VM by clicking “Console”. Note you may have to be quick if you need to boot to the “CD”.

That's it! That's a virtual machine running on your own hardware using proxmox. I'll dive in deeper next time to show how I run many virtual machines, but this is at it's core the base of how I run most of my services. Try out a few, get a feel for how it's working.

Coming up next...

Next time we'll touch on more features – backing up, restoring, and more advanced setups. Way later we'll get into very advanced topics allowing your VMs to access hardware like graphics cards which opens up a lot of new doors and exciting items. (AI/LLMs... other interesting project ideas? This here is creating that foundation!)

Proxmox also supports LXCs, or Linux Containers. Another branch of the container space that I'll need to take on some time to fully dive into here.

Proxmox has been now for 5 years the foundational layer of my entire homelab and has served me well. There is a learning curve, but it's an achievable one. If you're starting to feel like you need to host more services than one machine can handle, then it might be time to start looking at solutions like proxmox to handle the workloads you're asking to.

Have fun out there, and happy tinkering!

LED Transit Map (Part 2 / 3)

September 19, 2024

Hello again folks, it's time for my update on my LED transit map! It's been a couple of weeks, I made a lot of progress on my LED map of Sound Transit's Link Light Rail, both the 1 and 2 lines are now operational on my map!

If you didn't read the first portion, most of it has changed but you can still read it here. The goal was simple, I wanted to show a live-service map of Sound Transit's Light Rail on LED strips in a way that I could hang on a wall. How did that go?

TL-DR...

Image of the finished transit map Please ignore the very messy wire-filled room, I still need to mount it

It's working!

OneBusAway

So some things changed since my last update. I was really happy to show the GTFS file feed in my last update – only to find out soon after that Sound Transit is one of the few providers that doesn't offer GTFS – at least not directly. They use OneBusAway, another relatively open standard for transit data.

OneBusAway does offer a full RESTful API, and in my case they offered a simple python package. Their docs are nicely formatted here, which I referenced quite a bit in my code. My overall strategy had to change, and I had to double check that it was a good approach with their contact to make sure I wasn't calling their API too much. Essentially:

On Startup: – Call their Routes API, which gives me all of the routes – Check against my config to see which routes I care about – For each route I care about, grab all of the trips for those routes.

Then I start a loop, where right now every 8 seconds I call their status API.

For each route I know about, call their Vehicles For Route API
- Vehicles are tied to trips, which will tell me the next stop.
- If I cannot find the trip (newly added, maybe a stopgap train for example), call the Trips api go load the trip into my globally known Trips object.
- Go to my config, and grab the best LED position for that vehicle

    vehicles_by_route = get_latest_feed()
    vehicles_set_this_iteration = {}
    for route_short_name in led_config:
        vehicles = vehicles_by_route.get(route_short_name)

        for vehicle_item in vehicles:
            vehicle = vehicle_item.get('vehicle')
            route: Route = vehicle_item.get('route')
            trip = vehicle_item.get('trip')
            next_stop_id = vehicle.next_stop

            stop = get_route_stop_config(route_short_name, trip.direction, next_stop_id)
            # Error handling is important folks, and makes it easier to debug
            if stop is None:
                print('WARN Stop {} was not found in config, direction {}'.format(next_stop_id, trip.direction))
                continue

Less naive way to update lights

So in my first version, I had a very naive approach to settings lights. On every iteration: – Clear the strip – Set the stations – Set the vehicles (overriding stations if there is a vehicle stopped there)

This caused a lot of slowdown, and actually the LED strips got overwhelmed at all of the changes, signals getting confused and weird lighting happening. This caused me to rethink my approach, and if you read the code you're probably wondering why I needed to save the “vehicles set this iteration”.

Rather than setting the light right there in that iteration, I now (each iteration) save which vehicles are supposed to be changed this iteration. Now I have a discreet list of only items that need to change.

I then also added to my set_single_led method a tracker to keep the last known LED color for each led. It's a simple dictionary of the LED code and the last color, and since I force that all LED changes must go through that method (encapsulation, remember, is a very good thing even in small side projects like this) I can know what the colors are right now on the strip.

So my loop now compares the last color that was set to what the next color should be. It then updates the LED only if it actually needs to. So instead of (length of strip) x 2 updates for each iteration, it's only (number of vehicles this time) - (number of vehicles who changed) + (any stations that are now empty). Much much better, and the colors are now correct.

Speaking of correct colors

LEDs are very fickle when it comes to colors. I had to decide what brightness I set my strip to on instantiation. By default it's 1, full brightness, which honestly makes it hard to look at. I kept tuning it down and frankly landed at 0.1, 10% brightness. It's still very bright even in a bright room. Here's the rub though, by being that “dark”, standard RGB color codes were very wrong. I noticed Green would come out much stronger than any other color. My station “Yellow” was more of a lime green. I had to play with the colors quite a bit, but to make something look right, that station yellow you see on the strip is actually #7F1200, closer to a scarlet, more like maroon.

So if you're working on light strips in the future, know that just because you have a hex color you like doesn't mean it's going to look right on the strip. Plan to play with your colors.

More accurate vehicle locations

In my first version I went for an algorithm in finding the percentage of the distance traveled by a train between stations, then finding the appropriate light to turn on. This was a good first approach, but ended up making it feel like trains rushed through portions, or through curves in the line wasn't very accurate.

After some trial and error, I decided on a different approach, to instead use GPS. The OneBusAway API does return lat/lon coordinates for each vehicle. What was best was to do this all manually. Which I dreaded, I very much wanted to make an algorithm to do the locations for me, but I realized that this portion of the project was less programming – but more artistic. I realized to have it look the way I wanted to, I would need to map out every single LED and have it map to exactly where I wanted trains to be.

Distances like Tukwila to Rainier Beach are about 6 miles and 10 minutes, while distances between Symphony and Westlake are only a couple of blocks, but still 2 minutes. I had to manually go station by station and gather how many lights I wanted each one to be.

My approach was pretty simple. Find the timetables gather how many minutes between each station. I still had a lot so just for buffering I added 1 to each then. With that I had about 130 of my total 160 lights used with stations and just minutes between stops. I then went through again and added where I felt necessary. Stadium and International District gets an additional one because it's going to split off there to Bellevue. This section has a slowdown, so add an additional light there. This section is at-grade, so it goes slower than other sections. Overall, that I came out with the full light strip being used.

I used this simple GetJson then to map out each LED bounding box between each station. I simply drew boxes around the line, this one has 7 between the stations so I draw seven boxes. I then copied the json into my config. Here's a sample:

            {
                "code": "40_532",
                "name": "Pioneer Square",
                "lat": 47.603199,
                "lon": -122.331581,
                "led": "1:229",
                "intermediaries": {
                  "type": "FeatureCollection",
                  "features": [
                    {
                      "type": "Feature",
                      "led": "1:228",
                      "geometry": {
                        "coordinates": [
                          [
                            -122.33415368888537,
                            47.60406523564518
                          ],
                          [
                            -122.33252440110098,
                            47.60230488328597
                          ],
                          [
                            -122.33028658414389,
                            47.603264481202274
                          ],
                          [
                            -122.33193550190164,
                            47.60505788942638
                          ],
                          [
                            -122.33414387389868,
                            47.604058617890246
                          ]
                        ],
                        "type": "LineString"
                      }
                    },
                    {
                      "type": "Feature",
                      "led": "1:227",
                      "geometry": {
                        "coordinates": [
                          [
                            -122.33581241836166,
                            47.60585199878702
                          ],
                          [
                            -122.334153685617,
                            47.604058617788695
                          ],
                          [
                            -122.33193549863327,
                            47.60504465406461
                          ],
                          [
                            -122.33355497143121,
                            47.60675859235138
                          ],
                          [
                            -122.33584186332143,
                            47.60585199878702
                          ]
                        ],
                        "type": "LineString"
                      }
                    },
                    {
                      "type": "Feature",
                      "led": "1:226",
                      "geometry": {
                        "coordinates": [
                          [
                            -122.3370883666264,
                            47.607327686853296
                          ],
                          [
                            -122.33490943958938,
                            47.60823425484057
                          ],
                          [
                            -122.33355497143121,
                            47.60676520976472
                          ],
                          [
                            -122.33582223334808,
                            47.60585861631506
                          ],
                          [
                            -122.33707855163973,
                            47.60728798278629
                          ]
                        ],
                        "type": "LineString"
                      }
                    }
                  ]
                }
            },

Yes, it was a lot. I did not enjoy doing it manually, but it gave the best results on the strip. Overall the file is currently at 10,934 lines.

Each stop has it's stop code, which is what the API says “this vehicle is en route” to. I then determine “is the vehicle at the stop”, and if not “Given the location, which bounding box is it located in”. I grab my LED code from the config, and move on.

But wait, there's more!

You may have asked, but Rob, what's with the red stations? There are in fact a few red stations on there. I didn't want to build this for today, I wanted to build this for the next several years. Those red stations are our future transit lines due to open over the next few years, and I wanted to capture them in my map.

To the bottom of the 1 line is the Federal Way Extension due open in 2026.

At the top right of the 2 line is the Downtown Redmond Extension due open in 2025.

The bridge is not lit up (working on very tiny soldering skills right now) and will connect the system probably (just me guessing) end of next year, maybe very early 2026.

Finally, the little dot on the one line is the future 130th Street Station, due open in 2026.

For a transit nerd like me it's a very fun time!

That's my project, I'll probably post here one more time when I get it mounted on the wall. I have a couple of minor bugs I want to resolve too before I get a full timelapse of it. It's been a very fun project out of my normal project scope, and I've had both a very fun and very frustrating time too building it! Developers, don't be afraid to branch out into electrical. Just be safe about it, and remember: Never skip the fuses!

I'll have one more small update showing off the mounted version, and I want to make a timelapse for you all to see it, but it's definitely a fun piece!

LED Transit Map – (Part 1/?)

August 27, 2024

#transit #trains #python #sideprojects #raspberrypi #led

Seattle is celebrating it's newest extension to our light rail system on this upcoming Friday. In just 3 days, 4 new stations will open connecting the northern suburb of Lynnwood to Link's 1 line, which will connect down to Seattle and further down to SeaTac.

Now, many of you know. I like trains. That's right, I'm 100% a true train nerd, ever since I was a kid, I just thought they were neat. Growing up in Iowa though, we didn't really have any. So, moving to Washington and watching our light rail be built out has been a ton of fun for me.

I like trains meme

So, I was sitting at home and realized I had a spare string of programmable LED lights. I was probably watching a video of the new expansion, and it popped into my head, I want to build a real-time transit map of our light rail system.

General Approach

To build a realtime transit map, I knew I'd be dealing with my stack of raspberry pis and led strips. I could run pretty much any language I wanted to on the pi, but I decided on python. I'm still relatively new to python and this would shore up some of my learning on it. I've only done scripts on my computer in it, having an actual “deployed” application would be a step up for me.

For now, I will be doing a simple 10-light strip as a proof of concept. If I'm happy, I'll extend it out to the entire line. I also want to build it in a way that not only works for our current extension, but works for the next upcoming extensions as well. (The 2 line should connect to the one line hopefully late next year, and I don't want to have to rebuild the whole thing)

Finally, I'm not planning on labeling the strip. I think it would look better on the wall as more of an art project than a map in this case, so I want the LEDs to do the talking for me. So to do that I want to have LEDs that are constantly on for each station, with a different color LED indicating the trains on the line.

Gathering Data

So first step in this problem is the data itself. Where do we get data on transit networks? Well, funny enough, this isn't my first transit project. (Shocking, right?).

Transit orgs publish their data to Google and other map providers using a standard called GTFS – or the General Transit Feed Specification. This feed is essentially 2 main parts.

GTFS Static File

The first part is the Static File. This file contains data that is (mostly) static – Routes, Stops, Schedules, and Trips. If you think about your standard transit agency, these are mostly all standard and unchanging. (Unchanging in this case means you could pull the feed daily without much worry).

The file is delivered to you as a .zip file, containing several .txt files, which are honestly more .csv files. These txt files are broken into their respective parts, routes, stops, trips, and describe every detail about the system. The 1-line, our light rail line, has a route in the route file, there are many trips on the route, and those trips contain multiple stops. So as you learn how the data works, you start to see how it all interconnects. (If you're curious, the full spec is here.

So after retrieving the static file and building that functionality, I built out several concrete classes for each item, Route, Trip, etc. My original approach was to load each item into memory. On each load of the application pull down the static file, load the routes into memory, and go. Examining the data more, however, I realized the static file is small (30-40MB), but has a lot of data. Those trip stops above? That's a list of stops for every bus/train/tram/streetcar in the system for every trip for every route. Turns out even my main development machine took 5 minutes to load it all into memory. This wasn't going to work in a tiny raspberry pi.

So, I went online and tried to see if anyone had invented the wheel yet, pre-loading this GTFS data into some easier to query format for me, and wouldn't you know it someone did. I found GTFSDB, which is exactly what I needed. GTFSDB, cleverly enough, loads the static file of the GTFS feed directly into a DB. In this case I chose sqlite because I didn't really want to host an entire database for this, sqlite hosted on the pi would be fine.

The tool worked extremely well, simply pointed it to a file, gave it the URL to the zip for the feed, and within a few minutes I had a fully formed sqlite database of the entire feed.

GTFS Realtime File

The realtime file is the “right now” file for the transit feed. This is where we get data on vehicles and their current locations. This file is much smaller, usually only a couple of KB, and it's in protobuf format. For this there is a python package called gtfs_realtime_pb2 which reads in the protobuf format, which then using another package protobuf3_to_dict (the 3 is important for python3), can all be read into a dictionary. Finally, we can use pandas to iterate through and pull out our vehicles.

    feed = gtfs_realtime_pb2.FeedMessage()
    response = requests.get(realtime_url, allow_redirects=True)
    feed.ParseFromString(response.content)

    dict = protobuf_to_dict(feed)

    df = pd.DataFrame(flatten(record, '.')
        for record in dict['entity'])

    vehicles_by_route = {}
    for index, row in df.iterrows():
        vehicle = Vehicle(row)

At this point I have a sqlite database with all of the static data, and with this function I can get the current realtime info from the transit agency, it's finally time to start writing my own code!

Light Strip configuration

The configuration took probably the most thought – how do I lay out my light strips in a way that is dynamic, but not so much that I can't be precise about it. My first approach was to do just approximations. If there are 23 stations and 100 lights in a strip, just take round(100/23) for the stations and then move the trains between them. This would certainly work, and it would even work when Sound Transit opens the Federal Way extension soon when our stations become 24, and when 130th street opens and becomes 25. However, it does not support the 2 line.

The 2 line will merge into the 1 line at International District, heading all the way up to northgate, which means at some point in the strip there is a constant LED that will light up for a 2 line train heading south from ID, and then take a right, and need to seamlessly merge onto a second horizontal light strip. No amount of math or approximations will make that work, I need to define a config file.

So I landed on this rough schema:

{
    "E Line": [{
        "direction": 0,
        "stops": [
            {
                "code": 538,
                "led": "1:0"
            },
            {
                "code": 558,
                "led": "1:2",
                "loading": [
                    {
                        "led": "1:1",
                        "percentage": 1
                    }
                ]
            },
            {
                "code": 575,
                "led": "1:4",
                "loading": [
                    {
                        "led": "1:3",
                        "percentage": 1
                    }
                ]
            },

This schema has a few basic components. First it defines what line we're paying attention to. For testing, that's the King County Metro RapidRide E line. Good service, every 5-10 minutes or so. Then in that is an array, GTFS splits the trips into direction: 0 and direction: 1, to designated which direction on the line it's traveling. More on how I'm handling that later. Then there are stops, these will actually map the stops in the feed to LEDs we care about.

The code is a short code from the GTFS feed. 538 is a stop at 3rd Ave & Columbia St in Seattle.

Inter-stop algorithm

“loading” is where things get really interesting. So, if I can't use an approximation between stops, then I need to define each LED in some way between each stop. My original plan was to use bounding boxes on a map, from this lat/lon to this lat/lon make a box, if the vehicle's lat/lon is in that box, light up this pixel. I may still do that, but I realized that it is much simpler if I just do percentages.

We have two lat/lon coordinates. The stop before, and the stop the vehicle is currently moving towards. For now, assume a straight line between them. Then, we have a lat/lon for the vehicle itself. Given that point, calculate roughly what percentage it's traveled between these two points. That percentage then take the largest possible block in that loading block and return LED.

In this case, they are all at percentage: 1, 100%. I only have 10 LEDs and I'm using them sparingly, but for the next one I could have say, 5 LEDs between each station, and then each light would be 0.2, 0.4, 0.6, 0.8, and 1. Then each one should light up as it moves.

This should hold me over for the proof of concept. There's a couple more approaches I want to look into, but this will be good enough for now.

The Primary Loop

The entire meat of the program exists in a large loop, looping every 10 seconds to retrieve the latest info. In there, we grab the GTFS realtime feed, parse it, and grab each vehicle. If the vehicle is on a trip that belongs to one of the lines in our config file, we process it. The main code is:

while(True):

    vehicles_by_route = get_latest_feed()

    for route_short_name in led_config:
        route_config = led_config.get(route_short_name)
        vehicles = vehicles_by_route.get(route_short_name)
        route_stops = get_all_route_stops(route_short_name)

        clear_lights()
        for route_stop in route_stops:
            set_single_led(route_stop.get('led'), LightStatus.STATION)

        for vehicle_item in vehicles:
            vehicle: Vehicle = vehicle_item.get('vehicle')
            route: Route = vehicle_item.get('route')
            stop: Stop = route.stops.get(vehicle.stop_id)
            stop_bounding_area = BoundingArea.FromPoint(stop.latitude, stop.longitude, stop_radius)
            vehicle_is_at_stop = stop_bounding_area.contains(vehicle.latitude, vehicle.longitude)
            stop_config = get_stop_config_by_stop_code(route.short_name, vehicle.direction_id, stop.code)
            label = 'is at' if vehicle_is_at_stop else 'is heading to'
            if stop_config is not None:
                print('Vehicle {} {} stop {}'.format(vehicle.label, label, stop.name))
                if vehicle_is_at_stop:
                    set_single_led(stop_config.get('led'), LightStatus.OCCUPIED)
                else:
                    # Calculate the distance from the last stop to this one
                    prev_stop_config = get_prev_stop_config_by_current_stop_code(route.short_name, vehicle.direction_id, stop.code)
                    if (prev_stop_config is None):
                        continue
                    prev_stop = get_stop_by_code(prev_stop_config.get('code'))
                    prev_bounding_area = BoundingArea.FromPoint(prev_stop.latitude, prev_stop.longitude, stop_radius)
                    percentage = stop_bounding_area.calculate_percentage(prev_bounding_area, (vehicle.latitude, vehicle.longitude))
                    # We know we're not at the stop, now just figure out which light to light up
                    led = find_largest_object(stop_config.get('loading'), percentage)
                    if led is not None:
                        set_single_led(led.get('led'), LightStatus.OCCUPIED)
    time.sleep(loop_sleep)

Let's step through that piece by piece.

For each route in our config get the vehicles on that route, it's stops, and get the config for that route. Then, clear our light strip. This could be cleaned up later for sure, I could load those first before every loop, but we're in proof of concept land.

    vehicles_by_route = get_latest_feed()

    for route_short_name in led_config:
        route_config = led_config.get(route_short_name)
        vehicles = vehicles_by_route.get(route_short_name)
        route_stops = get_all_route_stops(route_short_name)

        clear_lights()

For each stop in all of the route's stops, set the LEDs for the station colors. In my case it's a gold color.

        for route_stop in route_stops:
            set_single_led(route_stop.get('led'), LightStatus.STATION)

For each vehicle in the vehicles in the feed, get the route the vehicle is on and the stop it's at. Then, get the rough area that the stop is at (a radius of about half a city block) and that will be our guide for if a vehicle is actually at a stop. (The feed doesn't tell you it's made it to a stop, it only tells you what the next stop is, so we have to do some lifting here).

Load the config for the stop and this route from the original json, this will tell us if we care about this stop (maybe it's too far off our lightstrip, and we don't care)

for vehicle_item in vehicles:
            vehicle: Vehicle = vehicle_item.get('vehicle')
            route: Route = vehicle_item.get('route')
            stop: Stop = route.stops.get(vehicle.stop_id)
            stop_bounding_area = BoundingArea.FromPoint(stop.latitude, stop.longitude, stop_radius)
            vehicle_is_at_stop = stop_bounding_area.contains(vehicle.latitude, vehicle.longitude)
            stop_config = get_stop_config_by_stop_code(route.short_name, vehicle.direction_id, stop.code)

If the vehicle is at the stop (in our bounding area), set the LED associated with that stop from the station color to the occupied color.

Otherwise, find the previous stop (which I do by looking in my config, rather than the trip, maybe the last stop wasn't on our strip?). If we don't have it in our config, continue on, it's before our strip.

If we do have it, get the previous stop from the config, grab it's bounding area, and then do the percentage based algorithm between the two areas.

We then find the largest config we can for the inter-station light's travel percentage, and return that LED.

Finally, if that found LED exists, we set that LED to the occupied state.

LED configuration

Last part of this post (long I know), is that we'll need to actually set the LEDs. This ended up being a bit trying for me. I started with a very old Raspberry Pi 1B. It turns out that it's so old that even just installing python dependencies took hours. So, unfortunately that one won't do. I then moved over to one of my raspberry pi zeros, but there was actually an issue with the wifi on it, and ssh kept dropping, and I just got frustrated using it. After 5 hours of trying to figure out why SSH keeps dropping you realize it might be worth doing something else.

So I switched to my LibreComputer Renegade, and usually I really love these guys, actually even in this case I do. These things are rock solid, they're fast, and they're cheap. Unfortunately after setting everything up I learned that even though the underlying GPIO pins are exactly the same as the Pi, there was a hard-coded block on the LED python library to block anything non-pi related. I didn't feel like building my own library for this project, and I had already sunk a weekend into this so I headed over to my electronics shop.

Now, quick plug, there's a small shop in Bellevue, WA on Northup Way called Vetco Electronics. This place is a nerds dream. If you're in the Seattle area and interested in electronics, retro tech, side projects, or anything in between go check them out.

I went over there and picked up a Raspberry Pi 5. I had decided at this point that I didn't care about the cost, I just wanted something that would work.

And it didn't. At first. Turns out the 5 changed most of their underlying architecture around the pins that rendered most libraries obsolete. I was about ready to throw in the whole project running on a pi when I finally found this little python library `neopixel_spi. Turns out the original neopixel project needs to be overhauled for the Pi 5, but the SPI version (which is the protocol that actually talks to the chips, I believe), works perfectly. So finally, I could set some lights.

Setting the LED states is surprisingly easy, once you can light up one light at all, everything else is very easy. So much that the actual code for setting a light is simply:

# Define a board, this one is using the SPI interface, and I'm using 10 LEDs in my strip
pixels = neopixel_spi.NeoPixel_SPI(board.SPI(), 10)

# Set the color of LED 2 to a hex color value
pixels[2] = 0x00ff00

Proof of concept

With this simple 10-LED strip I finally have a working PoC. This is real data of King County Metro's E Line, from Columbia to Bell, sped up about 10x. If you watch closely you'll see a bug I'm tracking down too, but we'll get there.

In the next update, I hope to have a more full size demo, with a full size lightstrip. I'm also waiting on an API key from Sound Transit to access their data, instead of KC metro's. (That's why I had to use the E line for now).

I also have some really fun LED strips on order (at Vetco of course), and we're working on finding a neat way to display it. Baby steps!

This one turned out to be a long one. I'll have more updates soon, and as the project matures I'll fill out more info. If you want a sneak peak at the code, it's not ready yet and there is no documentation, but it can be found here.

Have a good day, and don't forget if you're in Seattle, on Friday our light rail extends north!

Data Storage at home – Safely storing your first bytes

August 19, 2024

#homelab #selfhosting #unraid #storagespaces

I debated on what I wanted my first main topic to be about today, and after some internal debating I ended up deciding on starting like I do with most of my projects, from the bottom up. So today I'm going to talk about how I manage my (apparently) large amounts of data.

Storing data has always been the root cause of all of my selfhosting and homelabbing. Starting back in the days where an mp3 would take an hour to download, I quickly learned that I did not want to download things multiple times, if you can even find it a second time. So, I learned to download frequently and often.

As we started using data more, we had piles of new things cropping up. Photos were going digital, even some short video files started to appear. I remember being very annoyed that we ran out of disk space, and were forced to choose which photos to keep and which ones to delete, and so I started looking into my first few ways to store larger amounts of data.

Today I'm well over 100TB of data at home, mostly my own personal media, personal projects, backups of backups, and I proudly host cloud drives for my family. Getting here though was a long process.

Getting started

As I said in the first post, I started with a rock solid Pentium 3 with 256MB of RAM way back in the day. At that time sharing data between computers was as simple for me as setting up a share on Windows and sharing files stored on the main drive of the “server”. I ran Windows Server 2000, and ran shares for the entire house from that PC. (I actually remember installing games to the shares, and running installed games like Age of Empires from other PCs via Windows Shares).

Unfortunately data scales, and so the more I stored the larger drives I needed. A massive 120 GB drive becomes a 320, and then 500. Each time I would carefully install the new drive by hooking up the giant ribbon cable, copying everything through Windows Explorer to the new drive, and praying that everything would complete well.

Eventually I outgrew what a single drive could handle, and I got my first external drive, an IOGear 1TB “Drive”, which was 2 500GB drives in a RAID 0 configuration. I'll explain RAID more below, but if you aren't aware this is essentially 2 drives working together to appear as if it's a single 1TB drive. At the time I wasn't aware what RAID was, or that I was actually using it, or how risky it was at all (especially since it was in an external drive that I took with me everywhere) – but I would learn over time.

This system of a server computer hosting a few drives worked well for years. Into college I continued to grow my data to the point that I finally bought a 3TB internal Seagate drive, which I was amazed at how much storage it could fit. Unfortunately, this is when I learned my first lesson on the importance of data redundancy, as I woke up one morning and heard the stomach churning noise: click click click whirrrr – click click click whirrrr.

That was the end of all of that data, about 7 years of memories and content built up over years gone, overnight. I was devestated, and the thought of doing professional recovery for well over two thousand dollars made a college student living on ramen queasy. So, I had to start all over.

Trial and Error, “pro-sumer” level storage.

I went through a few different solutions for a few years before I solidified my decision on what I would use. I'll go over a few of them now, and why I eventually ended up choosing Unraid. Foreshadowing...

Hardware RAIDs

The first option was probably the easiest solution to start up with. I needed space, and I needed ways to store my data in a way where I could minimize losing data if a disk failed. This would mean combining multiple drives to act as one large drive, which is called an array.

I mentioned RAID 0 above. RAID is a hardware level option for combining drives in different configurations so that the above operating system only sees the drives that it sets up. If you want to combine 3 drives into one super mega drive, that's RAID. You usually configure it in your motherboard's BIOS, and then maybe install a driver in Windows to show the new drive. To Windows, it looks like any other drive.

An example of a RAID configuration An example configuration of a RAID configuration in BIOS

RAID has a few configurations, I'll talk about the core ones, but there's a complete list on the wikipedia page.

RAID 0 – All data is striped. This means that you are maximizing your data. For each N drives, your data is split into N chunks and distributed across all of those drives. Like slicing a loaf of bread, the first slice goes to drive 1, the second slice to drive 2, and on and on until you're out of drives and it starts over at 1.
- 0 maximizes storage, but your tolerance for failure is extremely high. If any drive fails, you lose the entire array. There is no redundancy, it is simply lost.
- 0, however, is great for speed. Since you are pulling from N number of drives, you have (slowest drive speed x N) the speed, with the only maximum speed being the speed of what your RAID controller (motherboard) can handle. RAID 0 is a popular choice in gaming and professional server applications. (It's actually what I still use in my main Gaming PC, I have 5 2TB SSDs in a RAID 0, and games are fast)
RAID 1 – All data is mirrored. Storage size is not the priority with RAID 1, but rather redundancy. In our bread example, for each slice of bread it sees, it instead clones the slice of bread, and puts a clone of the slice on each drive. So for N drives there are N number of loaves of bread now. (Okay the analogy is falling apart, I guess they have Star Trek replicator technology)
- 1 has maximum redundancy, if a drive fails you can simply replace the failed drive, and the failed data will be copied to the new drive, restoring the full array.
- 1 has no speed implications, as it is still limited by an individual drive's speed
- 1 is the simplest approach to having redundancy, you carry a complete mirror of the drive twice, which means that for building out your system, you must double all costs to carry the second copy.
- We are still limited by the size of a single drive, so no additional storage space is gained.
RAID 0+1 – Here we're starting to get a bit more clever, but not completely. Data is both striped and mirrored. This is where you may need more space of a RAID 0, but you want the redundancy of a RAID 1. Data is first sliced across the multiple drives, and then cloned into the mirror. You get the benefits of 0, with the additional storage and speed, the safety of 1 because you have the entire mirror, but unfortunately you also get the downsides of both as well.
- 0+1 gives us the speed boost of 0, and the mirroring of 1 together
- 0+1 mitigates some of the failures of 0, where if one drive fails you can recover it. However, if one drive fails you are officially down to one drive, and your entire array depends on the one single drive to read everything from, probably it's most intense use to date, without failing, to replace the failed drive. If that drive does fail, the entire array is lost.
RAID 5 – Finally we arrive at something that might work for a real world use case. 5 introduces the concept of striping with parity. Parity is going to be a big word as we continue. The concept is that if you take a RAID 0 array, for each bit on the drive you can do a mathematical equation on it, and then at the end, record the result of that equation. If one drive fails, you simply reverse that equation to find out what the value of the failed drive's bit was and store it on the new drive. I'll explain this a bit more, but essentially your number of drives needed for a 1-drive failure scenario is no longer 2N (where N is the number of drives in your array) for 0+1, but now it's N+1. For N number of drives, with parity you only need one additional drive. (The exact implementation is a bit different in that it slivers the parity across multiple drives, but for now this explanation will work) So to break it down:
- 5 gives us RAID-0 read speeds, with varying write speeds (due to the calculation of the parity).
- 5 gives us full array redundancy, with only needing 1 extra drive
- 5 any one drive can fail, and the entire array can be rebuilt from the parity – however...
- If any one drive fails, the entire array must go through a rebuild cycle to regenerate the failed drive, to bring the system back up to parity.

Again I'll dive into how that parity is calculated below, but that's the gist of it.

Okay, that was a lot, thanks for learning (a few of the standard) RAID types!

Hardware RAIDs are common, and I used one as my primary storage for a while, however they have a couple major flaws.

RAIDs are hardware, meaning that they work using your motherboard's RAID controller, or some other controller that you may install. This means that if that controller fails, then you are at the mercy of another controller working in the same way, or finding another identical controller. Portability to a new computer is near non-existent because of this.

RAIDs require that all of the drives be the exact same model. Not just the same size, the same model. Remember the hardware is in control of slicing that data up, and that means that the interfaces and ways it stores that data must be exactly the same. This is a pretty severe limitation of RAID, that you pretty much must know exactly how you want to build your array before running it.

But what if they don't make that drive anymore? What if they changed the drive and didn't tell anyone? What if you simply want to add more storage to your array? Well, then it's time to take it to the next level. To software RAIDs.

Windows Storage Spaces

Storage Spaces was my first foray into the world of Software RAIDs. Software RAIDs are similar to Hardware RAIDs in that they still combine disks, usually with the same basic algorithms, but since they're software based they have additional flexibility, mostly that being in software, you don't need to use the same model or even same capacity of drive. You can add a 4TB drive to an array of 3TB drives and it will work fine. This was a huge determination for me, because I wanted my array to grow with me.

Windows Storage Spaces is a built-in Microsoft approach at handling multiple disks and spreading data across them. I first heard about it through a friend at work, who recommended Storage Spaces. I decided to try it out, just for fun, and created a Virtual Machine with Windows Server 2012 on it. I attached 4 virtual “disks” to the virtual machine, so I could play with the array. The drives weren't big, only 100MB each, but I was able to create a simple array through Window's dialogs. The size was fair, it wasn't the full 400MB, but it was clearly keeping a parity copy, so it was about 320MB of total space.

I copied some data into the newly formed Storage Spaces drive, and then I proceeded to mess around. I shut down the VM, detached a drive, and watched what would happen. Storage Spaces saw the failed drive, and offered to remove it, or I could even still start it in a “degraded” state. I detached another drive and the array went offline. I could hot swap drives, I yanked drives while the VM's “power” was still on, everything was stable. I added drives of different sizes. When I was happy with my testing, I installed Server 2012 on a spare computer with a bunch of SATA ports, set up my storage spaces, and started the copy.

Screenshot of the GUI configuring Storage Spaces Configuration Storage Spaces, from the blog I'm pretty sure I read way back in the day to get started

Which took forever. I assumed it was just that my network was slow or something but I was getting maybe 100Kbps. I learned the biggest downfall of the software raid, that it's software. Being in software means that parity calculation must be also done in software, there is no specialized hardware calculating the parity for you. Read speeds were fine off the drives, but writing to the drives became an arduous task. I stuck with Storage Spaces for a while, but it was clear that as long as I used Storage Spaces, I would just have to deal with the trade off of flexibility for write speeds.

Until...

Unraid

I started hearing about Unraid on Reddit's /r/DataHoarder for a while. DataHoarder I realized was a totally real term that did apply to me, the need to retain and collect all data while the Delete key on the keyboard grows dusty and sad. Essentially we're squirrels, but for data.

Unraid is an entire Operating System, meaning that it won't be something you can enable on your existing computer, you will need a separate computer from your primary to run it.

It's primary feature is of course, the data array. Unraid offers a software raid that's similar to Storage Spaces in that it can use arbitrarily sized disks to create an array of any disks you have lying around. It creates a “storage pool” using these disks, mimicking a RAID 0 environment, but with an important caveat. It doesn't stripe the data. Instead of striping your data like slicing a loaf of bread, it will choose one of your disks to put the entire “loaf” (or file) on. So when you look at an individual disk, you will see your whole file sitting there. The pool/array part is that your directories (folders for you windows folks) will be split, so one file may be on one disk, but another file will be on a completely separate disk. The operating system then uses some proprietary magic to create “shares”, that combine all of these individual disks to then look like one large cohesive drive.

Image of unraid's main screen, showing multiple drives

So that's storage, what about parity? Storage Spaces reserves a small portion of each drive that's attached to store parity bits from all other drives. Well, let's talk about Parity.

Unraid's Parity system

Parity for unraid is the same basic function, but instead of reserving space on each drive, in Unraid you specify a separate drive, similar to RAID 5 mentioned above, to be your parity drive. The caveat is that Parity must be the largest drive in the array, or equal to the largest drive in the array. This makes sense why in a second.

To calculate parity, on each write Unraid does basically the same thing as RAID5 and Storage Spaces, in that it it runs a calculation on the entire array to determine what the parity bit should be. If you have 4 drives plus 1 parity then, it would mean that it sums up the first 4 bits, and calculates what the parity bit would be.

For you nerds following, this calculation is an XOR between each drive. For everyone else, think of it like adding up the 0's and 1's, and then deciding if it's even or odd. An even would be a zero, an odd would be a one. So if one drive failed, all you would need to do is add up those numbers again, and using the parity you could tell what the missing value would be. If you had a missing drive and the values were even, but the parity said it was an odd number, then you know that missing bit should be a one. Some examples:

Drive 1	Drive 2	Drive 3	Drive 4	Parity
0	0	0	0	0
1	0	0	0	1
1	1	0	0	0
1	1	1	1	0

So when writing a file it will check the other bits at the same location on each drive, and then update the parity drive with the new value. This is why the parity drive must be the largest drive, it must have the capacity to store the entire array's parity bits. For the largest actual storage drive in the pool, there must be a corresponding parity bit to match up in case that largest drive fails.

Double Parity

Unraid allows you to set up no parity (please don't do this), single parity like above, or double parity. This is actually supported across the other systems as well, RAID 6 is the same as RAID 5 but it has the extra parity drive. Storage Spaces also supports the double parity.

Why do double parity? Well, let's think about how parity works. If a drive fails, that means your array has no redundancy. At that moment, you have suffered the maximum amount of drive failures that it can handle. At the same time, you need to replace the failed disk, and at that time the rebuild process will start. The array rebuild consists of – Reading every bit in sequential order, start to finish, on every drive – Calculating the what bit should be on the new drive – Writing the new bit to that new drive

During a parity rebuild this operation will run all of your disks at 100%, for likely many hours (maybe even days, now that we're running at 20+TB drives). Your machine will have 100% activity while also producing the most heat it ever will. In essence, these are prime conditions for another drive to fail. If you have another that is teetering on the edge of failure, this would be the time for it to fail – and remember it's at this time you are at your most vulnerable, you have no extra redundancy.

This is why I recommend just biting the bullet and putting the extra drive into your cart. Yes, it's more money, but it'll save the extra stress and anxiety if the worst should happen.

Cache Drive

Unraid supports using a Cache Drive along with the main array. This is extraordinarily useful because of the limitations of both write speeds of spinning hard disk drives, but also the limitations of calculating parity. (Remember that calculating parity means spinning up all of your drives at once, and then running those calculations, then writing to the drive the file is stored on along with updating the parity). Using a cache drive like an SSD or NVME drive means that you can write that new file at blazing fast speeds, and unraid will save it to your full data array later. The tradeoff is that for this short time your file is unprotected by the array, but it allows you to move onto other things.

The process of moving files to the array is (clever enough), called the Mover in unraid. For my own, I schedule the mover at about midnight every night. Unraid takes any files on the cache drive and saves them to the array. In the case that my cache drive dies, I've lost maximum one day's files.

Full Homelab Suite

Unraid is much more than just a data storage system. It's a full operating system, with full virtualization built in. Because of this in Unraid you can run virtual machines and run docker images right from Unraid itself. This can be an amazing way to get started with homelabbing, by running your first applications right on unraid. There's no need to run separate servers or anything with unraid. You of course can, but it's extremely easy to get your first applications running with only Unraid. Need Windows? Spin up a Windows VM. Want to run Plex? Use the Plex docker image and directly hook in your media.

Comparisons

My ultimate decision to go for Unraid was a personal one, there are many other storage solutions out there that I could have gone with, more than I can document here, but for me ultimately this is how I viewed it.

Pros – The array can be expanded, and I can use newer drives to allow the system to grow with me. Where I started with 3TB drives, I just installed my first 22TB drive into the array, all without needing to do a massive copy of all of my data to a new array. – Write speeds are better than storage spaces because of how parity is calculated. – If you add a cache drive, write speeds are much faster than storage spaces. – Unraid is software, so you aren't dependent on a controller failing, like with a standard RAID. If your motherboard dies, you plug the drives into a new computer, and start Unraid. The configuration itself is stored on the array. – Unraid has an amazing feature suite of additional things you can do with it. It deserves it's own post, but if you're getting started you can: – Run VMs – Run docker images – Download a shocking amount of plugins

Cons – Not a true RAID array, so no performance gains – By keeping files separate rather than sliced, you are limited to the speed of your drive – A full OS, so this will truly only work as network attached storage, not on a primary PC – Proprietary. While relatively inexpensive (currently $119 for a forever license), it is not open source.
– Needs to run regular jobs to maintain parity – Parity check (on your schedule preference, I run monthly) – Mover (and while mover hasn't run, your data is at risk)

Summing up

Overall, I chose Unraid for it's extreme flexibility, even with it's slight performance hits. For me, this is my primary network storage, where I store large files that may only be accessed once in a while. If I'm storing things there, I can live with a write operation taking a bit longer than lightening speed. With the cache drive on top I can get the maximum my network allows, and then set it and forget it, with the mover picking up my writes again later.

There are many different options out there. One I looked into I didn't have the space to write about here was TrueNAS, I know a lot of people like that. I've toyed with ZFS a bit too on proxmox, ultimately it's going to be a use case. There are dozens of comparisons online if you're curious between options, but I hope you see the benefits of Unraid, and why I personally chose it.

It is a great system to start with if your curious about starting homelabbing/self hosting, as it does provide everything out of the box.

This ended up being much longer than I anticipated, but I wanted to give a full idea of how I manage storage. If you read this far, thank you! This was a lot to write so we'll see going forward how my other posts turn out. See you next time!

Hello, World

August 15, 2024

Hello! My name is Robert Clabough, and welcome to my newly-minted blog.

I've never been one to write, or to put myself out there online either, but in an effort to push myself out of my comfort zone, here I am.

Today, I want to share a bit about myself and what you can expect from this blog. (tldr – a lot of very nerdy things)

I'm a software engineer in the Seattle/Bellevue area, and I've been living here about 8 years now. I've been told I have a more unique experience then others when it comes to working tech, and from what my peers have told me, others may be interested in hearing my experiences as well. I don't want to go too deep into my work history, I'm sure you don't want to read a resume today, but I will give some quick background on me, my work, and what I do for fun today.

I started off first interning for the State of Iowa many years ago, and learned some great skills there. However my career really started with Robert Half consulting, and I contracted for about 7 solid years at the beginning of my career. I'm going to do a whole post at some point about contracting (consulting, as they like to call it), but ultimately I had a very positive experience. It taught me how to be quick and agile, how to be dropped into a project that's halfway over and start producing work within not weeks, not days, but hours. I appreciate the time I put in there and the people who helped me succeed. (There's 2 at Robert Half I know helped me more in my career than they realize)

Robert Half was a very unique journey, bouncing between 4 person startups out of a garage to working for large enterprises like Expedia. One month I was running games at a Google conference in London, and the next I was the lead engineer for an education startup. I learned how different companies use code in very different ways, from social media to fintech to pure behind the scenes infrastructure. I'm proud of the vastly different experiences that I've had, and I believe that it's allowed me to see tech from a difference lens compared to many engineers.

This blog won't only be about work, however, in fact even more it may be about my silly side projects.

I have a few hobbies, the topmost is what is known as “Homelabbing” or “self-hosting” – running services in your home. Hosting your own Netflix, hosting your own email, recipes, chat, video, video game streaming, you name it, I've probably tried it. Essentially if it has a monthly subscription fee – I probably host it myself.

This is something I've always been a fan of. Even as a teenager I was hosting Windows XP: Media Center Edition. It was a bit different back then, but anyone else remember those precious few years when Microsoft gave you a legitimate 10 foot interface, you could buy a remote from Logitech, and watch all of your .wmv files right on your TV? I went to Best Buy, bought an ATI card with S-Video, and hooked it right up to my 24” CRT TV, and was it cool. Blurry, 120x240 episodes of South Park that I could play whenever I wanted.

Screenshot of Windows XP Media Center Edition, showing the tiles for TV, Movies, and Pictures

Since then my presence has grown in the homelab space then. I'm not running a pentium 3 with the S-Video gpu anymore unfortunately, but I think it's still sitting somewhere in my garage. Now I'm told I run things more extravagantly than the average homelabber.

My current setup is a full kubernetes cluster, running about 150 pods at any given time. I run on my own hardware, with about 5 primary compute nodes, a data server, several switches, and a very patient wife. After all, as the saying goes

Why pay monthly compute costs to AWS when I could instead spend 10 times that amount on hardware?

Lately I've been toying with AI, running compute pods with GPUs attached to them, running LLMs, Image Generation, Voice simulation, and more. All things I look forward to writing blog posts about, but I've been successfully able to build my own little selfhosted ChatGPT, I have a Firefox extension that calls my own GPT to do things like summarizing pages, I am working on a shell extension for Gnome that can interact directly with it, and a very fun one – every morning I have EDI, the ship's computer from Mass Effect, read out out my agenda for the day.

My servers, in a 26U rack

That's really what I hope this blog will be about, showing what I'm tinkering with, what I'm trying to get running. I think it's a ton of fun, seeing some new technology big tech is running with and excited about, and trying to get it to run on my bundle of spare parts I have in the back closet.

My next few blogs are already thought out, but maybe there will be a new project that grabs my eye in the meantime. I don't expect everyone to find interest here, but if you're interested how I run kubernetes at home, how I monitor the services, how I provide alternatives to iCloud/OneDrive to my family, and more, then I hope you will find interest in this. I hope to provide insight to my thinking process, how I troubleshoot issues that come up, and as we go I'll make sure to provide tricks that I've found to be helpful.

If you're on the Fediverse, then my blog is subscribe-able! You can subscribe at @robert@blog.clabough.tech from any service that you prefer. If you're not on the Fediverse, here's a quick video explaining what it is and I will probably have a blog post diving more in depth as well, as this is now my 3rd fediverse service that I'm hosting.

Thank you for stopping by! I'm excited to share my experiences and path through not only homelabbing and hobbies, but also my career and software engineering as a whole.