My UPS is dumb. No USB, no network card, no NUT support; just a battery in a box that buys my gear a few minutes when the mains drops. The usual advice is "buy a smart UPS." I had a $2 microcontroller in a drawer instead. This is how a board that costs less than a coffee taught a server to put itself to sleep when the power goes out and wake itself back up when it returns; with no signal from the UPS at all.


The problem, plainly.

A smart UPS tells the machine "I'm on battery, you have N minutes." A dumb UPS tells you nothing. So the server has no idea an outage is happening; it just runs on battery until the battery dies and everything drops hard. For a box running a TrueNAS VM and a handful of containers, a hard drop is the thing you most want to avoid.

Two things plugged into the UPS: a Lenovo M910 SFF running Proxmox, and the router. I wanted exactly two behaviours:

  • When the power goes out, the server gracefully suspends before the battery is in danger.
  • When the power comes back, the server wakes itself; no button, no keyboard, no me.

The catch: with a dumb UPS there is no outage signal to react to. You have to infer the power state. And that inference is the whole trick.

The trick; infer power from a board that isn't protected.

Everything I cared about was on the UPS, which meant it all stayed up during an outage. So none of it could tell me the power was gone. I needed a witness on the other side of the line; something plugged into a normal wall socket, not the UPS, whose only job is to be alive when the mains is alive.

That witness is a Wemos D1 mini; an ESP8266 board flashed with ESPHome and powered from a USB charger on a non-UPS outlet. The logic is almost insultingly simple:

The board's presence on the network is the power signal the entire design in one line

Mains is up → the board is powered → it answers pings.

Mains drops → the board loses power → it goes silent.

The server pings the board every 30 seconds; if it disappears for three minutes straight, that's an outage, and the server suspends. No false alarm from a single dropped packet, no special hardware, no UPS cooperation required.

Waking the server is the half nobody mentions.

Suspending on an outage is the easy half. Waking on return is where the elegance lives and it falls out of the same board for free.

When the power returns, the D1 boots. So I gave it one job on boot: fire a Wake-on-LAN magic packet at the server. Power comes back → board boots → packet flies → server resumes from suspend. The same device that detects the outage is the one that ends it. One board, both halves, no extra logic.

Here's the ESPHome boot hook, it waits for WiFi, then sends the magic packet five times over fifteen seconds, because extra packets are harmless and the server's NIC takes a moment to be ready after power returns:

esphome:
  name: d1-mini
  on_boot:
    priority: -100
    then:
      - wait_until:
          condition:
            wifi.connected:
      - repeat:
          count: 5
          then:
            - script.execute:
                id: send_wol
                mac: "30:9C:23:75:43:68"   # server NIC
                host: "192.168.1.255"       # broadcast, not unicast
                port: 9
            - delay: 3s

script:
  - id: send_wol
    parameters:
      mac: string
      host: string
      port: int
    then:
      - lambda: |-
          WiFiUDP udp;
          uint8_t packet[102];
          for (int i = 0; i < 6; i++) packet[i] = 0xFF;
          for (int i = 1; i <= 16; i++) {
            for (int j = 0; j < 6; j++) {
              char mac_byte[3];
              mac_byte[0] = mac[j * 3];
              mac_byte[1] = mac[j * 3 + 1];
              mac_byte[2] = '\0';
              packet[i * 6 + j] = strtol(mac_byte, nullptr, 16);
            }
          }
          udp.beginPacket(host.c_str(), port);
          udp.write(packet, sizeof(packet));
          udp.endPacket();
          ESP_LOGD("wol", "Sent WoL to %s:%d", host.c_str(), port);

One detail that bit me and is worth stating: send the packet to the broadcast address (192.168.1.255), not the server's IP. A suspended machine ages out of the switch's ARP table, so a unicast packet aimed at its IP can get dropped. Broadcast reaches the NIC regardless of what the switch remembers.

The Proxmox side a 15-line watcher.

On the server, the whole detector is a bash loop run as a systemd service. Ping the witness; count consecutive failures; suspend after six of them (six × 30s = three minutes). That threshold is deliberate it rides out a single dropped packet or a brief WiFi blip without overreacting.

REF=192.168.1.222     # the D1, on a non-UPS socket
POLL=30
THRESH=6              # 6 x 30s = 3 min before suspend
FAILS=0
while true; do
  if ping -c1 -W2 "$REF" >/dev/null 2>&1; then
    FAILS=0
  else
    FAILS=$((FAILS+1))
    if [ "$FAILS" -ge "$THRESH" ]; then
      logger "powermon: beacon gone, suspending"
      FAILS=0
      systemctl suspend
    fi
  fi
  sleep "$POLL"
done


When systemctl suspend runs, the loop freezes with the whole OS. On resume triggered by the D1's magic packet the line returns, the next ping succeeds because the mains is back, and the counter resets. The suspend itself is transparent to the logic. That's a happy accident of the design, not something I had to engineer.

The part I got wrong the first time.

I tested it, declared victory, and then a longer outage broke it. On resume the SMB mount from the TrueNAS VM was dead, mount -a did nothing, and the dependent container refused to start. The error was the giveaway:

Couldn't chdir to /mnt/fileserver/hdd: No such device
mount error: could not resolve address for truenas.home.lan

Two failures, both about time. First: the host suspended holding a live CIFS connection; on resume that connection was a frozen corpse, but the kernel still thought the share was mounted, so it refused to remount over it. Second: the host tried to mount before the TrueNAS VM had even finished booting and was nowhere near serving SMB. A short suspend hid both fast enough that the connection survived. A long one exposed them.

The fixes were unglamorous and correct:

  1. Force-clear the stale handle.
    umount -f -lbefore remounting, so the resume starts a fresh connection instead of trying to revive a dead one. None of this is clever. It's just the difference between "works in the demo" and "works at 3am during a real outage." Which is the only kind of working that counts.
  2. Wait for the service, not the process.
    qm status = runningonly means the VM's process started, not that SMB is answering. Poll port 445 until it actually responds ,then mount.
  3. Mount by IP, not hostname.
    Name resolution can fail on resume right when you need it drop DNS out of the critical path entirely.
  4. Add nofail in fstab.
    A share that depends on a guest VM must never block the host's own boot.

A smart UPS with network management would have run $150+ and solved a narrower version of this. The board was already in the drawer. The right answer is usually the one that exists.

The lesson I keep relearning: the gap between a smart device and a dumb one is often just a signal you can synthesize yourself. The UPS couldn't tell the server anything. So I built the smallest possible thing that could and let its mere existence carry the message.