Understanding the Problem
Issue Summary
A custom or third-party systemd service on Ubuntu fails to start during boot but starts successfully via systemctl start
after login. This leads to failed application availability, delayed monitoring agent startups, or partial service availability in clustered environments.
Common Symptoms
- Service logs show "dependency failed" or "network unavailable" errors
systemctl status
shows "inactive (dead)" at boot timejournalctl -xe
reports race conditions or "unit not found"- Manual start works fine post-login
Root Cause Analysis
Systemd Boot Semantics
Systemd units are managed in parallel unless explicitly serialized via dependencies. Services relying on network interfaces, mounts, or sockets can attempt to start before their requirements are fully ready—especially in fast-booting systems or cloud VMs.
Dependency Graph Problems
Incorrect or missing After=
, Requires=
, or Wants=
directives in the service file result in systemd launching services prematurely.
[Unit] Description=Custom Daemon After=network.target Requires=network.target
Race Conditions and Timeouts
Services depending on interfaces like eth0
or docker0
might launch before DHCP or virtual interfaces are ready. Cloud-init or Netplan may not complete configuration before the service starts.
Architectural Implications
System Reliability and Observability
Critical services failing at boot result in alert fatigue and brittle environments. These failures are hard to trace because post-boot state appears clean, masking underlying race conditions.
Cluster and Microservice Impact
In HA setups or service meshes, delayed service registration affects discovery and auto-healing. System readiness must be deterministic for orchestrators to function correctly.
Diagnostics and Verification
Use systemd-analyze
systemd-analyze blame systemd-analyze critical-chain
This identifies startup delays and highlights which services blocked or failed early in the boot sequence.
Check Unit File Consistency
systemctl cat my-service.service systemd-analyze verify /etc/systemd/system/my-service.service
Verify that all Requires
and After
directives align with actual target units present in the system.
Monitor Logs Early in Boot
journalctl -b -0 -u my-service.service
This reveals early boot issues that do not appear once the service is started manually.
Step-by-Step Fix
1. Harden Service Dependencies
Replace generic network.target
with network-online.target
if the service needs full connectivity.
[Unit] After=network-online.target Wants=network-online.target
2. Add Explicit Dependencies
Ensure dependent services or mounts are listed in Requires=
or Wants=
to avoid soft failures.
3. Use systemd service conditions
Delay service execution until conditions are met using ConditionPathExists
, ExecStartPre
, or scripts that poll readiness.
4. Extend Timeouts if Necessary
[Service] TimeoutStartSec=90 ExecStartPre=/usr/local/bin/check-network.sh
Check network reachability or other required system states before launching the service.
5. Rebuild Daemon and Reload
systemctl daemon-reexec systemctl daemon-reload systemctl enable my-service
Best Practices
- Use
network-online.target
instead ofnetwork.target
for connectivity-based services - Always verify service unit dependencies using
systemd-analyze verify
- Use
ExecStartPre
to script critical prechecks - Do not rely on login sessions or user timers for service readiness
- In cloud environments, ensure proper cloud-init finalization before dependent services start
Conclusion
Systemd is a powerful but strict initialization system. Misconfigured unit files or ambiguous dependencies can silently delay or prevent service startup at boot, leading to erratic behavior in production. By adopting deterministic boot sequencing and verifying system dependencies with appropriate tooling, Ubuntu administrators can eliminate this class of error and build more resilient operating system configurations.
FAQs
1. Why does my service only fail at boot but runs manually?
At boot, required dependencies like network interfaces or mounts may not be ready. Manual starts occur after these resources are available, hiding timing issues.
2. How do I delay a systemd service until networking is ready?
Use After=network-online.target
and Wants=network-online.target
in your unit file. Also ensure systemd-networkd-wait-online.service
is enabled if applicable.
3. Can I visualize service startup timing?
Yes. Use systemd-analyze blame
and systemd-analyze critical-chain
to see boot-time delays and dependencies.
4. What is the role of systemd-analyze verify?
It checks for syntactic and logical errors in unit files, ensuring dependencies refer to valid targets and preventing misbehavior during boot.
5. Is network.target sufficient for networking services?
Not always. network.target
signals basic networking availability, but network-online.target
ensures full connectivity, including IP assignment and route readiness.