Intermittent Log Ingestion Failures in Loggly

Problem Overview

In production environments, it's not uncommon for some log entries to mysteriously vanish. Developers verify log shipping at the application level, yet the logs never reach Loggly or appear with significant delay. This creates blind spots, especially when logs are essential for incident response or compliance. The root causes typically involve throttling, misconfigured endpoints, network bottlenecks, or malformed payloads.

Architectural Context

Loggly Data Flow

Logs are pushed via syslog, HTTP/S endpoints, or third-party shippers like rsyslog, Fluentd, or Logstash. These clients send logs to Loggly's ingestion API, which applies parsing, token validation, and routing to storage and search indexes. Failures in any stage may silently discard logs without immediate feedback.

Rate Limits and Ingestion Policies

Loggly enforces rate limits per account or token, especially on shared tiers. Surpassing these thresholds can lead to dropped events, with little to no error surfaced on the sender side unless explicitly monitored.

Diagnostics and Root Cause Analysis

1. Confirm Log Delivery

Use curl or a test script to post logs to Loggly's HTTP endpoint and inspect response codes.

curl -X POST -H "content-type:text/plain" \
  -d "test log entry" \
  https://logs-01.loggly.com/inputs/LOGGLY_TOKEN/tag/http/

2. Validate Application-Level Shippers

Check logs of Fluentd or rsyslog for backpressure or delivery retries. Look for high buffer queue sizes or output plugin errors.

tail -f /var/log/td-agent/td-agent.log
grep error /var/log/syslog | grep rsyslog

3. Monitor Rate Limits

Loggly does not expose real-time rate limit errors directly but does log ingestion delays via their web UI and API usage dashboards. Periodically query ingestion metrics via the API.

Common Pitfalls

  • Omitting tags or sending malformed JSON (Loggly drops unparseable events)
  • Fluentd plugins buffering too aggressively due to unreachable endpoints
  • Relying on UDP-based syslog, which offers no delivery guarantee
  • Sending logs in large bursts, triggering rate limiting silently

Step-by-Step Fix

1. Switch to HTTP(S)-Based Logging

Prefer HTTP for delivery where reliability is critical. Enable TLS and retry logic.

curl -X POST -H "content-type:text/plain" \
  --retry 5 --retry-delay 2 \
  -d "log payload" \
  https://logs-01.loggly.com/inputs/LOGGLY_TOKEN/tag/script/

2. Add Monitoring to Your Shipper

Configure internal metrics for rsyslog or Fluentd to expose error counts, buffer overflows, and retry rates.

# Fluentd monitoring plugin example

  @type monitor_agent
  port 24220

3. Optimize Throughput

  • Batch small logs into larger payloads where possible
  • Throttle application log levels during high-load scenarios
  • Enable backpressure alerts on queues

4. Validate with Loggly API

Use Loggly's Search API to validate presence of logs by tag or time window.

curl -u USERNAME:PASSWORD \
  "https://logs-01.loggly.com/apiv2/events?query=tag:script&from=-10m"

Best Practices

  • Use HTTPS endpoints with retry logic for all logging clients
  • Set up dead-letter queues for critical logs that fail delivery
  • Tag logs systematically for traceability
  • Limit log verbosity in production using structured log levels
  • Review ingestion metrics weekly to catch trends or silent drops

Conclusion

Intermittent log delivery issues in Loggly can cripple DevOps workflows if left unaddressed. By understanding the full log ingestion lifecycle, validating shipper configurations, and monitoring rate limits and payload integrity, teams can ensure consistent observability. Prioritizing reliability in log transport, adopting monitoring for shippers, and using the Loggly API effectively form the cornerstone of a resilient logging pipeline.

FAQs

1. Why are my logs missing even when the application sends them?

They may be dropped due to rate limits, invalid formatting, or delivery issues in the shipper. Always verify end-to-end delivery using Loggly's API.

2. Does Loggly provide ingestion failure alerts?

Not by default. You must monitor shipper logs and configure dashboards to track missing log patterns or ingestion lags.

3. Is it better to use syslog or HTTP for Loggly?

HTTP is preferred for reliability and observability, especially in containerized or serverless environments. It allows retries and better control.

4. Can Fluentd lose logs during high load?

Yes, if buffers fill up and output plugins are blocked. Always configure persistent buffers and monitor queue size.

5. How can I test if Loggly is receiving my logs?

Send a sample payload via curl and search for it in Loggly using the UI or API. Tag your test log distinctly for easy querying.