Understanding TeamCity's Architecture

Key Components

  • Server: Central controller responsible for project configurations, build history, VCS polling, and result aggregation.
  • Build Agents: Execute build steps. Can be physical, virtual, or containerized.
  • Database: Stores metadata, settings, build history. Relational DB required for production-grade installs.
  • VCS Roots: Define Git/Subversion repositories used to trigger builds.

Pipeline Orchestration

Build configurations are defined using UI or Kotlin DSL. Build steps run sequentially or in parallel. Complex pipelines may involve snapshot dependencies and artifact passing between builds, introducing risks of orchestration failure or agent starvation.

Common Issues and Root Causes

1. Builds Hang Indefinitely on Step Execution

  • Cause: Resource contention, hanging test runners, or agent overload.
  • Diagnostic Tip: Check teamcity-agent.log and isolate which command is blocking.
  • Impact: Build queue pile-up, leading to environment-wide delays.

2. Agent is Registered but Does Not Pick Up Builds

Agent appears active but never executes builds.

  • Root Cause: Incompatible build requirements, unauthorized agent pool, or missing toolchains.
  • Resolution: Validate agent compatibility matrix and buildAgent.properties tags.

3. VCS Trigger Fires Late or Misses Changes

Builds do not start immediately after commits.

  • Root Cause: VCS polling interval too high, Git rate-limiting, or incorrect branch specification.
  • Impact: Delayed feedback loops or skipped releases.

4. Kotlin DSL Changes Not Reflected in UI

  • Cause: DSL cache not updated or Git branch misalignment in settings repository.
  • Fix: Re-trigger synchronization from Versioned Settings tab and check DSL status report.

5. Build Dependencies Cause Unpredictable Failures

  • Root Cause: Snapshot dependencies create tight coupling between projects, leading to failure propagation.
  • Solution: Break cyclic dependencies, use artifact dependencies with clear versioning.

Advanced Diagnostics

1. Debugging Agent Assignment

Use agent requirements view:

Build Configuration > Requirements Tab
Check unmet conditions or custom parameters

Check conf/buildAgent.properties for environment and custom tags:

teamcity.agent.name=agent-01
env.JAVA_HOME=/usr/lib/jvm/java-17

2. Monitoring VCS Triggers

Increase VCS polling verbosity:

internal.property.teamcity.vcs.trigger.debug=true

View logs at:

logs/teamcity-vcs.log

3. Identifying Hanging Build Steps

Enable step-level timeouts:

Build Step > Advanced Options > Execution Timeout: 20 minutes

Use diagnostic builds with set -x or verbose flags to trace stuck commands.

Step-by-Step Fixes

1. Fix Agent Build Assignment

1. Open agent details in TeamCity UI.
2. Review compatible configurations.
3. Edit buildAgent.properties to match required parameters.
4. Restart agent: ./bin/agent.sh stop && ./bin/agent.sh start

2. Normalize VCS Polling

1. Reduce polling interval to 60s.
2. Use Git webhook for immediate triggering.
3. Validate VCS root uses proper branch spec: +:refs/heads/*

3. Clean Up DSL Misalignment

1. Navigate to Project > Versioned Settings.
2. Trigger manual sync.
3. Review build log for DSL errors.
4. Ensure settings.kts branch aligns with main VCS root.

4. Break Snapshot Dependency Chains

Convert snapshot to artifact dependencies:

1. Modify upstream build to publish artifact.
2. Use rules: +:target/app.war => app-latest.war in downstream.
3. Decouple build triggers for better fault isolation.

Best Practices for Enterprise-Scale TeamCity

  • Use Kotlin DSL for source-controlled configuration but validate sync status regularly.
  • Tag agents by capabilities and use explicit requirements in build configurations.
  • Limit use of snapshot dependencies across projects; prefer artifact-based decoupling.
  • Use webhooks over polling for faster VCS trigger response.
  • Enable monitoring plugins and keep audit logs for VCS/trigger events.

Conclusion

TeamCity provides rich CI/CD features for complex enterprise workflows, but its flexibility can introduce subtle bugs without disciplined management. Issues like idle agents, stalled builds, or missed VCS triggers often stem from misaligned configurations, dependency misuse, or lack of observability. With proper diagnostics, dependency design, and trigger hygiene, engineering teams can maintain a fast, resilient, and scalable TeamCity pipeline.

FAQs

1. Why are my agents idle despite available builds?

Agents may not meet build requirements or be in the wrong pool. Review agent compatibility and tag settings in buildAgent.properties.

2. What causes VCS triggers to delay or miss builds?

Polling interval might be too long or branch filters too narrow. Use Git webhooks for real-time triggering and validate root configuration.

3. How can I diagnose DSL sync problems?

Check the Versioned Settings tab for sync errors. Ensure that DSL branches and VCS roots match your main repository configuration.

4. Are snapshot dependencies safe to use?

They work well within a single project but can cause cascading failures across projects. Prefer artifact dependencies with versioned outputs for stability.

5. What should I monitor to detect build slowness?

Track build queue times, agent utilization, step execution durations, and VCS trigger logs. Use TeamCity's diagnostics and monitoring tools for insights.