Background: Plastic SCM Architecture Relevant to the Failure Mode

Core Concepts Recap

Plastic SCM centers on changesets, branches, and labels, with both centralized and distributed workflows. Two design features matter most for this troubleshooting topic:

  • Xlinks: References that mount a path in one repository to a specific changeset in another repository. Xlinks can be pinned to a fixed changeset or left "floating" to follow a branch tip. Writable Xlinks add another layer of complexity by allowing checkins inside the referenced repo from the parent workspace.
  • Partial Workspaces: Selection rules that load a subset of the tree into a workspace, crucial for artist workflows and large monorepos. Plastic supports developer workspaces and Gluon workspaces; both can be partial.

In large organizations, these features combine with platform diversity (Windows/macOS/Linux), case sensitivity differences, and CI pools that reuse workspaces. The side effect is a class of problems that do not manifest in small demos: cross-repo coupling through Xlinks diverges silently, partial selection hides critical files, and CI agents or developer machines start exhibiting phantom pending changes and repeat merges.

Why This Problem Is Strategic, Not Just Tactical

Non-reproducible workspaces erode release confidence. Build pipelines become flaky, hotfix merges reintroduce old content, and artists or engineers lose hours in conflict resolution that should never have existed. Because Xlinks encode supply-chain style dependencies between repositories, drift affects licensing content, physics engines, and vendor asset packs as much as source code. Architects need durable mechanisms that make a specific changeset deterministically produce the same workspace on every machine.

Problem Statement: Xlink Drift & Phantom Pending Changes

Symptoms You Will Observe

  • "cm update" on the same branch yields different content on different machines, especially under load-balancing CI agents.
  • Repeated merges between long-lived branches reopen conflicts already resolved the day before.
  • Pending changes appear after a clean update with no actual file edits, often in directories mounted via Xlinks or filtered by partial rules.
  • Writable Xlinks cause checkins that are invisible to some team members until a later "mystery change" lands.
  • Binary locks or exclusive checkout rules block checkins although no one appears to be editing the asset in the parent repo.

High-Level Root Causes

  • Floating Xlinks without governance: The parent repository references a branch tip in a child repository instead of a pinned changeset. Different agents update at different times; they see different tips and produce non-identical workspaces.
  • Workspace rule mismatch: Partial workspace rules differ between developers and CI, or Gluon selection hides invisible metadata (e.g., .plastic control data or per-platform files) that affects computed status.
  • Cross-platform normalization: Line ending and file-mode attributes (e.g., executable bit) differ across platforms. Case-insensitive vs case-sensitive file systems trigger rename/move churn invisible to one group and loud to another.
  • Writable Xlink rebases: Teams commit within Xlinks from the parent workspace, but other workspaces pin to different Xlink changesets. Merges now require resolving both parent and child histories, often in the wrong order.
  • Stale CI workspaces: CI agents cache large partial workspaces for performance. Selector or branch switches happen, but concealed partial rules or leftover local changes keep agents out of sync with the intended manifest.

Diagnostics: Build a Deterministic Picture

Capture the Workspace Intent and Reality

First collect the selector or workspace configuration, the Xlink bindings, and the exact changesets seen by the machine. These commands (or their GUI equivalents) establish ground truth.

rem Show the workspace selector / configuration
cm showselector

rem List Xlinks resolved in the current workspace tree
cm xlinks --tree

rem Print the current branch and last changeset
cm status --nochanges --compact

rem Log of the parent branch with changeset IDs
cm log --branch=br:/main --limit=20 --compact

Record the output and attach it to the incident. You are looking for evidence that the same parent branch changeset resolves to different child Xlink changesets on different machines.

Prove or Disprove Xlink Drift

Compare two machines (e.g., a developer box and a CI agent) that claim to be on the same parent branch/changeset. If their Xlink resolutions differ, drift is proven.

rem Export the resolved Xlink manifest as a list
cm xlinks --format="{PATH} {REPOSITORY} {BRANCH_OR_CS}" --list > xlink-manifest.txt

rem For deeper diff, also emit the concrete pinned cs IDs
cm xlinks --format="{PATH} {REPOSITORY} {PINNED_CS}" --list > xlink-pins.txt

rem Diff two manifests from two machines
# On any diff tool or CI step
diff -u xlink-pins_A.txt xlink-pins_B.txt

If the PINNED_CS differs per Xlink path while the parent changeset is allegedly identical, the workspace is non-reproducible by design (floating Xlink) or misconfigured (partial rules hiding updates).

Identify Phantom Pending Changes

Phantom changes commonly arise from normalization differences or partially loaded metadata. Systematically rule these out:

  • Ensure your status is computed "as if clean" by ignoring ignored files and temp files.
  • Check for executable bit, symlink, or EOL deltas that vary per OS.
  • Confirm that partial rules include any metadata folders or attribute files that influence status.
rem Status ignoring ignored files and temp artifacts
cm status --nochanges --ignored --nostats

rem Show attributes tracked for line endings and modes
cm attrs list --recursive .

rem On Linux/macOS, display executable bit drift
find . -type f -perm -u+x | sed 's#^./##' > exec-list-unix.txt
# Compare with a Windows-generated list (empty is expected)

Trace Writable Xlink Checkins

When writable Xlinks are involved, you must reconstruct the order of operations across parent and child repositories. Confirm if a parent merge or update pulled a child repo changeset not yet seen elsewhere.

rem In the child repo, list incoming/outgoing relative to a "known good" label
cm log repo:ChildRepo@YourServer:8087 --branch=br:/release --since=label:Build_2025_08_01

rem From the parent workspace, show where the Xlink path points
cm xlinks --path=/ThirdParty/PhysicsEngine

Deep Dive: Architectural Pitfalls That Cause Recurrence

Floating Xlinks + Long-Lived Branches

Floating Xlinks (following a branch tip) seem convenient in active development. In practice, they bind the workspace to time, not to intent. A "green" build on Friday can turn red on Monday without any changes in the parent repository because the child tip moved. On long-lived integration or release branches, the effect is amplified: merges drag in unrelated vendor or content updates.

Partial Rule Drift and Gluon vs Developer Workspaces

Gluon workspaces simplify artist workflows but can hide selection details. When an engineer using a developer workspace makes changes that rely on ignored or unselected files, a Gluon user may receive a seemingly identical update that lacks those files. CI agents often run developer workspaces for speed; if their partial rules differ from the canonical selection, their "clean" checkouts are not equivalent to production.

Cross-Platform EOL/Case Normalization

In mixed fleets, Windows' case-insensitive file systems and CRLF endings meet Linux/macOS' case-sensitive, LF-by-default environments. Renames that only change case, or scripts toggling executable bits, register as pending changes on one platform and are ignored on another. Over time, these differences create merge churn, especially in directories brought in via Xlinks.

Writable Xlinks and Merge Ordering

Writable Xlinks enable local edits in the child repo from the parent workspace. Unless teams enforce a merge order (child first, then parent) and pin the resulting child changeset in the parent, engineers will intermittently see parent merges that reference child changesets not yet present in their local clone or cache, producing spurious conflicts.

Step-by-Step Fix: Remediate Today, Stabilize Tomorrow

Step 1: Freeze Xlinks for the Branch in Trouble

Pin every Xlink referenced by the problematic branch to a deterministic changeset. This converts "time-dependent" workspaces into "intent-dependent" ones.

rem Pin all Xlinks in the current workspace to their currently resolved cs IDs
cm xlinks --pin-all

rem Alternatively, pin a single Xlink path to a known changeset
cm xlinks --pin /ThirdParty/PhysicsEngine cs:12345

rem Checkin the selector/config change so everyone shares the same pins
cm checkin -m "Pin Xlinks for release branch to stabilize builds"

Result: Updates on other machines now resolve the same child repo versions, removing build-to-build variation unrelated to the parent repo.

Step 2: Align Partial Workspace Rules and Publish a Canonical "Selection Contract"

Create a single source of truth for which paths are loaded in developer, Gluon, and CI workspaces. Ensure metadata, attribute files, and platform-specific scripts are included where they influence status or build outputs.

rem Export partial rules to a tracked file in the repo
cm partial export --file=.ci/partial-rules.txt

rem Apply the same rules consistently on CI agents
cm partial apply --file=.ci/partial-rules.txt
cm update

rem Validate no pending changes immediately after an update
cm status --nochanges --ignored

Result: CI and developer machines operate with the same selection, eliminating "works on my box" discrepancies due to hidden files.

Step 3: Normalize EOL, Executable Bit, and Case Policies

Codify line-ending and mode behavior through Plastic attributes and pre-checkin validation. Prevent case-only renames and accidental mode flips from entering history.

rem Example: set LF for shell scripts, CRLF for batch files
cm attrs set eol=lf **/*.sh
cm attrs set eol=crlf **/*.bat

rem Example: enforce executable bit on POSIX scripts
cm attrs set x=on **/*.sh

rem (Server-side) Pre-checkin trigger sketch: reject case-only renames
# pseudo-trigger script
cm pendingchanges --format="{PATH} {CHANGE_TYPE}" | python reject_case_only_moves.py
exit %ERRORLEVEL%

Result: The same tree produces the same diffs across OSes, and "phantom changes" caused by normalization disappear.

Step 4: Establish Merge Ordering for Writable Xlinks

When Xlinks are writable, require that child repositories are merged and labeled first, then the parent is updated to pin to that label. This ordering avoids conflict cycles.

rem In the child repo
cm switch br:/release
cm merge br:/main
cm checkin -m "Merge main into release (child repo)"
cm label create Build_2025_08_Release

rem In the parent repo workspace
cm xlinks --pin /ThirdParty/PhysicsEngine label:Build_2025_08_Release
cm checkin -m "Pin child to Build_2025_08_Release before parent merge"

rem Now perform the parent merge
cm merge br:/main
cm resolve --all
cm checkin -m "Merge main into release (parent) with child pinned"

Result: Parent merges never drag in unexpected child changes; conflicts diminish and become localized.

Step 5: Sanitize Stale CI Agents

Ensure CI agents start from a known-clean state. Long-lived caches are valuable but must be periodically validated against the expected manifest.

rem CI bootstrap script fragment
cm showselector > .ci/selector.before.txt
cm xlinks --list > .ci/xlinks.before.txt
cm status --nochanges --ignored || (cm undo . --all && cm update)

cm switch br:/release
cm update

cm xlinks --list > .ci/xlinks.after.txt
diff -u .ci/xlinks.before.txt .ci/xlinks.after.txt || echo "Xlink set changed (expected on branch switch)"

cm status --nochanges --ignored || (echo "Unexpected pending changes" & exit 1)

Result: Pipelines fail fast when agents drift, instead of producing silently corrupt workspaces.

Verification: Prove You Fixed the Right Thing

Reproducibility Test Matrix

Create a three-dimensional test: OS (Windows/macOS/Linux) × workspace type (developer/Gluon) × role (dev/artist/CI). For a chosen parent branch changeset, all nine combinations should produce identical Xlink pin manifests and zero pending changes immediately after update.

# Automated cross-machine verification snippet
cm xlinks --format="{PATH} {PINNED_CS}" --list | sort > xlink-pins.env.txt
cm status --nochanges --ignored || exit 1
sha256sum xlink-pins.env.txt > xlink-pins.hash
# Upload hash to artifact storage; gate the pipeline if hashes differ between agents

Regression Guardrails

Introduce lightweight checks to catch future drift:

  • A server-side trigger that forbids floating Xlinks on protected branches.
  • A pre-merge check that the source and target branches reference identical Xlink pins, or else the merge is blocked pending a "child-first" pin update.
  • A nightly job that diffs the Xlink manifest of release branches across representative machines.

Operational Playbook: What to Do Next Time in 10 Minutes

Rapid Containment Procedure

  1. Pin all Xlinks in the impacted branch to the currently resolved changesets.
  2. Export and publish the effective partial rules for the branch.
  3. Force CI agents to clean-update and verify "status --nochanges".
  4. Normalize EOL/mode policies if phantom changes persist.
  5. If writable Xlinks are in play, merge and label child repos first, then re-pin in parent.

Communication Checklist

  • Share the Xlink manifest before/after to the team to explain the drift.
  • Call out policy changes (e.g., "no floating Xlinks on release branches") and how they are enforced.
  • Publish the canonical partial rules and where they live in the repository.

Long-Term Best Practices & Architectural Guidance

Govern Xlinks Like a Bill of Materials

Treat Xlinks as a software BOM. For every protected branch, maintain a machine-generated manifest mapping each Xlink path to an immutable changeset or label. Require a pull-request gate that diff-checks the manifest and demands human approval for any change.

Prefer Labels Over Branch Tips for Child Dependencies

Teach teams to advance dependencies by labeling child repos (e.g., "PhysicsEngine_2025_08_Stable") and pinning to the label in the parent. Labels are auditable intent; branch tips are moving targets. This practice improves root-cause traceability when a regression is introduced by a dependency bump.

Unify Partial Rules Across Roles and Environments

Maintain a single repository of selection profiles (dev, artist, CI). Changes to these profiles go through code review. This drives consistency and prevents invisible drift introduced by local experimentation on CI hosts or personal machines.

Codify Cross-Platform Normalization

Centralize EOL and mode attributes in version control. Add pre-checkin validation that rejects case-only renames and inconsistent executable bits. Decide early whether your repo is case-sensitive or case-preserving and enforce accordingly.

Define a Merge Order Contract for Writable Xlinks

Document and automate a "child-first" merge/pin protocol. In CI, implement a pipeline stage that validates the child repos are merged and labeled before the parent merge is allowed to proceed. This eliminates re-merge loops attributable to cross-repo ordering.

Control CI Workspace Lifecycles

Cache responsibly: periodically recycle agent workspaces and always verify selector, Xlink pins, and status before build. If performance demands caching, preserve and compare "xlink-pins.hash" artifacts across runs to detect stealth drift.

Leverage Role-Appropriate Tools

Adopt Gluon for artists with locked binary workflows and developer workspaces for engineers. Provide role-specific selection profiles, locking rules, and "how to update safely" guidance to minimize accidental cross-role side effects.

Institutionalize Observability

Log Xlink pin movements as first-class change events. Dashboards that chart "dependency moves" per branch help leaders see when a release line is stabilizing versus churning. During incidents, these timelines shortcut blameless root cause analysis.

Edge Cases & How to Handle Them

Case-Only Renames on Mixed Fleets

On Windows, "assets/Tree.fbx" and "assets/tree.fbx" are indistinguishable on disk. On Linux/macOS, they differ. Avoid case-only renames in code or content. If a historic rename exists, normalize the case in a controlled window and update all references atomically to prevent endless flip-flopping in merges.

Binary Lock Contention Across Xlinks

Exclusive locks taken in a child repo may not be obvious in the parent. Surface lock state in the parent's dashboards and teach users to check lock ownership before attempting merges that touch those assets. Automate polite fail-fast messages when a locked asset is encountered during a parent merge.

Replicated Servers & Latency

In globally distributed setups, Xlink target repos may be replicated across servers. Ensure pinning references the correct server/replica or uses labels that resolve identically across replicas. Monitor lag and consider promoting labels only after replicas are in sync to avoid "pin to a label that doesn't exist here yet" scenarios.

Worked Example: From Chaos to Determinism

Initial Conditions

A game studio's "br:/release" in the parent repo references three Xlinks: "/ThirdParty/PhysicsEngine" (writable), "/Content/WorldAssets" (read-only), and "/Tools/Exporter" (read-only). CI builds break intermittently; developers see different physics behavior day to day.

Diagnosis

  • Two CI agents resolve different changesets for "/Content/WorldAssets" because the Xlink followed "br:/main" in the child repo.
  • Artists use Gluon with a selection that omits "/Tools/Exporter/config"; engineers include it. Status differs.
  • Executable bits on exporter scripts are set on macOS agents but absent on Windows, producing phantom pending changes on mixed merges.

Remediation

  1. Pin all Xlinks on "br:/release" to specific labels created in each child repo.
  2. Publish a unified selection profile; apply to both CI and Gluon users.
  3. Set attributes for "*.sh" (LF, executable) and "*.bat" (CRLF); add a pre-checkin rule to forbid case-only renames.
  4. Enforce a child-first merge order for "/ThirdParty/PhysicsEngine" with labels that the parent pins to before parent merges.

Outcome

All agents produce identical Xlink pin manifests. "cm status --nochanges" is clean post-update everywhere. Merge volume drops 70% and conflicts are now localized to true code/content intersections.

Conclusion

Plastic SCM's Xlinks and partial workspaces are powerful, but with power comes a duty to design for determinism. Floating dependencies, ungoverned selection rules, and cross-platform normalization gaps produce the illusion of instability where none should exist. By pinning Xlinks deliberately, unifying partial rules, normalizing EOL/mode/case behavior, and enforcing a child-first merge order for writable Xlinks, you turn an intermittently failing system into a reproducible supply chain of code and content. For architects and tech leads, the payoff is not just fewer incidents; it is faster flow with higher confidence that a changeset means the same thing on every machine, every time.

FAQs

1. Should we ever use floating Xlinks in enterprise branches?

Use floating Xlinks only on short-lived feature branches with a clear policy to pin before integration. On protected branches (release, stabilization, main), pin to labels or specific changesets to guarantee reproducibility.

2. How do Gluon workspaces affect reproducibility?

Gluon simplifies selective loading for artists, but differing selection profiles can hide files that influence status or builds. Publish a canonical selection contract and apply it to Gluon and developer workspaces alike.

3. Can we eliminate phantom pending changes across OSes?

Yes, by codifying EOL and file-mode attributes and rejecting case-only renames in pre-checkin validation. Once normalization rules are enforced, "status" reflects real edits rather than platform quirks.

4. What's the safest way to update writable Xlinks?

Merge and label in the child repo first, then pin that label in the parent and check in the pin change before performing parent merges. This ordering prevents recursive conflicts and "hidden" updates.

5. How should CI handle large cached workspaces?

Cache with guardrails: verify selector, Xlink pin hashes, and clean status at the start of each job. Periodically recycle caches and fail fast when drift is detected, rather than attempting to build on a compromised workspace.