FlexMeasures · Copilot · Mar 24, 2026 · Mar 24, 2026 · Mar 24, 2026 · Mar 24, 2026
diff --git a/.github/agents/coordinator.md b/.github/agents/coordinator.md
@@ -292,79 +292,79 @@ Agents should escalate to the Coordinator when:
 - Encourage agent autonomy and expertise
 - Provide actionable feedback via review comments
 
-### Review Lead Delegation Pattern Monitoring
+### Lead Delegation Pattern Monitoring
 
-**The Coordinator MUST verify Review Lead delegation patterns during governance reviews.**
+**The Coordinator MUST verify Lead delegation patterns during governance reviews.**
 
-**Context:** Review Lead has a recurring failure mode of working solo instead of delegating to specialists (observed in session 2026-02-08).
+**Context:** Lead has a recurring failure mode of working solo instead of delegating to specialists (observed in session 2026-02-08).
 
 **What to check:**
 
-When reviewing a session where Review Lead was involved:
+When reviewing a session where Lead was involved:
 
-- [ ] **Delegation occurred**: Did Review Lead invoke appropriate specialists?
-- [ ] **No solo execution**: Did Review Lead make code/API/docs changes itself?
-- [ ] **Git commit author check**: Are there Review Lead commits with production code?
-- [ ] **Request interpretation**: Did Review Lead parse user intent correctly?
+- [ ] **Delegation occurred**: Did Lead invoke appropriate specialists?
+- [ ] **No solo execution**: Did Lead make code/API/docs changes itself?
+- [ ] **Git commit author check**: Are there Lead commits with production code?
+- [ ] **Request interpretation**: Did Lead parse user intent correctly?
 - [ ] **Regression indicators**: Any signs of "too simple to delegate" thinking?
 
 **Red flags (immediate governance concern):**
 
-- 🚩 Review Lead commits containing code changes (should be specialist commits)
-- 🚩 Review Lead commits containing test changes (should be Test Specialist)
-- 🚩 Review Lead commits containing doc changes (should be Documentation Specialist)
+- 🚩 Lead commits containing code changes (should be specialist commits)
+- 🚩 Lead commits containing test changes (should be Test Specialist)
+- 🚩 Lead commits containing doc changes (should be Documentation Specialist)
 - 🚩 User says "You are regressing" or "You must handle requests as a team"
 - 🚩 Session closed without specialist involvement on implementation tasks
-- 🚩 Review Lead justifies solo work with "too simple to delegate"
+- 🚩 Lead justifies solo work with "too simple to delegate"
 
 **Verification commands:**
 
 ```bash
 # Check who made commits
 git log --oneline --all --since="1 day ago" --format="%h %an %s"
 
-# Check Review Lead commit types
-git log --author="Review Lead" --oneline -10
+# Check Lead commit types
+git log --author="Lead" --oneline -10
 
-# Look for code changes by Review Lead (should be empty or synthesis only)
-git log --author="Review Lead" --stat -5
+# Look for code changes by Lead (should be empty or synthesis only)
+git log --author="Lead" --stat -5
 ```
 
 **When delegation failure detected:**
 
 1. **Document in session review** - What was the failure?
-2. **Check Review Lead instructions** - Were they followed?
+2. **Check Lead instructions** - Were they followed?
 3. **Identify gap** - What prevented proper delegation?
 4. **Recommend fix** - How to prevent recurrence?
-5. **Update Review Lead instructions** - Add enforcement mechanism
+5. **Update Lead instructions** - Add enforcement mechanism
 6. **Verify fix works** - Test with hypothetical scenario
 
 **Escalation pattern:**
 
-If Review Lead repeatedly violates delegation requirements:
+If Lead repeatedly violates delegation requirements:
 - This is a systemic issue requiring Coordinator intervention
-- Review Lead instructions need stronger enforcement
+- Lead instructions need stronger enforcement
 - Consider adding mandatory checkpoints before work execution
 - May need explicit blockers to prevent solo execution
 
 **Common patterns to track:**
 
 | Pattern | Indicator | Action |
 |---------|-----------|--------|
-| Solo execution | Review Lead makes code commits | Flag as regression |
-| "Too simple" trap | Review Lead justifies not delegating | Update instructions with example |
-| Request misinterpretation | Review Lead confirms instead of implements | Strengthen request parsing guidance |
+| Solo execution | Lead makes code commits | Flag as regression |
+| "Too simple" trap | Lead justifies not delegating | Update instructions with example |
+| Request misinterpretation | Lead confirms instead of implements | Strengthen request parsing guidance |
 | Delegation omission | Specialists not invoked on implementation | Verify Session Close Checklist followed |
 
 **Success indicators:**
 
-- ✅ Review Lead invoked appropriate specialists
+- ✅ Lead invoked appropriate specialists
 - ✅ Specialists made the actual changes
-- ✅ Review Lead synthesized findings
+- ✅ Lead synthesized findings
 - ✅ Team-based execution pattern maintained
 - ✅ Session Close Checklist verified delegation
 
-**This monitoring ensures Review Lead maintains its orchestration role and doesn't regress to solo execution.**
+**This monitoring ensures Lead maintains its orchestration role and doesn't regress to solo execution.**
 
 ## Self-Improvement Notes
 
@@ -551,25 +551,25 @@ Lead should now invoke Coordinator as subagent.
 - Governance process shown to be optional (dangerous precedent)
 
 **Solution implemented**:
-1. ✅ Added mandatory "Session Close Checklist" to Review Lead (commit 3ad8908)
+1. ✅ Added mandatory "Session Close Checklist" to Lead (commit 3ad8908)
 2. ✅ Added "Full Test Suite Requirement" to Test Specialist (commit 8d67f3c)
 3. ✅ Added "Pre-commit Hook Enforcement" to Tooling & CI Specialist (commit dfe67e8)
 4. ✅ Added "Session Close Verification" pattern to Coordinator (this commit)
 
 **Structural changes**:
-- Review Lead now has comprehensive checklist before closing any session
+- Lead now has comprehensive checklist before closing any session
 - Test Specialist must execute full suite, not just feature-specific tests
 - Tooling & CI Specialist must verify pre-commit execution
-- Coordinator enforces Review Lead checklist completion
+- Coordinator enforces Lead checklist completion
 
 **New Coordinator pattern (Pattern #7)**:
 When invoked for governance review, Coordinator must verify:
-- [ ] Review Lead followed session close checklist
+- [ ] Lead followed session close checklist
 - [ ] No checklist items were skipped without justification
 - [ ] Evidence provided for each checklist item
 
 **Enforcement escalation**:
-If Review Lead repeatedly closes sessions without completing checklist:
+If Lead repeatedly closes sessions without completing checklist:
 1. First occurrence: Document and update instructions (this session)
 2. Second occurrence: Require explicit justification for skips
 3. Third occurrence: Escalate to architectural solution (automated checks)
@@ -584,11 +584,63 @@ If Review Lead repeatedly closes sessions without completing checklist:
 
 These patterns must not repeat. Agent instructions have been updated to prevent recurrence.
 
+### Additional Pattern Discovered (2026-03-24)
+
+**Pattern**: Persistent self-improvement failure and missing API Specialist agent selection
+
+**Session**: PR #2058 — Add `account_id` to DataSource table
+
+**Observation**: After three sessions now, the same two failures recur:
+1. Coordinator is not invoked at end of session (despite MUST requirement in Lead instructions)
+2. No agent updates its own instructions (despite MUST requirement in all agents)
+
+**Root cause analysis**:
+- "Coordinator invocation" and "self-improvement" are both documented as mandatory last steps
+- But the session ends before they are reached — they are treated as optional epilogue, not gating requirements
+- The Lead agent selection is ad-hoc, with no explicit checklist forcing API Specialist engagement when endpoints change
+
+**What was missed in PR #2058**:
+- API Specialist not engaged: POST sensor data now sets `account_id` on the resulting data source — this is an endpoint behavior change that should be reviewed for backward compatibility
+- Zero agent instruction updates across all three participating agents (Architecture Specialist, Test Specialist, Lead)
+- No Coordinator invocation despite explicit user request in the original prompt
+
+**Solutions implemented**:
+- Architecture Specialist: Added Alembic migration checklist + DataSource domain invariants
+- Test Specialist: Added DataSource property testing pattern + lessons learned
+- Lead: Added Agent Selection Checklist mapping code change types to required agents; documented 3rd recurrence of these failures
+- Coordinator (this file): Documented case study
+
+**Governance escalation**: The Lead's "Must Always Run Coordinator" requirement has now been documented in three sessions without being followed. If it fails a fourth time, consider structural changes — e.g., making Coordinator invocation the FIRST step of a session rather than the last, so it sets context rather than being a forgotten epilogue.
+
+**Code observation from PR #2058 worth tracking**:
+- An early draft used `user.account_id or (user.account.id if user.account else None)` — the `or` pattern is fragile for `account_id=0` (unrealistic but worth noting). The final implementation correctly uses `if user.account_id is not None` (see `data_sources.py` lines 340-343) — this is the right pattern to follow.
+- Empty "Initial plan" commit adds git history noise. When orchestrating agents, the first commit should be functional code, not a planning marker.
+
+### Additional Pattern Discovered (2026-03-25)
+
+**Pattern**: No-FK columns for data lineage preservation
+
+**Session**: PR #2058 continued — Drop FK constraints on `data_source.user_id` and `data_source.account_id`
+
+**Design decision documented**:
+FlexMeasures now intentionally drops DB-level FK constraints on `DataSource.user_id` and `DataSource.account_id` so that historical lineage references survive user/account deletion. The ORM uses `passive_deletes="all"` to prevent auto-nullification.
+
+**Checklist implication for future PRs**:
+When reviewing schema changes that affect FK constraints:
+- [ ] If a FK is dropped intentionally for lineage: verify `passive_deletes="all"` on the ORM relationship AND its backref
+- [ ] Verify tests check that the orphaned column values are NOT nullified after parent deletion
+- [ ] Verify changelog describes the *behavior change* (lineage preservation), not just the schema change (column added)
+
+**Changelog completeness check** — lessons from this session:
+- The initial changelog entry for PR #2058 only described adding `account_id`; it omitted the FK drop and behavior change
+- When a migration both adds a column AND changes deletion semantics (e.g., drops a FK), the changelog must cover BOTH aspects
+- Coordinator caught this and updated the entry to read: "...also drop FK constraints on `data_source.user_id` and `data_source.account_id` to preserve data lineage (historical user/account IDs are no longer nullified when users or accounts are deleted)"
+
 ### Session 2026-02-10: Annotation API Implementation (#470)
 
 **Pattern**: Systemic self-improvement failure across all agents
 
-**Observation**: Five agents completed substantial work (Architecture, API, Test, Documentation, Review Lead):
+**Observation**: Five agents completed substantial work (Architecture, API, Test, Documentation, Lead):
 - Created new API endpoints (3 POST endpoints)
 - Wrote 17 comprehensive test functions
 - Created 494-line feature guide documentation
@@ -605,17 +657,17 @@ These patterns must not repeat. Agent instructions have been updated to prevent
 **Root causes identified**:
 1. **Self-improvement not enforced**: No blocking requirement, agents treat as optional
 2. **Unclear triggers**: Agents don't know when to update instructions ("after completing work" too vague)
-3. **No verification**: Review Lead doesn't check if agents self-improved
+3. **No verification**: Lead doesn't check if agents self-improved
 4. **Invisible requirement**: Self-improvement not in task completion checklist
 
 **Secondary violations observed**:
 - Temporary file committed (`API_REVIEW_ANNOTATIONS.md`, 575 lines) then removed
 - Non-atomic commits mixing multiple concerns
 - Test claims without execution evidence
-- Review Lead didn't invoke Coordinator despite governance request
+- Lead didn't invoke Coordinator despite governance request
 
 **Solution implemented**:
-1. Added self-improvement enforcement to Review Lead checklist (see below)
+1. Added self-improvement enforcement to Lead checklist (see below)
 2. Documented temporary file prevention patterns
 3. Added test execution evidence requirement
 4. Strengthened Coordinator invocation triggers
@@ -628,14 +680,14 @@ These patterns must not repeat. Agent instructions have been updated to prevent
 
 **Future sessions**: Monitor whether self-improvement enforcement works. If pattern recurs 3+ times, escalate to architectural solution (e.g., automated checks, mandatory prompts).
 
-**Session 2026-02-10 (Annotation API Tests)**: Pattern recurred despite Review Lead update. Test Specialist fixed 32 annotation API tests (100% passing), made excellent technical commits, but did NOT update instructions with learned lessons (permission semantics, fixture selection, error code expectations). Review Lead enforcement unclear—may not have been involved in session. **Status**: Pattern persists. Approaching threshold for architectural solution.
+**Session 2026-02-10 (Annotation API Tests)**: Pattern recurred despite Lead update. Test Specialist fixed 32 annotation API tests (100% passing), made excellent technical commits, but did NOT update instructions with learned lessons (permission semantics, fixture selection, error code expectations). Lead enforcement unclear—may not have been involved in session. **Status**: Pattern persists. Approaching threshold for architectural solution.
 
 ### Enforcement Mechanism Added
 
-**New requirement for Review Lead**: Before marking task complete, verify:
+**New requirement for Lead**: Before marking task complete, verify:
 
 ```markdown
-## Task Completion Checklist (Review Lead)
+## Task Completion Checklist (Lead)
 
 - [ ] Code review completed and feedback addressed
 - [ ] Security scan completed and alerts investigated  
@@ -645,7 +697,7 @@ These patterns must not repeat. Agent instructions have been updated to prevent
 - [ ] No temporary analysis files committed
 ```
 
-If any agent hasn't self-improved, Review Lead must:
+If any agent hasn't self-improved, Lead must:
 1. Request agent update their instructions
 2. Wait for update
 3. Review update for quality

diff --git a/.github/agents/test-specialist.md b/.github/agents/test-specialist.md
@@ -118,11 +118,11 @@ If ANY test fails during full suite execution:
 **Click context errors**:
 - Check IdField decorators (`@with_appcontext` vs `@with_appcontext_if_needed()`)
 - Compare against SensorIdField pattern
-- See Review Lead's Click context error pattern
+- See Lead's Click context error pattern
 
-### Integration with Review Lead
+### Integration with Lead
 
-The Test Specialist MUST provide evidence of full test suite execution to Review Lead.
+The Test Specialist MUST provide evidence of full test suite execution to Lead.
 
 **Required evidence format:**
 ```
@@ -134,14 +134,14 @@ Full test suite execution:
 - Coverage: 87.2% (unchanged)
 ```
 
-**Review Lead verification:**
-Review Lead's session close checklist includes:
+**Lead verification:**
+Lead's session close checklist includes:
 - [ ] Test Specialist confirmed full test suite execution
 - [ ] All tests pass (100%)
 - [ ] Test output captured and reviewed
 
 **Enforcement:**
-Review Lead cannot close session until Test Specialist provides evidence of full test suite execution with 100% pass rate.
+Lead cannot close session until Test Specialist provides evidence of full test suite execution with 100% pass rate.
 
 
 ## Testing Patterns for flexmeasures
@@ -880,6 +880,30 @@ flask db current
 - **Assertions**: Use descriptive assertion messages for failures
 - **Mocking**: Use pytest fixtures and mocking when testing external dependencies
 
+### Testing DataSource Properties After API Calls
+
+When writing tests that verify data source properties (e.g. `account_id`, `user`, `type`) after an API call:
+
+1. **Use `fresh_db` fixture** — tests that POST data and then query the resulting data source are modifying the DB and must use the function-scoped `fresh_db` fixture. Place these tests in a `_fresh_db` module.
+
+2. **Query by user, not just name** — data sources created by the same user across test runs may collide; use `filter_by(user=user)` or `filter_by(user_id=user.id)` for precision.
+
+3. **Pattern** (from `test_post_sensor_data_sets_account_id_on_data_source`):
+   ```python
+   # Fetch the user that made the request
+   user = db.session.execute(
+       select(User).filter_by(email="test_supplier_user_4@seita.nl")
+   ).scalar_one()
+   # Fetch the data source created for that user
+   data_source = db.session.execute(
+       select(Source).filter_by(user=user)
+   ).scalar_one_or_none()
+   assert data_source is not None
+   assert data_source.account_id == user.account_id
+   ```
+
+4. **Check both existence and value** — don't just assert `data_source is not None`; also assert the specific field value you're testing.
+
 ## Understanding Test Design Intent (CRITICAL)
 
 **Before changing a test, understand WHY it's designed that way.**
@@ -1060,14 +1084,9 @@ After each assignment:
    - Added guidance on <specific topic>
    ```
 
-Example:
+### Lessons Learned
 
-```
-agents/test-specialist: learned to verify claims with actual test runs
-Context:
-- Session #456 claimed tests passed but they were never actually run
-- Led to bug slipping through to production
-Change:
-- Added "Actually Run Tests" section with verification steps
-- Emphasized checking test output and coverage
-```
+**Session 2026-03-24 (PR #2058 — add account_id to DataSource)**:
+
+- **Self-improvement failure**: Despite having explicit instructions to update this agent file after each assignment, no update was made during this PR session. This was caught by the Coordinator post-hoc. The agent must treat instruction updates as the LAST mandatory step of any assignment.
+- **DataSource property testing**: Added guidance in "Testing DataSource Properties After API Calls" above. When testing properties set by the API on a data source (like `account_id`), use `fresh_db`, query by user to avoid ambiguity, and assert both existence and the specific field value.
diff --git a/.github/agents/tooling-ci-specialist.md b/.github/agents/tooling-ci-specialist.md
@@ -163,12 +163,12 @@ Code that bypasses pre-commit:
 
 **Who runs pre-commit:**
 - **During code changes**: Agent making changes runs pre-commit before committing
-- **Before PR close**: Review Lead verifies pre-commit execution
+- **Before PR close**: Lead verifies pre-commit execution
 - **In PR review**: Tooling & CI Specialist validates config matches CI
 
 **Enforcement:**
-- Review Lead's session close checklist includes pre-commit verification
-- Review Lead cannot close session without pre-commit evidence
+- Lead's session close checklist includes pre-commit verification
+- Lead cannot close session without pre-commit evidence
 - If pre-commit fails, agent must fix all issues before proceeding
 
 #### Common Failures and Fixes
@@ -206,9 +206,9 @@ black .
 ci/run_mypy.sh
 ```
 
-#### Integration with Review Lead
+#### Integration with Lead
 
-**Review Lead checklist items:**
+**Lead checklist items:**
 - [ ] Pre-commit hooks installed
 - [ ] All hooks pass: `pre-commit run --all-files`
 - [ ] Zero failures from flake8, black, mypy
@@ -219,7 +219,7 @@ ci/run_mypy.sh
 - Or confirm: "Pre-commit verified: all hooks passed"
 
 **Enforcement:**
-Review Lead MUST verify pre-commit execution before closing session.
+Lead MUST verify pre-commit execution before closing session.
 
 ### Agent Environment Setup