Google Sign-In Required

Use your company Google account to access the BetterFleet private content.

Back to private home

BetterFleet Support Private
Skip to content
BetterFleet Dev Wiki
Specification: Telematics Migration into bf-manage-core with 5-Minute Freshness and Health Visibility
Initializing search
    bf-dev
    • Home
    • Process
    • Products
    • Reference
    • Decisions
    • Work
    • Operations
    bf-dev
    • Home
      • Process Handbook
      • BetterFleet Workflow Map
      • Product Development System
      • Product Engineering Workflow
        • Process Workflows
        • Work Intake and Weekly Planning
        • Product Engineering Workflow in Linear
        • Product Engineering Delivery
        • Agent Guidance
        • Workflow
        • Skills
        • Skill Sources
        • Process Guides
        • GitLab Feature Flags
        • In-App Docs Authoring
        • Release Notes
        • Process Templates
        • Release Plan: <title>
      • Process Publishing
      • Product overview
        • General Reference
          • Core Domain Training
          • System Topology
          • Two-Axis Ontology Model
          • Ontology Primer
          • Worked Example
          • Evidence, Ownership, and Lineage
          • Energy Management
          • Standards and Protocol Map
          • Charging, Roaming, and Commercial Model
          • Charge Planning and Operations
          • Cross-Cutting Domains
          • Domain Coverage Matrix
        • BetterFleet Product Ontology
        • Core Operations Data Ontology
        • BetterFleet R&D Plan
        • Index
        • Architecture
        • Manage Product Capabilities
        • Manage Data and State
        • Manage Service Interaction Flows
        • Manage Reference
        • Manage Internal Application Diagrams
          • Manage Authorization And Permissions
          • bf-manage-core Auth and Authorization Model
          • Manage Authorization and Permissions
          • bf-manage-web Auth and Permission Model
          • Manage Service Catalog
          • bf-depot-sim
          • bf-digital-twin (Manage Role)
          • bf-fleet-health
          • bf-manage-connect
          • bf-manage-core
          • bf-manage-incidents
          • bf-manage-roaming
          • bf-manage-sitepwrmon
          • bf-manage-web
          • bf-schedule-creator (Manage Role)
          • bf-support-microsite
          • bf-telematics
        • Index
        • Architecture
        • Plan Reference
        • Plan Internal Application Diagrams
        • Plan Migration and Flags
        • Plan Simulation Request Lifecycle
          • Plan Service Catalog
          • bf-bnl-schedule-analysis-compute
          • bf-bnl-settings
          • bf-bnl-ui
          • bf-digital-twin (Plan Role)
          • bf-route-modelling
          • bf-schedule-creator (Plan Role)
      • Where to Ask Product Questions
      • Reference
        • Platform Reference
        • Platform Architecture
        • Script Runtime Model
        • Compose Profiles and Modes
        • Repository Map
        • Monolithic Git Transition FAQ
        • Monolithic Git Sizing
        • CI and Release Integration
        • Shared Reference
        • Shared Infrastructure Architecture
        • Secrets and Env Strategy
        • Vendors and Local Dependencies
        • System Reference
        • Cloud Data Dependencies
        • Ports and URLs
        • Service Matrix
          • API Docs
          • OCPI API Docs
          • OCPP API Docs
          • OSCP API Docs
          • VDV API Docs
          • Yard State API Docs
        • System Design
        • System Design: BBA Microgrid Controller Generic Packet Translation
        • System Design: Depot Simulation
        • System Design: IoT Sensor Packet
        • System Design: Microgrid Energy Orchestration
          • System Design: OCPP Profile 3 And ISO 15118 PKI
          • Architecture: BetterFleet OCPP Profile 3 and ISO 15118 PKI
          • Specification: BetterFleet OCPP Profile 3 and ISO 15118 Certificate Lifecycle Management
          • System Design: On-Prem Control
          • Challenge
          • Specification: BetterFleet On-Prem Continuity Control
          • System Design: OSCP
          • OSCP Protocol Documentation
          • Depot Sim Testing Requirements
          • System Design: OSCP Flexibility Provider Domain
      • Decisions
        • Architecture Decision Records
        • 0001 - Record architecture decisions
        • 0002 - Cognito for Authentication and Authorisation
        • 0003 - AWS Amplify for Authentication
        • 0004 - DynamoDB for default database
        • 0005 - Data Persistence
        • 0006 - Trunk-Based Development
        • 0007 - Generalised principle for automation
        • 0008 - Naming Repositories, Services, and URLs
        • 0009 - Use Timezone Aware DateTimes and UTC
        • 0010 - Use semantic release
        • 0011 - Centralized feature flag repository
        • 0012 - Use Named Exports in Storybook
        • 0013 - RESTful TITLE GraphQL
        • 0014 - Service Granularity
        • 0015 - Async/co-routine exception handling pattern
        • 0016 - Logging & log levels
        • 0017 - Instantiated Models
        • 0018 - Repository Pattern for Database Access
        • 0019 - Use of Design Tokens in TypeScript React Application
        • 0020 - API backwards compatibility and versioning
        • 0021 - Alembic Migration strategy
        • 0022 - Consistent react-hook-form usage
        • 0023 - Domain Event-Driven Architecture
        • 0024 - Domain Event Bus Tech Stack
        • 0025 - No enum types in DB table columns
        • 0026 - In-Memory Ormar Stores for Repository testing
        • 0027 - Storing Tab State in Query and Local Storage
        • 0028 - Adopt OpenTelemetry Semantic Conventions for Structured Logging
        • 0029 - Adopt RFC 9457 for HTTP Error Responses
        • 0030 - Use GitLab registry and Terraform state for ECS services
        • 0031 - Adopt DDD, Hexagonal Architecture, and CQRS for Python Domain Services
      • Work
        • Active Work
          • Work: Bba Microgrid Controller
          • Implementation Specification: BBA Microgrid Controller
          • BBA Microgrid Controller Deliverables (Stories)
          • Work: BFDev Monolithic Git
          • Challenge
          • Specification: BFDev Monolithic Git v2
          • BFDev Monolithic Git v2 Stories
          • Work: Complex Circuit Load Balancing
          • Implementation Specification: Complex Circuit Load Balancing
          • Complex Circuit Load Balancing Deliverables (Stories)
            • COR-10 and COR-11 Consolidation Review
          • Work: Dispatch Reliability and Reconciliation
          • Challenge
          • Specification: Dispatch Reliability and Reconciliation
          • Dispatch Reliability and Reconciliation (Unit User Stories)
            • Dispatch populated vehicle cards grey surface snapshot
            • Dispatch Visual Review
          • Work: Enable Scheduled Managed Charger Access
          • Challenge: Enable Scheduled Managed Charger Access
          • Specification Exploration Dossier: Enable Scheduled Managed Charger Access
          • Specification Review: Enable Scheduled Managed Charger Access
          • Specification: Enable Scheduled Managed Charger Access
          • Work: Guided Cut-Off and Release Orchestration
          • Specification: Guided Cut-Off and Release Orchestration
          • Guided Cut-Off and Release Orchestration (Unit User Stories)
          • Work: Production Deployment Validation
          • Challenge
          • Work: Scheduled Report Parity
          • Specification: Scheduled Report Parity
          • Work: Telematics
          • Telematics EventBridge Path
          • Telematics Ingress Architecture
          • Specification: Telematics Migration into bf-manage-core with 5-Minute Freshness and Health Visibility
            • TLDR (Solution Summary)
            • 1. Summary
              • Problem
              • Goal and Success Criteria
              • What Will Be Built In This Phase
              • Scope (In)
              • Scope (Out)
              • Current Baseline (As of April 14, 2026)
              • Future Evolution Guardrails
            • 2. Users and Use Cases
              • Primary Personas
              • High-Level User Stories
              • Edge Cases and Failure Modes
            • 3. Conceptual Model Terms and Decisions
              • Key Terms
              • Decision Ledger
            • 4. Domain Model and Eventstorming (Conceptual)
              • Interaction Flow
              • Control-Plane Flow
              • Event Timeline
              • Event Dictionary
            • 5. Requirements and Constraints
              • Functional Requirements
              • Non-Functional Requirements
              • Constraints and Assumptions
              • Build Item Coverage Mapping
              • Verification Notes
            • 6. Interaction and Flow
              • Sequence Diagram
            • 7. Non-Technical Implementation Approach
            • 8. Open Questions
            • 9. Appendices
              • 9.1 Current-State Risk Summary (Aligned to Slides)
              • 9.2 Related Customer-Facing Issues (Linear, IPT)
              • 9.3 Design Review and Spec Walkthrough Checklist
            • 10. MVP Coverage and DDD Alignment
              • MVP Satisfies (Transitional Design)
              • How the MVP Delivers It
              • Brief DDD Alignment Note
              • Provider Polling Pattern Reuse Decision
          • Telematics Core Migration MVP (Implementation-Time BDD)
          • Work: Vector Derms
          • Implementation Specification: Vector DERMS
          • Vector DERMS Deliverables (Stories)
          • Work: Visiting Vehicle Charging Visibility
          • Specification: Visiting Vehicle Charging Visibility
          • Visiting Vehicle Charging Visibility (Unit User Stories)
          • Work: Workspace Owned Stripe Roaming
          • Specification: Workspace-Owned Stripe Credentials for Roaming Payments
        • Backlog Work
          • Work: Microgrid
          • Microgrid Backlog Stories
          • Work: Mobile Ops Companion
          • Challenge
          • Specification: Mobile Operations Companion v1
          • Mobile Operations Companion Deliverables (Stories)
          • Work: Oscp
          • OSCP Backlog Stories
        • Archived Work
          • Work: Code Canonical Orchestration
          • Challenge
          • Specification: Product Engineering Workflow
          • Product Engineering Workflow Deliverables (Unit User Stories)
          • Work: Release Notes Automation
          • Release Plan: Release Notes Automation
          • Release Notes Automation Backlog Stories
      • Operations
      • Onboarding Runbook
        • Operations Runbooks
        • Production Hotfix Release
        • Staging Hotfix Release
        • Manage Staging Release Validation
        • Terraform Plan Dry Runs
        • Operations Tooling
        • Code Indexing
        • Operations Evidence
        • Database Restoration Test Report
      • Daily Operations Runbook
      • Testing Guide
      • Troubleshooting
    • TLDR (Solution Summary)
    • 1. Summary
      • Problem
      • Goal and Success Criteria
      • What Will Be Built In This Phase
      • Scope (In)
      • Scope (Out)
      • Current Baseline (As of April 14, 2026)
      • Future Evolution Guardrails
    • 2. Users and Use Cases
      • Primary Personas
      • High-Level User Stories
      • Edge Cases and Failure Modes
    • 3. Conceptual Model Terms and Decisions
      • Key Terms
      • Decision Ledger
    • 4. Domain Model and Eventstorming (Conceptual)
      • Interaction Flow
      • Control-Plane Flow
      • Event Timeline
      • Event Dictionary
    • 5. Requirements and Constraints
      • Functional Requirements
      • Non-Functional Requirements
      • Constraints and Assumptions
      • Build Item Coverage Mapping
      • Verification Notes
    • 6. Interaction and Flow
      • Sequence Diagram
    • 7. Non-Technical Implementation Approach
    • 8. Open Questions
    • 9. Appendices
      • 9.1 Current-State Risk Summary (Aligned to Slides)
      • 9.2 Related Customer-Facing Issues (Linear, IPT)
      • 9.3 Design Review and Spec Walkthrough Checklist
    • 10. MVP Coverage and DDD Alignment
      • MVP Satisfies (Transitional Design)
      • How the MVP Delivers It
      • Brief DDD Alignment Note
      • Provider Polling Pattern Reuse Decision
    1. Home
    2. Work
    3. Active
    4. Telematics

    Specification: Telematics Migration into bf-manage-core with 5-Minute Freshness and Health Visibility¶

    TLDR (Solution Summary)¶

    • Keep a clear transition boundary: bf-manage-core owns EventBridge scheduling, provider polling tasks, provider interpretation, command handling, persistence, and read APIs; bf-telematics owns websocket transport/runtime only.
    • Workspace Poll Schedule + run each workspace on a staggered 5-minute cycle inside core + avoid "all vehicles timeout at once" behavior without routing scheduled polls through telematics.
    • Canonical Telematics Snapshot + store one authoritative freshness/value record per vehicle/device in core + eliminate cross-page value mismatch.
    • Shared Command State in Core + persist cutover flags, configured devices, schedule metadata, and snapshots in shared SQL + keep fresh and restarted core instances behaviorally identical.
    • Request-Time Health and Issues + compute health/issues in core query flow (no event bus/projections in this phase) + expose discoverable diagnostics.
    • Stale and Missing Data List + identify devices not updated recently and configured devices with no data + give Accounts a direct outreach/worklist view.
    • Shared Core Ingest Path + send core-owned poll results and websocket-originated provider envelopes through the same core acceptance path + keep validation, stale/duplicate rejection, and persistence behavior consistent.
    • Migration Cutover Plan + route current telematics reads/writes through bf-manage-core with toggle-based rollout + rollback fast without destructive data changes.

    1. Summary¶

    Problem¶

    • Sales and Accounts are seeing synchronized timeout behavior, inconsistent values between pages, and poor discoverability when configured devices stop reporting or never report.
    • James currently performs manual account follow-up without a reliable stale-device signal.

    Goal and Success Criteria¶

    • Achieve predictable telematics updates at a 5-minute service level target.
    • Present consistent telematics values across pages from one canonical state.
    • Surface actionable health/reporting so support and Accounts can immediately identify non-reporting devices.

    What Will Be Built In This Phase¶

    • Core Poll Runtime + move EventBridge-triggered workspace polling and provider poll task execution into bf-manage-core + keep scheduled polling ownership inside core.
    • Telematics Stream Transport Layer + keep websocket connection/runtime ownership in bf-telematics + forward provider-tagged raw messages to core.
    • Core Provider Interpretation + Persistence + interpret provider envelopes, normalize/validate accepted updates, and write shared workspace command state, canonical snapshots, and event history in bf-manage-core.
    • Shared Update Entry + route websocket-originated provider envelopes and core poll-produced updates into the same core ingest acceptance path + keep command validation, stale/duplicate rejection, and persistence uniform.
    • Request-Time Health and Issues + evaluate freshness/status at query time from persisted core data + expose "last update received" and stale/no-data lists.
    • Migration Cutover Plan + move read ownership and scheduled poll ownership to core while keeping websocket ingress isolated in telematics during shadow/canary phases.

    Scope (In)¶

    • Per-workspace poll cadence control and staggering.
    • Core poll-task orchestration for scheduled provider polling.
    • Telematics-service-to-core provider-ingest contract for websocket-originated raw provider messages.
    • One shared core acceptance path for scheduled poll results and websocket provider envelopes.
    • Freshness semantics and health classification in core read APIs.
    • Health visibility for operations and account-facing workflows.
    • Toggle-based migration of read and polling ownership.

    Scope (Out)¶

    • Replacing all provider connector internals beyond the provider-ingest contract requirements.
    • Event bus + projection read-model pipeline for this phase (deferred).
    • Detailed long-term persistence optimization work.

    Current Baseline (As of April 14, 2026)¶

    • bf-manage-core now owns EventBridge scheduling and scheduled provider polling execution for the MVP path.
    • bf-telematics owns the Viriciti websocket transport/runtime and forwards provider-tagged raw messages into core.
    • bf-manage-core interprets provider envelopes, applies shared ingest/idempotency rules, and persists canonical snapshots plus event history.
    • Reported legacy pain still motivating the migration includes synchronized updates/timeouts and inconsistent values across UI pages.

    Future Evolution Guardrails¶

    • The model must remain compatible with both interval-based polling and provider push/stream updates.
    • Event-bus/projection architecture remains a planned evolution after this transitional phase.
    • Health states must remain extensible (for new failure classes and richer diagnostics) without changing user-facing semantics.

    2. Users and Use Cases¶

    Primary Personas¶

    • Accounts liaison (James): needs a reliable list of non-reporting devices to coordinate with customer accounts.
    • Fleet operations user: needs confidence that telematics values are fresh and consistent across pages.
    • Support/implementation user: needs fast discoverability of configured devices that are not receiving data.

    High-Level User Stories¶

    • As Accounts, I can see which devices have not updated recently so I can proactively contact relevant accounts.
    • As Operations, I see consistent telematics values and freshness timestamps regardless of page.
    • As Support, I can detect "configured but no data" devices without manual cross-checking.

    Edge Cases and Failure Modes¶

    • Workspace-wide poll failures causing many stale vehicles at once.
    • Device configured recently but never receiving first data.
    • Partial workspace updates where some vehicles are fresh and others are stale.
    • Provider latency spikes creating delayed but eventually successful updates.

    3. Conceptual Model Terms and Decisions¶

    Key Terms¶

    Term Definition Notes
    Workspace Poll Cycle One scheduled attempt to collect telematics updates for a workspace. Target cadence is every 5 minutes with staggered start.
    Telematics Provider Ingest Contract Provider-tagged raw message envelope sent from bf-telematics websocket runtime into bf-manage-core. Boundary between transport/runtime concerns and core-side provider interpretation.
    Freshness Age Time since the last accepted telematics update for a vehicle/device. Derived from canonical "last update received" timestamp.
    Stale Device Configured device whose freshness age exceeds stale threshold. Default threshold is 15 minutes (configurable per workspace policy).
    Configured-No-Data Configured device with no accepted update since activation beyond grace period. Default grace period is 60 minutes (configurable) and distinct from stale.
    Canonical Telematics Snapshot Single authoritative read model for current telematics values and freshness metadata. All pages use this source to avoid inconsistencies.
    Shared Workspace Command State Persisted workspace-level telematics state used by command validation and scheduling decisions. Includes cutover flags, configured devices, schedule metadata, and snapshots in shared SQL.
    Request-Time Health Evaluation Health and issues calculation performed during query execution from persisted core data. Used while event bus/projection pipeline is deferred.
    Telematics Health Report Workspace-level view listing freshness state, update lag, and issue reason by device/vehicle. Primary operational and accounts-facing artifact (computed on request in this phase).

    Decision Ledger¶

    ID Decision Rationale Alternatives Rejected Implications
    D-001 Set 5-minute freshness as the explicit service-level target. Sales indicated reliable 5-minute consistency is high-value. Keep mixed sub-minute targets as primary objective. Prioritizes predictability and trust over ultra-low latency.
    D-002 Stagger polling by workspace instead of synchronized starts. Prevents simultaneous update and timeout waves. Single global aligned polling window. Requires deterministic workspace offset/jitter policy.
    D-003 Use one canonical snapshot for all UI reads. Resolves inconsistent values between pages. Keep page/domain-specific derived snapshots. Requires read paths to converge before final cutover.
    D-004 Promote "configured-no-data" to a first-class health state. Removes "wild goose chase" diagnosis pattern. Fold into generic stale state. Health report and workflows must distinguish root cause types.
    D-005 Move EventBridge-triggered scheduling and provider polling tasks into bf-manage-core, while leaving websocket runtime in bf-telematics. Scheduled poll ownership should sit with the same service that owns command state and EventBridge schedules, while stream connection management remains isolated. Keep scheduler + polling in bf-telematics; move websocket runtime into core as well. Core becomes the poll-runtime owner; telematics becomes a websocket transport adapter that never participates in scheduled poll execution.
    D-006 Set default stale threshold to 15 minutes. Balances detection speed with provider delay tolerance against a 5-minute freshness target. 10-minute default threshold. Lower false-positive risk; threshold remains configurable by workspace policy.
    D-007 Set default configured-no-data grace period to 60 minutes. Gives new device configurations a practical initial data window before flagging. Shorter immediate flagging windows. Faster discoverability than manual checks while avoiding premature alerts.
    D-008 Defer event bus + projection pipeline in this phase. Event bus path is not delivery-ready for immediate cutover. Block migration until full evented architecture is complete. Health/issues must be calculated at query time initially.
    D-009 Compute health/issues at request time in core query flow. Delivers customer-facing outcomes now without waiting for projections. Keep stale/no-data in separate asynchronous projection-only path. Query service reads multiple persisted sources and applies policy on demand.
    D-010 Maintain toggle-based cutover/rollback per workspace. Enables shadow, canary, and immediate fallback with low blast radius. Big-bang migration cutover. Requires clear ownership toggles and parity checks during migration.
    D-011 Use one shared core ingest acceptance path for websocket provider envelopes and scheduled poll outputs. Prevents duplicate validation/persistence behavior and keeps acceptance/idempotency rules in one place. Separate core handlers by source path. Scheduled poll outputs may enter as canonical updates; websocket ingress may enter as provider-tagged raw envelopes before core-side interpretation.

    4. Domain Model and Eventstorming (Conceptual)¶

    • Bounded contexts:
      • Telematics Stream Transport (bf-telematics: websocket connection/runtime and provider-message forwarding)
      • Telematics Poll Runtime (bf-manage-core: EventBridge scheduling + provider polling tasks)
      • Telematics Core Processing (bf-manage-core: provider interpretation, command handling, UoW, persistence)
      • Telematics Health Query (bf-manage-core: request-time freshness/issues evaluation)
      • Account Follow-up (stale/no-data operational actions)
    • Core entities:
      • Workspace, Vehicle, Device, Telematics Provider Message, Telematics Snapshot, Health Status
    • Invariants and business rules:
      • Every accepted provider envelope or canonical update maps to exactly one workspace and device identity.
      • Every configured device has exactly one current health state.
      • Freshness state is always computed from the canonical last-update timestamp.
      • All user-facing pages must read from the canonical snapshot contract.
      • Core must reject stale or duplicate updates safely before persistence.
      • A device can only be in one terminal health classification at a time (healthy, delayed, stale, configured-no-data).

    Interaction Flow¶

    flowchart LR
      subgraph TS["bf-telematics (left of ownership boundary)"]
        WsProvider["Provider WebSocket Stream"] --> WsIngest["Ingress Adapter: Stream Message"]
        WsIngest --> Envelope["Provider Envelope Forwarder"]
      end
    
      subgraph CORE_POLL["bf-manage-core poll runtime"]
        Scheduler["EventBridge Workspace Schedule"] --> PollTasks["Provider Polling Tasks"]
        PollTasks --> PollProtocol["Canonical Poll Updates"]
      end
    
      Envelope --> Handoff["Provider Ingest Contract: Raw Provider Envelope"]
      PollProtocol --> Accept["Shared Core Ingest Acceptance Path"]
    
      subgraph CORE["bf-manage-core (right of ownership boundary)"]
        Handoff --> ProviderInterpret["Provider Interpretation / Compatibility Handler"]
        ProviderInterpret --> Accept
        Accept --> CmdService["Service Command Handler: Validate + Apply Update"]
        CmdService --> UoW["Unit of Work"]
        UoW --> RepoAgg["Repository + Aggregate"]
        RepoAgg --> EventStore["Event Store (event_store)"]
        RepoAgg --> Snapshot["Canonical Snapshot Storage"]
    
        ReadApi["Read API"] --> QueryService["Service Query Handler"]
        QueryService --> HealthCalc["Request-Time Health + Issues Calculator"]
        HealthCalc --> EventStore
        HealthCalc --> Snapshot
        HealthCalc --> DeviceRegistry["Configured Device Registry"]
        HealthCalc --> Response["Health / Issues / Snapshot Response"]
    
        ReadApi --> ManagePages["Manage UI Pages"]
        ReadApi --> Accounts["Accounts/Support Follow-up View"]
      end
    
      Deferred["Deferred in this phase: Outbox Processor + Event Bus + Projections"]
      EventStore -.-> Deferred

    Control-Plane Flow¶

    flowchart LR
      SA["System Admin Telematics Dashboard"] -->|"GET/PUT /api/telematics-mvp/workspaces/{workspace_id}/cutover"| CutoverApi["Workspace Cutover API"]
      WS["Workspace Settings Telematics Tab (feature flag: workspace-telematics-mvp-settings; always enabled in DEV)"] -->|"GET/PUT /api/telematics-mvp/workspaces/{workspace_id}/policy"| PolicyApi["Workspace Policy API"]
    
      CutoverApi --> Cmd["TelematicsMvpCommandService"]
      PolicyApi --> Cmd
      Cmd --> UoW["Unit of Work + Workspace Aggregate"]
    
      UoW --> CutoverState["Workspace cutover state: device_registry_source, poll_owner, read_owner, provider_mode"]
      UoW --> PolicyState["Workspace policy + schedule state"]
    
      CutoverState -.->|"controls run_workspace_cycle polling path"| PollCycle["Poll cycle behavior (legacy | both_shadow | core)"]

    Event Timeline¶

    timeline
      title Workspace Telematics Health Timeline
      T-2 Control-plane setup: System Admin sets workspace cutover flags in core
      T-1 Policy setup: Workspace Settings tab updates workspace policy (feature-flag gated outside DEV)
      T0 Scheduled trigger: EventBridge starts workspace poll cycle in core
      T1 Ingress preparation: core poll results become canonical updates; telematics websocket messages become provider envelopes
      T2 Core interpretation + command transaction: provider envelopes are interpreted, then UoW writes accepted aggregate changes, event store, and snapshot state
      T3 Query-time evaluation: health/issues are computed from persisted core data when requested
      T4 Follow-up: Read API serves health/issues/snapshot responses for operations and accounts
      T5 Evolution point: event bus/projection path may replace request-time calculation later

    Event Dictionary¶

    • WorkspacePollCycleTriggered: workspace poll initiated in core | defines cycle boundary | workspaceId, cycleStartedAt | triggers provider collection.
    • TelematicsProviderMessageReceived: provider-tagged raw websocket message received in core | preserves transport/provider boundary while reusing core compatibility logic | workspaceId, providerType, providerId, receivedAt, rawMessage | triggers core-side provider interpretation.
    • TelematicsIngressPayloadPrepared: canonical update payload prepared either directly from a core poll task or after core-side provider interpretation | decouples provider protocol from core domain | workspaceId, providerType, deviceId, vehicleId, payloadPreparedAt, normalizedValues | triggers core command ingest.
    • TelematicsUpdateAccepted: core accepted update payload | refreshes canonical freshness/value truth | workspaceId, deviceId, vehicleId, receivedAt, normalizedValues | triggers snapshot/event persistence.
    • WorkspacePollIngestCompleted: poll ingest finished for workspace | measures ingestion coverage | workspaceId, cycleEndedAt, resultSummary | supports cycle reliability reporting.
    • DeviceHealthEvaluatedOnRequest: health classification computed during query | drives status visibility in MVP | workspaceId, deviceId, freshnessAge, healthState, evaluatedAt | returned in API response.
    • DeviceMarkedConfiguredNoData: configured device has no accepted data beyond grace | surfaces integration issue quickly | workspaceId, deviceId, configuredAt, graceExceededAt | returned in issues response.

    5. Requirements and Constraints¶

    Functional Requirements¶

    • FR-001: The system must execute a workspace poll cycle on a 5-minute target cadence for every active workspace via core-owned EventBridge scheduling.
    • FR-002: The system must stagger workspace poll start times to prevent synchronized update and timeout behavior.
    • FR-003: The system must execute scheduled provider polling tasks inside bf-manage-core without calling bf-telematics.
    • FR-004: The system must maintain one canonical telematics snapshot that includes current values and "last update received" per configured device/vehicle.
    • FR-005: The system must provide a Telematics Health Report for each workspace that includes device/vehicle health state and last update received timestamp.
    • FR-006: The system must classify configured devices into health states including at least healthy, delayed, stale, and configured-no-data.
    • FR-007: The system must provide a stale-device list suitable for Accounts workflows, including reason and recency context for follow-up.
    • FR-008: The system must ensure all manage pages that present telematics freshness/value information read from the canonical snapshot/health contract.
    • FR-009: The system must expose workspace-level performance indicators for telematics collection (for example cycle success, delay patterns, and recent update coverage).
    • FR-010: The system must enforce boundary ownership for transitional MVP: bf-manage-core handles scheduled polling, provider interpretation, command persistence, and read APIs, while bf-telematics handles websocket transport/runtime only.
    • FR-011: The system must send websocket-originated provider envelopes and scheduled poll results through the same core ingest acceptance path.
    • FR-012: Until event-bus projections are available, the system must calculate health/issues at request time from persisted core data and device configuration.
    • FR-013: Production telematics command behavior must validate against shared persisted workspace state, not process-local memory or a local state file.
    • FR-014: The core ingest and provider-ingest write endpoints must require M2M authorization.
    • FR-015: The shared core ingest path must reject stale or duplicate updates safely before persistence.

    Non-Functional Requirements¶

    • NFR-001: Freshness reliability target: at least 95% of active configured devices should have an update age within 5 minutes during normal provider availability windows.
    • NFR-002: Consistency target: pages reading telematics freshness/value state should converge on the same canonical values within one refresh cycle.
    • NFR-003: Observability target: each workspace poll cycle must emit traceable lifecycle records across core poll execution, websocket provider-envelope forwarding, and core ingest (start, completion, outcome summary).
    • NFR-004: Discoverability target: configured-no-data devices must appear in health reporting no later than one evaluation cycle after grace-period breach.
    • NFR-005: Request-time query performance for health/issues must remain within acceptable UI latency budgets for first-wave workspace sizes.

    Constraints and Assumptions¶

    • EventBridge is the intended scheduler/orchestration mechanism for core-owned telematics poll cycles and reconciliation sweeps.
    • Providers may be integrated as core-owned scheduled poll tasks or telematics-owned websocket transport while sharing one core ingest acceptance path.
    • Event bus, outbox processor fan-out, and projection read models are intentionally deferred in this phase.
    • Providers may return partial or delayed data; health logic must distinguish delayed vs no-data conditions.
    • Workspace-specific policy values (stale threshold, no-data grace period) are configurable with defaults of 15 minutes and 60 minutes.
    • Migration must preserve business continuity while moving ownership into bf-manage-core.
    • Shared SQL in bf-manage-core is the source of truth for workspace cutover state, configured devices, schedule metadata, and snapshots; in-memory repositories, schedulers, and state files are test/dev-only and not valid production dependencies.

    Build Item Coverage Mapping¶

    Build Item Requirement Coverage
    Core Poll Runtime + Telematics Stream Transport FR-001, FR-002, FR-003, FR-010, NFR-001
    Core Provider Interpretation + Canonical Snapshot Persistence FR-003, FR-007, FR-009, FR-011, FR-015, NFR-002
    Request-Time Health and Issues Calculator FR-005, FR-006, FR-007, FR-009, FR-012, NFR-004, NFR-005
    Migration Cutover Plan FR-010, FR-011, FR-014, NFR-001, NFR-003

    Verification Notes¶

    • FR-001/FR-002/NFR-001: verify schedule execution distribution and freshness-age percentile reporting by workspace.
    • FR-003/FR-007/NFR-002: verify cross-page value/freshness parity against the canonical snapshot.
    • FR-004/FR-005/FR-006/NFR-004: verify request-time health-state transitions and stale/no-data surfacing behavior.
    • FR-008/NFR-003: verify performance indicators and end-to-end trace completeness across core poll execution, websocket provider-envelope forwarding, and core ingest.
    • FR-009/FR-010/FR-011/FR-014/FR-015/NFR-005: verify boundary responsibilities, M2M-protected write paths, stale/duplicate rejection, restart/multi-instance shared-state behavior, and request-time query latency under first-wave workspace load.

    6. Interaction and Flow¶

    • Journey overview:
      • EventBridge triggers a per-workspace cycle in core.
      • Core poll tasks produce canonical updates while telematics websocket ingress forwards provider-tagged raw envelopes.
      • Core provider interpretation and command flow validate against shared SQL-backed workspace state and persist canonical snapshot/event history.
      • Read API requests compute freshness/health/issues at request time from persisted core data and device configuration.
      • Manage pages and Accounts workflows consume one core contract for telematics values and health.

    Sequence Diagram¶

    sequenceDiagram
      participant SA as System Admin Telematics Dashboard
      participant WS as Workspace Settings Telematics Tab
    
      box Telematics Service (left of ownership boundary)
        participant TP as Websocket Provider
        participant TW as Websocket Runtime
        participant TH as Provider Envelope Client
      end
    
      box bf-manage-core poll runtime
        participant EB as EventBridge Scheduler
        participant CP as Polling Tasks
        participant CN as Canonical Poll Update Builder
      end
    
      participant RB as Ownership Boundary (red line)
    
      box bf-manage-core (right of ownership boundary)
        participant PI as Provider Interpreter
        participant CMD as Shared Command Handler
        participant UOW as Unit of Work
        participant RA as Repository + Aggregate
        participant ES as Event Store
        participant SNAP as Canonical Snapshot Store
        participant API as Read API
        participant QS as Query Service
        participant CALC as Health/Issues Calculator (request-time)
        participant REG as Configured Device Registry
      end
    
      participant UI as Manage UI + Accounts/Support
    
      SA->>API: GET/PUT workspace cutover flags
      API->>CMD: Persist workspace cutover settings
      CMD->>UOW: Save cutover to workspace aggregate
    
      WS->>API: GET/PUT workspace telematics policy
      API->>CMD: Persist workspace policy settings
      CMD->>UOW: Save policy + update schedule
    
      EB->>CP: Trigger workspace poll cycle
      CP->>CN: Run provider poll and build canonical updates
      CN->>CMD: Submit canonical poll update batch
    
      TP->>TW: Push websocket message
      TW->>TH: Forward provider-tagged raw message
      TH->>RB: Handoff provider envelope
      RB->>PI: Submit provider envelope
      PI->>CMD: Build canonical updates
      CMD->>UOW: Begin transaction
      UOW->>RA: Load + mutate aggregate
      RA-->>UOW: Domain events
      UOW->>ES: Append event history
      UOW->>SNAP: Upsert canonical snapshot
      UOW-->>CMD: Commit
      Note over CMD: Reject stale/duplicate updates before persistence.
    
      Note over CMD,SNAP: Event bus/outbox projections are deferred in this MVP phase.
    
      UI->>API: Request health / issues / snapshots
      API->>QS: Execute query use-case
      QS->>CALC: Evaluate freshness and issue states
      CALC->>ES: Read ingest/event history
      CALC->>SNAP: Read latest snapshot values
      CALC->>REG: Read configured devices
      CALC-->>QS: Health/issues/snapshot DTOs
      QS-->>API: Response DTO
      API-->>UI: API response
      Note over API,UI: Health and issues are calculated at request time.
      Note over CMD: workspace poll_owner flag governs legacy/shadow/core polling behavior for run_workspace_cycle.

    7. Non-Technical Implementation Approach¶

    • Approach overview:
      • Define and align canonical freshness/health semantics first.
      • Move scheduled polling into core and formalize one provider-envelope contract for websocket transport plus one shared core acceptance path.
      • Run migration in controlled phases with parallel validation of consistency, freshness, and query-time health outcomes.
      • Cut over UI and account workflows to canonical report/snapshot outputs.
    • Delivery sequencing:
      • Phase 1: Define health taxonomy, policy defaults, provider-envelope contract, and shared core acceptance rules.
      • Phase 2: Implement core-owned scheduled polling and telematics-owned websocket provider-envelope forwarding.
      • Phase 3: Expose request-time health/issues/snapshot queries for Accounts and Support.
      • Phase 4: Switch all telematics consumers to canonical core contract and retire direct legacy read path.
      • Phase 5 (post-MVP): Introduce event bus/projection pipeline to replace request-time calculation where beneficial.
    • Cutover control model (per workspace):
      • device_registry_source: legacy_sync | core
      • poll_owner: legacy | both_shadow | core
      • read_owner: legacy | core
      • provider_mode: legacy_provider | depot_sim | mixed
    • UI rollout controls:
      • Workspace Settings Telematics tab is gated by feature flag workspace-telematics-mvp-settings.
      • In local development (import.meta.env.DEV), that tab is enabled regardless of flag state for implementation/testing.
      • Migration cutover flags are managed in System Admin Telematics dashboard; Workspace Settings is reserved for public-facing policy controls.
    • Rollback model:
      • Immediate rollback is done by setting read_owner=legacy and poll_owner=legacy.
      • Legacy service remains running during migration/soak; core data retained for diagnostics.
    • Dependencies and prerequisites:
      • Product/Sales sign-off on stale and no-data policy thresholds.
      • Accounts workflow agreement for outreach handling and ownership.
      • Operational dashboards for cycle reliability, freshness indicators, and query-latency monitoring.

    8. Open Questions¶

    • OQ-003: Should stale/no-data alerts be passive report-only in phase 1, or include active notifications?
    • OQ-004: Which exact manage pages are in first-wave scope for canonical snapshot cutover?
    • OQ-005: What query-latency threshold should trigger moving health/issues from request-time calculation to projected read models?
    • OQ-006: What criteria mark readiness to introduce event bus + projection infrastructure post-MVP?

    9. Appendices¶

    • Source feedback incorporated:
      • Vehicles update and timeout in sync today.
      • Reliable updates every 5 minutes are considered high-value.
      • Values are inconsistent between pages.
      • Health reporting and "last update received" visibility are needed.
      • Accounts workflow needs a direct list of devices not updated recently.
      • Configured devices with no incoming data must be discoverable quickly.

    9.1 Current-State Risk Summary (Aligned to Slides)¶

    • UI currently consumes telematics through two pipelines (direct telematics service and via core), which increases cross-page inconsistency risk.
    • Shared legacy polling responsibilities create correlated failure patterns (update/timeout waves) and larger blast radius per cycle fault.
    • Split ownership across services fragments observability and makes canary/rollback controls harder without explicit cutover toggles.

    9.2 Related Customer-Facing Issues (Linear, IPT)¶

    • IPT-71: no-data alert does not identify impacted vehicles.
    • IPT-66: polling/refresh rate makes data feel not live.
    • IPT-58: stale/unknown SoC handling is inconsistent.
    • IPT-50: unknown SoC investigation for customer fleet.
    • IPT-46: SoC mismatch across Chargers/Power/live views.

    9.3 Design Review and Spec Walkthrough Checklist¶

    • Scope to review:
      • Bounded context responsibilities in the core telematics module.
      • Event flow from scheduler trigger to canonical snapshot and health evaluation.
      • Cutover toggles, rollout phases, and rollback path.
    • Questions to resolve:
      • Confirm where legacy provider x_client.py and x_polling_task.py patterns are reused vs replaced.
      • How the request-time health/issues path transitions to event bus/projections post-MVP.
      • What deviations require immediate spec updates before implementation continues.
    • Expected outputs:
      • Approved architecture direction and MVP scope boundaries.
      • Explicit list of spec deltas with owners.
      • Next-step worklist for migration stream execution.

    10. MVP Coverage and DDD Alignment¶

    MVP Satisfies (Transitional Design)¶

    • Core Poll Runtime + Telematics Stream Transport:
      • Satisfies FR-001, FR-002, FR-003, FR-010 at MVP level by running per-workspace schedule entries in core while keeping websocket connection/runtime management in bf-telematics.
    • Core Provider Interpretation + Canonical Snapshot:
      • Satisfies FR-004, FR-008, FR-010, FR-011, FR-015 at MVP level by accepting scheduled poll updates and provider envelopes, interpreting provider messages in core, and persisting shared workspace command state plus one workspace/device snapshot contract in core.
    • Request-Time Health Report + Issues:
      • Satisfies FR-005, FR-006, FR-007, FR-009, FR-012 at MVP level by calculating freshness/health/issues on query from persisted core data and device configuration.
    • Cutover Controls and Rollback:
      • Satisfies FR-010, FR-011, NFR-001, NFR-003 through workspace-level ownership toggles and reversible rollout phases.

    How the MVP Delivers It¶

    • EventBridge triggers per-workspace cycles in core.
    • Core poll tasks produce canonical updates; telematics websocket ingress forwards provider-tagged raw messages.
    • Core provider interpretation and command flow validate against shared SQL-backed workspace state, reject stale/duplicate updates, and write event history + canonical snapshot through UoW/repository boundaries.
    • Production command validation reads shared persisted workspace state; .telematics-mvp/state.json and in-memory adapters are reserved for tests/dev-only support.
    • Core query flow calculates health/issues at request time from event store, snapshot state, and configured device metadata.
    • Event bus/outbox projection fan-out is deferred to a post-MVP phase.

    Brief DDD Alignment Note¶

    • Aligned:
      • A clear bounded-context boundary exists between websocket transport (bf-telematics) and provider interpretation plus domain command/query handling and scheduled polling (bf-manage-core).
      • Command/query concerns are split (TelematicsMvpCommandService, TelematicsMvpQueryService).
      • Command writes flow through an explicit workspace unit-of-work + repository boundary.
      • Query-time policy evaluation is centralized in core query use cases rather than spread across UI pages.
    • Partial (next increments):
      • No projection read-model/event-bus pipeline in this phase; health/issues are computed on demand.
      • Query-time calculation may require optimization thresholds before broader workspace rollout.
      • The command/query layer is class-based in this module; docs/DDD/agent.md prefers function-style use cases, so this can be flattened later without behavior change.

    Provider Polling Pattern Reuse Decision¶

    • Reused from legacy service:
      • Keep provider integration seams in the familiar x_client.py + x_polling_task.py shape, with scheduled polling ownership and websocket provider interpretation both living in core-compatible adapters.
    • Transitional boundary decision:
      • Core becomes the scheduled polling/provider-interpretation owner; telematics only forwards websocket-originated provider envelopes.
      • Workspace poll_owner still gates legacy, shadow, and core rollout modes during migration, but the target steady state is poll_owner=core.
    • Evolution note:
      • When event bus/projections are introduced, provider seams remain unchanged; only downstream core processing shifts.
    Made with Material for MkDocs
    BFDev Docs Assistant
    New conversation?
    Ask one focused question at a time, this helps the assistant provide accurate answers about what's been implemented in BetterFleet.