Google Sign-In Required

Use your company Google account to access the BetterFleet private content.

Back to private home

BetterFleet Support Private
Skip to content
BetterFleet Dev Wiki
Code Indexing
Initializing search
    bf-dev
    • Home
    • Process
    • Products
    • Reference
    • Decisions
    • Work
    • Operations
    bf-dev
    • Home
      • Process Handbook
      • BetterFleet Workflow Map
      • Product Development System
      • Product Engineering Workflow
        • Process Workflows
        • Work Intake and Weekly Planning
        • Product Engineering Workflow in Linear
        • Product Engineering Delivery
        • Agent Guidance
        • Workflow
        • Skills
        • Skill Sources
        • Process Guides
        • GitLab Feature Flags
        • In-App Docs Authoring
        • Release Notes
        • Process Templates
        • Release Plan: <title>
      • Process Publishing
      • Product overview
        • General Reference
          • Core Domain Training
          • System Topology
          • Two-Axis Ontology Model
          • Ontology Primer
          • Worked Example
          • Evidence, Ownership, and Lineage
          • Energy Management
          • Standards and Protocol Map
          • Charging, Roaming, and Commercial Model
          • Charge Planning and Operations
          • Cross-Cutting Domains
          • Domain Coverage Matrix
        • BetterFleet Product Ontology
        • Core Operations Data Ontology
        • BetterFleet R&D Plan
        • Index
        • Architecture
        • Manage Product Capabilities
        • Manage Data and State
        • Manage Service Interaction Flows
        • Manage Reference
        • Manage Internal Application Diagrams
          • Manage Authorization And Permissions
          • bf-manage-core Auth and Authorization Model
          • Manage Authorization and Permissions
          • bf-manage-web Auth and Permission Model
          • Manage Service Catalog
          • bf-depot-sim
          • bf-digital-twin (Manage Role)
          • bf-fleet-health
          • bf-manage-connect
          • bf-manage-core
          • bf-manage-incidents
          • bf-manage-roaming
          • bf-manage-sitepwrmon
          • bf-manage-web
          • bf-schedule-creator (Manage Role)
          • bf-support-microsite
          • bf-telematics
        • Index
        • Architecture
        • Plan Reference
        • Plan Internal Application Diagrams
        • Plan Migration and Flags
        • Plan Simulation Request Lifecycle
          • Plan Service Catalog
          • bf-bnl-schedule-analysis-compute
          • bf-bnl-settings
          • bf-bnl-ui
          • bf-digital-twin (Plan Role)
          • bf-route-modelling
          • bf-schedule-creator (Plan Role)
      • Where to Ask Product Questions
      • Reference
        • Platform Reference
        • Platform Architecture
        • Script Runtime Model
        • Compose Profiles and Modes
        • Repository Map
        • Monolithic Git Transition FAQ
        • Monolithic Git Sizing
        • CI and Release Integration
        • Shared Reference
        • Shared Infrastructure Architecture
        • Secrets and Env Strategy
        • Vendors and Local Dependencies
        • System Reference
        • Cloud Data Dependencies
        • Ports and URLs
        • Service Matrix
          • API Docs
          • OCPI API Docs
          • OCPP API Docs
          • OSCP API Docs
          • VDV API Docs
          • Yard State API Docs
        • System Design
        • System Design: BBA Microgrid Controller Generic Packet Translation
        • System Design: Depot Simulation
        • System Design: IoT Sensor Packet
        • System Design: Microgrid Energy Orchestration
          • System Design: OCPP Profile 3 And ISO 15118 PKI
          • Architecture: BetterFleet OCPP Profile 3 and ISO 15118 PKI
          • Specification: BetterFleet OCPP Profile 3 and ISO 15118 Certificate Lifecycle Management
          • System Design: On-Prem Control
          • Challenge
          • Specification: BetterFleet On-Prem Continuity Control
          • System Design: OSCP
          • OSCP Protocol Documentation
          • Depot Sim Testing Requirements
          • System Design: OSCP Flexibility Provider Domain
      • Decisions
        • Architecture Decision Records
        • 0001 - Record architecture decisions
        • 0002 - Cognito for Authentication and Authorisation
        • 0003 - AWS Amplify for Authentication
        • 0004 - DynamoDB for default database
        • 0005 - Data Persistence
        • 0006 - Trunk-Based Development
        • 0007 - Generalised principle for automation
        • 0008 - Naming Repositories, Services, and URLs
        • 0009 - Use Timezone Aware DateTimes and UTC
        • 0010 - Use semantic release
        • 0011 - Centralized feature flag repository
        • 0012 - Use Named Exports in Storybook
        • 0013 - RESTful TITLE GraphQL
        • 0014 - Service Granularity
        • 0015 - Async/co-routine exception handling pattern
        • 0016 - Logging & log levels
        • 0017 - Instantiated Models
        • 0018 - Repository Pattern for Database Access
        • 0019 - Use of Design Tokens in TypeScript React Application
        • 0020 - API backwards compatibility and versioning
        • 0021 - Alembic Migration strategy
        • 0022 - Consistent react-hook-form usage
        • 0023 - Domain Event-Driven Architecture
        • 0024 - Domain Event Bus Tech Stack
        • 0025 - No enum types in DB table columns
        • 0026 - In-Memory Ormar Stores for Repository testing
        • 0027 - Storing Tab State in Query and Local Storage
        • 0028 - Adopt OpenTelemetry Semantic Conventions for Structured Logging
        • 0029 - Adopt RFC 9457 for HTTP Error Responses
        • 0030 - Use GitLab registry and Terraform state for ECS services
        • 0031 - Adopt DDD, Hexagonal Architecture, and CQRS for Python Domain Services
      • Work
        • Active Work
          • Work: Bba Microgrid Controller
          • Implementation Specification: BBA Microgrid Controller
          • BBA Microgrid Controller Deliverables (Stories)
          • Work: BFDev Monolithic Git
          • Challenge
          • Specification: BFDev Monolithic Git v2
          • BFDev Monolithic Git v2 Stories
          • Work: Complex Circuit Load Balancing
          • Implementation Specification: Complex Circuit Load Balancing
          • Complex Circuit Load Balancing Deliverables (Stories)
            • COR-10 and COR-11 Consolidation Review
          • Work: Dispatch Reliability and Reconciliation
          • Challenge
          • Specification: Dispatch Reliability and Reconciliation
          • Dispatch Reliability and Reconciliation (Unit User Stories)
            • Dispatch populated vehicle cards grey surface snapshot
            • Dispatch Visual Review
          • Work: Enable Scheduled Managed Charger Access
          • Challenge: Enable Scheduled Managed Charger Access
          • Specification Exploration Dossier: Enable Scheduled Managed Charger Access
          • Specification Review: Enable Scheduled Managed Charger Access
          • Specification: Enable Scheduled Managed Charger Access
          • Work: Guided Cut-Off and Release Orchestration
          • Specification: Guided Cut-Off and Release Orchestration
          • Guided Cut-Off and Release Orchestration (Unit User Stories)
          • Work: Production Deployment Validation
          • Challenge
          • Work: Scheduled Report Parity
          • Specification: Scheduled Report Parity
          • Work: Telematics
          • Telematics EventBridge Path
          • Telematics Ingress Architecture
          • Specification: Telematics Migration into bf-manage-core with 5-Minute Freshness and Health Visibility
          • Telematics Core Migration MVP (Implementation-Time BDD)
          • Work: Vector Derms
          • Implementation Specification: Vector DERMS
          • Vector DERMS Deliverables (Stories)
          • Work: Visiting Vehicle Charging Visibility
          • Specification: Visiting Vehicle Charging Visibility
          • Visiting Vehicle Charging Visibility (Unit User Stories)
          • Work: Workspace Owned Stripe Roaming
          • Specification: Workspace-Owned Stripe Credentials for Roaming Payments
        • Backlog Work
          • Work: Microgrid
          • Microgrid Backlog Stories
          • Work: Mobile Ops Companion
          • Challenge
          • Specification: Mobile Operations Companion v1
          • Mobile Operations Companion Deliverables (Stories)
          • Work: Oscp
          • OSCP Backlog Stories
        • Archived Work
          • Work: Code Canonical Orchestration
          • Challenge
          • Specification: Product Engineering Workflow
          • Product Engineering Workflow Deliverables (Unit User Stories)
          • Work: Release Notes Automation
          • Release Plan: Release Notes Automation
          • Release Notes Automation Backlog Stories
      • Operations
      • Onboarding Runbook
        • Operations Runbooks
        • Production Hotfix Release
        • Staging Hotfix Release
        • Manage Staging Release Validation
        • Terraform Plan Dry Runs
        • Operations Tooling
        • Code Indexing
          • Purpose
          • What It Includes
          • Setup
          • Default Workflow
          • SQLite State Inspection
          • Manual Recovery
          • Optional Pre-commit Integration
          • Config Contract
          • Migrating Existing Local State
          • Rollback
          • Constraints
        • Operations Evidence
        • Database Restoration Test Report
      • Daily Operations Runbook
      • Testing Guide
      • Troubleshooting
    • Purpose
    • What It Includes
    • Setup
    • Default Workflow
    • SQLite State Inspection
    • Manual Recovery
    • Optional Pre-commit Integration
    • Config Contract
    • Migrating Existing Local State
    • Rollback
    • Constraints
    1. Home
    2. Operations
    3. Tooling
    Operations Resilience & Security general

    Code Indexing¶

    Purpose¶

    The code-index POC provides a local Qdrant-backed search index for this bf-dev workspace. It is intended for developer-only exploration and agent assistance, not for production traffic or CI hosting. The Qdrant REST and gRPC ports are bound to 127.0.0.1 only by default.

    What It Includes¶

    • A dedicated Qdrant service started with ./code-index-backend.
    • A Python CLI at ./code-index with rebuild, update, status, validate, and query.
    • Persisted local state under bfd-caches/code-index/, with SQLite metadata in bfd-caches/code-index/state.sqlite3 by default.
    • An optional .pre-commit-config.yaml hook that runs ./code-index update.
    • Separate Qdrant collections per repo root, with workspace-wide queries fanning out across them.
    • Tree-sitter-aware chunking when a parser is available, with a newline-aware fallback when it is not.

    Setup¶

    1. Install the repo-managed tool dependencies:
    uv sync --dev
    
    1. Start the isolated Qdrant service:
    ./code-index-backend up
    

    The REST API listens on 127.0.0.1:${CODE_INDEX_QDRANT_PORT:-6333} and the built-in dashboard is available at http://127.0.0.1:6333/dashboard by default.

    1. Optionally copy the example config and adjust it:
    cp .code-index.json.example .code-index.json
    
    1. Build the first index:
    ./code-index rebuild
    

    To rebuild only selected repo roots, repeat --project. Use root for bf-dev files at the workspace root:

    ./code-index rebuild --project bf-manage-core --project bf-manage-web
    ./code-index rebuild --project root
    ./code-index rebuild --project bf-manage-roaming --ignore '**/build/**' --ignore '**/dist/**'
    

    The code-index CLI resolves the state database path from state_dir in .code-index.json and stores the database at <state_dir>/state.sqlite3. With the default config, that is bfd-caches/code-index/state.sqlite3.

    Default Workflow¶

    Use update as the normal local workflow:

    ./code-index update
    

    Add --profile to rebuild, update, status, validate, or query when you want timing data in the JSON response:

    ./code-index update --profile
    ./code-index query --profile "incident model"
    

    update compares tracked files against the last saved manifest and only refreshes changed or deleted files. This keeps incremental indexing as the default path for day-to-day work. It now uses a cheap stat scan first and only hashes file contents when the saved metadata suggests a file may have changed, which avoids reindexing mtime-only churn. You can scope the same workflow to selected repo roots with --project, and add one-off path globs with --ignore.

    Use status to inspect the local state:

    ./code-index status
    ./code-index status --project bf-manage-core
    

    status prints the resolved SQLite path in config.state_path and includes the persisted metadata snapshot in state. For deeper inspection, use the local sqlite3 shell against that path.

    Use validate when you want an integrity check across the saved SQLite state, the current discovered file set, and the scoped Qdrant collections without mutating anything:

    ./code-index validate
    ./code-index validate --project bf-manage-core
    

    validate is a read-only command. It does not repair drift, rewrite SQLite metadata, re-embed files, or recreate collections. Use it when status shows something suspicious, after local migrations or manual recovery work, or when you want a machine-readable integrity report before deciding whether a later update or rebuild is needed.

    When validate completes successfully, it exits 0 if no integrity issues were found and 2 if the validation pass found drift. Operational failures still exit non-zero as normal command errors.

    At a high level, the JSON response includes a validation summary with issue categories such as:

    • state chunks missing from Qdrant
    • orphan chunks present in Qdrant but not in saved state
    • payload mismatches for chunks that exist in both places
    • collection count drift between saved state and Qdrant
    • file snapshot drift between saved state and the current working tree
    • discovery drift between saved state paths and the current Git-tracked discovery set

    Use status for a fast local snapshot, validate for a deeper read-only integrity pass, and update when you want to bring the index forward to match current tracked files.

    Use query to search the local collection:

    ./code-index query "where is profile.ini loaded"
    ./code-index query --project bf-manage-core "incident model"
    

    SQLite State Inspection¶

    The code-index state store is a local SQLite database, not a committed project artifact. By default it lives at:

    bfd-caches/code-index/state.sqlite3
    

    You can confirm the resolved path at any time:

    ./code-index status
    

    Inspect the schema and metadata with sqlite3:

    sqlite3 bfd-caches/code-index/state.sqlite3 ".tables"
    sqlite3 bfd-caches/code-index/state.sqlite3 "SELECT key, value FROM metadata ORDER BY key;"
    sqlite3 bfd-caches/code-index/state.sqlite3 "SELECT path, collection_name, size, mtime_ns FROM files ORDER BY path LIMIT 20;"
    sqlite3 bfd-caches/code-index/state.sqlite3 "SELECT path, chunk_ordinal, chunk_id FROM file_chunks ORDER BY path, chunk_ordinal LIMIT 20;"
    

    Expected tables are metadata, files, and file_chunks.

    SQLite runs in WAL mode for this state store. While the indexer or another SQLite client has the database open, sidecar files such as state.sqlite3-wal and state.sqlite3-shm may be present beside the main database file. That is expected local SQLite behavior; do not delete those files while the database is in use.

    Manual Recovery¶

    Use rebuild when the collection is missing, the schema/config changed, or validate confirms drift that you want to replace with a clean rebuild:

    ./code-index rebuild
    

    rebuild is the manual full refresh escape hatch. Do not use it as the default hook or commit-time workflow. With no --project, rebuild recreates the whole workspace index. With one or more --project flags, it only recreates those project collections and keeps the rest of the state intact.

    Optional Pre-commit Integration¶

    This repo includes a local-only .pre-commit-config.yaml entry for ./code-index update. It does not write directly into .git/hooks/.

    If you want to enable it:

    pre-commit install
    

    Config Contract¶

    .code-index.json is optional. If present, it can override:

    • state_dir
    • chunk_size
    • chunk_overlap
    • max_file_bytes
    • update_batch_size
    • query_limit
    • query_multiplier
    • include_extensions
    • exclude_dirs
    • exclude_path_globs
    • filetype_map
    • chunk_filters
    • qdrant.host
    • qdrant.port
    • qdrant.grpc_port
    • qdrant.https
    • qdrant.api_key_env
    • qdrant.collection_name acts as the collection name prefix. Each repo is indexed into <prefix>__root or <prefix>__<repo-name>.
    • qdrant.on_disk
    • qdrant.hnsw.distance
    • qdrant.hnsw.ef_construct
    • qdrant.hnsw.search_ef
    • qdrant.hnsw.m
    • qdrant.hnsw.full_scan_threshold
    • qdrant.hnsw.max_indexing_threads
    • qdrant.hnsw.on_disk
    • qdrant.hnsw.payload_m
    • embedding_provider.kind
    • embedding_provider.model
    • embedding_provider.base_url
    • embedding_provider.api_key_env
    • embedding_provider.options

    query_multiplier controls query over-fetch before the service deduplicates chunk hits down to the best result per file. filetype_map lets you force a parser/language for unusual file names or extensions using a simple name-or-suffix-to-language mapping. chunk_filters applies regex-based chunk exclusion by language before chunks are upserted, using either a language-keyed object or an explicit rule list. The default provider uses local Qdrant FastEmbed integration, and the CLI also supports openai, ollama, and sentence-transformer embedding functions when the relevant local dependencies and credentials are available. Tree-sitter chunking is opportunistic for an allowlisted set of code-heavy languages such as Python, TypeScript, TSX, JavaScript, Go, Java, Rust, C, and C++; if parser resolution fails, the indexer falls back to newline-aware window chunking. The default max_file_bytes is 256000, so larger files are skipped unless you raise that limit in .code-index.json. CLI --ignore flags append extra globs on top of exclude_path_globs for a single run.

    Environment overrides are also supported:

    • CODE_INDEX_QDRANT_HOST
    • CODE_INDEX_QDRANT_PORT
    • CODE_INDEX_QDRANT_GRPC_PORT
    • CODE_INDEX_COLLECTION

    Migrating Existing Local State¶

    If you already have legacy local state from an older checkout that used bfd-caches/code-index/state.json, decide which path you want before running the new SQLite-backed workflow:

    • If you do not need to preserve the old incremental state, delete the old JSON file and run ./code-index rebuild.
    • If you do want to preserve it, first stop any running code-index work, ensure uv sync --dev has completed, ensure the Qdrant backend is available if you plan to validate counts, and import the JSON into bfd-caches/code-index/state.sqlite3 with a one-time local migration script.

    Migration prerequisites:

    • No concurrent ./code-index rebuild or ./code-index update process is running.
    • You know the target SQLite path (./code-index status reports it as config.state_path).
    • You keep the legacy state.json untouched until you have validated the imported row counts and spot-checked a few file-to-chunk mappings with sqlite3.

    The migration helper is intentionally temporary and local-only. Do not commit it. After a successful import and validation, delete the temporary migration script from your working tree.

    Rollback¶

    To stop using the POC entirely:

    1. Stop and remove the isolated service:
    ./code-index-backend down -v
    
    1. Remove the local code-index state, including the SQLite database and any WAL sidecars:
    rm -rf bfd-caches/code-index
    rm -f .code-index.json
    
    1. If you enabled pre-commit, uninstall it or remove the local hook from .pre-commit-config.yaml.

    If you only need to recover from bad local SQLite state, prefer ./code-index rebuild first. That recreates the indexed state without requiring a full feature rollback.

    If you created a one-time migration helper for legacy state.json import, delete that script as part of rollback or immediately after a successful migration. It is a temporary local tool and should not remain in the repo.

    Constraints¶

    • This is a local-only proof of concept.
    • Default indexing excludes common secret-bearing paths and file names, but it is still your responsibility not to index sensitive material carelessly.
    • The default provider uses local Qdrant FastEmbed integration unless you opt into another provider in .code-index.json.
    • Tests are intentionally written to avoid requiring a live Qdrant service.
    • The CLI runs through uv run --dev, so uv sync --dev must complete successfully first.
    • Query quality depends on the configured embedding provider, tree-sitter parser availability, and current chunk/query settings.
    • The standing Qdrant service is treated as persistent local state; destructive rebuild behavior remains an explicit CLI workflow.
    • Discovery currently follows Git-tracked files only; untracked files are ignored for now.
    • SQLite writes use WAL mode and immediate write transactions for local durability; avoid manually deleting state.sqlite3-wal or state.sqlite3-shm while code-index commands are running.
    Made with Material for MkDocs
    BFDev Docs Assistant
    New conversation?
    Ask one focused question at a time, this helps the assistant provide accurate answers about what's been implemented in BetterFleet.