Skip to content

statnett/Talk2PowerSystem_LLM

Large-Language Model (LLM) part of Talk2PowerSystem

Overview

Large-Language Model (LLM) part of Talk2PowerSystem (Talk2PowerSystem_LLM) is a core component of the Talk2PowerSystem project, providing all the necessary coding and scripting to support the integration and operation of a Large-Language Model (LLM). This project focuses on enabling robust LLM functionalities, including data preprocessing, model training, inference, and seamless integration with other parts of the Talk2PowerSystem ecosystem.

Features

  • Data Preprocessing: Scripts to clean, normalize, and format data for LLM training.

  • Model Training: Pipelines and utilities for fine-tuning and training LLM models.

  • Inference Engine: Code for running real-time queries and generating model predictions.

  • System Integration: Tools and interfaces to connect the LLM with other components of the Talk2PowerSystem project.

  • Testing and Evaluation: Automated tests and performance evaluation scripts to ensure model reliability and accuracy.

Project Structure

The repository is organized as follows:

  • config/ - Configuration files for model parameters and environment settings.

  • docker/ - Dockerfile for the FastAPI chatbot application.

  • docs/ - Documentation, guides, and technical notes.

  • evaluation_results/ - Directory, which holds the evaluation results of the system.

  • helm-chart/ - Directory, which holds resources for easier deployment on Kubernetes environments.

  • src/ - Main source code including training, inference, and integration scripts.

  • tests/ - Unit and integration tests for various modules.

Installation

Prerequisites

  • You should install conda. miniconda will suffice.

Setup

To set up the project locally, follow these steps:

  1. Clone the repository:

    git clone https://github.com/statnett/Talk2PowerSystem_LLM.git
  2. Create a conda environment and install dependencies

    conda create --name Talk2PowerSystemLLM --file conda-linux-64.lock
    conda activate Talk2PowerSystemLLM
    poetry install

Run tests

Unit tests

conda activate Talk2PowerSystemLLM
poetry install --with test
poetry run pytest --cov=talk2powersystemllm --cov-report=term-missing tests/unit_tests/

Acceptance tests

GraphDB License Management

The acceptance tests require a valid GraphDB license to run in CI. On GitHub, this is handled via the GRAPHDB_LICENSE GitHub Secret.

Current License Expiry: 2027-03-20

How to Update the License

Since the license is a binary file, it must be stored as a Base64 encoded string to prevent corruption during transport. When the license expires, follow these steps:

  1. Encode the new binary file

    Run this command on your local machine to generate the encoded string:

    # Linux/GNU (standard)
    base64 -w 0 /path/to/new/graphdb.license
    
    # macOS (BSD)
    base64 -i /path/to/new/graphdb.license
  2. Update GitHub Secrets

    • Navigate to the repository menu:Settings[Secrets and variables > Actions].

    • Locate the GRAPHDB_LICENSE secret and click the Edit (pencil) icon.

    • Paste the entire output from the command above into the value field and save.

Local Execution

When running acceptance tests locally, the environment does not use the GitHub Secret. Instead, you must pass the absolute path to the license file on your machine via the LICENSE_PATH environment variable:

bash ./docker/generate-manifest.sh
docker buildx build --file docker/Dockerfile --tag talk2powersystem .
docker buildx build --file tests/acceptance_tests/docker-compose/DockerfileAcceptanceTests --tag talk2powersystem-acceptance-tests .
docker buildx build --file tests/acceptance_tests/docker-compose/DockerfileGraphDB --tag graphdb .
LICENSE_PATH=/path/to/graphdb.license docker compose -f tests/acceptance_tests/docker-compose/docker-compose.yaml run --rm talk2powersystem-acceptance-tests poetry run pytest tests/acceptance_tests/
docker compose -f tests/acceptance_tests/docker-compose/docker-compose.yaml down -v --remove-orphans

Versioning

The chat bot releases follow the Semantic Versioning Standard. However, poetry follows the PEP 440 Standard, so in order to create a pre-release we need to do update the version manually. The official releases versions are in the form of major.minor.patch. The pre-releases versions are in the form of major.minor.patch-rc<N>. The development versions are in the form of major.minor.patch-dev0.

Steps to create an official release (from the main branch):

  1. conda activate Talk2PowerSystemLLM

  2. poetry version <major|minor|patch>.

  3. Update CHANGELOG.md

  4. git add pyproject.toml CHANGELOG.md

  5. git commit -m "Bumping version from <previous-version> to <current-version>"

  6. git push -u origin main

  7. Create a release from the GitHub interface. The tag and the release title must match the version from poetry!

  8. Edit the version in pyproject.toml - increase the patch and add -dev0 after it. For example, if the version is 2.0.0, the next version must be 2.0.1-dev0.

  9. git add pyproject.toml

  10. git commit -m "Bumping version from <previous-version> to <current-version>"

  11. git push -u origin main

Steps to create a pre-release (from the main branch):

  1. conda activate Talk2PowerSystemLLM

  2. Edit the version in pyproject.toml - If the current version is a pre-release version, then you must increase the release candidate number. For example, if the current version is 1.2.0-rc1, the next version must be 1.2.0-rc2. If the current version is a development version, then the next pre-release version must follow the semantic versioning convention on how to increment the major, minor and patch parts of the version and add -rc1 at the end. For example, if the current version is 1.1.0-dev0 and the next release will be a major one, the next version must be 2.0.0-rc1.

  3. Update CHANGELOG.md

  4. git add pyproject.toml CHANGELOG.md

  5. git commit -m "Bumping version from <previous-version> to <current-version>"

  6. git push -u origin main

  7. Create a release from the GitHub interface. The tag and the release title must match the version from poetry!

We release changes on demand. First, we create pre-releases, then we deploy them, and test the changes. Until the changes don’t meet the acceptance criteria, new pre-releases are created with fixes. If the changes meet the acceptance criteria, an official release is created, which is deployed.

License

Talk2PowerSystem_LLM is licensed under the Apache License 2.0. For more information, see the LICENSE file.

About

Large-Language Model (LLM) part of Talk2PowerSystem

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors