Skip to content

Datagen inputs are too big for model context length #316

@changminbark

Description

@changminbark

What happened: The ShareGPT dataset seems to have 2 inputs that are longer the max input size for Qwen2.5 0.5B. This results in 2 failures when running the following config.

Image Image

What you expected to happen: I am not sure what is the intended behavior. Is it supposed to just fail or should we truncate the datagen input to fit within the model's context length

How to reproduce it (as minimally and precisely as possible): I used Qwen/Qwen2.5-0.5B-Instruct along with the shareGPT dataset. The following is the config.yml:

load:
  type: constant
  stages:
  - rate: 1
    duration: 30
api: 
  type: completion
  streaming: true
server:
  type: vllm
  model_name: Qwen/Qwen2.5-0.5B-Instruct
  base_url: http://0.0.0.0:8000
  ignore_eos: true
tokenizer:
  pretrained_model_name_or_path: Qwen/Qwen2.5-0.5B-Instruct
data:
  type: shareGPT
metrics:
  type: prometheus
  prometheus:
    url: http://localhost:9090
    scrape_interval: 15
report:
  request_lifecycle:
    summary: true
    per_stage: true
    per_request: false
    per_adapter: true
    per_adapter_stage: true
  prometheus:
    summary: true
    per_stage: false

Anything else we need to know?:

Environment:

  • inference-perf version: Latest (v0.3.0)
  • config.yml (entire one printed by the benchmark run): See above
  • cloud provider or hardware configuration: local
  • others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions