Skip to content

Ask for a completed environment config for reproduction #2

@KinnariyaMamaTanha

Description

@KinnariyaMamaTanha

Hello, I recently try to reproduce your wonderful work but meet a little problems. When I follow the instruction in your README to set up the environment using the following commands:

conda create -n infercept python=3.10
conda activate infercept
# clong your repository to infercept
cd infercept/
pip install -e .

However, according to this issue, the auto-installed torch==2.6.0 and triton==3.2.0 can not run. So I change the requirements.txt to

ninja  # For faster builds.
psutil
ray >= 2.5.1
pandas  # Required for Ray data.
pyarrow  # Required for Ray data.
sentencepiece  # Required for LLaMA tokenizer.
numpy
torch == 2.0.1
transformers >= 4.33.1  # Required for Code Llama.
xformers >= 0.0.22
fastapi
uvicorn[standard]
pydantic < 2  # Required for OpenAI server.
gurobipy
rich
deepspeed == 0.12.3
deepspeed-kernels

which specifies the torch version to 2.0.1 (just wanna have a try).

Rebuild the environment and then try to use your AsyncLLMEngine class. However, there will be another error during the initialization of the engine:

ERROR 03-24 08:41:49 async_llm_engine.py:296] Failed to initialize async LLM engine: /root/InferCept/vllm/attention_ops.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c1021throwNullDataPtrErrorEv
Traceback (most recent call last):
  File "bench_infercept.py", line 107, in <module>
    llm_servers = setup_infercept(infercept_config)
  File "/root/evaluation/infercept/setup_infercept.py", line 19, in setup_infercept
    servers = [
  File "/root/evaluation/infercept/setup_infercept.py", line 21, in <listcomp>
    AsyncLLMEngine.from_engine_args(infercept_config.engine_args)
  File "/root/InferCept/vllm/engine/async_llm_engine.py", line 564, in from_engine_args
    engine = cls(engine_args.worker_use_ray,
  File "/root/InferCept/vllm/engine/async_llm_engine.py", line 297, in __init__
    raise e
  File "/root/InferCept/vllm/engine/async_llm_engine.py", line 294, in __init__
    self.engine = self._init_engine(*args, **kwargs)
  File "/root/InferCept/vllm/engine/async_llm_engine.py", line 334, in _init_engine
    return ray.get(ray.remote(num_cpus=0)(self._engine_class(*args, **kwargs)).remote())
  File "/root/InferCept/vllm/engine/llm_engine.py", line 112, in __init__
    self._init_workers_ray(placement_group)
  File "/root/InferCept/vllm/engine/llm_engine.py", line 173, in _init_workers_ray
    from vllm.worker.worker import Worker  # pylint: disable=import-outside-toplevel
  File "/root/InferCept/vllm/worker/worker.py", line 10, in <module>
    from vllm.model_executor import get_model, InputMetadata, set_random_seed
  File "/root/InferCept/vllm/model_executor/__init__.py", line 2, in <module>
    from vllm.model_executor.model_loader import get_model
  File "/root/InferCept/vllm/model_executor/model_loader.py", line 10, in <module>
    from vllm.model_executor.models import *  # pylint: disable=wildcard-import
  File "/root/InferCept/vllm/model_executor/models/__init__.py", line 1, in <module>
    from vllm.model_executor.models.aquila import AquilaForCausalLM
  File "/root/InferCept/vllm/model_executor/models/aquila.py", line 35, in <module>
    from vllm.model_executor.layers.attention import PagedAttentionWithRoPE
  File "/root/InferCept/vllm/model_executor/layers/attention.py", line 10, in <module>
    from vllm import attention_ops
ImportError: /root/InferCept/vllm/attention_ops.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c1021throwNullDataPtrErrorEv

It seems to be the problem that vllm version doesn't exactly match the torch version in the env. However, I can not find the exact version of torch in your repo. Thus I wanna bother you to give me a completed version of the requirements.txt. Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions