Skip to content

[Question]: 多卡时运行run_pretrain.py存在报错,单卡时可以正常运行 #11205

@Buddingpopp

Description

@Buddingpopp

请提出你的问题

单卡时训练可以正常完成,在相同配置下,单机双GPU运行 python -u -m paddle.distributed.launch --gpus "0,1" run_pretrain.py ./config/qwen/pretrain_argument_0p5b.json
会有如下报错:

OSError: [Errno 101] Network is unreachable

urllib3.exceptions.NewConnectionError: HTTPSConnection(host='bj.bcebos.com', port=443): Failed to establish a new connection: [Errno 101] Network is unreachable

urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Qwen/Qwen2.5-0.5B/added_tokens.json (Caused by NewConnectionError("HTTPSConnection(host='bj.bcebos.com', port=443): Failed to establish a new connection: [Errno 101] Network is unreachable"))

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Qwen/Qwen2.5-0.5B/added_tokens.json (Caused by NewConnectionError("HTTPSConnection(host='bj.bcebos.com', port=443): Failed to establish a new connection: [Errno 101] Network is unreachable"))

OSError: Can't load the model for 'Qwen/Qwen2.5-0.5B'. If you were trying to load it from 'BOS', make sure you don't have a local directory with the same name. Otherwise, make sure 'Qwen/Qwen2.5-0.5B' is the correct path to a directory containing one of the ['added_tokens.json']

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Qwen/Qwen2.5-0.5B/added_tokens.json (Caused by NewConnectionError("HTTPSConnection(host='bj.bcebos.com', port=443): Failed to establish a new connection: [Errno 101] Network is unreachable"))

OSError: Can't load the model for 'Qwen/Qwen2.5-0.5B'. If you were trying to load it from 'BOS', make sure you don't have a local directory with the same name. Otherwise, make sure 'Qwen/Qwen2.5-0.5B' is the correct path to a directory containing one of the ['added_tokens.json']

Metadata

Metadata

Assignees

Labels

questionFurther information is requestedstale

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions