Skip to content

Excessive confusion and stupidity #50

@runvnc

Description

@runvnc

Backend impacted

The PyTorch implementation

Operating system

Linux

Hardware

GPU with CUDA

Description

Thank you so much for this work. I think you have improved over the original open source Moshi demo and this is a big step towards something useful without specific fine-tuning.

However, and I'm not sure if it's because I am doing an outgoing rather than incoming call scenario, but using the demo client/server: the model seems very easily confused, bad at following instructions, and generally stupid.

It is getting the important names confused with other people who might be involved, such as the person that answered, and cannot adjust to the idea that that it made an outgoing call. When I had an example name in a call script in the instructions, it mixed that up with the real name that I clearly indicated.

I am hoping someone will build something similar but a bit larger, maybe 24B or so.

Extra information

See above.

Environment

Fill in the following information on your system.

  • Operating system version: Ubuntu 24

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions