HNNewShowAskJobs
Built with Tanstack Start
Arcee Trinity Mini: US-Trained Moe Model(arcee.ai)
59 points by hurrycane 12 hours ago | 14 comments
  • halJordan11 hours ago

    Looks like a less good version of qwen 30b3a which makes sense bc it is slightly smaller. If they can keep that effiency going into the large one it'll be sick.

    Trinity Large [will be] a 420B parameter model with 13B active parameters. Just perfect for a large Ram pool @ q4.

  • davidsainez8 hours ago

    Excited to put this through its paces. It seems most directly comparable to GPT-OSS-20B. Comparing their numbers on the Together API: Trinity Mini is slightly less expensive ($0.045/$0.15 v $0.05/$0.20) and seems to have better latency and throughput numbers.

  • htrp11 hours ago

    Trinity Nano Preview: 6B parameter MoE (1B active, ~800M non-embedding), 56 layers, 128 experts with 8 active per token

    Trinity Mini: 26B parameter MoE (3B active), fully post-trained reasoning model

    They did pretraining on their own and are still training the large version on 2048 B300 GPUs

  • ksynwa8 hours ago

    > Trinity Large is currently training on 2048 B300 GPUs and will arrive in January 2026.

    How long does the training take?

    • arthurcolle6 hours ago |parent

      Couple days or weeks usually. No one is doing 9 month training runs

  • bitwize11 hours ago

    A moe model you say? How kawaii is it? uwu

    • ghc10 hours ago |parent

      Capitalization makes a surprising amount of difference here...

    • donw9 hours ago |parent

      Meccha at present, but it may reach sugoi levels with fine-tuning.

    • noxa11 hours ago |parent

      I hate that I laughed at this. Thanks ;)

  • trvz6 hours ago

    Moe ≠ MoE

    • cachius6 hours ago |parent

      ?

      • azinman26 hours ago |parent

        The HN title uses incorrect capitalization.

        • rbanffy5 hours ago |parent

          I was eagerly waiting for the Larry and Curly models.

      • m4rtink3 hours ago |parent

        ^_-