• 0 Posts
  • 6 Comments
Joined 6 months ago
cake
Cake day: July 13th, 2024

help-circle
  • BB84@mander.xyztoTechTakes@awful.systemsOpenAI is so cooked and I'm all here for it
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    5
    ·
    edit-2
    2 days ago

    Can someone explain why I am being downvoted and attacked in this thread? I swear I am not sealioning. Genuinely confused.

    @sc_griffith@awful.systems asked how request frequency might impact cost per request. Batch inference is a reason (ask anyone in the self-hosted LLM community). I noted that this reason only applies at very small scale, probably much smaller than what OpenAI is operating at.

    @dgerard@awful.systems why did you say I am demanding someone disprove the assertion? Are you misunderstanding “I would be very very surprised if they couldn’t fill [the optimal batch size] for any few-seconds window” to mean “I would be very very surprised if they are not profitable”?

    The tweet I linked shows that good LLMs can be much cheaper. I am saying that OpenAI is very inefficient and thus economically “cooked”, as the post title will have it. How does this make me FYGM? @froztbyte@awful.systems






  • Stop depending on these proprietary LLMs. Go to !localllama@sh.itjust.works.

    There are open-source LLMs you can run on your own computer if you have a powerful GPU. Models like OLMo and Falcon are made by true non-profits and universities, and they reach GPT-3.5 level of capability.

    There are also open-weight models that you can run locally and fine-tune to your liking (although these don’t have open-source training data or code). The best of these (Alibaba’s Qwen, Meta’s llama, Mistral, Deepseek, etc.) match and sometimes exceed GPT 4o capabilities.