• progress_activity cloud_sync

    Reconnection to the server…

    Movim cannot talk with the server, please try again later

  • back_to_tab fullscreen tile_small dialpad mic videocam switch_camera screen_share

    mic_none No sound detected from your microphone


    • Public subscriptions

    • chevron_right

      coopr8

    • chevron_right

      gabagoo

    • chevron_right

      kenu_demon

    • chevron_right

      coopr8

    • chevron_right

      gabagoo

    • chevron_right

      kenu_demon

    • chevron_right

      coopr8

    • chevron_right

      gabagoo

    • chevron_right

      kenu_demon

  • Register Login

    Movim

    movim.chatterboxtown.us


  • group_work rss_feed
    add Follow

    ArsTechnica

    • Ar chevron_right

      New study accuses LM Arena of gaming its popular AI benchmark

      news.movim.eu / ArsTechnica • 1 May 2025 • 1 minute

    The rapid proliferation of AI chatbots has made it difficult to know which models are actually improving and which are falling behind. Traditional academic benchmarks only tell you so much, which has led many to lean on vibes-based analysis from LM Arena. However, a new study claims this popular AI ranking platform is rife with unfair practices, favoring large companies that just so happen to rank near the top of the index. The site's operators, however, say the study draws the wrong conclusions.

    LM Arena was created in 2023 as a research project at UC Berkeley. The pitch is simple—users feed a prompt into two unidentified AI models in the "Chatbot Arena" and evaluate the outputs to vote on the one they like more. This data is aggregated in the LM Arena leaderboard that shows which models people like the most, which can help track improvements in AI models.

    Companies are paying more attention to this ranking as the AI market heats up. Google noted when it released Gemini 2.5 Pro that the model debuted at the top of the LM Arena leaderboard, where it remains to this day. Meanwhile, DeepSeek's strong performance in the Chatbot Arena earlier this year helped to catapult it to the upper echelons of the LLM race.

    Read full article

    Comments

    • tagai tagai tagai tagartificial intelligence tagartificial intelligence tagartificial intelligence taggoogle taggoogle taggoogle taglm arena taglm arena taglm arena tagmeta tagmeta tagmeta tagopenai tagopenai tagopenai tagai tagai tagai tagartificial intelligence tagartificial intelligence tagartificial intelligence taggoogle taggoogle taggoogle taglm arena taglm arena taglm arena tagmeta tagmeta tagmeta tagopenai tagopenai tagopenai tagai tagai tagai tagartificial intelligence tagartificial intelligence tagartificial intelligence taggoogle taggoogle taggoogle taglm arena taglm arena taglm arena tagmeta tagmeta tagmeta tagopenai tagopenai tagopenai

    • Pictures 3 image

    • visibility
    • visibility
    • visibility
    • Ar chevron_right

      New study accuses LM Arena of gaming its popular AI benchmark

      news.movim.eu / ArsTechnica • 1 May 2025 • 1 minute

    The rapid proliferation of AI chatbots has made it difficult to know which models are actually improving and which are falling behind. Traditional academic benchmarks only tell you so much, which has led many to lean on vibes-based analysis from LM Arena. However, a new study claims this popular AI ranking platform is rife with unfair practices, favoring large companies that just so happen to rank near the top of the index. The site's operators, however, say the study draws the wrong conclusions.

    LM Arena was created in 2023 as a research project at UC Berkeley. The pitch is simple—users feed a prompt into two unidentified AI models in the "Chatbot Arena" and evaluate the outputs to vote on the one they like more. This data is aggregated in the LM Arena leaderboard that shows which models people like the most, which can help track improvements in AI models.

    Companies are paying more attention to this ranking as the AI market heats up. Google noted when it released Gemini 2.5 Pro that the model debuted at the top of the LM Arena leaderboard, where it remains to this day. Meanwhile, DeepSeek's strong performance in the Chatbot Arena earlier this year helped to catapult it to the upper echelons of the LLM race.

    Read full article

    Comments

    • tagai tagai tagai tagartificial intelligence tagartificial intelligence tagartificial intelligence taggoogle taggoogle taggoogle taglm arena taglm arena taglm arena tagmeta tagmeta tagmeta tagopenai tagopenai tagopenai tagai tagai tagai tagartificial intelligence tagartificial intelligence tagartificial intelligence taggoogle taggoogle taggoogle taglm arena taglm arena taglm arena tagmeta tagmeta tagmeta tagopenai tagopenai tagopenai tagai tagai tagai tagartificial intelligence tagartificial intelligence tagartificial intelligence taggoogle taggoogle taggoogle taglm arena taglm arena taglm arena tagmeta tagmeta tagmeta tagopenai tagopenai tagopenai

    • Pictures 3 image

    • visibility
    • visibility
    • visibility
    • Ar chevron_right

      New study accuses LM Arena of gaming its popular AI benchmark

      news.movim.eu / ArsTechnica • 1 May 2025 • 1 minute

    The rapid proliferation of AI chatbots has made it difficult to know which models are actually improving and which are falling behind. Traditional academic benchmarks only tell you so much, which has led many to lean on vibes-based analysis from LM Arena. However, a new study claims this popular AI ranking platform is rife with unfair practices, favoring large companies that just so happen to rank near the top of the index. The site's operators, however, say the study draws the wrong conclusions.

    LM Arena was created in 2023 as a research project at UC Berkeley. The pitch is simple—users feed a prompt into two unidentified AI models in the "Chatbot Arena" and evaluate the outputs to vote on the one they like more. This data is aggregated in the LM Arena leaderboard that shows which models people like the most, which can help track improvements in AI models.

    Companies are paying more attention to this ranking as the AI market heats up. Google noted when it released Gemini 2.5 Pro that the model debuted at the top of the LM Arena leaderboard, where it remains to this day. Meanwhile, DeepSeek's strong performance in the Chatbot Arena earlier this year helped to catapult it to the upper echelons of the LLM race.

    Read full article

    Comments

    • tagai tagai tagai tagartificial intelligence tagartificial intelligence tagartificial intelligence taggoogle taggoogle taggoogle taglm arena taglm arena taglm arena tagmeta tagmeta tagmeta tagopenai tagopenai tagopenai tagai tagai tagai tagartificial intelligence tagartificial intelligence tagartificial intelligence taggoogle taggoogle taggoogle taglm arena taglm arena taglm arena tagmeta tagmeta tagmeta tagopenai tagopenai tagopenai tagai tagai tagai tagartificial intelligence tagartificial intelligence tagartificial intelligence taggoogle taggoogle taggoogle taglm arena taglm arena taglm arena tagmeta tagmeta tagmeta tagopenai tagopenai tagopenai

    • Pictures 3 image

    • visibility
    • visibility
    • visibility
  • cloud_queue

    Powered by Movim