Skip Navigation
InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)MO
Posts
10
Comments
14
Joined
2 mo. ago
  • Technically it supports fewer languages than whisper, 40 vs 99

    The main problem isn't "bother", it's training data. You need hundreds of thousands of hours of high quality transcripts to train models like these and that just doesn't exist for like zulu or whatever

  • LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee
  • I want to clarify something. Reranker is a general term that can refer to any model used for reranking. It is independent of implementation.

    What you refer to

    because reranker models look at the two pieces of content simultaneously and can be fine tuned to the domain in question. They shouldn't be used for the initial retrieval because the evaluation time is O(n²) as each combination of input

    Is a specific implementation known as CrossEncoder that is common for reranking models but not retrieval ones for the reasons you described. But you can also use any other architecture

  • LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee

    Sentence transformers v4

    LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee

    NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms

  • Claude frequently draws svgs to illustrate things for me (I'm guessing it's in the prompt) but even though it's better at it than all the other models, it still kinda sucks. It's just fudamentally dumb task to do for a purely language model, similar to the arc-agi benchmark , just makes more sense for a vision model and trying to get an llm to do is a waste

  • LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee

    StarVector - a foundation model for generating svgs

    LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee

    EXAONE Deep ━ Setting a New Standard for Reasoning AI - LG AI Research News

    LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee
    LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee
    LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee

    Reka Flash, open source 21B model comparable to QWQ 32B

    LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee
    arxiv.org Chain of Draft: Thinking Faster by Writing Less

    Large Language Models (LLMs) have demonstrated remarkable performance in solving complex reasoning tasks through mechanisms like Chain-of-Thought (CoT) prompting, which emphasizes verbose, step-by-step reasoning. However, humans typically employ a more efficient strategy: drafting concise intermedia...

    Chain of Draft: Thinking Faster by Writing Less
    LocalLLaMA @sh.itjust.works
    morrowind @lemm.ee

    Atom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1

    bsky.app Sung Kim (@sungkim.bsky.social)

    Atom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1 ! For each reasoning step, it: 1. Decompose the question into DAG 2. Contract the subquestions into a NEW simpler question 3. Iterate until reaching an atomic question

    Sung Kim (@sungkim.bsky.social)