Skip Navigation

jaykrown

@ jaykrown @lemmy.world

Posts

96
Comments

454
Joined

1 yr. ago

Moderating

AI Generated lemmy.world

AI News lemmy.world

3mo ago

Reframing Universal Basic Income as Automation Compensation
Jump
1

jaykrown @lemmy.world 3mo ago

Yea I agree, but getting people behind the idea is difficult. Too many idiots nowadays would think "dividend" and "We The People" is communist or some stupid shit. The word "Compensation" would break through a lot of the idiots brainwashing and bring more people together as they see robots literally do the work that they could have done and been paid for.

3mo ago

I dunno

Jump

jaykrown @lemmy.world 3mo ago

Doubt it, don't think Bezos or Musk give a single shit what you think.

3mo ago

I dunno

Jump

jaykrown @lemmy.world 3mo ago

I may very well be, still researching.

3mo ago

I dunno

Jump

jaykrown @lemmy.world 3mo ago

Funny, tell that to the billionaires who have a private jet.

3mo ago

JPEG XL is Dead. Long Live JPEG XL

Jump

jaykrown @lemmy.world 3mo ago

"AVIF is an image file format that uses AV1 compression algorithms." yes i mean that

3mo ago

I dunno

Jump

jaykrown @lemmy.world 3mo ago

3mo ago

JPEG XL is Dead. Long Live JPEG XL

Jump

jaykrown @lemmy.world 3mo ago

Everyone should just be using AV1 at this point. https://en.wikipedia.org/wiki/AV1

3mo ago

I dunno

Jump

jaykrown @lemmy.world 3mo ago

3mo ago

I dunno

Jump

jaykrown @lemmy.world 3mo ago

3mo ago

OpenAI needs to raise at least $207bn by 2030 so it can continue to lose money, HSBC estimates

Jump

jaykrown @lemmy.world 3mo ago

What's your knowledge regarding LLMs, if any at all?

3mo ago

OpenAI needs to raise at least $207bn by 2030 so it can continue to lose money, HSBC estimates

Jump

jaykrown @lemmy.world 3mo ago

Yes, as far as scalability, cheaper more efficient models can be used in applications which require thousands of uses a day.

3mo ago

OpenAI needs to raise at least $207bn by 2030 so it can continue to lose money, HSBC estimates

Jump

jaykrown @lemmy.world 3mo ago

This is peak bubble type news. AI is becoming rapidly more energy efficient. These events will be looked back on like pets.com reaching hundreds of millions and then dying.

3mo ago

Unemployment could hit 25% among recent grads and trigger 'unprecedented' social disruption thanks to AI, U.S. senator warns

Jump

jaykrown @lemmy.world 3mo ago

The student loans are never being paid back, just like the federal debt.

3mo ago

How can you tell if music is AI-generated?

Jump

jaykrown @lemmy.world 3mo ago

Not sure but I generated this and it's something I really like https://www.youtube.com/watch?v=XPYk3zjDY-o

3mo ago

AI Model Efficiency Index v2.1

Jump

jaykrown @lemmy.world 3mo ago

AI Model Efficiency Index 2.1 — Methodology Summary

Goal: Rank AI models by real-world value (performance per dollar) using harder, less-contaminated benchmarks.

Benchmarks Used (8 metrics):

20% SWE-bench – real-world coding tasks (repo-level bug fixes)
15% MMLU-Pro – harder general knowledge (resists saturation)
15% Humanity's Last Exam – extremely difficult academic reasoning
15% GPQA Diamond – PhD-level science questions
10% ARC-AGI – abstract reasoning and problem-solving
15% Chatbot Arena Elo – human preference (crowdsourced rankings)
10% RULER – long-context robustness (32k–128k tokens)
10% EQBench – emotional intelligence and creative quality

Why This Mix?

Reduces gaming and contamination (avoids relying on easy, memorized benchmarks like vanilla MMLU).
Captures multiple capability dimensions: coding, reasoning, long-context, human preference, and creativity.
Harder benchmarks are less saturated, making score differences meaningful.

Calculation:

Normalize all 8 benchmark scores to 0–100 scale.
Compute weighted composite score for each model.
Divide composite score by blended API cost (3:1 input:output token ratio).
Rank by efficiency index (higher = better value).

Coverage:

Includes only models with complete or near-complete data across all 8 metrics.
Excludes enterprise/niche models (Cohere, AI21, Baichuan) due to incomplete benchmark coverage or opaque pricing.
All models are 2025 releases with public pricing and APIs.

3mo ago

I am working on an "AI Efficiency Index"

Jump

jaykrown @lemmy.world 3mo ago

Okay here's the new bar chart with the more spread out weighting, and honestly it looks a lot more reasonable.

DeepSeek V3.2-Exp (Sep 2025) — 69.26
Kimi K2 Thinking (Nov 2025) — 66.19
Gemini 2.5 Flash (May 2025) — 58.73
Qwen 3 Max (Jul 2025) — 55.56
GPT-5 (Aug 2025) — 21.25
o3 (Apr 2025) — 20.39
Gemini 2.5 Pro (Mar 2025) — 19.98
Gemini 3 Pro (Nov 2025) — 19.82
Claude 3.5 Sonnet (Aug 2025) — 10.17
GPT-5 Pro (Aug 2025) — 1.96

3mo ago

I am working on an "AI Efficiency Index"

Jump

jaykrown @lemmy.world 3mo ago

Thank you, I'll factor that into the index. If you have any other recommendations for how to make the index more robust let me know. The goal is to make this dependent on real world API costs. I don't care if the newest smartest model is released if it costs $100 every time to use.