
The Trump administration is considering barring Americans from accessing AI models created by the Chinese AI lab, DeepSeek.

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
The Trump administration is considering barring Americans from accessing AI models created by the Chinese AI lab, DeepSeek.
The Trump administration is considering new restrictions on the Chinese AI lab DeepSeek that would limit it from buying Nvidia’s AI chips and potentially bar Americans from accessing its AI services, The New York Times reported on Wednesday.
This is a lightweight as it gets.
Which model are you using?
Just thinking about making this a monthly post, which model are you using? what are the positives and negatives?
Can it play Doom? - New VLM Benchmark
VideoGameBench is a benchmark for video game VLM agents.
Microsoft just released BitNet!
Official inference framework for 1-bit LLMs. Contribute to microsoft/BitNet development by creating an account on GitHub.
Let's go! Lossless CPU inference
What do you use AI for?
Ok, so the best thing that I have been using my local AI for lately is acting as Socrates and helping me delve into my thinking
::: spoiler Long text here
When considering the Socratic method, what questions did Socrates ask people, to start the process of enquiry
Okay, let's break down the types of questions Socrates used to initiate the Socratic method. It's important to understand that it wasn't a rigid formula, but a flexible approach designed to expose contradictions and ultimately lead to deeper understanding. Here's a breakdown, categorized by their function and with examples. I'll also include some key principles at the end.
1. Initial Clarification & Agreement (Establishing a Starting Point - Often Seemingly Simple)
llama4 release discussion thread
General consensus seems to be that llama4 was a flop. A head of meta AI research division was let go.
Do you think it was a bad fp32 conversion, or just unerwhelming models all around?
2t parameters was a big increase without much gain. If throwing compute and parameters isnt working to stay competitive anymore, how do you think the next big performance gains will be made? Better CoT reasoning patterns? Omnimodal? something entirely new?
The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation
We’re introducing Llama 4 Scout and Llama 4 Maverick, the first open-weight natively multimodal models with unprecedented context support and our first built using a mixture-of-experts (MoE) architecture.
We’re introducing Llama 4 Scout and Llama 4 Maverick, the first open-weight natively multimodal models with unprecedented context support and our first built using a mixture-of-experts (MoE) architecture.
Tips for getting Ollama to be useful with home assistant?
So far I have tried to get it to read me my shoping list, turn on my desklight, and tell me how many hours until sundown. It wasn't able to do any of those things. I'm using llama3.1:8b. I have a 3080 12gb. I also tried mixtral:8x7b but it just says "unexpected intent error".
This report introduces Dolphin, a large-scale multilingual automatic speech recognition (ASR) model that extends the Whisper architecture to support a wider range of languages. Our approach integrates in-house proprietary and open-source datasets to refine and optimize Dolphin's performance. The mod...
Timelapse of our current LocaLLaMA community thumbnail llama creation
Im having some fun experimenting with gif making tonight hope you don't mind the animation. This shows the iterative creation process of our current thumbnail.
I kind of knew what I wanted in my minds eye with a front facing llama to juxtapose the old thumbnail sideways view. I went searching for AI generated llama images since thats fitting of the fourm and copyright free public domain (as far as I understand).
The original image was generated by Stable Diffusion. I like it a lot as is but to be a good thumbnail for the community it needed to be easily recognizable and renderable on small screens. First picked the color.The purple is more bright and slightly reddish in tinge which helps with pop. Then I expanded the neck to fill to the bottom.
All those detail lines created nasty artifacting when compressed on small phone icons of the community, They needed to go. I left the chin hair lines and eye lines to not make it too simple. The nose ridge outline was thickened for some recog
Latest release of kobold.cpp adds tts voice cloning support via OuteTTS, updates multimodal vision mmproj projectors for Qwen2.5 VL
Every release from kobold has me hyped one of the nicest engines that balances cutting edge features with ease of use and optimization. This is gonna be a great year for LocalLLaMA Hype :)
Some updates on community changes and future goals (03-28-2025)
Hi everyone! I recently became moderator of this community and have been making some changes. I figured it would be good practice to be transparent with you and document what's been going on.
Ive been experimenting with some different thumbnails for our community. I didn't really want to keep associating with r/localllama in any way we dont need to copy them.
Old thumbnail
New Thumbnail:
Anthropic develops new tool to examine hidden processes in LLM generation
What the firm found challenges some basic assumptions about how this technology really works.
I liked reading this article. Its cool to really poke into the hidden perplexity behind patterns of 'thought' in llms. They aren't merely simple 'auto complete'.
The finding that claude does math in a different way then it says it does and can anticipate words ahead of generation time are facinating.
M4 Max 128GB vs M1 Ultra 128GB
I am unsure whether to buy the M4 Max Max Studio with 128GB ram and 500GB/s bandwidth or M1 Ultra Mac Studio with 128GB ram and 800 GB/s bandwidth.
The Max costs 4400€ The Ultra costs 3500€ second-hand
What would you buy? I am not going to use it only for LLMs but for coding, machine learning, gaming.
Best model for programming?
I upgraded my PC, I have just upped the RAM to 64GB and got a Radeon 9700xt 16GB GPU.
I want to do some personal programming projects and was wondering what is the best model that would fit my HW?
I mostly use python, but dabble a little in go, and am interested in looking at rust.
I am running this on my desktop, the processor is a AMD 3700X (8c/16t).