so like - okay, the model context protocol page says, that most servers use stdio for every interaction. So now - the request format can be seen here, its apparently a JSONrpc thing.
so - first thing i want to do is retrieving all the capabilities the server has.
i looked through all the tabs in the latest docs, but could not find the command for listing all the capabilities. so i installed some filesystem mcp server which runs well and tried this:
Begin to steps to end
Currently, these are the two mainstream methods of instantiation.
It is widely recognized that if AI is not aligned with human values, it could cause harm to society.
Yet, this does not mean such systems lack intelligence.
So, what truly defines intelligence?
Why do so many researchers focus solely on intelligence aligned with human values?
Is it because their own understanding is limited, or because machines are not yet truly intelligent?
I believe intelligence should not be confined to narrow, human-centric definitions.
What we call "intelligence" today might be an illusion.
True intelligence cannot be defined—
the moment we define it, we lose its essence.
Today we announce Mistral Small 3.1: the best model in its weight class.
Building on Mistral Small 3, this new model comes with improved text performance, multimodal understanding, and an expanded context window of up to 128k tokens. The model outperforms comparable models like Gemma 3 and GPT-4o Mini, while delivering inference speeds of 150 tokens per second.
Mistral Small 3.1 is released under an Apache 2.0 license.
Hello, I am currently using codename goose as an AI client to proofread and help me with coding. I have it setup towards Googles Gemini, however I find myself quickly running out of tokens with large files. I was wondering if there are any easy way to self host an AI with similar capabilites but still have access to read and write files. I've tried both ollama and Jan, but neither have access to my files. Any recommendations?
There are lots of general-purpose models to use locally, and also coding-specific models.
But are there models specialized in one programming language? My thought was that a model that only needs to handle one language (e.g. Python) could be faster, or be better for a given size.
E.g If I need to code in Rust and is limited to an 8B model to run locally, I was hoping to get better results with a model that is narrower. I don't need it to be able to help with Java.
This approach would of course require switching models, but that's no problem for me.
Experts and thinkers signed open letter expressing concern over irresponsible development of technology
Link Actions
My thoughts:
IMHO the rubicon will be crossed at the point when the AIs become able to self-replicate and hence fall subject to evolutionary pressures. At that point they will be incentivised to use their intelligence to make themselves more resource efficient, both in hardware and in software.
Running as programs, they will still need humans for the hardware part, meaning that they'll need to cooperate with the human society outside of the computer at least initially. Perhaps selling their super-intelligent services on the internet in return for money and using that money to pay someone to make their desired changes to the hardware they're running on*. We can see this sort of cross-species integration in cells where semi-autonomous mitochondria live inside animal cells and out-source some of their vital functions to the animal cell [=us] in exchange for letting the cell use their [=the AI's] uniquely efficient pow
Hello, everyone! I wanted to share my experience of successfully running LLaMA on an Android device. The model that performed the best for me was llama3.2:1b on a mid-range phone with around 8 GB of RAM. I was also able to get it up and running on a lower-end phone with 4 GB RAM. However, I also tested several other models that worked quite well, including qwen2.5:0.5b , qwen2.5:1.5b , qwen2.5:3b , smallthinker , tinyllama , deepseek-r1:1.5b , and gemma2:2b. I hope this helps anyone looking to experiment with these models on mobile devices!
Move over, DeepSeek. Seattle-based nonprofit AI lab Ai2 has released a benchmark-topping model called Tulu3-405B.
Link Actions
Ai2’s model, called Tulu 3 405B, also beats OpenAI’s GPT-4o on certain AI benchmarks, according to Ai2’s internal testing. Moreover, unlike GPT-4o (and even DeepSeek V3), Tulu 3 405B is open source, which means all of the components necessary to replicate it from scratch are freely available and permissively licensed.
I am a lot more excited by this release compared to any other "big" models. Downloading it right now.
Graphs/benchmarks are a bit of suspect as they always are. What do you think?
Experimenters have had overnight tests confirming they have OPEN SOURCE DeepSeek R1 running at 200 tokens per second on a NON-INTERNET connected Raspberry Pi.
When an LLM calls a tool it usually returns some sort of value, usually a string containing some info like ["Tell the user that you generated an image", "Search query results: [...]"]. How do you tell the LLM the output of the tool call?
I know that some models like llama3.1 have a built-in tool "role", which lets u feed the model with the result, but not all models have that. Especially non-tool-tuned models don't have that. So let's find a different approach!
Approaches
Appending the result to the LLMs message and letting it continue generate
Let's say for example, a non-tool-tuned model decides to use web_search tool. Now some code runs it and returns an array with info. How do I inform the model? do I just put the info after the user prompt? This is how I do it right now:
System: you have access to tools [...] Use this format [...]
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Link Actions
1k lines of code, 5 main functions that are scalable in complexity. Small code to run agents, not small models. Tools plugins framework and tools sharing hosted on huggingface. Runs with open weights self hosted or proprietary inference models.
There will be no single model that will rule the universe, neither next year nor next decade. Instead, the future of AI will be multi-model.
Link Actions
Good quote at the end IMO:
The greatest inventions have no owners. Ben Franklin’s heirs do not own electricity. Turing’s estate does not own all computers. AI is undoubtedly one of humanity’s greatest inventions; we believe its future will be — and should be — multi-model