
The evidence-backed model delivered impressive results, but it doesn’t validate the wave of AI therapy bots flooding the market.

Welcome to the Generative AI community on Lemmy! This is a place where you can share and discuss anything related to generative AI, which is a kind of technology that can make new things, like pictures, words, or sounds, by learning from existing things. You can post your own creations, ask for feedback, share resources, or just chat with other fans. Whether you are a beginner or an expert, you are welcome here. Please follow the Lemmy etiquette and be respectful to each other. Have fun and enjoy the magic of generative AI!
P.s. Every aspect of this community was created with AI tools, isn't that nifty.
Llama 4 behemoth seems 👍
my go-to llm in no specific order –
What's your go-to?
When in a hurry, I just use the Gemini voice assistant or Meta ai – I have the Messenger app.
The evidence-backed model delivered impressive results, but it doesn’t validate the wave of AI therapy bots flooding the market.
AI Search Has A Citation Problem, Study Finds
We Compared Eight AI Search Engines. They’re All Bad at Citing News.
cross-posted from: https://slrpnk.net/post/19631567
The Tow Center for Digital Journalism at the Columbia University in the U.S. conducted tests on eight generative search tools with live search features to assess their abilities to accurately retrieve and cite news content, as well as how they behave when they cannot.
Results in brief:
- Chatbots were generally bad at declining to answer questions they couldn’t answer accurately, offering incorrect or speculative answers instead.
- Premium chatbots provided more confidently incorrect answers than their free counterparts.
- Multiple chatbots seemed to bypass Robot Exclusion Protocol preferences.
- Generative search tools fabricated links and cited syndicated and copied versions of articles.
- Content licensing deals with news sources provided no guara
can generative AI retell an Edgar Allan Poe story?
We've set up a generative AI to do an interactive retelling of Edgar Allan Poe's Cask of Amontillado. Please follow the link to give it a try. Does this work? What do you think?
https://www.loomers.world/cask/
The AI has access to Poe's original text, and can just tell the story, in Poe's words. But you can also interact with the narrator and try to take it in a different direction.
The Australian government says the Chinese AI app is a threat to it and its assets.
cross-posted from: https://lemmy.sdf.org/post/28980041
Australia has banned DeepSeek from all government devices and systems over what it says is the security risk the Chinese artificial intelligence (AI) startup poses.
...
Growing - and familiar - concerns
Western countries have a track record of being suspicious of Chinese tech - notably telecoms firm Huawei and the social media platform, TikTok - both of which have been restricted on national security grounds.
...
An Australian science minister previously said in January that countries needed to be "very careful" about DeepSeek, citing "data and privacy" concerns.
The chatbot was removed from app stores after its privacy policy was questioned in Italy. The Italian goverment previously temporarily blocked ChatGPT over privacy concerns in March 2023.
Regulators in South Korea, Ireland and France have all begun investigations into how DeepSeek handles user data, which it stores in servers in C
No, DeepSeek isn’t uncensored if you run it locally
DeepSeek's model is censored at both the application and training layers, a Wired investigation shows.
cross-posted from: https://lemmy.sdf.org/post/28978937
There’s an idea floating around that DeepSeek’s well-documented censorship only exists at its application layer but goes away if you run it locally (that means downloading its AI model to your computer).
But DeepSeek’s censorship is baked-in, according to a Wired investigation which found that the model is censored on both the application and training levels.
For example, a locally run version of DeepSeek revealed to Wired thanks to its reasoning feature that it should “avoid mentioning” events like the Cultural Revolution and focus only on the “positive” aspects of the Chinese Communist Party.
A quick check by TechCrunch of a locally run version of DeepSeek available via Groq also showed clear censorship: DeepSeek happily answered a question about the Kent State shootings in the U.S., but replied “I cannot answer” when asked about what happened in Tiananmen Sq
The development of DeepSeek-V3 was probably much more expensive than suggested by the Chinese company, researchers say
The development of DeepSeek-V3 was probably much more expensive than suggested. The company is said to have access to 60,000 GPUs.
cross-posted from: https://lemmy.sdf.org/post/28971543
DeepSeek is said to have access to tens of thousands of GPU accelerators for the development of its own AI models, including H100 GPUs, which fall under the US export bans. The reported costs of just under 5.6 million US dollars for DeepSeek v3 probably only represent a small part of the total bill.
In the paper on the V3 model, DeepSeek writes of a comparatively small data center with 2048 H800 accelerators from Nvidia. The company calculates hypothetical rental costs of 2 US dollars per hour and H800 GPU. With a total of just under 2.8 million computing hours (distributed across 2048 GPUs), this comes to 5.6 million US dollars.
However, the developers themselves cite a caveat: "Please note that the above costs only include the official training of Dee
Zuck's new Llama is a beast
cross-posted from: https://lemmy.world/post/17926715
Llama 3.1 (405b) seems 👍. It and Claude 3.5 sonnet are my go-to large language models. I use chat.lmsys.org. Openai may be scrambling now to release Chatgpt 5?
Marques Brownlee's latest vid is kinda unneeded
The new Siri vs the RabbitR1 and Humane pinRabbit R1: https://youtu.be/ddTV12hErTc?si=tLR_GSXyRFtpgpJbHumane AI pin: https://youtu.be/TitZV6k8zfA?si=vI4mZMhN...
cross-posted from: https://lemmy.world/post/16792709
I'm an avid Marques fan, but for me, he didn't have to make that vid. It was just a set of comparisons. No new info. No interesting discussion. Instead he should've just shared that Wired podcast episode on his X.
I wonder if Apple is making their own large language model (llm) and it'll be released this year or next year. Or are they still musing re the cost-benefit analysis? If they think that an Apple llm won't earn that much profit, they may not make 1.
Quantized model issues
Hey, so first off, this is my first time dabbling with LLMs and most of the information I found myself by rummaging through githubs.
I have a fairly modest set-up, an older gaming laptop with a RTX3060 video card with 6 GB VRAM. I run inside WSL2.
I have had some success running fastchat with the vicuna 7B model, but it's extremely slow, at roughly 1 word every 2-3 seconds output, with --load-8bit, lest I get a CUDA OOM error. Starts faster at 1-2 words per second but slows to a crawl later on (I suspect it's because it also uses a bit of the 'Shared video RAM' according to the task manager). So I heard about quantization which is supposed to compress models at the cost of some accuracy. Tried ready-quantized models (compatible with the fastchat implementation) from hugginface.co, but I ran into an issue - whenever I'd ask something, the output would be repeated quite a lot. Say I'd say 'hello' and I'd get 200 'Hello!' in response. Tried quantizing a model myself with exllamav2 (usin
Guiding Language Models of Code with Global Context using Monitors
Language models of code (LMs) work well when the surrounding code in the vicinity of generation provides sufficient context. This is not true when it becomes necessary to use types or functionality defined in another module or library, especially those not seen during training. LMs suffer from limited awareness of such global context and end up hallucinating, e.g., using types defined in other files incorrectly. Recent work tries to overcome this issue by retrieving global information to augment the local context. However, this bloats the prompt or requires architecture modifications and additional training. Integrated development environments (IDEs) assist developers by bringing the global context at their fingertips using static analysis. We extend this assistance, enjoyed by developers, to the LMs. We propose a notion of monitors that use static analysis in the background to guide the decoding. Unlike a priori retrieval, static analysis is invoked iteratively during the entire decod
(https://sopuli.xyz/c/gai) Adobe Firefly cannibalizes stock photo market for creators [https://venturebeat.com/ai/adobe-stock-creators-arent-happy-with-firefly-the-companys-commercially-safe-gen
@gai Adobe Firefly cannibalizes stock photo market for creators https://venturebeat.com/ai/adobe-stock-creators-arent-happy-with-firefly-the-companys-commercially-safe-gen-ai-tool/
What are some differences between the kind of output you get from Microsoft's image generator and Midjourney?
With minimal tweaking, just giving relatively simple prompts to these, would you say one is measurably better than the other? in what ways? or is it more of a subjective judgement.
KoboldAI discussion allowed in this group?
A nice fork from a main dev: https://github.com/henk717/KoboldAI
Main release: https://github.com/KoboldAI/KoboldAI-Client
So... Alignment problem.
Thoughts? Ideas? How do we align these systems, some food for thought; when we have these systems do chain of reasoning or various methods of logically going through problems and coming to conclusions we've found that they are telling "lies" about their method, they follow no logic even if their stated logic is coherent and makes sense.
Here's the study I'm poorly explaining, read that instead. https://arxiv.org/abs/2305.04388