AI / Machine Learning

1y ago

DeepSouth Neuromorphic Supercomputer: Brain-Like System to Transform Business

www.ultra-unlimited.com DeepSouth Neuromorphic Supercomputer: Brain-Like System to Transform Business — Ultra Unlimited

Learn more about the DeepSouth Neuromorphic Supercomputer due to come online in August of 2024. Explore the future of neural exascale.

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

How Does Compression Explain Unsupervised Learning in LLMs? | Ilya Sutskever, OpenAI

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

openai.com GPT-3.5 Turbo fine-tuning and API updates

Developers can now bring their own data to customize GPT-3.5 Turbo for their use cases.

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

Dr Stephen Wolfram says THIS about ChatGPT, Natural Language and Physics

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

DuckAI - An open-source ML research community

https://duckai.org/

cross-posted from: https://lemmy.intai.tech/post/134262

DuckAI is an open and scalable academic lab and open-source community working on various Machine Learning projects. Our team consists of researchers from the Georgia Institute of Technology and beyond, driven by our passion for investigating large language models and multimodal systems.
Our present endeavors concentrate on the development and analysis of a variety of dataset projects, with the aim of comprehending the depth and performance of these models across diverse domains.
Our objective is to welcome people with a variety of backgrounds to cutting-edge ML projects and rapidly scale up our community to make an impact on the ML landscape.
We are particularly devoted to open-sourcing datasets that can turn into an important infrastructure for the community and exploring various ways to improve the design of foundation models.

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

Attention Is All You Need

cross-posted from: https://lemmy.intai.tech/post/133548

https://arxiv.org/pdf/1706.03762.pdf
Attention Is All You Need
By Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin
Word count: 4221
Estimated read time: 17 minutes
Links:
Paper: https://arxiv.org/abs/1706.03762
Code: https://github.com/tensorflow/tensor2tensor
Summary: This paper proposes a new neural network architecture called the Transformer that is based solely on attention mechanisms, without using sequence aligned RNNs or convolutions. The Transformer achieves state-of-the-art results in machine translation while being more parallelizable and requiring significantly less time to train. Key contributions:
Proposes multi-head self-attention as a replacement for recurrence and convolutions in encoder-decoder architectures. Self-attention connects all positions with a constant number of sequentially execute

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

Large Language Models as Tool Makers

cross-posted from: https://lemmy.intai.tech/post/124795

Github
Paper
Large Language Models as Tool Makers Authors: Tianle Cai, Xuezhi Wang, Tengyu Ma, Xinyun Chen, Denny Zhou
Word count: 4579 words
Estimated read time: 12 minutes
Source code: https://github.com/ctlllll/LLM-ToolMaker ↗
Summary:
This paper proposes a framework called LLMs As Tool Makers (LATM) that enables large language models (LLMs) to create and utilize their own tools for solving complex reasoning tasks. The key idea is to separate the process into two stages - tool making and tool using. In the tool making stage, a powerful yet expensive LLM acts as the "tool maker" to generate reusable Python functions for solving demonstrations of a task. In the tool using stage, a lightweight and cost-effective LLM acts as the "tool user" to call these tools to solve new instances of the task.
Experim

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

www.reuters.com US judge finds flaws in artists' lawsuit against AI companies

U.S. District Judge William Orrick said during a hearing in San Francisco on Wednesday that he was inclined to dismiss most of a lawsuit brought by a group of artists against generative artificial intelligence companies, though he would allow them to file a new complaint.

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

huggingface.co georgesung/llama2_7b_chat_uncensored · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

GPT4All now supports Text Embeddings ⚡

docs.gpt4all.io Embedding - GPT4All Documentation

Documentation for running GPT4All anywhere.

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

Today we’re incredibly excited to announce the launch of a big new capability within LlamaIndex: Data Agents.

medium.com Data Agents

Today we’re incredibly excited to announce the launch of a big new capability within LlamaIndex: Data Agents.

AI / Machine Learning @compuverse.uk

Cameron @compuverse.uk

2y ago

Falcon 40B LLM

An intriguing video discussing Falcon 40B, another LLM that seems to perform really quite well, especially given its much smaller size than models like GPT 4.

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

BabyCommandAGI - Write software through feedback

cross-posted from: https://lemmy.intai.tech/post/41936

repo
[tweet[(https://twitter.com/saten_work/status/1674856415977181184)
Reviews:
https://www.producthunt.com/products/babycommandagi
https://www.futurepedia.io/tool/babycommandagi
https://aitoptools.com/tool/babycommandagi/
https://whattheai.tech/tools/babycommandagi/
BabyCommandAGI, which is based on @yoheinakajima 's BabyAGI, can now automatically create apps just by providing feedback.
The following example is for creating a Reversi game Flutter app.
Set the following OBJECTIVE and INITIAL_TASK, then wait for about 30 minutes.
OBJECTIVE: "Please install the Flutter environment via git, implement a Flutter app to play Reversi with black and white stones, and make the Flutter app you created accessible from outside the container by running 'flutter run -d web-server --web-port 8080 --web-hostname 0.0.0.0'."
INITIAL_TA

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

github.com GitHub - opensouls/LMYield: Lightweight language for controlling OpenAI Chat API generations

Lightweight language for controlling OpenAI Chat API generations - GitHub - opensouls/LMYield: Lightweight language for controlling OpenAI Chat API generations

LMYield enables you to guide OpenAI's Chat API generations into arbitrary output patterns, and is specifically designed to enhance chain of thought prompting for agents.

The motivating concept behind LMYield is that for a given context, an agnetic entity will spawn some number of ordered, related chain of thoughts, and they should be yielded as a subscribable stream.

Features:

Simple, intuitive syntax, based on Handlebars templating. Rich output structure with speculative caching and multiple generations to ensure desired output structure. Designed specifically for agentic chain of thought. Typescript not python

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

One-2-3-45

cross-posted from: https://lemmy.intai.tech/post/41706

github repo
site
paper
One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization
Minghua Liu1∗ Chao Xu2∗ Haian Jin3,4∗ Linghao Chen1,4∗ Mukund Varma T5 Zexiang Xu6Hao Su1
Word count: 8458 words
Estimated read time : ~ 37 minutes
Source code: http://one-2-3-45.com
Summary: This paper presents a method to reconstruct 3D shapes from a single image in an end-to-end manner without time-consuming optimization. Their approach consists of three main parts:
Multi-view synthesis: They leverage a view-conditioned 2D diffusion model, Zero123, to generate multi-view images of the input object.
Pose estimation: They estimate the elevation angle of the input image to determine the camera poses of the multi-view images.
3D reconstructi

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

OpenChat_8192 - The first model to beat 100% of ChatGPT-3.5

cross-posted from: https://lemmy.intai.tech/post/40699

Models
opnechat
openchat_8192
opencoderplus
Datasets
openchat_sharegpt4_dataset
Repos
openchat
Related Papers
LIMA Less is More For Alignment
ORCA
Credit:
Tweet
Archive:
@Yampeleg The first model to beat 100% of ChatGPT-3.5 Available on Huggingface
🔥 OpenChat_8192
🔥 105.7% of ChatGPT (Vicuna GPT-4 Benchmark)
Less than a month ago the world witnessed as ORCA [1] became the first model to ever outpace ChatGPT on Vicuna's benchmark.
Today, the race to replicate these results

AI / Machine Learning @compuverse.uk

manitcor @lemmy.intai.tech

2y ago

New trick scales LLMs even longer! - GitHub - jquesnelle/scaled-rope

cross-posted from: https://lemmy.intai.tech/post/40583

github repo

NTK-Aware Scaled RoPE allows LLaMA models to have extended (8k+) context size without any fine-tuning and minimal perplexity degradation.
Twitter Reddit
News I've seen the posts about SuperHOT and just recently, the paper from Meta which uses RoPE interpolation, and I've noticed an immediate improvement that can be brought to this method. Basically if you apply Neural Tangent Kernel (NTK) theory to this problem, it becomes clear that simply interpolating the RoPE's fourier space "linearly" is very sub-optimal, as it prevents the network to distinguish the order and positions of tokens that are very cl