The DeepSeek-R1 Effect and Web3-AI
The post The DeepSeek-R1 Effect and Web3-AI appeared on BitcoinEthereumNews.com.
The artificial intelligence (AI) world was taken by storm a few days ago with the release of DeepSeek-R1, an open-source reasoning model that matches the performance of top foundation models while claiming to have been built using a remarkably low training budget and novel post-training techniques. The release of DeepSeek-R1 not only challenged the conventional wisdom surrounding the scaling laws of foundation models – which traditionally favor massive training budgets – but did so in the most active area of research in the field: reasoning. The open-weights (as opposed to open-source) nature of the release made the model readily accessible to the AI community, leading to a surge of clones within hours. Moreover, DeepSeek-R1 left its mark on the ongoing AI race between China and the United States, reinforcing what has been increasingly evident: Chinese models are of exceptionally high quality and fully capable of driving innovation with original ideas. Unlike most advancements in generative AI, which seem to widen the gap between Web2 and Web3 in the realm of foundation models, the release of DeepSeek-R1 carries real implications and presents intriguing opportunities for Web3-AI. To assess these, we must first take a closer look at DeepSeek-R1’s key innovations and differentiators. Inside DeepSeek-R1 DeepSeek-R1 was the result of introducing incremental innovations into a well-established pretraining framework for foundation models. In broad terms, DeepSeek-R1 follows the same training methodology as most high-profile foundation models. This approach consists of three key steps: Pretraining: The model is initially pretrained to predict the next word using massive amounts of unlabeled data. Supervised Fine-Tuning (SFT): This step optimizes the model in two critical areas: following instructions and answering questions. Alignment with Human Preferences: A final fine-tuning phase is conducted to align the model’s responses with human preferences. Most major foundation models – including those developed…
Filed under: News - @ February 5, 2025 4:23 am