In
Technology
โข1 min readโข
Jan 28, 2025 22:03p.m.
Why Data is the new Gold?
<p>Recently, China released a groundbreaking large language model or chatbot called DeepSeek-R1. According to the paper they published, this model is an improvement of DeepSeek-zero, which was purely trained using reinforcement learning.
</p><p>Reinforcement learning is a type of machine learning where a model learns by interacting with the environment without using prior data to train it. However, DeepSeek researchers noted that DeepSeek-zero grappled with performance challenges due to the fact that it was only trained with pure reinforcement learning.
</p><p>To improve DeepSeek-zero, they developed a new training pipeline, which gave birth to DeepSeek-R1, with much improved performance. This involved collecting thousands of cold start data as the starting point for reinforcement learning.
</p><p>Their aim was to explore the effect of incorporating a small amount of high-quality data as a cold start on model reasoning performance. DeepSeek researchers stated that the cold start data contributed to the boost in performance of DeepSeek-R1, which backs the statement that โ๐๐๐ฉ๐ ๐๐จ ๐ฉ๐๐ ๐ฃ๐๐ฌ ๐๐ค๐ก๐.โ
</p><p><br></p><p>
</p><p>
</p><p>
</p>
Your contributor score is a weighted calcuation of how much engagement
all your content on TwoCents has received.
Here is a list of metrics that are used to calcuate your contributor score, arranged from
the metric with the highest weighting, to the one with the lowest weighting.
1
Insights published
2
Subscriptions received
3
Tips received
4
Comments (excluding replies)
5
Upvotes
6
Views
Below is a list of badges on TwoCents and their designations.
Comments