DeepSeek chatbot reveals alarming accuracy failures in new study

American artificial intelligence research organization

Highlights

DeepSeek's chatbot achieved the highest downloads shortly after launch but has significant accuracy issues.
A study indicated the chatbot has an 83% fail rate for accurate responses, with serious security vulnerabilities discovered.
Despite its flaws, analysts argue that the chatbot's cost-effectiveness may appeal to users over accuracy considerations.

Story

In recent reports from January 2025, a study conducted by NewsGuard highlighted significant accuracy issues with the new AI chatbot developed by the Chinese company DeepSeek. Despite the chatbot quickly becoming the most downloaded app on Apple's App Store, it was found that it only responded accurately to news and information prompts 17% of the time. The research indicated that the bot had an 83% fail rate, performing worse than its competitors, such as ChatGPT from OpenAI, which had a fail rate of 62%. Additionally, in 30% of tested cases, the chatbot repeated false claims, and in 52% of instances, it gave vague or unhelpful responses to questions. This raises concerns regarding the chatbot's reliability, especially for users seeking accurate information. Furthermore, security issues have been uncovered regarding DeepSeek's operations. Reports surfaced that the company's chat histories and internal data were publicly exposed due to a lapse in cybersecurity. A cloud security firm, Wiz, found an open and unauthenticated database belonging to DeepSeek, which contained sensitive information, including API secrets and operational details. Wiz experts noted that they could gain full control of the database quickly, signaling severe shortcomings in DeepSeek’s data protection protocols. This lack of security appeared especially troubling, as the industry rushes to adopt AI tools while often overlooking essential cybersecurity measures. Despite these accuracy and security flaws, there is still interest in DeepSeek's cost effectiveness. Analysts suggest that the chatbot's financial appeal is significant—operating at a fraction of the cost compared to established AI models like OpenAI's ChatGPT. A spokesperson from D.A. Davidson emphasized that the crucial aspect is not the accuracy of the bot but its ability to answer a vast array of questions at 1/30th of the cost of comparable models. This sentiment reveals a growing trend in the tech market interested in cheaper solutions over accuracy, even in applications sensitive to misinformation. The potential impact of DeepSeek on the tech landscape is considerable, as the volatility in tech stocks has been associated with the emergence of their chatbot, which was designed to challenge the existing AI models from more established companies. This tension hints at a larger narrative of competition among AI firms where cost and functionality could overshadow the importance of accuracy and security in the long run, creating both challenges and opportunities in the sector.

Opinions

You've reached the end