ChatGPT Is Getting Less Accurate, Study Reveals

A recent study discovered that the popular chatbot ChatGPT had some ups and downs in its performance. The study, done by Stanford University, looked at how well ChatGPT handled different tasks over a few months; These tasks included solving math problems, answering sensitive questions, generating software code, and visual reasoning.

The results were surprising. They found that ChatGPT’s abilities were not consistent. For instance, they looked at two versions of the technology: GPT-3.5 and GPT-4. When it came to solving math problems, GPT-4 started off strong in March, correctly identifying prime numbers 97.6% of the time — But just three months later, its accuracy dropped to a mere 2.4%. GPT-3.5 showed improvement, going from 7.4% accuracy to 86.8% in the same task.

The study revealed that ChatGPT’s performance is not consistent.

Similar fluctuations occurred in tasks like writing code and visual reasoning. James Zou, a Stanford computer science professor involved in the study, was surprised by the significant changes in ChatGPT’s performance.

“When we are tuning a large language model to improve its performance on certain tasks, that can actually have a lot of unintended consequences, which might actually hurt this model’s performance on other tasks […]. There’s all sorts of interesting interdependencies in how the model answers things which can lead to some of the worsening behaviors that we observed.”

The shifts in performance are not so much about the chatbot’s accuracy in specific tasks but rather the unintended consequences of fine-tuning the model. Tweaking one part of the model to improve one task can negatively affect other tasks due to complex interconnections within the model.

Not only did ChatGPT’s answers become less accurate, but it also stopped explaining its reasoning.

The Importance Of Acknowledging the Performance Shifts

Unfortunately, because ChatGPT operates like a black box, researchers and the public can’t see how it works. This lack of transparency became more evident when OpenAI decided not to make its code open source. Zou emphasizes the importance of acknowledging these performance shifts and keeping an eye on how the models perform over time.

Not only did ChatGPT’s answers become less accurate, but it also stopped explaining its reasoning. This is akin to asking a student to show their work in solving a math problem step by step. It helps researchers understand how the AI arrives at its answers — However, ChatGPT started to skip this step, making it harder to study its reasoning process.

In the case of sensitive questions, both GPT-4 and GPT-3.5 initially refused to engage, stating that the questions were based on discriminatory ideas. But by June, ChatGPT simply declined to answer, providing less insight into its decision-making process.

To wrap it up, ChatGPT’s performance can be unpredictable, and understanding its inner workings remains a challenge but the study’s main message is the need to monitor and address these performance shifts in large language models.

Filed in Robots. Read more about AI (Artificial Intelligence) and ChatGPT.

$144.99

Add to cart

ChatGPT Is Getting Less Accurate, Study Reveals

The Importance Of Acknowledging the Performance Shifts

Cooler Master MasterBox Q300L Micro-ATX Tower with Magnetic Design Dust Filter, Transparent Acrylic Side Panel, Adjustable I/O & Fully Ventilated Airflow, Black (MCB-Q300L-KANN-S00)

ASUS TUF Gaming GT301 ZAKU II Edition ATX mid-Tower Compact case with Tempered Glass Side Panel, Honeycomb Front Panel, 120mm Aura Addressable RGB Fan, Headphone Hanger,360mm Radiator, Gundam Edition

ASUS TUF Gaming GT501 Mid-Tower Computer Case for up to EATX Motherboards with USB 3.0 Front Panel Cases GT501/GRY/WITH Handle

be quiet! Pure Base 500DX ATX Mid Tower PC case | ARGB | 3 Pre-Installed Pure Wings 2 Fans | Tempered Glass Window | Black | BGW37

ASUS ROG Strix Helios GX601 White Edition RGB Mid-Tower Computer Case for ATX/EATX Motherboards with tempered glass, aluminum frame, GPU braces, 420mm radiator support and Aura Sync

Corsair 5000D Airflow Tempered Glass Mid-Tower ATX PC Case – Black

CORSAIR 7000D AIRFLOW Full-Tower ATX PC Case, Black

Bgears b-Voguish Gaming PC Case with Tempered Glass panels, USB3.0, Support E-ATX, ATX, mATX, ITX. (Fans are sold separately)

Phanteks (PH-EC360ATG_DWT01) Eclipse P360A Ultra-fine Performance Mesh, Mid-Tower case, Tempered Glass, Digital-RGB Lighting, White

CORSAIR iCUE 4000X RGB Tempered Glass Mid-Tower ATX PC Case – 3X SP120 RGB Elite Fans – iCUE Lighting Node CORE Controller – High Airflow – White

Build Sustainable Self Confidence to Slay the New Year

CLOVE COOKIES-The Southern Lady Cooks

Jalapeno Cornbread – Spend With Pennies

Roasted Peppers – Spend With Pennies

Leave a reply Cancel reply

Compare items

Shopping cart