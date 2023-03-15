The results show that the new system exceeds ChatGPT's capabilities on topics like math and verbal comprehension.

For example, while ChatGPT scored a 1 out of possible 5 on the Advanced Placement Calculus BC exam, GPT-4 scored a 4.

And while ChatGPT did poorly on the Uniform Bar Examination, scoring in the lower 10th percentile, GPT-4 scored in the 90th percentile.

However, the new system didn't do great on tests related to writing.

GPT-4 didn't improve on ChatGPT's performance for the AP English Language or Literature exams (it got a 2 out of 5 on both), or the writing portion of the Graduate Record Examination (GRE).

It also didn't score well on an advanced Leetcode exam, which tests developers' skills to prepare them for technical interviews.

Rumman Chowdhury, an expert in the responsible applications of AI, pointed out on Twitter that results show the model is "incapable of abstract creativity." She added, "Humans aren’t replaceable."