Would you like to receive notifications on latest updates of the following headlines?

AI experts ready 'Humanity's Last Exam' to stump powerful tech

POSTED ON September 17, 2024 •   Technology      BY Abiodun Saheed Omodara
Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024.y l. Credit: Reuters

A team of technology experts issued a global call on Monday seeking the toughest questions to pose to artificial intelligence systems, which increasingly have handled popular benchmark tests like a child's play.

Dubbed, 'Humanity's Last Exam,' the project seeks to determine when expert-level AI has arrived. 

It aims to stay relevant even as capabilities advance in future years, according to the organisers, a non-profit called the Centre for AI Safety (CAIS) and the startup Scale AI.

The call comes days after the maker of ChatGPT previewed a new model, known as OpenAI o1, which "destroyed the most popular reasoning benchmarks," said Dan Hendrycks, executive director of CAIS and an advisor to Elon Musk's xAI startup.

Hendrycks co-authored two 2021 papers that proposed tests of AI systems that are now widely used, one quizzing them on undergraduate-level knowledge of topics like US history, the other probing models' ability to reason through competition-level math. 

The undergraduate-style test has more downloads from the online AI hub Hugging Face than any such dataset.

At the time of those papers, AI was giving almost random answers to questions on the exams. "They're now crushed," Hendrycks told Reuters.

As one example, the Claude models from the AI lab Anthropic have gone from scoring about 77% on the undergraduate-level test in 2023, to nearly 89% a year later, according to a prominent capabilities leaderboard.

These common benchmarks have less meaning as a result.

AI has appeared to score poorly on lesser-used tests involving plan formulation and visual pattern-recognition puzzles, according to Stanford University’s AI Index Report from April. 

OpenAI o1 scored around 21% on one version of the pattern-recognition ARC-AGI test, for instance, the ARC organisers said on Friday.

Some AI researchers argue that results like this show planning and abstract reasoning to be better measures of intelligence, though Hendrycks said the visual aspect of ARC makes it less suited to assessing language models. "Humanity’s Last Exam" will require abstract reasoning, he said.

Answers from common benchmarks may also have ended up in data used to train AI systems, industry observers have said. 

Hendrycks said some questions on "Humanity's Last Exam" would remain private to make sure AI systems' answers are not from memorisation.

The exam will include at least 1,000 crowd-sourced questions due on November 1 that are hard for non-experts to answer. 

These will undergo peer review, with winning submissions offered co-authorship and up to $5,000 prizes sponsored by Scale AI.

"We desperately need harder tests for expert-level models to measure the rapid progress of AI," said Alexandr Wang, Scale's CEO.

One restriction: the organisers want no questions about weapons, which some say would be too dangerous for AI to study.

0
READ ALSO
CBEX unregistered digital assets exchange in Nigeria, SEC warns of investment risks
BY Abiodun Saheed Omodara April 19, 2025 0

The Securities and Exchange Commission (SEC) has announced that Crypto Bridge Exchange, also referre...

READ ALSO
NDPC launches initiative to combat cyberbullying, financial fraud through data protection
BY Abiodun Saheed Omodara April 7, 2025 0

The National Data Protection Commission (NDPC) on Monday reiterated its dedication to enhancing data...

READ ALSO
UNCTAD highlights risks of AI disparities as market approaches $4.8trn
BY Abiodun Saheed Omodara April 7, 2025 0

The widespread adoption of artificial intelligence (AI) worldwide, along with the emergence of new t...

READ ALSO
AI's Role in Spiritual Guidance: Enhancing teachings while upholding values
BY Abiodun Saheed Omodara April 5, 2025 0

Artificial Intelligence (AI), a collection of technologies programmed into computers to execute vari...

READ ALSO
U.S. shows highest anxiety over AI Job loss amidst technological advancements
BY Abiodun Saheed Omodara April 3, 2025 0

Despite its advanced status, research indicates that the United States of America (USA) has the high...

READ ALSO
NITDA Partners Afrovision technologies to bridge job gap for Nigeria’s Tech Talent
BY Abiodun Saheed Omodara April 3, 2025 0

In an effort to tackle the ongoing challenge of job placement for Nigeria’s expanding tech tal...

READ ALSO
OpenAI valuation hits $300 billion after SoftBank-led fund
BY Abiodun Saheed Omodara April 2, 2025 0

The Japanese telecommunications company, alongside a group of investors, has recently announced yet...

READ ALSO
Nigeria set to lead in smart mobility with AI speed train initiative
BY Abiodun Saheed Omodara March 26, 2025 0

The Ogun-Guangdong Free Trade Zone (OGFTZ) and China’s Zhongguancun Infogu Asset Management Co...

OUR CHANNELS:

2face get appointment as technical adviser to Benue Govt.
BY Abiodun Saheed Omodara April 24, 2025 0

The Governor of Benue State, Hyacinth Alia    has appointed music legend, Innocent Idibia...


Nigeria's Economic Reforms Boost Sovereign Credit Profile - IMF
BY Abiodun Saheed Omodara April 23, 2025 0

The Assistant Director for Global Markets at the International Monetary Fund (IMF), Jason Wu,  ...


Six dead in Okene Accident: Akpoti-Uduaghan Urges Immediate Road Safety Reforms
BY Abiodun Saheed Omodara April 23, 2025 0

KOGI, Nigeria - Following a tragic auto accident in Okene, Kogi State, which resulted in the loss of...


Plateau NBA expresses outrage over violence, urges FG to protect citizens
BY Abiodun Saheed Omodara April 23, 2025 0

PLATEAU, Nigeria - Due to the assaults and homicides occurring in Plateau State by alleged Fulani mi...


LASG reports over 517,000 traffic violations amid enhanced road safety measures
BY Abiodun Saheed Omodara April 24, 2025 0

LAGOS, Nigeria - The Lagos State Government on Tuesday revealed that from May 2024 to March 2025, mo...


Police Inspector Dies by Suicide in Port Harcourt
BY Abiodun Saheed Omodara April 24, 2025 0

RIVERS, Nigeria - A Police Inspector identified as Maxwell Zabu, who was attached to a former Chairm...


Tinubu lauds ECOWAS for regional integration, economic progress at golden jubilee
BY Abiodun Saheed Omodara April 23, 2025 0

President Bola Tinubu has praised the Economic Community of West African States (ECOWAS) for promoti...


LASG celebrates two million passengers on blue rail, highlights transportation achievements
BY Abiodun Saheed Omodara April 23, 2025 0

LAGOS, Nigeria (NAN) - The Lagos State Government has announced that over two million passengers hav...


FG aims to upgrade 2,701 PHC to enhance healthcare access across Nigeria
BY Abiodun Saheed Omodara April 24, 2025 0

The Federal Government, via the National Primary Health Care Development Agency (NPHCDA), has reveal...


Education minister urges colleges to embrace dual degree system
BY Abiodun Saheed Omodara April 24, 2025 0

ABUJA, Nigeria (NAN) - The Minister of Education, Dr. Tunji Alausa, has called on stakeholders in Co...


More Articles

Load more...

Menu