Computer Agent Arena builds stronger AI models by assessing its ability to perform real-world tasks like web browsing and coding

By Mayuri Punithan

Cheriton School of Computer Science

Imagine asking AI to plan your trip itinerary, book and pay for all your flights, and arrange your airport transport — all within a single click. Fortunately, an international research team is making this vision a reality.  

The team, composed of researchers from the University of Waterloo, University of Hong Kong, Salesforce Research and Carnegie Mellon University developed Computer Agent Arena — an evaluation platform that can enhance and create computer agents. 

A computer agent is a type of software that can perform tasks on behalf of a person or organization, without needing constant human intervention. It can interpret the state of the computer and act autonomously to help users solve problems. Examples of computer agents include voice assistants like Siri and Alexa, who can help users send messages and schedule meetings. 

To read the full article, click here!