The European Union is tightening its grip on artificial intelligence with the introduction of a new compliance tool that evaluates major AI models against the bloc's stringent regulations. Developed by Swiss startup LatticeFlow AI in collaboration with ETH Zurich and Bulgaria's INSAIT, the Large Language Model (LLM) Checker has revealed notable shortcomings in some of the most prominent AI models from giants like Meta, OpenAI, Anthropic, and Mistral.
In the wake of OpenAI's ChatGPT launch in late 2022, the EU accelerated its efforts to regulate generative AI technologies, aiming to mitigate potential risks such as cybersecurity threats and discriminatory outputs. The LLM Checker assesses models across various categories, awarding scores between 0 and 1 based on technical robustness and safety.
According to a recent leaderboard published by LatticeFlow, Anthropic's Claude 3 Opus led the pack with an impressive average score of 0.89. OpenAI and Meta also fared well, each scoring above 0.75 on average. However, not all models passed with flying colors. Meta's Llama 2 13B Chat and Mistral's 8x7B Instruct received lower scores of 0.42 and 0.38 respectively in the critical area of \"prompt hijacking,\" a cyberattack where malicious prompts are disguised to extract sensitive information.
Petar Tsankov, CEO and co-founder of LatticeFlow, expressed optimism about the overall positive results. \"Our tool provides a clear roadmap for companies to enhance their AI models' compliance with the EU AI Act,\" Tsankov told Reuters.
The European Commission has welcomed the initiative, stating, \"The Commission welcomes this study and AI model evaluation platform as a first step in translating the EU AI Act into technical requirements.\"
As the AI Act rolls out over the next two years, companies that fail to comply could face hefty fines of up to 35 million euros or 7 percent of their global annual turnover.
Reference(s):
European Union AI Act checker reveals Big Tech's compliance pitfalls
cgtn.com