Systems using Artificial Intelligence (AI), Machine-Learning and Robotics are here but the risks are often poorly understood, poorly articulated, and therefore seldom sufficiently considered during system discovery and design phases. But where there is little risk, there is little reward. AI brings huge opportunities and, by getting it right from the start, it has the potential to transform services and benefit millions of people.
How can we address risk at an appropriate point in the implementation of AI? Where there is technological risk, there will always be a need for testing, and a new approach to testing AI is required to help mitigate these new risks. This paper considers the testing implications of AI in all its guises. We’ve correlated our new testing techniques against KPMG’s AI Risk & Controls Framework, which describes the additional risks introduced with AI.
Why is the AI project failure rate so high?
In addition to the standard challenges with any IT development project, AI systems are inherently risky because their implementation relies so heavily on data from which they continuously learn. This therefore introduces five prominent reasons why AI projects struggle:
1. Data Quality
Machines learn from data and will therefore only be as good as the data they are trained on - if it is not validated or cleaned correctly, poor data quality can unintentionally reinforce harmful biases, increase polarisation and therefore result in damaging consequences (such as racial discrimination).
Read KPMG's paper on the Ethical Use of Customer Data in a Digital Economy to find out more.
2. Data Volume
Physical and regulatory limitations, i.e. GDPR, often prevent datasets (used for training and testing) from being representative of the whole. The more complex your pattern is, the more training examples you need - for example, it’s easier to distinguish gender than it is to determine race from a few samples. It’s hard to generalise something you have never seen!
3. Evaluation Metrics
The ‘Test Oracle Problem’ – has the test passed or failed? In traditional testing, expected results can be determined based on a ‘test oracle’ such as specified requirements. However, empirical methods are not always possible when an application is continuously changing its behaviour. Other, heuristic methods are more applicable, as is the use of validation techniques more commonly associated with medical trials where trials are conducted in phases to determine whether new treatments are safe and effective.
4. Regulatory Risk
It is often difficult to guarantee enduring compliance with the law, regulations, corporate policy and company values. AI can be a ‘black box’ as it is not always clear how it makes predictions and there is no way of guaranteeing its reasoning behind the decisions it makes. Real examples of these issues manifesting are in the press regularly, one being the failure of compliance with minimum-wage legislation where the rules are not correctly ‘taught’ to the AI system and it allows, or even encourages, breaches to happen. Another example has been seen when AI starts to apply racial, gender or other bias which contravenes human rights, and usually is at odds with an organisation’s values and policies around equality. Without these guarantees, many companies will not be able to accept the risk, as well as the pressure to keep up with new and emerging laws, particularly in industries that are highly regulated.
5. Lack of Knowledge and Skills
This field requires a skill set based on business, mathematics, technology and logic - machine learning and data science are a few among the many in-demand skills. There is a significant demand-supply gap in the AI jobs market, where the increase in projects is outpacing the number of available competent professionals with appropriate science, technology, engineering and maths (STEM) skills.
The right test framework for AI systems
An AI implementation which has sufficient controls, checks and balances to satisfy regulatory and audit requirements requires focused design and sufficient investment. This also can address the dilemma already witnessed in industry – you cannot punish, fire or prosecute a computer – the law requires a human to be accountable.
With so many potential failure points, it is important to mitigate the risk of failure with a carefully structured test framework that considers the following:
Challenges for Testing
Shift-right is the new Shift-left
As well as the risks and challenges outlined above, the fact that AI changes its behaviour in a live environment brings additional challenges and risk: We no longer have the safety-net of change management and controlled release into production – and our scope for testing and managing that risk changes accordingly. KPMG have developed a test methodology specifically to address this problem. With our methodology, clients can understand the associated risks with AI, and build audit and control points into the AI design process, which mitigate that risk as early as possible (‘Shift-Left’). However, there is also the need to operationalise the continuous changing behaviour of the system once live, and to measure and build trust in the AI application, later in the process (‘Shift-Right’).
Why act now?
There will be significant competitive advantage to those organisations who are successful in implementing this technology, and we believe that can only be achieved by applying appropriate quality measures and testing. KPMG have a specialist team delivering Cognitive Test Assurance Services (CTAS), from specialist technical consulting, AI test strategy development, operational continuous testing; through to delivery of test design and execution for AI and related technologies. We can help you to understand and assess your risk, and work with internal compliance and risk functions to develop strategies to drive confidence in the use of AI, ultimately leading to successful AI implementation and unlocking the benefits of this exciting new technology!