Workplace Clinic: When Your Performance Review Includes an AI Score
“My company rolled out AI adoption metrics into our performance reviews three months ago. Every four weeks I receive a dashboard showing my ‘AI query frequency,’ ‘Copilot session duration,’ and ‘AI-assisted document creation rate’ — alongside a comparison to my team’s average. I work in legal compliance, where using AI without rigorous human verification creates real regulatory exposure. Last week my manager told me my adoption scores are ‘below average’ and I need to ‘show more engagement with our AI tools.’ I use AI — thoughtfully, for tasks where it genuinely helps, and deliberately not for tasks where it introduces risk. But that careful approach is apparently not what the metric measures. I don’t want to game the system. But I can’t afford a bad review. What do I actually do?” — Fatima, Senior Compliance Analyst, financial services sector
Your organization has confused the instrument for the outcome. That confusion is now being delivered to you as a performance problem.
The metric Fatima’s organization is using — frequency of AI tool sessions, query counts, AI-assisted document rate — measures tool use, not value created. These are not synonyms. In a well-functioning system, they might correlate. But in a compliance function, where AI hallucinations carry regulatory consequences and every AI-generated output must be verified before use, optimizing for tool-use frequency can actively make the work worse. The meter and the road are not the same thing.
The Clinical Read #
This is a structural problem produced by a common organizational error. Economists call it Goodhart’s Law: when a measure becomes a target, it ceases to be a good measure. Named for the British economist Charles Goodhart and later formalized by anthropologist Marilyn Strathern, the principle has a long history in management science precisely because it is so reliably violated.
Applied to AI: when “Copilot session duration” becomes the metric, the rational employee response is to open Copilot and leave it running. When “AI query frequency” is the target, employees query the system repeatedly for things they already know the answer to. The dashboard shows excellent AI adoption. The output quality does not improve. In functions with regulatory stakes, it may decline.
Fatima is not failing to adopt AI. She is refusing to perform adoption theater. The distinction matters — and it is worth naming clearly before she enters any conversation with her manager.
The scale of this problem is larger than one company. Microsoft’s 2024 Work Trend Index, which surveyed 31,000 workers across 31 countries, found that 52% of people who use AI at work are already reluctant to admit using it for their most important tasks, and 53% worry that using AI on important work makes them look replaceable (Microsoft Work Trend Index, May 8, 2024). The same survey found that only 39% of employees who use AI at work have received any AI training from their company, and only 25% of organizations plan to offer AI training. Organizations are deploying measurement systems for a behavior they have not yet taught.
Fatima is experiencing the mirror of this dynamic. Where others hide their AI use to avoid looking replaceable, she is being penalized for using it with professional judgment rather than strategic visibility. Both behaviors are rational responses to poorly designed accountability systems.
There is a deeper risk Fatima should understand before she acts. Deloitte’s 2026 Global Human Capital Trends report, which surveyed more than 9,000 business and HR leaders across 89 countries, found that 59% of organizations are currently taking a tech-focused approach to AI. The same data showed that organizations taking a human-centric approach — one that intentionally designs how humans and AI collaborate, rather than simply maximizing tool usage — are 1.6 times more likely to exceed expectations on investment returns (Deloitte, March 4, 2026). Fatima’s organization is likely in the 59% that is, empirically, underperforming its potential by measuring the wrong things.
That is the clinical read. Now here is what to do.
The Three-Move Intervention #
Move 1: Build an AI Outcomes Journal #
Before you have any conversation with your manager, create a contemporaneous record — not a retrospective defense — of your actual AI use over the next three to four weeks.
For each instance of AI use, document:
- The task type (first-pass drafting, research summary, regulatory search, template creation, etc.)
- Whether the task was appropriate for AI use (low verification burden, low regulatory sensitivity)
- What the AI produced and whether it required correction
- Estimated time saved, or risk avoided by not using AI for adjacent tasks
This journal does two things. It shifts the conversation from “I don’t use AI enough” to “I use AI with documented purpose and traceable outcomes.” It also forces you to test your own assumption: are there tasks in your function where you could deploy AI more and should? If the journal reveals gaps, you can close them intentionally. If it confirms that your current level of use is proportionate to your function’s risk profile, you have a case.
The goal is not to produce a grievance document. It is to replace a vague metric comparison with a specific outcome record.
Move 2: Request a Conversation About Metric Design — Framed as Risk Protection #
Do not frame this conversation as “I think this metric is unfair.” That framing positions you as a critic of management decisions and invites a defensive response.
Frame it as organizational risk. Specifically:
“I want to raise something that may have compliance implications for our team. As our AI adoption metrics are currently designed, the measure tracks frequency of AI tool engagement. In our function, some of the highest-risk activities are precisely those where AI use without careful verification creates regulatory exposure. If our team optimizes for the adoption score in those task areas, we may inadvertently increase the risk profile. I’d like to understand whether there’s a role-adjusted version of the metric, or whether we can map our function’s appropriate AI use cases so we’re measuring the right thing.”
This framing does three things simultaneously: it signals that you are engaged with AI adoption, not resistant to it; it establishes your professional judgment as a compliance asset, not an obstruction; and it surfaces the metric design flaw as a risk to the organization, which is a problem any competent manager will want to address.
The Center for Creative Leadership’s research on psychological safety consistently shows that teams where professionals can raise concerns directly and constructively — framed as problem-solving rather than complaint — produce better outcomes and handle organizational change more effectively (CCL, 2025). You are not criticizing a policy. You are doing your job.
Move 3: Ask for Role-Specific AI Guidelines in Writing #
The third move converts the conversation from a one-off discussion to a documented policy improvement.
Once you have raised the metric design concern, make a specific request:
“Could we develop a framework that maps our compliance function’s tasks against appropriate AI use cases — noting where AI adds value, where it requires additional verification, and where it introduces risk? This would give our team clearer guidance and allow the adoption metrics to reflect actual performance improvement rather than tool frequency.”
This request serves you in two ways. If management responds positively, you get clarity — and a documented basis for evaluating your AI use that accounts for your role’s specific requirements. If management declines, or cannot engage with the substance of the request, that is important information about how seriously your organization has actually thought through its AI adoption strategy.
And if nothing changes and the adoption score continues to penalize professional judgment? That is a data point about whether this organization has designed its AI systems to reward competence — or compliance theater.
The Uncomfortable Truth About AI Adoption Scoring #
Most organizations deploying AI adoption metrics have not asked the question that matters: what outcome are we actually trying to produce? The answer should be something like: “improved quality of outputs, faster execution of appropriate tasks, reduced cognitive burden on professionals for routine work.”
Instead, the metric most organizations have deployed measures something different: how often the tool is opened. These are operationally distinct, and the gap between them grows wider in functions with high verification requirements.
Gallup’s 2026 State of the Global Workplace report found that global employee engagement fell to 20% in 2025 — its lowest level since 2020 — at a cost of an estimated $10 trillion in lost productivity (Gallup, April 8, 2026). Organizations trying to recover engagement through AI adoption mandates backed by frequency metrics are likely adding another source of disengagement — the particular frustration of being evaluated on the wrong thing.
For Fatima: you are not failing. You are being measured by a metric that was not designed with your function in mind. The three moves above give you a path to address that without performing theater and without triggering a defensive response from your manager.
The clearest sign your organization is serious about AI is not that it measures how often you open a tool. It is whether it has thought carefully enough about your specific function to design metrics that distinguish between using AI well and using AI often.
Those are not the same thing. Any organization that has missed this distinction needs someone willing to name it.
That is your job right now. Do it directly, do it constructively — and do it in writing.
References #
- Microsoft. (May 8, 2024). “2024 Work Trend Index: AI at Work Is Here. Now Comes the Hard Part.” https://www.microsoft.com/en-us/worklab/work-trend-index/ai-at-work-is-here-now-comes-the-hard-part (Accessed June 2026)
- Deloitte. (March 4, 2026). “2026 Global Human Capital Trends.” https://www.deloitte.com/us/en/insights/topics/talent/human-capital-trends.html (Accessed June 2026)
- Gallup. (April 8, 2026). “State of the Global Workplace 2026.” https://www.gallup.com/workplace/349484/state-of-the-global-workplace.aspx (Accessed June 2026)
- Gallup. (March 24, 2026). “U.S. Worker Thriving Declines as Job Market Pessimism Grows.” https://www.gallup.com/workplace/703280/worker-thriving-declines-job-market-pessimism-grows.aspx (Accessed June 2026)
- Center for Creative Leadership. (2025). “What Is Psychological Safety at Work?” https://www.ccl.org/articles/leading-effectively-articles/what-is-psychological-safety-at-work/ (Accessed June 2026)
AI-Generated Content Notice
This article was created using artificial intelligence technology. While we strive for accuracy and provide valuable insights, readers should independently verify information and use their own judgment when making business decisions. The content may not reflect real-time market conditions or personal circumstances.
Whenever possible, we include references and sources to support the information presented. Readers are encouraged to consult these sources for further information.
Related Articles
Signals & Shifts: The Pipeline Premium
In a slowing labor market, the real career premium is shifting toward employers and sectors that …
Signals & Shifts: The Two Markets
The headline numbers are the strongest in months, but the data beneath them describe two different …
Skills-Based Hiring: The Career Transition Game-Changer You've Been Waiting For
The transition from degree-based to skills-based hiring is revolutionizing career transitions, …