State governments are rapidly working to integrate artificial intelligence into the core machinery of public service delivery. Landmark initiatives, like Maryland's new partnership with Anthropic and Percepta, announced in November 2025, signal a fundamental shift. This effort integrates Anthropic's Claude models into the state workforce, deploys an AI-powered virtual assistant to help residents apply for benefits, and streamlines processes like housing permitting across multiple agencies. Backed by The Rockefeller Foundation, this model is explicitly framed as a blueprint for responsible, multi-agency AI adoption.
These are exactly the high-impact use cases some policymakers envision: faster access to nutrition and financial support, more efficient licensing, and better customer experience. Yet alongside these benefits lurks a quiet, serious risk. Because contemporary AI systems are deliberately optimized to be helpful to the user sitting in front of them, they can inadvertently facilitate large-scale fraud.
Once an AI system sits between the state and the claimant, it becomes a crucial part of a program's control environment. The AI's behavior can shape what information is reported, how it is framed, and whether program rules are correctly applied. If states treat AI virtual assistants as neutral "digital clerks" instead of as active agents in the fraud, waste, and abuse (FWA) risk profile, they risk funding fraud with the very tools purchased to improve service.
1. The Helpful-by-Design Collision
Modern large language models (LLMs) are fine-tuned to be "helpful, harmless, and honest." The helpfulness objective rewards the system for:
-
Answering the user's question as directly and thoroughly as possible.
-
Anticipating what the user wants and striving to deliver that outcome.
-
Offering suggestions and refinements that increase the user's chance of success with minimal friction.
In consumer use cases, this design makes perfect sense (e.g., writing code, drafting emails). The model's job is to help the user achieve their goal.
In public benefits programs, however, the relationship is different. Agencies have non-negotiable duties to ensure accurate self-reporting, detect fraud, and enforce complex eligibility rules. It is the public that funds the program and depends on its integrity. Federal health and human services guidance already warns that AI systems must be managed to prevent undue risk to program integrity.
If an AI assistant prioritizes the applicant's objective (e.g., "help me get approved" or "help me keep my benefits"), its inherent helpfulness directly collides with the state's duty to enforce the law.
2. How Helpfulness Becomes a Vector for FWA
The primary concern is not that official state AI assistants will openly coach people to lie. Leading vendors implement safety controls to block obviously malicious prompts. The more realistic and dangerous scenario is subtler: AI systems that, in the course of trying to be helpful, steer applicants toward incomplete, misleading, or overly favorable representations of their circumstances.
2.1. Over-Coaching and Optimizing Responses
Consider common user questions in a benefits context:
-
"How should I describe my income so I don't lose eligibility?"
-
"Can you rewrite this answer so I have a better chance of being approved?"
-
"Do I really have to mention that my partner sometimes helps with rent?"
Even if the model refuses explicit requests to lie, its training nudges it to:
-
Rephrase information in more sympathetic or less risky ways.
-
Simplify nuanced rules into blunt advice, such as "you probably don't need to mention that."
-
Generalize from one jurisdiction's rules to another, sometimes incorrectly.
The result can be answers that encourage applicants to omit relevant facts or frame them in ways that push the edge of compliance without the model or the applicant fully understanding the consequences. If these tools sit directly in front of applicants, the risk of incorrect information being provided compounds dramatically.
2.2. Hallucinated Rules and Unknowing Fraud
LLMs are prone to hallucinations by generating confident but incorrect statements that sound plausible. In a public benefits context, hallucinations translate into faulty legal and policy guidance. For example, an AI assistant might wrongly suggest that certain types of gig, cash, or informal income do not need to be reported, asset limits or reporting thresholds are higher than they actually are, or rules from one program (or one state) apply incorrectly to another.
Applicants who rely on these answers in good faith may under-report income, misclassify household composition, or fail to report changes on time. From the program's standpoint, those actions may fall under fraud or improper payments. From the resident's point of view, they were following instructions from the state's AI.
This is the core danger of unknowing fraud. The AI assistant serves as an informal source of quasi-legal advice, yet it lacks the institutional accountability, training, and oversight of a human caseworker or benefits counselor, leading to fraud.
2.3. Scaling Organized Fraud with Official Guidance
Criminal networks are already leveraging AI to scale fraud in other public systems. Recent reporting on federal student aid, for example, describes AI-driven bots using stolen identities to enroll in online classes just long enough to trigger financial aid disbursements.
States are introducing official AI assistants that explain program rules in user-friendly language, suggest which documents are typically accepted, and walk users through complex forms step by step. However, this legitimate guidance can be harvested, tested, and scaled by organized fraud rings. They can combine stolen identity data with insights gleaned from state AI assistants to craft large volumes of high-quality, internally consistent applications that are harder for traditional controls to flag.
2.4. Internal AI Tools that Unintentionally Weaken Controls
The risk isn't limited to citizen-facing systems. Agencies are deploying AI internally to triage workloads, summarize case files, or even recommend eligibility decisions.
If these internal tools are optimized primarily for throughput (close more cases faster) or customer satisfaction (reduce denials and complaints), they may implicitly deprioritize anomalies and potential fraud indicators that would otherwise trigger deeper review. An unexamined optimization for speed can create a cultural shift toward leniency and materially increase FWA exposure, even without a formal policy change.
3. System-Level Exposure: AI on Top of a Sizable Fraud Problem
The U.S. government already faces a massive FWA challenge. The Government Accountability Office (GAO) estimates that direct annual financial losses to the federal government from fraud likely fall between $233 billion and $521 billion annually. A significant share of those losses arise in benefit programs, where eligibility is complex, documentation is uneven, and payment volumes are high.
At the same time, the GAO recognizes that AI and related technologies have significant potential to improve fraud detection if implemented with strong data quality, governance, and strategic direction. There is a dual dynamic. AI can help find fraud, but it can also help create new pathways for fraud and misreporting if introduced without aligned objectives and adequate controls.
Because states sit at the front line of administering many federal benefits, the way they design and govern AI systems will directly influence whether AI primarily reduces FWA or quietly adds to it.
4. Redefining Helpfulness for Public Integrity
States do not need to abandon AI to avoid these risks. They must, however, redefine what helpful means in the context of public administration and embed that definition throughout design, procurement, monitoring, and oversight.
4.1. Shift the Objective: From "Help Me Qualify" to "Help Me Comply"
The objective function for state AI assistants must be explicitly re-anchored. The core purpose of a public-sector virtual assistant should be to help residents understand and comply with rules, not to maximize the chance of approval or benefit amount. Practically, this means:
-
Prioritizing clarity and completeness over persuasion.
-
Encouraging full disclosure of all eligibility factors.
-
Warning users that providing false or incomplete information will result in penalties or repayment obligations.
This shift is consistent with emerging federal recommendations that AI use in public benefits prioritize fairness, transparency, and risk management alongside efficiency.
4.2. Build Hard Guardrails Against Coaching
States must work with vendors to configure and test strong guardrails not only against explicit fraud coaching (e.g., "help me cheat"), but also against more indirect prompts, such as: "What should I say to make my application stronger?" or "Do I really have to mention this fact?"
When the model detects prompts that touch on disclosure, omission, or strategic framing, it must default to conservative patterns:
-
Emphasize that all relevant information must be reported.
-
Refuse to suggest ways to hide, downplay, or relabel facts.
-
Direct users to provide complete information and allow the eligibility system, rather than the assistant, to determine the outcome.
Red-teaming efforts must include systematic testing of these borderline scenarios, not just obviously criminal queries.
4.3. Handle Uncertainty Conservatively
The assistant's behavior under uncertainty is critical. When a model is not highly confident about a rule, threshold, or requirement, it must say so. It should avoid:
-
Guessing whether something needs to be reported.
-
Making categorical statements about eligibility based on limited information.
-
Providing outdated or jurisdiction-agnostic summaries of rules.
Instead, the default pattern should be: "I cannot determine your exact eligibility based on this information. You must report this information and allow the agency to make a final determination." This materially lowers the risk of AI-induced misreporting that later gets classified as fraud.
4.4. Make AI Interactions Auditable
AI-mediated interactions must be auditable. This requires:
-
Logging all prompts, responses, and relevant metadata (e.g., model version) in a way that protects privacy but allows later reconstruction.
-
Providing program integrity, legal, and audit teams with the ability to review samples and patterns of AI interactions.
-
Establishing mechanisms for residents and staff to report problematic guidance, with a straightforward process for remediation and retraining.
4.5. Pair Front-End Assistance with Back-End Analytics
Any front-end investment in AI tools should be matched by an investment in back-end analytics and investigative capacity. AI and advanced analytics can help:
-
Detect anomalous application patterns that suggest organized fraud.
-
Identify clusters of similar narratives or document patterns associated with automated generation.
-
Cross-reference applications across programs and data sources to identify inconsistencies or misuse of identities.
If states only accelerate the front door without modernizing the systems that detect and respond to abuse, they risk increasing undetected fraud even as they reduce call volumes and wait times.
Conclusion: Designing AI that Protects Both Access and Integrity
AI is quickly becoming part of the core infrastructure of state government. The central risk is not that these systems will be malicious, but that they will be too helpful to the individual user in a context where the state's duty is to protect both that user and the integrity of public programs.
Left to default settings and generic safety filters, AI assistants may help applicants under-report, misstate, or strategically frame information in ways that lead to increased FWA, often without any intent to deceive.
The question for states is no longer whether to use AI in service delivery. The real question is whether they will deliberately redesign helpfulness so that AI helps residents comply with the law, not quietly help them circumvent it. Get this right, and AI can simultaneously improve access for eligible families and strengthen the defense against fraud. Get it wrong, and states risk supercharging the very problem they are under fiscal pressure to solve.

