
Claude AI Contract Review: Risks, Limits & Better Alternatives
Table of contents
Claude Opus 4.7 scored 90.9% on Harvey's BigLaw Bench — the standard the legal industry uses to measure AI on real legal work. But that score comes with a caveat most teams miss: it reflects Claude running through Harvey's purpose-built platform, not a general deployment. In this post, Legartis CEO David A. Bloch breaks down what Claude Legal actually is, the four risk categories that emerge when enterprise legal teams use it without safeguards, and what purpose-built legal AI does differently.
Over the past year, Anthropic has made a serious push into legal — building out a dedicated set of legal solutions for Claude, including practice-area plugins, connectors to the tools legal teams already use such as iManage, NetDocuments, DocuSign, and Thomson Reuters, and purpose-built integrations with major legal platforms. Most recently, in April 2026, they released Claude Opus 4.7, which scored 90.9% on Harvey's BigLaw Bench — the benchmark that has become the industry standard for measuring AI performance on real legal work.
And the questions every legal and procurement leader is asking right now are: Is Claude actually ready to review real contracts? And does using it as-is pose any risks?
Having spent years building contract AI for enterprise legal teams across Europe, here is exactly what Claude AI for legal teams offers, where it's actually useful, where it starts creating risk once real contracts get involved, and everything you need to know if you're considering Claude for legal work right now.
What Claude Legal Actually Is
The name Claude Legal is a little misleading. Claude is Anthropic's general-purpose AI assistant — similar in category to ChatGPT — meaning it's built to answer questions, summarize documents, draft text, and assist with analysis across any topic. At its core, Claude is not a legal product. It's a general AI that happens to be capable of handling legal tasks when you prompt it to.
What changed over the past year is that Anthropic built out a dedicated legal solutions layer on top of Claude, accessible through Claude Cowork — their agentic desktop application. This includes practice-area plugins configured for commercial legal, corporate legal, IP, and litigation work, as well as connectors to iManage, NetDocuments, DocuSign, Ironclad, and Thomson Reuters.
The workflow is natural language-driven: you can ask Claude to review a counterparty's redlines against your playbook, flag change-of-control provisions in an M&A data room, or draft a research memo on regulatory guidance — and it will execute those multi-step tasks end to end. Anthropic's own documentation makes clear that all Claude-generated legal outputs should be reviewed by a licensed attorney before reliance.
At the model level, Claude can process up to 200,000 tokens in a single session — about 500 pages of text. For legal work, that means it can hold a long merger agreement or a full due diligence pack in context without losing track of cross-references between clauses.
For certain workflows, using Claude makes sense. A first-pass read of a standard NDA, a plain-English summary of a complex section for a non-lawyer, or quick legal research to point your team in the right direction — it's a useful tool for that. Claude is clearly a capable foundation. But the issue is what happens when enterprise legal teams try to use it without safeguards on real contracts.
The Risks of Using AI for Contract Review at the Enterprise Level
1. The Accuracy and Configuration Gap
This is where the numbers get important. Anthropic's latest model, Claude Opus 4.7, scored 90.9% on Harvey's BigLaw Bench — which sounds impressive, and it is. But that score reflects Claude running through Harvey's purpose-built legal platform, with all the configuration, connectors, and legal-specific tuning that entails.
When your team uses Claude directly — as a general assistant, without a legal-specific playbook, without structured connectors to your document systems, and without specialized training on your contract types — you're not getting that 90.9%.
General AI models also still hallucinate, meaning they generate confident, plausible-sounding outputs that are completely false. In one well-known case in 2023, a New York attorney named Steven Schwartz was sanctioned after submitting a brief with six AI-generated case citations that turned out to be completely fabricated — and that risk hasn't gone away with newer models. According to Pactly, relying on a general AI tool for anything beyond general brainstorming requires 100% human verification, which eliminates most of the efficiency gain you were going for in the first place.
2. No Playbook Enforcement — the AI Doesn't Know Your Rules
Claude has no embedded knowledge of your fallback positions, your negotiation history, or your internal risk tolerance. It might suggest an edit that looks fine on paper but violates a core business policy — like agreeing to an indemnity clause your CFO would never accept, or a liability cap that's far below your standard floor.
And because every prompt starts from zero, the AI has no memory of how your team has negotiated the last 500 contracts. You get different outputs depending on who's using it, what kind of deal they're working on, and where the contract is based.
3. Data Security and GDPR Exposure
This is the biggest risk for European companies. If a contract contains personal data — names, signatures, addresses, or any identifying information — uploading it to a general AI tool without the right security guarantees is a direct violation of GDPR. The same logic applies to CCPA in the US.
The penalty for a GDPR violation is up to 4% of annual global turnover or €20 million, whichever is higher — and European authorities have already penalized OpenAI over GDPR concerns. There's also an attorney-client privilege exposure: the moment you upload a privileged document to a consumer version of Claude, the confidentiality protection the NDA or contract was supposed to preserve is being compromised.
Anthropic does offer enterprise plans with data processing agreements, audit logs, and retention controls, but setting all of that up properly takes significant effort from your legal, IT, and compliance teams — and most teams that deploy Claude as-is are skipping all of that.
4. Your Prompts Can Be Pulled Into Litigation
This is the risk most teams don't find out about until they're already in litigation. A September 2025 advisory from the law firm Harris Beach Murtha makes it explicit: if any of your team's prompts or Claude's responses relate to something being argued in court, the other side's lawyers can demand to see them.
Their exact example is one every contract team should hear: if an employee used Claude to edit a draft contract section that's later alleged to be misleading, the content of the prompt and the AI's suggested edits could be used to show what the employee knew or what they were trying to do.
This is no longer theoretical. In May 2025, a federal judge in the Southern District of New York issued a preservation order in the New York Times v. OpenAI case, directing OpenAI to preserve all consumer output logs — meaning even chats that users manually deleted are now under legal hold and theoretically subject to subpoena. Any time you upload a sensitive legal document to the consumer version of Claude, you risk losing the legal protection that was keeping it private.
What Purpose-Built Legal AI Software Does Differently
If you want an AI contract review tool that doesn't come with the accuracy, compliance, and privilege issues above, book a demo with Legartis — an AI contract intelligence platform built specifically for enterprise legal, sales, and procurement teams.
Accuracy and Playbook Enforcement
Purpose-built legal AI platforms are trained on specific contract types and tuned for clause-level extraction — which is where the performance gap shows up most clearly in practice. A general Claude configuration without legal-specific tuning produces inconsistent results on structured tasks like clause extraction, particularly across different jurisdictions, contract languages, and deal types.
Purpose-built platforms validated against DACH contracts, for example, can achieve an F1 above 90% on clause extraction for those specific contract types — a meaningful difference for any legal team making a review decision based on what the AI found.
On the playbook side, these platforms let you turn your company's internal negotiation rules, fallback positions, and risk tolerances into review logic the AI actually applies to every contract. Instead of generic legal suggestions, the AI checks the contract against your own standards — giving you consistent output across your team, aligned with decisions your business has already made. This is what makes AI Quality measurable and auditable rather than a black-box number.
GDPR Compliance and Jurisdiction — Built In, Not Bolted On
A purpose-built European legal AI platform is GDPR compliant out of the box — meaning you're not spending months configuring enterprise agreements, audit logs, and data processing addenda just to meet basic legal standards. Data is typically hosted in Europe and never shared with US hyperscalers or third parties, which means you avoid the privilege and data residency exposure covered above.
The better platforms are also trained on local legal corpora — German, Swiss, and Austrian contracts under civil law — which is the kind of jurisdictional nuance a general AI trained on the open web is going to miss every time. If you're evaluating GDPR compliant AI contract review software for a European or multinational team, this distinction is not optional.
Why Legartis
Legartis is built specifically for this use case. It achieves F1 above 90% on clause extraction, runs on Swiss infrastructure, is GDPR compliant out of the box, and is trained on millions of contracts.
It also includes the Playbook Creator — an agentic feature that turns your internal rules into enforceable review logic across all your contracts in hours. One of our clients used the Playbook Creator to build a full Procurement Agreement playbook that his team had estimated would take two months of manual work — and he had it done in a single morning, with full transparency on how the AI was applying his rules.
The way I think about it: general AI is a great starting point. Purpose-built legal AI software is what you actually deploy.
Should Your Team Be Using Claude for Contract Review?
For low-stakes, one-off tasks — a quick summary, a plain-English read of a section for a non-lawyer, a first-pass on a standard NDA — Claude is a capable and accessible tool. There are use cases where it makes sense.
But for enterprise contract review at scale — where consistency matters, where your company's negotiation standards need to be enforced, where GDPR exposure and litigation discovery risk are real — using Claude Legal for contract review without a purpose-built layer underneath it is a significant risk that most teams don't fully account for until something goes wrong.
If your team is reviewing a high volume of contracts and you want to see exactly how a purpose-built platform handles accuracy, playbook enforcement, and data compliance without any of those risks, book a demo with Legartis. We'll walk you through how the platform works against your own contracts and map out exactly what onboarding your team would look like.
David A. Bloch is the CEO of Legartis, an enterprise AI contract review platform built for European legal teams.
Start withLegartis Today!
Talk to us about your business case or test Legartis right away!