TwinLadder logoTwinLadder
TwinLadder
TwinLadder logoTwinLadder
Back to Archive
TwinLadder Intelligence
Issue #29

TwinLadder Weekly

March 2026

TwinLadder Weekly

Issue #29 | March 2026


Editor's Note

Last month I sat in a conference room in Rotterdam, watching a European logistics company's procurement team evaluate AI vendor responses for a new freight optimisation platform. Four vendors, four glossy proposals, four promises of "full EU AI Act compliance." I was there as an advisor. Within twenty minutes, I knew we had a problem.

The procurement team had a solid IT evaluation framework — uptime guarantees, data residency, API documentation, SLA terms. They knew how to buy software. What they did not know was how to buy AI. Not one of the four vendor proposals specified the training data used in their models. Not one disclosed a measured error rate for the company's specific use case. Not one explained, in operational terms, how a warehouse manager should verify the system's routing recommendations before acting on them. And the procurement team had not asked for any of this — because they did not know they should.

When I raised these gaps, the head of procurement said something that has stayed with me: "We assumed the vendor handles the AI part. We just need it to work." That sentence captures the single most dangerous assumption in European AI procurement today. Because under the EU AI Act, the vendor does not handle the AI part. The deployer does. And what you buy either makes that obligation manageable or makes it impossible.


The Procurement Blind Spot: When Buying AI Means Buying Compliance Risk

Why Article 4 Changes Everything About How You Evaluate AI Vendors

Alex Blumentals, with legal analysis by Liga Paulina and technical analysis by Edgars Rozentals

Here is a number that should concern every procurement director in Europe: only 14% of organisations have formal AI vendor evaluation frameworks that go beyond standard IT procurement criteria. [cite:gartner-ai-vendor] Meanwhile, 65% of organisations now regularly use generative AI — nearly double from ten months prior. [cite:mckinsey-ai-procurement] The gap between AI adoption speed and procurement sophistication is not a nuisance. It is a compliance liability.

Most companies still treat AI purchasing like any other software procurement. Feature comparison, price negotiation, security review, contract signature. The vendor says "AI-powered." The buyer says "sounds good." Nobody asks what that means in practice — what the model can and cannot do, where it fails, how staff should verify its outputs, and whether the documentation supports the deployer's legal obligations.

Article 4 changes this equation entirely. [cite:eu-ai-act-article4] The regulation places AI literacy obligations on the deployer — the company that buys and uses the AI system, not the company that built it. This means every AI tool you purchase either helps you meet your Article 4 obligations or creates a gap you must fill yourself. And most procurement teams have no way to tell which.

What Vendors Do Not Tell You

Let me be specific about what I have observed in vendor proposals and sales processes across Europe over the past six months. I have reviewed or advised on AI procurement for eleven organisations in seven countries. The patterns are consistent enough to generalise.

"AI-powered" is not a feature specification. It is a marketing term. When a vendor says their contract review platform is "AI-powered," that tells you approximately as much as a car manufacturer saying their vehicle is "engine-powered." Every vendor proposal I reviewed used the phrase. Not one defined it with technical precision. Does "AI-powered" mean a large language model? A fine-tuned classifier? A rule-based system with a machine learning layer? The answer matters enormously for understanding reliability, limitations, and compliance requirements — and it is almost never provided.

Edgars Rozentals, who evaluates AI systems for Baltic and Nordic organisations, has a rule: "If the vendor cannot tell you the architecture of the AI component in two sentences, they either do not understand their own product or they do not want you to understand it. Both are disqualifying."

Training data provenance is rarely disclosed. Of the eleven procurement processes I observed, only two vendors provided any information about what data their models were trained on. Neither provided sufficient detail to assess whether the training data was representative of the deployer's specific domain. The Stanford Foundation Model Transparency Index found an average transparency score of just 37 out of 100 across major providers, with training data documentation among the weakest areas. [cite:stanford-hai-transparency]

Error rates are almost never published. I asked every vendor I encountered a simple question: "What is the measured error rate of your AI system for this client's specific use case?" Not one could answer. Some offered generic accuracy benchmarks from controlled test environments. None had domain-specific error data for the industries they were selling into. One sales representative told me, memorably, "Our system is highly accurate." When I asked for the number, he said, "We don't publish that."

"Compliance" is marketing, not a legal guarantee. Seven of the eleven vendor proposals I reviewed claimed some form of "EU AI Act compliance." Liga Paulina reviewed three of these claims in detail and found that none constituted a contractual warranty. "Every compliance claim I examined was either a statement of intent — 'we are committed to compliance' — or a reference to features that partially address one article of the regulation while ignoring others," she says. "No vendor I reviewed offered a contractual guarantee that their system, as deployed, would satisfy the deployer's specific obligations under Articles 4, 13, 14, or 26."

This last point is critical. The vendor's compliance is not your compliance. Even if a vendor's AI system is perfectly compliant from the provider's perspective, that does not discharge the deployer's separate obligations. And the deployer's obligations — particularly the Article 4 literacy requirement and the Article 26 human oversight requirement — depend entirely on how the system is used, by whom, and with what understanding. [cite:eu-ai-act-article26]

The Deployer's Dilemma

Liga Paulina frames the legal architecture plainly: "The EU AI Act creates a shared responsibility model. The provider must build a transparent, documented system. But the deployer must ensure that the people using that system understand it well enough to use it appropriately. [cite:eu-ai-act-articles13-14] These are separate obligations, and they cannot be satisfied by the same actions. A vendor's product documentation does not automatically become the deployer's AI literacy programme."

She identifies three specific deployer obligations that procurement teams routinely overlook:

Obligation AI Act Article What It Requires of the Deployer What Procurement Teams Usually Ask
AI literacy Article 4 Staff must have sufficient understanding of AI system capabilities and limitations "Does the vendor provide training?" (checkbox)
Human oversight Article 26 Staff assigned to oversight must have necessary competence, training, and authority "Does the system have a human-in-the-loop?" (feature check)
Transparency to affected persons Articles 13–14 Deployer must be able to explain AI-assisted decisions to affected individuals "Is the system explainable?" (yes/no)

The right-hand column represents how most procurement teams currently address these requirements. The middle column represents what the regulation actually demands. The gap between them is where compliance risk lives.

The Five Questions Every AI Vendor RFP Should Ask

Based on what I have observed — the procurement processes that went well, the ones that produced regret, and the regulatory guidance emerging across Europe — here are five questions that should appear in every AI vendor evaluation. None of them are standard in current procurement frameworks. All of them are necessary.

1. What are the documented limitations of your AI system in our specific use case?

Not generic limitations. Not "AI can sometimes make errors." Specific, documented, use-case-relevant limitations. If you are buying a contract review tool, what types of clauses does it consistently miss? If you are buying an HR screening system, what candidate profiles generate unreliable results? If the vendor cannot answer this question with specifics, they have not tested their system in your domain — or they have tested it and do not want you to see the results.

Edgars Rozentals suggests a practical test: "Ask the vendor to provide three examples of cases where their system produced incorrect or misleading results in a comparable deployment. If they say there are none, they are either not measuring or not being honest. Both are problems."

2. What training does your platform provide to help our staff understand AI output reliability?

Article 4 requires the deployer to ensure sufficient AI literacy. [cite:eu-ai-act-article4] If the vendor provides training, that training needs to go beyond product tutorials. It needs to teach your staff how to assess whether the AI output is reliable in their specific context. Ask for the training curriculum. Ask how it addresses limitations and failure modes. Ask whether it differentiates by user role and expertise level.

If the vendor's training is a 30-minute onboarding video followed by a quiz about interface features, that is product training, not AI literacy training. You will need to budget for the gap.

3. What is your measured error rate for our domain, and how was it established?

This question is uncomfortable for vendors. Ask it anyway. A vendor that cannot provide domain-specific performance data is asking you to deploy their system based on faith. The measurement methodology matters as much as the number — was it tested on a representative sample of your data? By whom? Under what conditions?

4. How does your system support human oversight as required by the EU AI Act?

Not "does your system have a human-in-the-loop option?" That is a feature question. The compliance question is: how does the system's design enable the specific oversight capabilities Article 26 requires? [cite:eu-ai-act-article26] Can the human overseer understand the AI's reasoning? Can they override the system? Can they identify when the system is operating outside its reliable parameters? How does the interface design support — not just permit — meaningful human oversight?

5. What documentation do you provide that we can use as Article 4 compliance evidence?

Ask this explicitly. The deployer needs to demonstrate compliance. That means documentation — about the system's capabilities and limitations, about the training provided to users, about human oversight mechanisms, about how the organisation ensures ongoing AI literacy. If the vendor's documentation is designed for their own compliance rather than yours, you have a gap that your legal and compliance teams will need to fill.

The Vendor Competence Framework

These five questions are a starting point. But individual questions are not a framework. What European organisations need — and what almost none currently have — is a structured methodology for evaluating whether an AI vendor supports or undermines the deployer's Article 4 obligations.

Over the past three months, working with procurement teams, legal advisors, and technical evaluators across seven organisations, I have developed a scoring framework that applies TwinLadder's AI competence methodology to the vendor evaluation context. It assesses vendors across five dimensions:

Dimension What It Evaluates Compliance-Supporting Compliance-Neutral Compliance-Undermining
Transparency How clearly the vendor documents system capabilities, limitations, and architecture Detailed technical docs, domain-specific limitation disclosures, architecture explanation Generic product documentation, high-level accuracy claims "Proprietary" response to technical questions, no limitation disclosure
Training support Whether vendor training builds genuine AI literacy or just product proficiency Role-specific training, failure case studies, verification exercises Product tutorials, feature walkthroughs, generic "AI awareness" No training, or training that positions AI output as reliable by default
Documentation quality Whether vendor docs serve the deployer's compliance needs Deployer-facing compliance documentation, Article 4 mapping, oversight guides Provider-focused documentation only, no deployer compliance guidance No documentation, or documentation with disclaimers that shift all responsibility
Error transparency Whether the vendor measures and discloses performance data Published domain-specific error rates, ongoing monitoring data shared with deployer Generic accuracy benchmarks, controlled-environment metrics only No error data, or claims of near-perfect accuracy without supporting evidence
Human oversight design Whether the system's UX supports meaningful human control Interface designed for verification, uncertainty indicators, override mechanisms Basic approval/rejection workflow, no uncertainty signals Automated defaults, friction to override, confidence displays without calibration

Each dimension can be scored on a 1-5 scale. A vendor that scores 3 or above across all five dimensions actively supports your Article 4 posture. A vendor that scores below 3 on any dimension creates a compliance gap you must fill with your own resources — training, documentation, oversight protocols — at your own cost. A vendor that scores 1 on transparency or error transparency is, in my assessment, not suitable for deployment in an Article 4 environment without substantial internal mitigation.

This is not abstract. I have watched procurement teams use a version of this framework in live evaluations. It changes the conversation. When you tell a vendor, "We scored your training support at 2 out of 5 because your onboarding programme does not address system limitations," you get a very different response than when you ask, "Do you provide training?" The specificity creates accountability.

What Europe Is Already Doing

The smartest procurement frameworks I have seen are emerging from public sector organisations — not because governments are more sophisticated buyers, but because public procurement rules force transparency that the private sector can avoid.

The Dutch government maintains a public algorithm register that requires government agencies to document AI systems including purpose, data sources, human oversight mechanisms, and known limitations. [cite:dutch-algo-register] More importantly, Dutch public procurement frameworks now require vendors to complete transparency documentation before contract award. The Netherlands was early on this. Other jurisdictions are following.

Germany updated its federal procurement guidelines in 2025 to include AI-specific evaluation criteria. [cite:german-procurement-ai] Vendors bidding on public contracts must provide technical documentation covering training data provenance, known error rates, and human oversight design. The Bitkom industry association published complementary guidance helping procurement teams evaluate these submissions — recognising that asking for the documentation is only half the battle. You also need to know how to read it.

The European Commission published procurement guidance recommending that public buyers evaluate AI systems on transparency, explainability, and human oversight capabilities — not solely on functionality and price. [cite:ec-ai-procurement] The guidance is advisory, not binding. But it signals where the regulatory expectation is heading.

In the Nordics, I have observed private-sector organisations building their own vendor assessment frameworks. A Swedish financial services company developed a 40-point AI vendor evaluation checklist that includes questions about model retraining schedules, data retention policies, and deployer notification requirements when the model changes. A Finnish healthcare organisation now requires vendors to conduct a "failure mode workshop" as part of the procurement process — a structured session where the vendor walks the buyer through scenarios where their AI system is known to perform poorly.

The Baltic picture is less mature. In Latvia, Lithuania, and Estonia, most organisations I work with are still using standard IT procurement frameworks for AI purchases. The regulatory infrastructure is developing — the Latvian CDPC has signalled interest in AI procurement guidance — but the practical tools are not yet in place. This is a gap that will close, but it has not closed yet.

The Market That Does Not Yet Exist

Here is where this analysis becomes uncomfortable, because it points to a capability gap that no existing service adequately fills.

Procurement teams do not have AI literacy. Not in the Article 4 sense. They understand purchasing. They understand vendor management. They do not understand, in most cases, how to evaluate whether a vendor's AI system will support or undermine their organisation's compliance posture. They cannot assess training data quality. They cannot evaluate whether an error rate is acceptable. They cannot determine whether a vendor's human oversight design is genuine or cosmetic.

This is not a criticism. It is a structural problem. Article 4 created an obligation that cuts across every function that uses AI — including the function that buys it. The procurement team is not exempt from the literacy requirement. And yet procurement-specific AI competence is almost entirely absent from the training market, from consulting offerings, and from regulatory guidance.

Edgars Rozentals sees this from the technical evaluation side: "I regularly receive calls from procurement teams asking me to evaluate an AI vendor's technical claims. They know enough to sense that the vendor's pitch does not add up. They do not know enough to articulate why. That gap — between intuition that something is wrong and the technical knowledge to identify what — is where organisations get hurt."

The emerging role is something like an "AI procurement assessor" — someone who can sit between the procurement team, the legal team, and the technical evaluation to ask the questions that none of those functions, individually, knows to ask. Some organisations are building this capability internally. Most are not. The consultancies that serve procurement functions have been slow to develop AI-specific evaluation competence. The AI vendors themselves have an obvious conflict of interest.

This is, candidly, where we see TwinLadder's assessment methodology extending naturally. Our competence framework — the same six pillars and four maturity levels we use for internal AI literacy assessment — applies directly to vendor evaluation. If you can assess your own organisation's AI competence, you can assess whether a vendor supports or undermines it. The same diagnostic questions, the same evidence requirements, the same scoring methodology. Applied outward instead of inward.

We are building this. I mention it not as a pitch but as a disclosure — because the editorial observation and the product development come from the same analysis, and you should know that.


The Competence Question

A European insurance company buys an "AI-compliant" claims assessment tool in early 2026. The vendor's proposal emphasises speed — 70% faster claims processing — and includes a section titled "EU AI Act Compliance" that describes the provider's internal governance framework. The procurement team evaluates the tool on processing speed, integration requirements, and cost. They check the compliance section, confirm it exists, and move forward.

Eight months later, a policyholder disputes a claim denial. The claim was flagged by the AI system as potentially fraudulent and routed to a human reviewer, who affirmed the denial. The policyholder's lawyer asks a straightforward question: "On what basis did the AI system flag this claim?" The human reviewer cannot answer — the system provided a fraud probability score but no explanation of the underlying factors. The training the vendor provided covered how to read the dashboard and process claims efficiently. It did not cover how to interrogate the system's reasoning or when to override it.

The policyholder files a complaint with the national data protection authority. The authority asks the company to demonstrate that the staff involved had "sufficient AI literacy" to understand the system's capabilities and limitations — the Article 4 standard. [cite:eu-ai-act-article4] The company produces the vendor's training certificates. The authority asks a follow-up question: "Did the training address the specific limitations of the AI system in claims fraud detection? Can the reviewing staff explain how the model generates its risk scores?" The company cannot demonstrate this. The vendor's training was product training, not competence training. And the procurement team never asked for the difference.

The Article 4 obligation falls on the deployer — the insurance company — not the vendor. The compliance gap was built into the purchase.


What To Do

  1. Add Article 4 requirements to your standard AI vendor RFP template. Include the five questions above — limitations disclosure, training adequacy, error rates, human oversight design, and compliance documentation. Do not accept "we are committed to compliance" as an answer. Require specifics. If your RFP template does not have an AI-specific section, you are evaluating AI tools with a framework designed for traditional software. That is like using a vehicle safety checklist to evaluate an aircraft.

  2. Require vendor transparency documentation before purchase, not after. The time to discover that a vendor cannot explain their model's limitations is during evaluation, not after deployment. Request the vendor's technical documentation — architecture description, training data summary, known limitations, performance benchmarks — as a prerequisite for advancing to commercial negotiations. If the vendor declines, that is your answer.

  3. Test vendor claims with domain-specific scenarios. Do not accept generic demonstrations. Provide the vendor with test cases from your actual operations — representative documents, realistic edge cases, known-difficult scenarios. Ask them to run their system against your data and share the results, including errors. A vendor that refuses domain-specific testing is asking you to buy a product they have not validated for your use case.

  4. Budget for the competence gap your vendor does not close. If the vendor's training covers product features but not AI literacy in the Article 4 sense, budget for supplementary training. If the vendor's documentation does not map to your deployer obligations, budget for creating that mapping internally. The cost of filling these gaps should be part of your total cost of ownership calculation — not a surprise discovered six months after deployment.


Quick Reads

  • European Commission, EU Guide for Public Procurement of AI — the closest thing to an official procurement playbook for AI purchases. Written for the public sector but applicable to any organisation evaluating AI vendors against European regulatory requirements.

  • Stanford HAI, Foundation Model Transparency Index — average transparency score: 37 out of 100. If you want to understand how little even the largest AI providers disclose, start here. Then ask yourself what your mid-market vendor is telling you.

  • Dutch Government Algorithm Register — the most mature public AI transparency initiative in Europe. Worth studying not for the technology but for the documentation requirements it imposes on AI system operators — requirements that mirror what your procurement team should be asking vendors.

  • McKinsey, The State of AI in 2025 — 65% adoption, 18% procurement adaptation. Those two numbers in a single sentence tell you everything about the gap this issue addresses.

  • EU AI Act, Article 26 — Deployer Obligations — read this before your next AI vendor evaluation. The obligations it places on deployers — competence, training, authority, understanding of limitations — apply regardless of what the vendor promises.


One Question

If your AI vendor cannot explain how their system makes decisions in language your procurement team understands, who bears the compliance burden when those decisions are challenged — and does your procurement contract actually allocate that responsibility, or does it just assume the vendor handles it?


TwinLadder Weekly | Issue #29 | March 2026

Helping professionals build AI capability through honest education.