DIY, productized, or managed: three on-prem AI models and who maintains them
"On-prem AI" isn't one deployment model but at least three, with different cost, risk, and team-load profiles. We break them down so CISOs and CIOs know which conversation they're really having before the RFP.

Three on-prem AI deployment models: cost, risk profile, when each fits
Reading time: ~11 minutes
When the board asks the CISO about "on-prem AI," it usually has one thing in mind: the data doesn't leave the organisation. That's a good regulatory instinct but a poor basis for an architectural decision, because "on-prem" doesn't describe one solution. It describes at least three different deployment models that differ by an order of magnitude in cost, risk, and how much of your own team you have to put into it.
Most failed on-prem AI projects I've seen or heard about didn't fail on technology. They failed because someone committed to one model and budgeted and staffed for another. This post breaks the three models down so you know which conversation you're actually having before you send out an RFP.
Table of contents
- What "on-prem" actually means
- Model 1: DIY on open source
- Model 2: Productized appliance
- Model 3: Managed on-prem
- Comparison: cost, risk, team load
- How to choose: four questions
- Disclosure and biases
- What I don't cover here
What "on-prem" actually means
Let's set the boundaries of the term, because it's the source of half the misunderstandings in meetings.
On-prem in the strict sense means the inference workload (model, data, RAG pipeline) runs on infrastructure the organisation controls physically or contractually, and the data does not go to an external model provider's API (OpenAI, Anthropic, Google). That's the definition that matters for NIS2 and GDPR, because it concerns the place of processing and the flow of data, not the brand on the server chassis.
That definition contains three organisational models, not one. What separates them isn't where the hardware sits (usually the same server room), but who builds it, who maintains it, and who takes responsibility for the whole thing working and being auditable. Those three questions define cost and risk more than any choice of language model.
I deliberately leave out hybrid variants and externally hosted setups where part of the workload lands in public cloud in an isolated mode. That's a separate topic, with a separate regulatory risk profile, and dropping it in here would only blur the picture.
Model 1: DIY on open source
The first model is building everything yourself on open-source components. You take an open model (the Llama, Mixtral, or Qwen family), an inference server (vLLM, TGI, Ollama for a prototype), build your own RAG pipeline, application layer, and observability, and wire it into a working whole on your own hardware.
Cost profile
The software license cost can be zero, and that's exactly the number that misleads boards. The real cost sits elsewhere: in people. You need a team that understands model serving, quantization, RAG-pipeline tuning, GPU management, and maintaining all of it over time. In a market where AI/ML salaries are the highest in the whole of European IT, that's not a cheap line item — and it's hard to fill.
You buy the hardware yourself. For a mid-sized organisation that wants to serve a 70B-class model at a reasonable pace for a few dozen concurrent users, we're talking a GPU outlay in the hundreds of thousands of euros and up, depending on the chosen configuration and whether you build redundancy. That's a one-off cost plus amortisation — but only the start of the bill.
Risk profile
The highest of the three models, but in a specific sense. Technology risk is manageable if you have the team. The real risk is organisational and has two layers.
The first is key-person dependency. If two people built the pipeline and one leaves, you're left with a production system nobody fully understands. That's exactly the kind of risk a NIS2 auditor will start to question, because it concerns continuity and reproducibility.
The second is auditability. Assembling your own pipeline also means the obligation to document it yourself: where answers come from, how access is logged, how secrets are managed, what the data flow looks like. A commercial auditor won't accept "trust us, it works." DIY means that documentation is on you too.
When it makes sense
DIY makes sense when AI is a strategic competence for you, not a supporting tool. If you already have a data-science or ML-platform team that exists anyway, and you want full control over every layer, this is the model for you. In practice that applies to larger organisations (usually 1000+ FTE with mature IT) or firms whose product is itself AI-based.
For a typical NIS2 essential entity in manufacturing, whose core business is making things, not building AI platforms, DIY is usually a cost trap dressed up as a saving on licenses.
Model 2: Productized appliance
The second model is a ready, productized system delivered as a coherent whole: software plus a management layer, often on a predefined or supplied hardware configuration, installed at your site. You buy a product, not a kit of components to assemble. Updates, the RAG pipeline, observability, and part of the audit documentation come with the product.
Cost profile
A higher software entry cost than DIY (you pay for the product and for someone having already solved the integration problems), but a lower and more predictable team-staffing cost. You don't need an ML-platform team; you need someone who handles integrations and ownership on the business side.
The billing model is often subscription, which for the CFO means an operating cost instead of a large capex, and for the CISO means predictability. The hardware may be bought by you or supplied as part of delivery, depending on the vendor. Worth asking directly, because it materially changes the TCO calculation and who carries hardware risk.
Risk profile
Medium, with a different distribution than DIY. Technology and operational risk fall (the vendor is responsible for the product working and updating), but vendor-dependency risk appears. If the vendor disappears, raises prices, or stops developing the product, you have a problem whose scale depends on how deeply the product is wired into your processes.
That risk can be limited contractually: exit clauses, access to data in an open format, continuity guarantees, code escrow in extreme cases. A CISO assessing this model should read the contract for "what happens when the vendor fails" as carefully as for features.
When it makes sense
A productized appliance makes sense for an organisation that wants value from AI without building AI competence from scratch, and that values predictability and ready audit documentation more than full control over every layer. It's usually a good choice for mid-sized and larger manufacturers (200 to 2000 FTE) where IT exists and is competent but isn't an ML team.
Model 3: Managed on-prem
The third model is the variant where the hardware sits at your site (data doesn't leave, the perimeter is preserved), but the whole thing is managed remotely by the vendor. It combines control over data location with a service model close to a managed service. You provide the space and connectivity; the vendor is responsible for the system working, staying current, and being maintained.
Cost profile
The most predictable of the three, but not necessarily the lowest. You pay for not having to hold any maintenance competence at all. It's an operating cost, usually subscription, covering management. The hardware, depending on the vendor, may be part of the service.
From the team-load perspective, it's the lightest model: your team maintains neither the platform, nor the models, nor the pipeline. That matters for organisations that simply don't have and don't want to build a maintenance capability in this area.
Risk profile
Here an interesting regulatory paradox appears. On one hand, data doesn't leave the perimeter, which is strong under NIS2 and GDPR. On the other, remote management means the vendor has administrative access to a system sitting in your server room. That has to be mapped in the risk analysis: what that access looks like, how it's logged, how it's segmented, what the vendor sees and doesn't.
For an audit, that's not disqualifying — quite the opposite, it's often cleaner than DIY, because the vendor usually arrives with a ready access-control and logging model. But it requires a conscious, documented decision, not a wave of the hand. Vendor dependency is highest here of the three models, because you hand over not just the build but also day-to-day operations.
When it makes sense
Managed on-prem makes sense for organisations for which data location is a hard requirement (regulatory or political), but building and maintaining their own AI competence has no business justification. That's a common profile for a mid-sized manufacturer that is a NIS2 essential entity: it must keep data in-house, but doesn't want and shouldn't become an AI-platform operator.
Comparison: cost, risk, team load
A shorthand summary for an internal conversation. Specific numbers depend on scale, so I give profiles, not amounts.
DIY on open source. License cost low, people cost high, hardware cost on your side. Risk: high organisational (key people, auditability), low vendor dependency. Team load: highest. Control: full.
Productized appliance. Software cost medium to high, people cost low, hardware cost vendor-dependent. Risk: medium, mainly vendor dependency (limitable by contract). Team load: medium to low. Control: partial, with ready audit documentation.
Managed on-prem. Operating cost predictable, people cost lowest, hardware usually in the service. Risk: lowest operational, highest vendor dependency plus a remote-access question to map. Team load: lowest. Control: full over data, limited over the system.
The takeaway worth keeping: there's no "best" model. There's a model matched to how much AI competence you want and can maintain, and how you price control against predictability.
How to choose: four questions
Before you enter an RFP, answer four questions internally. They usually decide the model faster than any vendor presentation.
- Is AI a strategic competence for us, or a tool? If strategic and we have the team, DIY is in play. If a tool, it's almost always out.
- Do we have, or want to have, a maintenance capability in this area? If not, productized or managed. Honesty here saves a year of pain.
- How do we price control against predictability? That's a question about organisational culture as much as technology.
- What will an auditor see in 18 months? Choose the model for which the answer to "where did this AI answer come from and who had access" is ready, not to be written the night before the audit.
// disclosure & biasesDisclosure and biases
The author works on a productized on-prem AI platform for European manufacturers. That means a real bias toward the productized and managed categories relative to DIY. Where that bias could shape the conclusions, I've tried to flag it plainly: DIY is the right choice for organisations with existing ML competence and a need for full control, and in those cases my category preference is simply inadequate. I give cost numbers as profiles, not offers, precisely so as not to sell a specific model under the guise of analysis.
What I don't cover here
I deliberately left out a few things so this post stays about deployment models, not everything at once:
- Hybrid variants and isolated public cloud. A different regulatory risk profile, a separate topic.
- Specific language-model selection and GPU benchmarks. A separate note on sizing, where numbers only make sense with a specific configuration.
- Mapping to specific NIS2 and AI Act articles. I touch regulation in outline; full mapping needs a separate piece in the compliance cluster.
- Model federation across sites and multi-location setups. Important for large groups, beyond the scope of this analysis.
Related notes
Building CortexMine, an on-prem AI platform for European manufacturers under NIS2. Where this bias could affect conclusions, it is flagged inline.
Bare-metal, colocation, or appliance: where to put on-prem AI (CAPEX and OPEX)
Bare-metal in your own server room, colocation with dedicated hardware, or a vendor's managed appliance. Three on-prem AI deployment models for European manufacturing in 2026: CAPEX and OPEX numbers, NIS2 risk profiles, when each makes sense — and when to skip on-prem entirely.
On-prem AI in European manufacturing 2026: a complete architecture guide
Architecture, GPU sizing, security, integrations, TCO, build vs buy. A practical guide to deploying on-prem AI for CISOs and CIOs in European manufacturing in 2026.