Agentic Mechanical Engineering Through Blender MCP

A language model can write a Blender script that makes a part-shaped mesh.

That is not mechanical engineering yet.

Mechanical work does not end when the viewport looks convincing. The part has to carry dimensions, tolerances, material assumptions, manufacturing exports, safety boundaries, and a record another person can rerun. If the agent cannot show the requirements, compile the geometry, export the files, and explain what would invalidate the result, it has made an image. It has not made an engineering artifact.

The shortest useful answer is this:

Use the official Blender Lab MCP project as the upstream base.
Fork it into a CAD-specific MCP server instead of forking Blender core first.
Keep Blender as the review surface: scene context, imported preview meshes, screenshots, visual inspection, and operator interaction.
Put mechanical source-of-truth geometry in a CAD runtime outside Blender.
Give the agent typed tools with schemas, validators, and structured evidence records.
Treat free-form Blender Python execution as a dangerous maintenance escape hatch, not the normal design path.

This paper tests that path with two source-controlled parts: an M8 service wrench and a 12 inch pipe flange validation model. Both were generated as parametric B-Rep solids with build123d==0.9.1, exported as STEP, STL, and glTF, checked for declared requirements, and rendered in local Blender 4.2.4 LTS from the exported STL review geometry. That render step used Blender only as a headless renderer; the official Blender Lab MCP page currently lists Blender 5.1 or newer for MCP operation.^{1Blender Foundation, "MCP Server," https://www.blender.org/lab/mcp-server/; Blender Lab, blendermcp repository, https://projects.blender.org/lab/blendermcp. Accessed 2026-06-13. Workspace source card records the repository architecture, GPL-3.0-or-later license, and official security warning.}

Abstract

Blender Lab now maintains an official Model Context Protocol server for Blender. The project gives LLM clients a supported way to inspect Blender scenes, query documentation, render screenshots, and execute Blender Python through a small MCP server and Blender add-on bridge.^{1Blender Foundation, "MCP Server," https://www.blender.org/lab/mcp-server/; Blender Lab, blendermcp repository, https://projects.blender.org/lab/blendermcp. Accessed 2026-06-13. Workspace source card records the repository architecture, GPL-3.0-or-later license, and official security warning.} That is important because it gives agentic 3D work a real upstream surface instead of a pile of community bridges.

The mechanical engineering problem is different from the animation problem. Blender's mesh API exposes vertices, edges, loops, and polygons.^{2Blender Python API, bpy.types.Mesh, https://docs.blender.org/api/current/bpy.types.Mesh.html. Accessed 2026-06-13.} That is the right representation for many visual workflows, but it is the wrong source of truth for manufacturing decisions that depend on exact dimensions, stable features, fastener fits, and neutral CAD exports.

The proposed product is blender-cad-mcp: a GPL-compatible fork of Blender Lab MCP that adds constrained CAD tools. The fork should run a separate Python CAD runtime, generate B-Rep solids, export STEP/STL/glTF files, send review meshes into Blender, and write JSON evidence for every run. A Blender-core fork or standalone distribution may become justified later. It is not the first move.

Keywords

Agentic mechanical engineering; Blender MCP; Model Context Protocol; CAD agents; B-Rep; build123d; STEP; STL; glTF; 3MF; manufacturing validation; rapid R&D.

The Research Question

The decision is not "Can an LLM make a shape in Blender?"

The decision is:

Should Lyon Industries fork Blender Lab MCP into a CAD-specific agent harness before considering a deeper Blender fork?

The answer from this sprint is yes.

The MCP path is enough to prove the important loop: requirements become parametric CAD, CAD becomes exports, exports become Blender review geometry, and the run leaves evidence. Forking Blender core before that loop is repeatable would add maintenance burden before the product boundary is known.

What Changed With Blender Lab MCP

Blender Lab's MCP server matters because it gives the work an official foundation. The repository describes two components: a Blender add-on running inside Blender and an MCP server running as a separate process. The data path is MCP client over stdio to blender-mcp, then a TCP socket to the Blender add-on.^{1Blender Foundation, "MCP Server," https://www.blender.org/lab/mcp-server/; Blender Lab, blendermcp repository, https://projects.blender.org/lab/blendermcp. Accessed 2026-06-13. Workspace source card records the repository architecture, GPL-3.0-or-later license, and official security warning.}

That architecture is the right starting point for CAD agents because it separates concerns:

Layer	What it should own	Why it matters
LLM client	User dialogue, planning, tool calls	The model should request operations, not silently mutate files.
MCP server	Tool contracts, validation, runtime execution	This is where CAD-specific schemas and evidence records belong.
CAD runtime	B-Rep construction, dimensions, exports	Manufacturing geometry needs a CAD representation, rather than a scene mesh alone.
Blender add-on	Preview import, scene context, render capture	Blender is excellent for visual review and communication.

The upstream project also defines the main safety problem. The official Blender page warns that the server executes LLM-generated code in Blender without guards that protect data from deletion or remote exfiltration.^{1Blender Foundation, "MCP Server," https://www.blender.org/lab/mcp-server/; Blender Lab, blendermcp repository, https://projects.blender.org/lab/blendermcp. Accessed 2026-06-13. Workspace source card records the repository architecture, GPL-3.0-or-later license, and official security warning.} That warning is not a footnote. It is the product requirement.

For mechanical engineering, the default should not be "ask the model to run arbitrary Python in Blender." The default should be typed tools with input schemas, output schemas, validation, logs, user confirmations for sensitive actions, and workspace-limited file access. That matches the direction of the MCP tool specification, where tools are model-controlled capabilities with defined schemas and security expectations around input validation, access control, timeouts, logging, and client confirmation for sensitive operations.^{3Model Context Protocol, "Tools - Specification 2025-06-18," https://modelcontextprotocol.io/specification/2025-06-18/server/tools. Accessed 2026-06-13.}

Why Blender Is The Review Surface, Not The CAD Kernel

Blender is not the weak link. The weak link is asking Blender's mesh model to carry engineering meaning it was not designed to carry.

Blender meshes are made of vertices, edges, loops, and polygons.^{2Blender Python API, bpy.types.Mesh, https://docs.blender.org/api/current/bpy.types.Mesh.html. Accessed 2026-06-13.} That is perfect when the question is how something looks, animates, shades, or appears in a scene. Mechanical engineering asks different questions:

Engineering question	Mesh-first answer	CAD-runtime answer
What is the bore diameter?	Measure or infer from polygons.	Read the parameter or B-Rep face.
Did the bolt circle remain 17.0 inches?	Hope the mesh transform did not drift.	Check the declared parameter and generated geometry.
Can a vendor open this file?	Maybe STL or OBJ.	STEP for CAD exchange, STL/glTF for review, 3MF later for additive metadata.
Why did the fillet fail?	Boolean or bevel result may be hard to explain.	Return the CAD operation, parameters, and kernel error.
Can another agent rerun it?	Scene state is a moving target.	Requirements, script, exports, and JSON report are source-controlled.

The practical rule:

Blender should show the engineer what happened. The CAD runtime should decide what exists.

That division gives Blender a stronger role, not a smaller one. It becomes the operator cockpit for inspection, context, screenshots, collision review, and communication with non-CAD stakeholders. It stops being the hidden source of mechanical truth.

What The Frontier Literature Suggests

Recent text-to-CAD work points toward constrained generation with programmatic feedback, not general-purpose shape improvisation.

Text2CAD-Bench evaluates text-to-parametric-CAD generation across 600 human-curated examples and reports the expected pattern: models handle simpler geometry better than complex topology and advanced features.^{4Wang, Liang, et al., "Text2CAD-Bench: A Benchmark for LLM-based Text-to-Parametric CAD Generation," arXiv:2605.18430, 2026.} CADSmith frames the problem as a multi-agent loop with execution correction and programmatic geometric validation. Its reported validation features include bounding boxes, volume, and solid validity.^{5Barkley, Jesse, Rumi Loghmani, and Amir Barati Farimani, "CADSmith: Multi-Agent CAD Generation with Programmatic Geometric Validation," arXiv:2603.26512, 2026.}

The useful lesson is conservative. General agents are not ready to replace CAD engineers across arbitrary assemblies. A useful first product should give agents a smaller surface:

structured requirements
template-aware parametric generation
B-Rep compilation
geometry measurements
explicit pass/fail checks
export records
Blender review
human approval before manufacturing

That is enough to matter in rapid R&D. It is also narrow enough to test.

The Product Hypothesis

blender-cad-mcp should be a CAD-specific fork of Blender Lab MCP with three normal operating modes:

Mode	User job	Agent job	Hard boundary
Requirements capture	Turn intent into dimensions, materials, standards, and constraints.	Ask missing questions and write `requirements.json`.	No geometry generation until units and envelope are known.
CAD compile	Turn requirements into a parametric B-Rep model.	Generate or edit CAD code, compile, inspect metrics, repair failures.	No direct Blender mesh editing as source of truth.
Review and export	Show the result and package evidence.	Push preview geometry to Blender, render images, export files, write a report.	No manufacturing claim without human review and governing standards.

The fork should preserve the upstream MCP/add-on shape, then add tools like these:

{
  "name": "compile_cad_model",
  "inputSchema": {
    "type": "object",
    "required": ["model_id", "requirements_path"],
    "properties": {
      "model_id": { "type": "string" },
      "requirements_path": { "type": "string" },
      "export_formats": {
        "type": "array",
        "items": { "enum": ["step", "stl", "gltf"] }
      }
    }
  },
  "outputSchema": {
    "type": "object",
    "required": ["valid", "metrics", "exports", "checks"],
    "properties": {
      "valid": { "type": "boolean" },
      "metrics": { "type": "object" },
      "exports": { "type": "object" },
      "checks": { "type": "array" }
    }
  }
}

The schema is not paperwork. It is how the system stops a persuasive model from replacing evidence with prose.

Proposed Architecture

The first build should stay small.

LLM client
  |
  | MCP stdio
  v
blender-cad-mcp server
  |-- requirements writer
  |-- build123d CAD runtime
  |-- validators
  |-- export writer
  |-- evidence logger
  |
  | TCP socket, file path, or import command
  v
Blender Lab add-on fork
  |-- import preview mesh
  |-- place in scene
  |-- render evidence
  |-- return screenshot path

The CAD runtime should live outside Blender's bundled Python. That keeps dependency management, CAD package versions, and sandbox policy separate from the artist workstation. Blender receives tessellated review geometry and sends back visual state. STEP, STL, and glTF exports are generated from the CAD runtime. The build123d documentation supports this split: it provides exports for STEP, STL, and glTF, with STEP serving neutral CAD exchange and glTF serving visualization.^{6build123d documentation, "Import/Export," https://build123d.readthedocs.io/en/latest/importexport.html. Accessed 2026-06-13.}

3MF should be added in a later additive-manufacturing path, not claimed in this sprint. The 3MF specification is the relevant standards surface when material, units, color, and print metadata need a richer additive package than STL.^{73MF Consortium, "Specification," https://3mf.io/spec/. Accessed 2026-06-13.}

Evidence Run: Two Parts

The evidence run used a structured input file with two requested models:

m8_wrench: a flat service wrench for an M8 hex fastener, using 13.2 mm across-flats clearance.
twelve_inch_pipe_flange: a 12 inch nominal pipe-flange validation geometry with 12 bolt holes and a raised face.

The script generated the solids with build123d==0.9.1, computed mass and bounding boxes, exported STEP/STL/glTF, wrote a JSON report, and used Blender background mode to render PNG evidence from the STL review geometry.^{8Local reproducibility records for this white paper: cadrequirements.json, generatecadevidence.py, rendermodelsblender.py, cadevidence.json, generated STEP/STL/glTF files, and rendered PNGs in the post workspace. Generated and checked on 2026-06-13.}

This is not a claim that either part is production-ready. It is a claim that the minimum evidence loop works.

M8 Service Wrench

The M8 wrench fixture tests small-part generation: a 150 mm envelope, 6 mm thickness, open-end clearance, box-end hex clearance, and repeated handle holes. The generated solid passed validity, positive-volume, envelope, thickness, and service-clearance checks.

The record:

Result	Value
B-Rep validity	Pass
Bounding box	149.313 x 30.0 x 6.0 mm
Volume	13.152 cm3
Steel mass estimate at 7.85 g/cm3	103.242 g
Exports	STEP, STL, glTF
Declared checks	6/6 passed

What this proves: the harness can generate a small mechanical hand-tool geometry with repeatable dimensions, export it, and render it.

What it does not prove: torque rating, heat treatment, ergonomics, manufacturing tolerances, edge finishing, or safety as a real wrench.

12 Inch Pipe Flange

The pipe-flange fixture tests larger radial geometry: inch-to-mm conversion, circular arrays, a bore, bolt circle, bolt holes, and a raised face. The model uses public Class-150-style dimensional assumptions. It is not an ASME-certified procurement drawing and should not be treated as one.^{9The 12 inch flange fixture uses public Class-150-style dimensional assumptions recorded in the workspace source card. It is validation geometry only. A procurement-ready flange needs the governing standard, pressure class, material specification, facing, gasket, bolting, tolerances, and engineeri...}

The record:

Result	Value
B-Rep validity	Pass
Bounding box	482.6 x 482.6 x 25.463 mm
Volume	2282.445 cm3
Steel mass estimate at 7.85 g/cm3	17917.197 g
Exports	STEP, STL, glTF
Declared checks	6/6 passed

What this proves: the harness can generate a larger circular part with a repeated bolt pattern, preserve declared dimensions, export review and manufacturing files, and produce an inspection render.

What it does not prove: pressure rating, gasket compatibility, flange facing, material grade, bolting class, tolerance stack, corrosion allowance, or compliance with the governing standard.

What The Evidence Proves

The useful result is not that the models are impressive. They are intentionally ordinary. That is the point.

Rapid R&D is full of ordinary parts: brackets, plates, adapters, spacers, jigs, flanges, mounts, covers, strain-relief blocks, tool holders, and fixtures. A CAD agent earns trust by producing boring geometry with a clean record.

This run proves five things:

Claim	Evidence
Requirements can be made machine-readable.	The run starts from a structured JSON requirement file.
CAD should compile outside Blender.	The B-Rep solids were generated in a separate Python runtime.
Blender can remain in the loop.	Blender rendered the exported STL review geometry headlessly.
The output can be inspected later.	JSON and Markdown evidence reports record validity, dimensions, exports, and renders.
The product can start as an MCP fork.	No Blender-core changes were required for the first proof.

The run does not prove autonomous mechanical engineering. The agent did not take responsibility for standard selection, load cases, material certification, drawing control, tolerance design, manufacturing release, or safety sign-off. Those remain human and process responsibilities.

That boundary is a feature. A serious tool should know where it stops.

Safety Model

The official Blender MCP warning is direct enough to shape the fork. A model-controlled tool that can execute arbitrary Blender Python can delete files, alter scenes, or send data away if the surrounding environment allows it.^{1Blender Foundation, "MCP Server," https://www.blender.org/lab/mcp-server/; Blender Lab, blendermcp repository, https://projects.blender.org/lab/blendermcp. Accessed 2026-06-13. Workspace source card records the repository architecture, GPL-3.0-or-later license, and official security warning.}

A CAD-specific fork should have two tool classes:

Tool class	Default state	Example
Typed CAD tools	Enabled	compile model, inspect metrics, export STEP, render evidence
Free-form code tools	Disabled or approval-gated	execute arbitrary Blender Python

The normal CAD path should include:

workspace-limited file access
no network access during CAD compile by default
allowlisted imports for CAD scripts
model-specific output folders
schema validation before tool execution
structured tool results
explicit user confirmation before destructive file operations
per-run logs with timestamps, tool inputs, outputs, and generated artifacts

This does not make the system safe enough for regulated manufacturing by itself. It makes the first R&D tool safer than a chat window with an unrestricted execute_blender_code path.

Why Not Fork Blender Core Yet

Forking Blender core may become a product path. It is not the next step.

A core fork would create a new release, dependency, packaging, governance, and support burden before the CAD-agent workflow has proven repeatability. The MCP/add-on path can answer the first expensive questions faster:

Do users accept Blender as the review cockpit for CAD-agent work?
Can typed tools cover 80 percent of early R&D parts?
Can agents repair CAD failures from structured error feedback?
Can the evidence record satisfy a serious engineer?
Which operations keep forcing users back into traditional CAD?

A core fork becomes rational when the MCP fork hits one of these limits:

Limit	Why it may justify a deeper fork
Native B-Rep interaction is required in the viewport.	Mesh previews hide too much feature and topology information.
The CAD runtime needs tighter UI affordances.	The add-on starts feeling like a bolted-on console instead of a tool.
Packaging becomes the product.	Industrial users may need a standalone, locked-down distribution.
Security boundaries cannot be enforced cleanly outside Blender.	The product may need stronger runtime isolation and permissions.
Scene-to-CAD round trips become central.	Users may need persistent CAD features attached to Blender objects.

Until then, the MCP fork is the sharper bet.

Repository Blueprint

The first public fork should look like this:

blender-cad-mcp/
  mcp/
    tools/
      capture_requirements.py
      compile_cad_model.py
      inspect_cad_model.py
      export_manufacturing_bundle.py
      push_review_mesh.py
      render_evidence.py
    schemas/
      requirements.schema.json
      cad_report.schema.json
      export_manifest.schema.json
  cad_runtime/
    requirements.txt
    templates/
      wrench.py
      flange.py
      bracket.py
      enclosure.py
    validators/
      dimensions.py
      fasteners.py
      wall_thickness.py
      export_status.py
  blender_addon/
    __init__.py
    preview_import.py
    render_capture.py
  examples/
    m8_wrench/
    twelve_inch_pipe_flange/
  tests/
    test_compile_examples.py
    test_schema_outputs.py

The examples should not be decorative demos. Each one should include:

requirements JSON
CAD source
STEP, STL, and glTF outputs
rendered PNG
evidence JSON
human-readable report
known limitations

That is how the work becomes repeatable instead of merely plausible.

Evaluation Tasks

The next benchmark should be small but unforgiving:

Task family	Example	Pass condition
Service tools	M8 wrench, spanner, simple socket adapter	Correct envelope, fastener clearance, export bundle.
Industrial fixtures	12 inch flange, bolt plate, gasket spacer	Correct bolt pattern, bore, thickness, unit handling.
R&D brackets	motor mount, sensor bracket, extrusion adapter	Hole spacing, minimum wall, mating envelope.
Additive parts	fan duct, cable guide, enclosure lid	STL export, minimum feature checks, orientation note.
Review workflows	push model to Blender and render evidence	Scene import, stable naming, screenshot record.

The benchmark should track failures as carefully as passes. A useful CAD agent is not one that never fails. It is one that fails in ways the engineer can inspect and repair.

Risks And Invalidators

This thesis is wrong if any of the following happens:

Agents cannot reliably repair CAD compile failures from structured feedback.
build123d or the selected CAD runtime is too fragile for common R&D parts.
Blender preview geometry hides errors that matter before manufacturing.
Users reject Blender as the review environment and prefer FreeCAD, Onshape, Fusion, or native CAD integrations.
The safety model becomes so restrictive that the agent loses useful autonomy.
GPL obligations or product packaging constraints make the fork unsuitable for the intended commercial surface.

The most important invalidator is practical: if the evidence record does not make a skeptical engineer more confident, the product is not working.

The Next Build

The next sprint should produce a real fork, not another article.

Fork projects.blender.org/lab/blender_mcp into blender-cad-mcp.
Keep the upstream MCP/add-on bridge intact and document GPL obligations.
Add typed CAD tools for requirements capture, compile, inspect, export, push preview mesh, and render evidence.
Move execute_blender_code behind an explicit dangerous-tool gate.
Port the M8 wrench and 12 inch flange into example folders.
Add at least 20 more mechanical fixtures with repeatable pass/fail records.
Publish a small benchmark table that includes failures.

The commercial product is not "AI in Blender." The commercial product is a repeatable mechanical R&D harness: a way to turn an ambiguous part request into a source-controlled CAD artifact, a visual review, and a record good enough for the next engineer to continue.

That is the standard for agentic mechanical engineering.

Footnotes

Blender Foundation, "MCP Server," https://www.blender.org/lab/mcp-server/; Blender Lab, blender_mcp repository, https://projects.blender.org/lab/blender_mcp. Accessed 2026-06-13. Workspace source card records the repository architecture, GPL-3.0-or-later license, and official security warning. ↩ ↩² ↩³ ↩⁴ ↩⁵
Blender Python API, bpy.types.Mesh, https://docs.blender.org/api/current/bpy.types.Mesh.html. Accessed 2026-06-13. ↩ ↩²
Model Context Protocol, "Tools - Specification 2025-06-18," https://modelcontextprotocol.io/specification/2025-06-18/server/tools. Accessed 2026-06-13. ↩
Wang, Liang, et al., "Text2CAD-Bench: A Benchmark for LLM-based Text-to-Parametric CAD Generation," arXiv:2605.18430, 2026. ↩
Barkley, Jesse, Rumi Loghmani, and Amir Barati Farimani, "CADSmith: Multi-Agent CAD Generation with Programmatic Geometric Validation," arXiv:2603.26512, 2026. ↩
build123d documentation, "Import/Export," https://build123d.readthedocs.io/en/latest/import_export.html. Accessed 2026-06-13. ↩
3MF Consortium, "Specification," https://3mf.io/spec/. Accessed 2026-06-13. ↩
Local reproducibility records for this white paper: cad_requirements.json, generate_cad_evidence.py, render_models_blender.py, cad_evidence.json, generated STEP/STL/glTF files, and rendered PNGs in the post workspace. Generated and checked on 2026-06-13. ↩
The 12 inch flange fixture uses public Class-150-style dimensional assumptions recorded in the workspace source card. It is validation geometry only. A procurement-ready flange needs the governing standard, pressure class, material specification, facing, gasket, bolting, tolerances, and engineering review. ↩