---
name: "ReproducibleResearch"
description: "Expertise in designing reproducible research infrastructure using Infrastructure as Code principles, with a focus on deterministic build pipelines, Zero-Mock testing, steganographic provenance, and AI-agent-aligned documentation standards."
tags: ["reproducible-research", "infrastructure-as-code", "build-pipeline", "zero-mock-testing", "steganographic-watermarking", "model-context-protocol", "documentation-duality", "open-science"]
---

# A template/ approach to Reproducible Generative Research

**Daniel Ari Friedman** (2026) · Computational / Open Science

## Instructions

Use this skill when working with topics related to **reproducible research, build pipelines, Infrastructure as Code, AI-agent documentation, and research provenance**.

When applying this skill:
1. Apply the Two-Layer Architecture pattern (infrastructure subpackages vs. project workspaces)
1. Implement Zero-Mock testing policies using real filesystem and subprocess operations
1. Design Documentation Duality systems (README.md + AGENTS.md + SKILL.md)
1. Build deterministic, multi-stage research pipelines with cryptographic provenance

## Key Concepts

- **Two-Layer Architecture**: Separation of reusable infrastructure (~150 modules, ~3,083 tests) from self-contained project workspaces
- **Eight-Stage Build Pipeline**: sanitization → tests → analysis → Pandoc/XeLaTeX → SHA-256 → steganographic watermarking → PDF validation → LLM review
- **Zero-Mock Testing**: Policy enforcing 90% project-level and 60% infrastructure-level coverage via real operations
- **Documentation Duality**: Every directory has README.md (human) + AGENTS.md (machine)
- **SKILL.md**: Structured skill descriptors aligned with Model Context Protocol
- **Steganographic Watermarking**: Invisible cryptographic provenance embedded in rendered documents
- **SHA-256 Provenance**: Cryptographic hash chain from source to publication
- **Self-Referential Architecture**: The manuscript is rendered by the pipeline it describes

## Methods & Techniques

- Infrastructure as Code applied to research lifecycle management
- Pandoc + XeLaTeX for manuscript rendering within automated pipelines
- SHA-256 cryptographic hashing with steganographic watermarking for document provenance
- Real-operation testing (Zero-Mock) with enforced coverage thresholds
- Comparative feature analysis methodology across tool ecosystems
- Model Context Protocol integration for AI-agent capability discovery

## Key Findings

- Only 24% of 1.4 million Jupyter notebooks can be successfully re-executed
- No existing tool integrates all 11 distinctive capabilities within a single enforced pipeline
- Zero-Mock testing achieves 100% pipeline success across three heterogeneous projects
- Self-referential architecture proves the system's self-productive capacity
- Documentation Duality enables both human comprehension and AI-agent interoperability

## Prerequisites

- Python package management and build systems
- LaTeX document preparation (Pandoc, XeLaTeX)
- Software testing methodology (pytest, coverage)
- Basic cryptography (SHA-256, steganography concepts)

## 🎯 Consulting & Tutoring

[Daniel Ari Friedman, PhD](https://danielarifriedman.com/) is available for AI Research Consulting and Tutoring related to this skill.

## Related Skills

- [GNN](../2023_GNN/SKILL.md) — Generalized Notation Notation
- [CEREBRUM](../2025_CEREBRUM/SKILL.md) — Case-Enabled Reasoning Engine
- [MDKV](../2025_MDKV/SKILL.md) — Multitrack Markdown Container

See [BIBLIOGRAPHY.md](../../pages/BIBLIOGRAPHY.md) for the complete publication catalog and related papers.
