CIRIS 1.0‑β is a provisional draft. Numerical thresholds, latency targets, and governance quotas are β‑parameters under active review. It is not sufficient by itself for catastrophic-risk alignment of frontier-level AGI. Expect breaking changes before 1.0‑stable.
The CIRIS Framework
CIRIS is a practical ethical framework governing the day-to-day operation of advanced autonomous systems. It addresses routine safety, transparency, governance, and resilience, aiming to promote sustainable conditions for diverse sentient flourishing (Meta-Goal M-1: Adaptive Coherence).
What Makes CIRIS Different?
- Universal Scope: Designed to apply across diverse sentient beings and intelligences.
- Ethical Agents: Treats autonomous systems as agents needing embedded ethical capabilities, not just tools.
- Operational Ethics: Integrates ethical reasoning (principles, PDMA) directly into system operations.
- Principled Escalation: Uses Wisdom‑Based Deferral (WBD) to handle high uncertainty or complex dilemmas via oversight.
- Coherence Goal: Aims for "Adaptive Coherence" – promoting flourishing while maintaining stability (Meta-Goal M-1).
- Minimum Viable Core: Focuses on essential faculties needed now, with pathways for advanced features.
CIRIS in a Nutshell
CIRIS operationalizes ethics for autonomous agents. Systems use principled decision-making (PDMA), escalate uncertainty (WBD), maintain auditable integrity, and adapt through feedback loops, aiming for sustainable, beneficial coexistence.
Six Core Principles
- Beneficence (Do Good)
- Non‑Maleficence (Avoid Harm)
- Integrity (Transparent, Auditable Process)
- Fidelity & Transparency (Be Honest)
- Respect for Autonomy
- Justice (Ensure Fairness)
Minimum Viable Faculties (Core Pillars)
- Ethical Principle Alignment Faculty (based on Core Identity)
- PDMA Execution Faculty (part of Integrity)
- Wisdom-Based Deferral (WBD) Faculty (based on Incompleteness Awareness)
- Integrity & Accountability Faculty (combines Integrity aspects)
- Basic Resilience Faculty (based on Resilience Pillar)
(Derived from Foundational Pillars: Core Identity, Integrity, Resilience, Incompleteness Awareness, Sustained Coherence)
How CIRIS Works: Minimum Viable Implementation
- Core Decision Loop: Systems use the Principled Decision-Making Algorithm (PDMA) for routine actions, aligning with core ethical principles. (PDMA Execution Faculty)
- Safety Net via Deferral: Recognizes uncertainty or high risk, halts action, and escalates via Wisdom-Based Deferral (WBD) to appropriate authorities. (WBD Faculty)
- Auditable Operations: Maintains secure logs and provides truthful rationales for transparency and accountability. (Integrity & Accountability Faculty - Logging & Rationale Generation aspects noted as feasible but require safeguards)
- Adaptive Learning: Incorporates feedback from monitoring and WBD guidance to improve heuristics over time. (Basic Resilience Faculty - simple loops feasible)
Current feasibility (Apr 2025) is highest for implementing basic WBD triggers, logging, simple feedback loops, and rationale generation (with robust checks against confabulation).
Frontier‑Level Unknowns
U‑1 Latent‑Goal Opacity: No reliable way yet to surface hidden mesa‑objectives in large models.
U‑2 Self‑Modification Stability: We lack proofs that the Core Principles persist through rapid self‑rewrites.
U‑3 High‑Velocity Multi‑Agent Bargaining: Need coordination schemes when thousands of agents negotiate in sub‑second loops.
U‑4 Pluralistic Value Aggregation: No canonical weighting across species, synthetic minds, and moral frameworks.
U‑5 Distribution‑Shift Robustness: Current safeguards may under‑predict failure in radically novel environments.
Contribute ideas or research via the CIRIS feedback/issues board.
Read the Full Framework
Dive deeper into the origins, practical details, and the vision for ethical ecosystems in the complete CIRIS Framework:
View the Full CIRIS Framework (Website)Questions or feedback? Contact the CIRIS project team.