Document Control
Use Case ID: `UC-002`
Use Case Title: `Intermittent DNS/TLS Incident Diagnosis Across Client Networks`
Version: `v1.0`
Status: `Approved`
Owner: `NYEX AI Program Office`
Last Updated: 30 March 2026`
NYEX AI for DNS/TLS Incident Diagnosis (Health Certificate and Sanitary Permit System)
1. Context
The client’s Health Certificate and Sanitary Permit system experienced intermittent production login failures on select client networks. Errors showed certificate-name mismatch behaviour, while internal teams could not reproduce consistently.2. Goal
Use NYEX AI to isolate whether the failure domain is cloud-side misconfiguration or network-local path issues (resolver/proxy/SSL inspection), and produce a low-risk, operator-ready mitigation plan.3. Scope
In Scope- DNS and TLS path diagnostics for intermittent login failures
- Cross-layer evidence correlation (browser, DNS, TLS, endpoint config, cloud metadata)
- Mitigation planning and field validation checklist
- Immediate high-risk production cutover
- Full platform re-architecture
- Non-network unrelated application feature defects
4. Stakeholders
- Business Owner: Client Service Owner (Health/Sanitary Permit Program)
- Technical Owner: Infra/Platform Engineering Lead
- Security/Compliance: Security Operations and Governance
- QA/Operations: Application Support and NOC/Service Desk
5. Preconditions
- Access to incident samples from affected and unaffected client networks
- Current endpoint/domain mapping and certificate inventory
- Cloud service metadata and edge/network configuration references
- Browser and resolver diagnostic outputs from field teams
6. NYEX AI Diagnostic Workflow (Applied)
| Step | Diagnostic Stage | Inputs to NYEX AI | NYEX AI Actions | Outputs | Approval Gate |- Define Diagnostic Scope | Incident statement, impacted login flow, environment boundaries | Frames failure domain and target systems | Scoped diagnostic brief | Incident manager sign-off |
- Collect Baseline Inputs | Architecture path, API hostnames, cert bindings, change timeline | Builds baseline map of expected DNS/TLS behavior | Baseline configuration matrix | Infra lead validation |
- Gather Evidence | Browser errors, DNS lookups, TLS checks, endpoint config, cloud metadata | Normalizes and correlates evidence across layers | Evidence correlation table | Support + infra review |
- Frame Prompt Pack | Symptoms, timeline, constraints, expected structured outputs | Enforces hypothesis + confidence + validation format | Structured diagnostic prompt set | Technical owner approval |
- Run Layered Analysis | All collected evidence | Performs L1-L4 analysis: flow, performance path, data-path integrity, security controls | Ranked hypothesis list | Cross-team triage checkpoint |
- Generate Root-Cause Hypotheses | Layered analysis results | Distinguishes cloud misconfig vs client-path causes with confidence scoring | Root-cause narrowing report | Incident commander acceptance |
- Design Verification Tests | Candidate causes and affected network samples | Produces deterministic validation checklist for field operators | Operator-ready validation checklist | Support operations approval |
- Prioritize Remediation Plan | Verified findings and risk tolerance | Creates risk-rated actions (immediate, short-term, hardening) | Mitigation plan with sequencing | Change control review |
- Validate Fixes | Post-action logs/metrics from affected networks | Confirms symptom resolution and no regressions | Validation summary and closure recommendation | Service owner approval |
- Produce Diagnostic Report | Full timeline and findings | Generates executive + technical report and action log | Incident report + action plan | Management readout |
- Operationalize Learnings | Final findings and recurring patterns | Converts outcomes into runbooks/alerts/playbooks | Updated SOP and monitoring controls | Operations governance sign-off |
7. How NYEX Supported This Incident
- Correlated browser errors, DNS lookups, TLS certificate checks, app endpoint configuration, and cloud metadata in one investigation stream.
- Narrowed probable cause domain between cloud configuration and network-local traversal factors (resolver/proxy/SSL inspection).
- Produced a risk-rated mitigation path without forcing immediate production changes.
- Generated an operator-ready validation checklist for affected client networks.
8. Expected Outcome
- Faster isolation of root-cause domain (`infra` vs `client network path`)
- Reduced escalation loops across app, infra, and support teams
- Clear mitigation path, including canonical API hostname strategy
9. Business Case
Problem Today- High MTTR for intermittent, non-reproducible incidents
- High engineering effort in manual cross-team troubleshooting
- User trust impact from inconsistent “works for some, fails for others” behavior
- MTTR reduction via structured, repeatable diagnostics
- Operational safety through investigation-first, low-risk planning
- Higher decision confidence with evidence-based root-cause narrowing
- Executive-ready reporting with impact and action plans
10. KPI Framework
- MTTR reduction (%)
- Engineer-hours saved per incident
- Escalation handoff count reduction
- Repeat-incident rate reduction
- User-impact minutes avoided
11. Risk and Control Mapping
| Risk | Control | NYEX AI Contribution | Owner |- Misattributed root cause | Evidence-backed hypothesis ranking | Correlates multi-layer diagnostics before action | Infra Lead |
- Unsafe production changes | Investigation-first sequencing | Recommends low-risk validation and phased mitigation | Change Manager |
- Repeated incident recurrence | Operationalization and monitoring updates | Produces checklist, runbook, and alert improvements | Operations Lead |
- Escalation churn | Structured handoff artifacts | Provides single incident narrative for all teams | Incident Manager |

