Verification, Testing, and Troubleshooting
Verify End-to-End Functionality
Check Service Startup
You can use the messages in the service logs to check that the service is running at startup.,
✅ All required configuration loaded from ConfigMap environment variables ✅ v3 Subnets API: X subnets available ✅ v3 VMs API: X VMs available ✅ Ansible Tower connection successful 👀 Starting v3 event monitoring... 🔍 Watching SUBNETS & VMS & VIRTUAL SWITCHES for: CREATION | MODIFICATION | DELETION
Test Infrastructure Event Detection
You can create a test subnet in Nutanix:
Log into Prism Central.
Go to Network and Security | Virtual Private Clouds.
Create a new subnet.
Monitor the service logs for event detection.
Here's some expected log out.
🌐 SUBNET CREATED! (v3) 🌐 📅 Detection Time: 2025-10-14 15:30:00 🆔 Subnet UUID: xxxxx-xxxxx-xxxxx 📛 Name: test-subnet 🎯 Processing NETWORK CREATED event for Ansible automation ✅ Found job template 'create-vnet' with ID: X 🚀 Launching job template: create-vnet ✅ Job launched successfully! Job ID: X Job URL: http://x.x.x.x:xxxxx/#/jobs/X
Verify AWX Job Execution
To verify that AWX jobs are being executed, following these steps:
Login to AWX web interface.
Go to the Jobs tab.
Verify that the jobs are being triggered when an infrastructure change occurs,
Check your job output to see if it was a successful execution.
Troubleshooting
Kubernetes Installation Issues
To troubleshoot Kubernetes Installation issues, check for the following:
-
Insufficient Resources
# If k8s_deploy.sh fails due to resources: # Check current usage free -h df -h nproc # The script requires minimum: 2 CPU, 4GB RAM, 20GB disk # For better performance, use: 4 CPU, 8GB RAM, 50GB disk
-
Network Issues During Installation
# If download fails, check internet connectivity ping -c 3 8.8.8.8 # If behind proxy, set proxy environment variables export http_proxy=http://proxy-server:port export https_proxy=http://proxy-server:port
-
Python and Ansible Issues
# If Python installation fails, install manually sudo apt-get update sudo apt-get install -y python3 python3-venv python3-pip # If Ansible fails, check virtual environment source ~/k8s-venv/bin/activate pip install --upgrade ansible
-
Kubernetes Cluster Not Ready
# Check kubelet status sudo systemctl status kubelet # Check kubernetes pods kubectl get pods -A # If pods are failing, check logs kubectl logs -n kube-system <pod-name>
-
Kubernetes Cluster Not Ready
# Check kubelet status sudo systemctl status kubelet # Check kubernetes pods kubectl get pods -A # If pods are failing, check logs kubectl logs -n kube-system <pod-name>
-
Service Cannot Connect to Nutanix
# Check network connectivity kubectl exec deployment/event-notification-service -- curl -k https://PRISM-IP:9440/api/nutanix/v3/clusters # Verify credentials kubectl get secret nutanix-eda-secrets -o yaml
-
Service Cannot Connect to AWX
# Check AWX service kubectl get svc -n aap ansible-awx-service # Test connectivity kubectl exec deployment/event-notification-service -- curl http://AWX-IP:PORT/api/v2/ping/
-
No Events Detected
# Check monitoring flags kubectl exec deployment/event-notification-service -- env | grep MONITORING # Verify Nutanix API access kubectl logs deployment/event-notification-service | grep "API"
-
AWX Jobs Not Triggering:
# Check job templates exist kubectl logs deployment/event-notification-service | grep "job template" # Verify AWX credentials kubectl logs deployment/event-notification-service | grep "Ansible Tower"
Recovery Procedures
- Restart Kubernetes
Installation
# If k8s_deploy.sh fails, clean up and retry sudo kubeadm reset -f sudo rm -rf ~/.kube rm -rf ~/k8s-venv ~/kubespray # Then run the script again ./k8s_deploy.sh
- Reinstall AWX
# Remove AWX completely helm uninstall ansible-awx -n aap kubectl delete namespace aap # Wait for cleanup, then redeploy ./awx_deploy.sh
- Reinstall Nutanix
Service
# Kubernetes deployment kubectl delete deployment event-notification-service kubectl delete configmap nutanix-eda-config kubectl delete secret nutanix-eda-secrets # Docker deployment docker stop nutanix-event-service docker rm nutanix-event-service # Then redeploy ./deploy_nutanix_service.sh
Configuration Reference
Here are the default configuration values that are used when you don't specify any values in the configuration:
# Monitoring (all enabled by default) MONITORING_CHECK_INTERVAL: "5" # seconds MONITORING_MONITOR_NETWORKS: "true" MONITORING_MONITOR_VMS: "true" MONITORING_MONITOR_VIRTUAL_SWITCHES: "true" # Job Templates (default AWX job names) JOB_TEMPLATE_NETWORK_CREATE: "create-vnet" JOB_TEMPLATE_NETWORK_DELETE: "delete-vnet" JOB_TEMPLATE_VIRTUAL_SWITCH_CREATE: "create-vrf" JOB_TEMPLATE_VIRTUAL_SWITCH_DELETE: "delete-vrf" # Ansible Settings (enabled by default) ANSIBLE_ENABLED: "true" ANSIBLE_MAX_RETRIES: "3" ANSIBLE_RETRY_DELAY: "5"
File Locations
Here is the file structure for your projects:
eda-apstra-project/
├── deploy/nutanix/
│ ├── scripts/
│ │ ├── awx_deploy.sh # AWX deployment
│ │ ├── configure_awx.sh # AWX configuration
│ │ └── deploy_nutanix_service.sh # Service deployment
│ └── files/
│ ├── deployment.yaml # Kubernetes deployment
│ ├── unified-configmap.yaml # Configuration template
│ ├── unified-secret.yaml # Secrets template
│ └── nutanix-eda-docker.env # Docker environment template
└── playbooks/ # Ansible playbooks for job templates
├── ntx-create-sz.yml # Create security zone
├── ntx-delete-sz.yml # Delete security zone
├── ntx-create-vnet.yml # Create virtual network
└── ntx-delete-vnet.yml # Delete virtual networkMaintenance and Operations
- Monitoring Service Health
- Updating Configuration for Kubernetes
- Scaling (Kubernetes Only)
- Updating Configuration for Docker
Monitoring Service Health
# Kubernetes deployment kubectl get pods -l app=event-notification-service kubectl logs -f deployment/event-notification-service # Docker deployment docker ps | grep nutanix-event-service docker logs -f nutanix-event-service
Updating Configuration for Kubernetes
# Update ConfigMap kubectl edit configmap nutanix-eda-config # Update Secret kubectl edit secret nutanix-eda-secrets # Restart deployment kubectl rollout restart deployment event-notification-service
Scaling (Kubernetes Only)
# Scale to multiple replicas kubectl scale deployment event-notification-service --replicas=2 # Update resource limits kubectl edit deployment event-notification-service
Updating Configuration for Docker
# Update environment file and restart container docker stop nutanix-event-service docker rm nutanix-event-service # Edit nutanix-eda-docker.env docker run -d --name nutanix-event-service --env-file nutanix-eda-docker.env <image>
Support and Documentation
Log Analysis
This service provides detailed logging for troubleshooting:
-
Infrastructure event detection
-
AWX job template execution
-
Configuration loading
-
API Connectivity Status
Useful Commands
# Get service version
kubectl exec deployment/event-notification-service -- python -c "print('Service running')"
# Test configuration
kubectl exec deployment/event-notification-service -- python unified_config_manager.py
# Manual job trigger (for testing)
# Access AWX web interface and manually run job templates