Skip to content

Feat: support getting control plane logs #73

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 16, 2025
Merged

Conversation

gossion
Copy link
Member

@gossion gossion commented Jul 16, 2025

This pull request introduces functionality for managing and querying AKS control plane diagnostics and logs, along with enhancements to command validation for security. The changes include new handler functions, tools registration, and specialized validation for KQL queries.

AKS Control Plane Diagnostics and Logs Management:

  • New Handlers for Diagnostics and Logs: Added HandleControlPlaneDiagnosticSettings and HandleControlPlaneLogs functions to manage diagnostic settings and query control plane logs with validation and safety constraints. (internal/components/monitor/diagnostics_handlers.go, internal/components/monitor/diagnostics_handlers.goR1-R400)
  • Tool Registration: Registered tools for checking diagnostic settings (aks_control_plane_diagnostic_settings) and querying control plane logs (aks_control_plane_logs) with detailed parameter validation and descriptions. (internal/components/monitor/registry.go, internal/components/monitor/registry.goR101-R155)
  • Service Initialization: Updated the service initialization to include registration of AKS control plane tools. (internal/server/server.go, [1] [2]

Security Enhancements:

  • KQL Query Validation: Added specialized validation logic for az monitor log-analytics query commands to ensure KQL queries are properly quoted and other command parts are free of dangerous patterns. (internal/security/validator.go, [1] [2]

Copy link
Member

@feiskyer feiskyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detailed Code Suggestions

1. Extract hard-coded values to constants (line 194)

const (
    MaxLogRetentionDays = 7
    MaxQueryRangeDuration = 24 * time.Hour
    DefaultMaxRecords = 100
    MaxAllowedRecords = 1000
)

2. Extract resource ID construction utility (line 27)

func buildClusterResourceID(subscriptionID, resourceGroup, clusterName string) string {
    return fmt.Sprintf("/subscriptions/%s/resourceGroups/%s/providers/Microsoft.ContainerService/managedClusters/%s",
        subscriptionID, resourceGroup, clusterName)
}

3. Improve log level mapping (line 292)

var logLevelPrefixes = map[string]string{
    "info":    "I",
    "warning": "W",
    "error":   "E",
}

if prefix, exists := logLevelPrefixes[strings.ToLower(logLevel)]; exists {
    baseQuery += fmt.Sprintf(" | where log_s startswith '%s'", prefix)
}

4. Add more context to error messages (line 42)

return "", fmt.Errorf("failed to get diagnostic settings for cluster %s in resource group %s: %w", clusterName, resourceGroup, err)

5. Consider splitting large file for better organization

The 400-line handler file could benefit from being split:

internal/components/monitor/
├── diagnostics.go (main handlers)
├── validation.go (parameter validation)
├── kql.go (query building)
└── workspace.go (workspace GUID extraction)

@gossion
Copy link
Member Author

gossion commented Jul 16, 2025

failed to get diagnostic settings for cluster

fixed 1,2,3,4. skipping 5 for this time (I would change it later)

Copy link
Member

@feiskyer feiskyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gossion gossion merged commit 50a809f into main Jul 16, 2025
9 checks passed
@gossion gossion deleted the guwe/controlplanelogs-p branch July 16, 2025 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants