Update README.md
This commit is contained in:
26
.env
Normal file
26
.env
Normal file
@ -0,0 +1,26 @@
|
||||
# Nomad connection settings
|
||||
NOMAD_ADDR=http://pjmldk01.ds.meisheng.group:4646
|
||||
NOMAD_TOKEN=
|
||||
NOMAD_SKIP_VERIFY=true
|
||||
NOMAD_NAMESPACE=development
|
||||
|
||||
# Gitea API configuration
|
||||
GITEA_API_URL=https://gitea.dev.meisheng.group/api/v1
|
||||
GITEA_API_TOKEN=a2de6c0014e6d0108edb94fb7d524777bb75d33a
|
||||
# Alternative authentication (uncomment if needed)
|
||||
# GITEA_USERNAME=your-gitea-username
|
||||
# GITEA_PASSWORD=your-gitea-password
|
||||
GITEA_VERIFY_SSL=false
|
||||
|
||||
# API settings
|
||||
PORT=8000
|
||||
HOST=0.0.0.0
|
||||
|
||||
# Configuration directory
|
||||
CONFIG_DIR=./configs
|
||||
|
||||
# Logging level
|
||||
LOG_LEVEL=INFO
|
||||
|
||||
# Enable to make development easier
|
||||
RELOAD=true
|
22
.env.example
Normal file
22
.env.example
Normal file
@ -0,0 +1,22 @@
|
||||
# Nomad connection settings
|
||||
NOMAD_ADDR=http://localhost:4646
|
||||
NOMAD_TOKEN=<your-nomad-token>
|
||||
NOMAD_SKIP_VERIFY=false
|
||||
|
||||
# Gitea API configuration
|
||||
GITEA_API_URL=http://gitea.internal.example.com/api/v1
|
||||
GITEA_API_TOKEN=<your-gitea-api-token>
|
||||
# Alternative authentication (if token is not available)
|
||||
# GITEA_USERNAME=<your-gitea-username>
|
||||
# GITEA_PASSWORD=<your-gitea-password>
|
||||
GITEA_VERIFY_SSL=false
|
||||
|
||||
# API settings
|
||||
PORT=8000
|
||||
HOST=0.0.0.0
|
||||
|
||||
# Configuration directory
|
||||
CONFIG_DIR=./configs
|
||||
|
||||
# Optional: Logging level
|
||||
LOG_LEVEL=INFO
|
249
CLAUDE_API_INTEGRATION.md
Normal file
249
CLAUDE_API_INTEGRATION.md
Normal file
@ -0,0 +1,249 @@
|
||||
# Claude Integration with Nomad MCP
|
||||
|
||||
This document explains how to configure Claude to connect to the Nomad MCP service and manage jobs.
|
||||
|
||||
## Overview
|
||||
|
||||
The Nomad MCP service provides a simplified REST API specifically designed for Claude to interact with Nomad jobs. This API allows Claude to:
|
||||
|
||||
1. List all jobs in a namespace
|
||||
2. Get the status of a specific job
|
||||
3. Start, stop, and restart jobs
|
||||
4. Create new jobs with a simplified specification
|
||||
5. Retrieve logs from jobs
|
||||
|
||||
## API Endpoints
|
||||
|
||||
The Claude-specific API is available at the `/api/claude` prefix. The following endpoints are available:
|
||||
|
||||
### List Jobs
|
||||
|
||||
```
|
||||
GET /api/claude/list-jobs?namespace=development
|
||||
```
|
||||
|
||||
Returns a list of all jobs in the specified namespace with their IDs, names, statuses, and types.
|
||||
|
||||
### Manage Jobs
|
||||
|
||||
```
|
||||
POST /api/claude/jobs
|
||||
```
|
||||
|
||||
Manages existing jobs with operations like status check, stop, and restart.
|
||||
|
||||
Request body:
|
||||
```json
|
||||
{
|
||||
"job_id": "example-job",
|
||||
"action": "status|stop|restart",
|
||||
"namespace": "development",
|
||||
"purge": false
|
||||
}
|
||||
```
|
||||
|
||||
### Create Jobs
|
||||
|
||||
```
|
||||
POST /api/claude/create-job
|
||||
```
|
||||
|
||||
Creates a new job with a simplified specification.
|
||||
|
||||
Request body:
|
||||
```json
|
||||
{
|
||||
"job_id": "example-job",
|
||||
"name": "Example Job",
|
||||
"type": "service",
|
||||
"datacenters": ["jm"],
|
||||
"namespace": "development",
|
||||
"docker_image": "nginx:latest",
|
||||
"count": 1,
|
||||
"cpu": 100,
|
||||
"memory": 128,
|
||||
"ports": [
|
||||
{
|
||||
"Label": "http",
|
||||
"Value": 0,
|
||||
"To": 80
|
||||
}
|
||||
],
|
||||
"env_vars": {
|
||||
"ENV_VAR1": "value1",
|
||||
"ENV_VAR2": "value2"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Get Job Logs
|
||||
|
||||
```
|
||||
GET /api/claude/job-logs/{job_id}?namespace=development
|
||||
```
|
||||
|
||||
Retrieves logs from the latest allocation of the specified job.
|
||||
|
||||
## Configuring Claude Desktop Application
|
||||
|
||||
To configure Claude to connect to the Nomad MCP service, follow these steps:
|
||||
|
||||
### 1. Set Up API Access
|
||||
|
||||
Claude needs to be configured with the base URL of your Nomad MCP service. This is typically:
|
||||
|
||||
```
|
||||
http://your-server-address:8000
|
||||
```
|
||||
|
||||
### 2. Create a Claude Tool Configuration
|
||||
|
||||
In the Claude desktop application, you can create a custom tool configuration that allows Claude to interact with the Nomad MCP API. Here's a sample configuration:
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": [
|
||||
{
|
||||
"name": "nomad_mcp",
|
||||
"description": "Manage Nomad jobs through the MCP service",
|
||||
"api_endpoints": [
|
||||
{
|
||||
"name": "list_jobs",
|
||||
"description": "List all jobs in a namespace",
|
||||
"method": "GET",
|
||||
"url": "http://your-server-address:8000/api/claude/list-jobs",
|
||||
"params": [
|
||||
{
|
||||
"name": "namespace",
|
||||
"type": "string",
|
||||
"description": "Nomad namespace",
|
||||
"required": false,
|
||||
"default": "development"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "manage_job",
|
||||
"description": "Manage a job (status, stop, restart)",
|
||||
"method": "POST",
|
||||
"url": "http://your-server-address:8000/api/claude/jobs",
|
||||
"body": {
|
||||
"job_id": "string",
|
||||
"action": "string",
|
||||
"namespace": "string",
|
||||
"purge": "boolean"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "create_job",
|
||||
"description": "Create a new job",
|
||||
"method": "POST",
|
||||
"url": "http://your-server-address:8000/api/claude/create-job",
|
||||
"body": {
|
||||
"job_id": "string",
|
||||
"name": "string",
|
||||
"type": "string",
|
||||
"datacenters": "array",
|
||||
"namespace": "string",
|
||||
"docker_image": "string",
|
||||
"count": "integer",
|
||||
"cpu": "integer",
|
||||
"memory": "integer",
|
||||
"ports": "array",
|
||||
"env_vars": "object"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "get_job_logs",
|
||||
"description": "Get logs for a job",
|
||||
"method": "GET",
|
||||
"url": "http://your-server-address:8000/api/claude/job-logs/{job_id}",
|
||||
"params": [
|
||||
{
|
||||
"name": "namespace",
|
||||
"type": "string",
|
||||
"description": "Nomad namespace",
|
||||
"required": false,
|
||||
"default": "development"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Import the Tool Configuration
|
||||
|
||||
1. Open the Claude desktop application
|
||||
2. Go to Settings > Tools
|
||||
3. Click "Import Tool Configuration"
|
||||
4. Select the JSON file with the above configuration
|
||||
5. Click "Save"
|
||||
|
||||
### 4. Test the Connection
|
||||
|
||||
You can test the connection by asking Claude to list all jobs:
|
||||
|
||||
```
|
||||
Please list all jobs in the development namespace using the Nomad MCP service.
|
||||
```
|
||||
|
||||
Claude should use the configured tool to make an API request to the Nomad MCP service and return the list of jobs.
|
||||
|
||||
## Example Prompts for Claude
|
||||
|
||||
Here are some example prompts you can use with Claude to interact with the Nomad MCP service:
|
||||
|
||||
### List Jobs
|
||||
|
||||
```
|
||||
Please list all jobs in the development namespace.
|
||||
```
|
||||
|
||||
### Check Job Status
|
||||
|
||||
```
|
||||
What is the status of the job "example-job"?
|
||||
```
|
||||
|
||||
### Start a New Job
|
||||
|
||||
```
|
||||
Please create a new job with the following specifications:
|
||||
- Job ID: test-nginx
|
||||
- Docker image: nginx:latest
|
||||
- Memory: 256MB
|
||||
- CPU: 200MHz
|
||||
- Port mapping: HTTP port 80
|
||||
```
|
||||
|
||||
### Stop a Job
|
||||
|
||||
```
|
||||
Please stop the job "test-nginx" and purge it from Nomad.
|
||||
```
|
||||
|
||||
### Get Job Logs
|
||||
|
||||
```
|
||||
Show me the logs for the job "example-job".
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
If Claude is unable to connect to the Nomad MCP service, check the following:
|
||||
|
||||
1. Ensure the Nomad MCP service is running and accessible from Claude's network
|
||||
2. Verify the base URL in the tool configuration is correct
|
||||
3. Check that the Nomad MCP service has proper connectivity to the Nomad server
|
||||
4. Review the logs of the Nomad MCP service for any errors
|
||||
|
||||
## Security Considerations
|
||||
|
||||
The Claude API integration does not include authentication by default. If you need to secure the API:
|
||||
|
||||
1. Add an API key requirement to the FastAPI application
|
||||
2. Include the API key in the Claude tool configuration
|
||||
3. Consider using HTTPS for all communications between Claude and the Nomad MCP service
|
19
Dockerfile
Normal file
19
Dockerfile
Normal file
@ -0,0 +1,19 @@
|
||||
FROM python:3.11-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Copy requirements first for better layer caching
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Copy application code
|
||||
COPY . .
|
||||
|
||||
# Create configs directory
|
||||
RUN mkdir -p configs
|
||||
|
||||
# Expose the API port
|
||||
EXPOSE 8000
|
||||
|
||||
# Run the application
|
||||
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
122
QUICK_START.md
Normal file
122
QUICK_START.md
Normal file
@ -0,0 +1,122 @@
|
||||
# Nomad MCP - Quick Start Guide
|
||||
|
||||
This guide will help you quickly set up and start using the Nomad MCP service for managing Nomad jobs.
|
||||
|
||||
## 1. Installation
|
||||
|
||||
### Clone the Repository
|
||||
|
||||
```bash
|
||||
git clone https://github.com/your-org/nomad-mcp.git
|
||||
cd nomad-mcp
|
||||
```
|
||||
|
||||
### Install Dependencies
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## 2. Configuration
|
||||
|
||||
### Set Up Environment Variables
|
||||
|
||||
Create a `.env` file in the project root:
|
||||
|
||||
```
|
||||
# Nomad connection settings
|
||||
NOMAD_ADDR=http://your-nomad-server:4646
|
||||
NOMAD_TOKEN=your-nomad-token
|
||||
NOMAD_NAMESPACE=development
|
||||
NOMAD_SKIP_VERIFY=true
|
||||
|
||||
# API settings
|
||||
PORT=8000
|
||||
HOST=0.0.0.0
|
||||
|
||||
# Logging level
|
||||
LOG_LEVEL=INFO
|
||||
```
|
||||
|
||||
Replace `your-nomad-server` and `your-nomad-token` with your actual Nomad server address and token.
|
||||
|
||||
## 3. Start the Service
|
||||
|
||||
```bash
|
||||
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
The service will be available at `http://localhost:8000`.
|
||||
|
||||
## 4. Access the Web UI
|
||||
|
||||
Open your browser and navigate to:
|
||||
|
||||
```
|
||||
http://localhost:8000
|
||||
```
|
||||
|
||||
You should see the Nomad Job Manager UI with a list of jobs in your default namespace.
|
||||
|
||||
## 5. Basic Operations
|
||||
|
||||
### View Jobs
|
||||
|
||||
1. Select a namespace from the dropdown in the header
|
||||
2. Browse the list of jobs with their statuses
|
||||
|
||||
### Manage a Job
|
||||
|
||||
1. Click the "View" button next to a job to see its details
|
||||
2. Use the "Restart" button to restart a job
|
||||
3. Use the "Stop" button to stop a job
|
||||
|
||||
### View Logs
|
||||
|
||||
1. Select a job to view its details
|
||||
2. Scroll down to the "Logs" section
|
||||
3. Switch between stdout and stderr using the tabs
|
||||
|
||||
## 6. API Usage
|
||||
|
||||
### List Jobs
|
||||
|
||||
```bash
|
||||
curl http://localhost:8000/api/claude/list-jobs?namespace=development
|
||||
```
|
||||
|
||||
### Get Job Status
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8000/api/claude/jobs \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"job_id": "example-job", "action": "status", "namespace": "development"}'
|
||||
```
|
||||
|
||||
### Stop a Job
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8000/api/claude/jobs \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"job_id": "example-job", "action": "stop", "namespace": "development", "purge": false}'
|
||||
```
|
||||
|
||||
## 7. Claude AI Integration
|
||||
|
||||
To set up Claude AI integration:
|
||||
|
||||
1. Configure Claude with the provided `claude_nomad_tool.json` file
|
||||
2. Update the URLs in the configuration to point to your Nomad MCP service
|
||||
3. Use natural language to ask Claude to manage your Nomad jobs
|
||||
|
||||
Example prompt for Claude:
|
||||
```
|
||||
Please list all jobs in the development namespace using the Nomad MCP service.
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Read the full [README.md](README.md) for detailed information
|
||||
- Check out the [User Guide](USER_GUIDE.md) for the web UI
|
||||
- Explore the [Claude API Integration Documentation](CLAUDE_API_INTEGRATION.md) for AI integration
|
||||
- Review the API documentation at `http://localhost:8000/docs`
|
116
README_NOMAD_API.md
Normal file
116
README_NOMAD_API.md
Normal file
@ -0,0 +1,116 @@
|
||||
# Nomad API Integration
|
||||
|
||||
This document explains how the Nomad API integration works in this application, the recent improvements made, and how to test the functionality.
|
||||
|
||||
## Overview
|
||||
|
||||
This application uses Hashicorp Nomad for job orchestration, interacting with Nomad through its HTTP API. The integration allows starting, stopping, and monitoring jobs in Nomad.
|
||||
|
||||
## Recent Improvements
|
||||
|
||||
The following improvements have been made to the Nomad service integration:
|
||||
|
||||
1. **Simplified Namespace Handling**:
|
||||
- Clear priority order for determining which namespace to use:
|
||||
1. Explicitly specified in job spec (highest priority)
|
||||
2. Service instance namespace (default: "development")
|
||||
- Consistent namespace handling across all API operations
|
||||
- Better logging of namespace resolution
|
||||
|
||||
2. **Standardized Job Specification Formatting**:
|
||||
- Consistent normalization of job specifications to ensure proper structure
|
||||
- Always ensures job specs are wrapped in a "Job" key as required by Nomad
|
||||
- Maintains any existing structure while normalizing as needed
|
||||
|
||||
3. **Enhanced Error Handling**:
|
||||
- Improved error messages with more context
|
||||
- Added logging of API responses for better troubleshooting
|
||||
- Returns namespace information in responses
|
||||
|
||||
4. **Automated Testing**:
|
||||
- Added pytest tests to verify job start/stop functionality
|
||||
- Tests cover different job specification formats
|
||||
- Auto-cleanup of test jobs
|
||||
|
||||
## How to Run Tests
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. Set up the environment variables:
|
||||
- `NOMAD_ADDR`: URL of your Nomad server (e.g., `http://pjmldk01.ds.meisheng.group:4646`)
|
||||
- `NOMAD_TOKEN`: Authentication token (if your Nomad cluster uses ACLs)
|
||||
- `NOMAD_NAMESPACE`: Default namespace to use (defaults to "development")
|
||||
|
||||
2. Install test dependencies:
|
||||
```
|
||||
pip install pytest pytest-cov
|
||||
```
|
||||
|
||||
### Running the Tests
|
||||
|
||||
From the project root directory:
|
||||
|
||||
```bash
|
||||
python -m pytest tests/test_nomad_service.py -v
|
||||
```
|
||||
|
||||
Add coverage reporting:
|
||||
|
||||
```bash
|
||||
python -m pytest tests/test_nomad_service.py --cov=app.services.nomad_client -v
|
||||
```
|
||||
|
||||
## Manual API Testing
|
||||
|
||||
You can use PowerShell to test Nomad API operations directly:
|
||||
|
||||
### List Jobs
|
||||
|
||||
```powershell
|
||||
Invoke-RestMethod -Uri "http://pjmldk01.ds.meisheng.group:4646/v1/jobs?namespace=development" -Method GET
|
||||
```
|
||||
|
||||
### Get Job Details
|
||||
|
||||
```powershell
|
||||
Invoke-RestMethod -Uri "http://pjmldk01.ds.meisheng.group:4646/v1/job/example-job?namespace=development" -Method GET
|
||||
```
|
||||
|
||||
### Start a Job
|
||||
|
||||
```powershell
|
||||
$jobSpec = @{
|
||||
"Job" = @{
|
||||
"ID" = "example-job"
|
||||
"Name" = "example-job"
|
||||
"Namespace" = "development"
|
||||
# Other job properties
|
||||
}
|
||||
} | ConvertTo-Json -Depth 20
|
||||
|
||||
Invoke-RestMethod -Uri "http://pjmldk01.ds.meisheng.group:4646/v1/jobs" -Method POST -Body $jobSpec -ContentType "application/json"
|
||||
```
|
||||
|
||||
### Stop a Job
|
||||
|
||||
```powershell
|
||||
Invoke-RestMethod -Uri "http://pjmldk01.ds.meisheng.group:4646/v1/job/example-job?namespace=development" -Method DELETE
|
||||
```
|
||||
|
||||
## API Documentation
|
||||
|
||||
For more comprehensive documentation on the Nomad API integration, refer to the `nomad_job_api_docs.md` file.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Job Not Found**: Ensure you're specifying the correct namespace
|
||||
2. **Failed to Start Job**: Check job specification format and resource requirements
|
||||
3. **Permission Denied**: Verify ACL token has appropriate permissions
|
||||
|
||||
### Debugging Tips
|
||||
|
||||
1. Check the application logs for detailed error messages
|
||||
2. Use the `-v` flag with pytest to see more verbose output
|
||||
3. Try direct API requests to isolate application vs. Nomad API issues
|
135
USER_GUIDE.md
Normal file
135
USER_GUIDE.md
Normal file
@ -0,0 +1,135 @@
|
||||
# Nomad Job Manager UI - User Guide
|
||||
|
||||
This guide provides instructions on how to use the Nomad Job Manager web interface to monitor and manage your Nomad jobs.
|
||||
|
||||
## Accessing the UI
|
||||
|
||||
The Nomad Job Manager UI is available at the root URL of the Nomad MCP service:
|
||||
|
||||
```
|
||||
http://your-server-address:8000
|
||||
```
|
||||
|
||||
## Interface Overview
|
||||
|
||||
The UI is divided into two main sections:
|
||||
|
||||
1. **Job List** (left panel): Displays all jobs in the selected namespace
|
||||
2. **Job Details** (right panel): Shows detailed information about the selected job and its logs
|
||||
|
||||
### Header Controls
|
||||
|
||||
- **Namespace Selector**: Dropdown to switch between different Nomad namespaces
|
||||
- **Refresh Button**: Updates the job list with the latest information from Nomad
|
||||
|
||||
## Managing Jobs
|
||||
|
||||
### Viewing Jobs
|
||||
|
||||
1. Select the desired namespace from the dropdown in the header
|
||||
2. The job list will display all jobs in that namespace with their:
|
||||
- Job ID
|
||||
- Type (service, batch, system)
|
||||
- Status (running, pending, dead)
|
||||
- Action buttons
|
||||
|
||||
### Job Actions
|
||||
|
||||
For each job in the list, you can perform the following actions:
|
||||
|
||||
- **View**: Display detailed information about the job and its logs
|
||||
- **Restart**: Stop and restart the job with its current configuration
|
||||
- **Stop**: Stop the job (with an option to purge it)
|
||||
|
||||
### Viewing Job Details
|
||||
|
||||
When you click the "View" button for a job, the right panel will display:
|
||||
|
||||
1. **Job Information**:
|
||||
- Job ID
|
||||
- Status
|
||||
- Type
|
||||
- Namespace
|
||||
- Datacenters
|
||||
|
||||
2. **Allocation Information** (if available):
|
||||
- Allocation ID
|
||||
- Status
|
||||
- Description
|
||||
|
||||
3. **Logs**:
|
||||
- Tabs to switch between stdout and stderr logs
|
||||
- Scrollable log content
|
||||
|
||||
## Working with Logs
|
||||
|
||||
The logs section allows you to view the output from your job's tasks:
|
||||
|
||||
1. Click on a job to view its details
|
||||
2. Scroll down to the "Logs" section
|
||||
3. Use the tabs to switch between:
|
||||
- **stdout**: Standard output logs
|
||||
- **stderr**: Standard error logs
|
||||
|
||||
The logs are automatically retrieved from the most recent allocation of the job.
|
||||
|
||||
## Common Tasks
|
||||
|
||||
### Restarting a Failed Job
|
||||
|
||||
1. Find the job in the job list
|
||||
2. Click the "Restart" button
|
||||
3. Confirm the restart when prompted
|
||||
4. The job status will update once the restart is complete
|
||||
|
||||
### Stopping a Job
|
||||
|
||||
1. Find the job in the job list
|
||||
2. Click the "Stop" button
|
||||
3. Choose whether to purge the job when prompted
|
||||
4. Confirm the stop operation
|
||||
5. The job will be removed from the list if purged, or shown as "dead" if not purged
|
||||
|
||||
### Troubleshooting a Job
|
||||
|
||||
1. Select the job to view its details
|
||||
2. Check the status and any error messages in the job details
|
||||
3. Review the stderr logs for error information
|
||||
4. If needed, restart the job to attempt recovery
|
||||
|
||||
## Tips and Tricks
|
||||
|
||||
- **Regular Refreshes**: Use the refresh button to get the latest job status
|
||||
- **Log Navigation**: For large log files, use your browser's search function (Ctrl+F) to find specific messages
|
||||
- **Multiple Namespaces**: Switch between namespaces to manage different environments (development, production, etc.)
|
||||
- **Job Status Colors**:
|
||||
- Green: Running jobs
|
||||
- Orange: Pending jobs
|
||||
- Red: Dead or failed jobs
|
||||
|
||||
## Troubleshooting the UI
|
||||
|
||||
If you encounter issues with the UI:
|
||||
|
||||
1. **UI Doesn't Load**:
|
||||
- Check that the Nomad MCP service is running
|
||||
- Verify your browser can reach the server
|
||||
- Check browser console for JavaScript errors
|
||||
|
||||
2. **Jobs Not Appearing**:
|
||||
- Ensure you've selected the correct namespace
|
||||
- Verify that your Nomad server is accessible
|
||||
- Check that your Nomad token has permission to list jobs
|
||||
|
||||
3. **Cannot Perform Actions**:
|
||||
- Verify that your Nomad token has appropriate permissions
|
||||
- Check the browser console for API errors
|
||||
- Review the Nomad MCP service logs for backend errors
|
||||
|
||||
## Next Steps
|
||||
|
||||
For more advanced operations or programmatic access, consider:
|
||||
|
||||
1. Using the REST API directly (see the API documentation)
|
||||
2. Setting up Claude AI integration for natural language job management
|
||||
3. Creating job configuration mappings for repository-based job management
|
BIN
__pycache__/test_gitea_integration.cpython-313.pyc
Normal file
BIN
__pycache__/test_gitea_integration.cpython-313.pyc
Normal file
Binary file not shown.
BIN
__pycache__/test_gitea_repos.cpython-313.pyc
Normal file
BIN
__pycache__/test_gitea_repos.cpython-313.pyc
Normal file
Binary file not shown.
BIN
__pycache__/test_nomad_connection.cpython-313.pyc
Normal file
BIN
__pycache__/test_nomad_connection.cpython-313.pyc
Normal file
Binary file not shown.
BIN
__pycache__/test_nomad_namespaces.cpython-313.pyc
Normal file
BIN
__pycache__/test_nomad_namespaces.cpython-313.pyc
Normal file
Binary file not shown.
2
app/__init__.py
Normal file
2
app/__init__.py
Normal file
@ -0,0 +1,2 @@
|
||||
# Import version
|
||||
__version__ = "0.1.0"
|
BIN
app/__pycache__/__init__.cpython-313.pyc
Normal file
BIN
app/__pycache__/__init__.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/__pycache__/main.cpython-313.pyc
Normal file
BIN
app/__pycache__/main.cpython-313.pyc
Normal file
Binary file not shown.
101
app/main.py
Normal file
101
app/main.py
Normal file
@ -0,0 +1,101 @@
|
||||
from fastapi import FastAPI, HTTPException, Depends
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
from fastapi.staticfiles import StaticFiles
|
||||
import os
|
||||
import logging
|
||||
from dotenv import load_dotenv
|
||||
|
||||
from app.routers import jobs, logs, configs, repositories, claude
|
||||
from app.services.nomad_client import get_nomad_client
|
||||
from app.services.gitea_client import GiteaClient
|
||||
|
||||
# Load environment variables
|
||||
load_dotenv()
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
|
||||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Initialize the FastAPI app
|
||||
app = FastAPI(
|
||||
title="Nomad MCP",
|
||||
description="Service for AI agents to manage Nomad jobs via MCP protocol",
|
||||
version="0.1.0",
|
||||
)
|
||||
|
||||
# Add CORS middleware
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
allow_origins=["*"], # Can be set to specific origins in production
|
||||
allow_credentials=True,
|
||||
allow_methods=["*"],
|
||||
allow_headers=["*"],
|
||||
)
|
||||
|
||||
# Include routers
|
||||
app.include_router(jobs.router, prefix="/api/jobs", tags=["jobs"])
|
||||
app.include_router(logs.router, prefix="/api/logs", tags=["logs"])
|
||||
app.include_router(configs.router, prefix="/api/configs", tags=["configs"])
|
||||
app.include_router(repositories.router, prefix="/api/repositories", tags=["repositories"])
|
||||
app.include_router(claude.router, prefix="/api/claude", tags=["claude"])
|
||||
|
||||
@app.get("/api/health", tags=["health"])
|
||||
async def health_check():
|
||||
"""Health check endpoint."""
|
||||
health_status = {
|
||||
"status": "healthy",
|
||||
"services": {}
|
||||
}
|
||||
|
||||
# Check Nomad connection
|
||||
try:
|
||||
client = get_nomad_client()
|
||||
nomad_status = client.agent.get_agent()
|
||||
health_status["services"]["nomad"] = {
|
||||
"status": "connected",
|
||||
"version": nomad_status.get("config", {}).get("Version", "unknown"),
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"Nomad health check failed: {str(e)}")
|
||||
health_status["services"]["nomad"] = {
|
||||
"status": "failed",
|
||||
"error": str(e),
|
||||
}
|
||||
|
||||
# Check Gitea connection
|
||||
try:
|
||||
gitea_client = GiteaClient()
|
||||
if gitea_client.api_base_url:
|
||||
# Try to list repositories as a connection test
|
||||
repos = gitea_client.list_repositories(limit=1)
|
||||
health_status["services"]["gitea"] = {
|
||||
"status": "connected",
|
||||
"api_url": gitea_client.api_base_url,
|
||||
}
|
||||
else:
|
||||
health_status["services"]["gitea"] = {
|
||||
"status": "not_configured",
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"Gitea health check failed: {str(e)}")
|
||||
health_status["services"]["gitea"] = {
|
||||
"status": "failed",
|
||||
"error": str(e),
|
||||
}
|
||||
|
||||
# Overall status is unhealthy if any service is failed
|
||||
if any(service["status"] == "failed" for service in health_status["services"].values()):
|
||||
health_status["status"] = "unhealthy"
|
||||
|
||||
return health_status
|
||||
|
||||
# Mount static files
|
||||
app.mount("/", StaticFiles(directory="static", html=True), name="static")
|
||||
|
||||
if __name__ == "__main__":
|
||||
import uvicorn
|
||||
port = int(os.getenv("PORT", "8000"))
|
||||
uvicorn.run("app.main:app", host="0.0.0.0", port=port, reload=True)
|
1
app/routers/__init__.py
Normal file
1
app/routers/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
# Import routers
|
BIN
app/routers/__pycache__/__init__.cpython-313.pyc
Normal file
BIN
app/routers/__pycache__/__init__.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/routers/__pycache__/claude.cpython-313.pyc
Normal file
BIN
app/routers/__pycache__/claude.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/routers/__pycache__/configs.cpython-313.pyc
Normal file
BIN
app/routers/__pycache__/configs.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/routers/__pycache__/jobs.cpython-313.pyc
Normal file
BIN
app/routers/__pycache__/jobs.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/routers/__pycache__/logs.cpython-313.pyc
Normal file
BIN
app/routers/__pycache__/logs.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/routers/__pycache__/repositories.cpython-313.pyc
Normal file
BIN
app/routers/__pycache__/repositories.cpython-313.pyc
Normal file
Binary file not shown.
230
app/routers/claude.py
Normal file
230
app/routers/claude.py
Normal file
@ -0,0 +1,230 @@
|
||||
from fastapi import APIRouter, HTTPException, Body, Query, Depends
|
||||
from typing import Dict, Any, List, Optional
|
||||
import logging
|
||||
import json
|
||||
|
||||
from app.services.nomad_client import NomadService
|
||||
from app.schemas.claude_api import ClaudeJobRequest, ClaudeJobSpecification, ClaudeJobResponse
|
||||
|
||||
router = APIRouter()
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@router.post("/jobs", response_model=ClaudeJobResponse)
|
||||
async def manage_job(request: ClaudeJobRequest):
|
||||
"""
|
||||
Endpoint for Claude to manage Nomad jobs with a simplified interface.
|
||||
|
||||
This endpoint handles job operations like start, stop, restart, and status checks.
|
||||
"""
|
||||
try:
|
||||
# Create a Nomad service instance with the specified namespace
|
||||
nomad_service = NomadService()
|
||||
if request.namespace:
|
||||
nomad_service.namespace = request.namespace
|
||||
|
||||
# Handle different actions
|
||||
if request.action.lower() == "status":
|
||||
# Get job status
|
||||
job = nomad_service.get_job(request.job_id)
|
||||
|
||||
# Get allocations for more detailed status
|
||||
allocations = nomad_service.get_allocations(request.job_id)
|
||||
latest_alloc = None
|
||||
if allocations:
|
||||
# Sort allocations by creation time (descending)
|
||||
sorted_allocations = sorted(
|
||||
allocations,
|
||||
key=lambda a: a.get("CreateTime", 0),
|
||||
reverse=True
|
||||
)
|
||||
latest_alloc = sorted_allocations[0]
|
||||
|
||||
return ClaudeJobResponse(
|
||||
success=True,
|
||||
job_id=request.job_id,
|
||||
status=job.get("Status", "unknown"),
|
||||
message=f"Job {request.job_id} is {job.get('Status', 'unknown')}",
|
||||
details={
|
||||
"job": job,
|
||||
"latest_allocation": latest_alloc
|
||||
}
|
||||
)
|
||||
|
||||
elif request.action.lower() == "stop":
|
||||
# Stop the job
|
||||
result = nomad_service.stop_job(request.job_id, purge=request.purge)
|
||||
|
||||
return ClaudeJobResponse(
|
||||
success=True,
|
||||
job_id=request.job_id,
|
||||
status="stopped",
|
||||
message=f"Job {request.job_id} has been stopped" + (" and purged" if request.purge else ""),
|
||||
details=result
|
||||
)
|
||||
|
||||
elif request.action.lower() == "restart":
|
||||
# Get the current job specification
|
||||
job_spec = nomad_service.get_job(request.job_id)
|
||||
|
||||
# Stop the job
|
||||
nomad_service.stop_job(request.job_id)
|
||||
|
||||
# Start the job with the original specification
|
||||
result = nomad_service.start_job(job_spec)
|
||||
|
||||
return ClaudeJobResponse(
|
||||
success=True,
|
||||
job_id=request.job_id,
|
||||
status="restarted",
|
||||
message=f"Job {request.job_id} has been restarted",
|
||||
details=result
|
||||
)
|
||||
|
||||
else:
|
||||
# Unknown action
|
||||
raise HTTPException(status_code=400, detail=f"Unknown action: {request.action}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error managing job {request.job_id}: {str(e)}")
|
||||
return ClaudeJobResponse(
|
||||
success=False,
|
||||
job_id=request.job_id,
|
||||
status="error",
|
||||
message=f"Error: {str(e)}",
|
||||
details=None
|
||||
)
|
||||
|
||||
@router.post("/create-job", response_model=ClaudeJobResponse)
|
||||
async def create_job(job_spec: ClaudeJobSpecification):
|
||||
"""
|
||||
Endpoint for Claude to create a new Nomad job with a simplified interface.
|
||||
|
||||
This endpoint allows creating a job with minimal configuration.
|
||||
"""
|
||||
try:
|
||||
# Create a Nomad service instance with the specified namespace
|
||||
nomad_service = NomadService()
|
||||
if job_spec.namespace:
|
||||
nomad_service.namespace = job_spec.namespace
|
||||
|
||||
# Convert the simplified job spec to Nomad format
|
||||
nomad_job_spec = job_spec.to_nomad_job_spec()
|
||||
|
||||
# Start the job
|
||||
result = nomad_service.start_job(nomad_job_spec)
|
||||
|
||||
return ClaudeJobResponse(
|
||||
success=True,
|
||||
job_id=job_spec.job_id,
|
||||
status="started",
|
||||
message=f"Job {job_spec.job_id} has been created and started",
|
||||
details=result
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error creating job {job_spec.job_id}: {str(e)}")
|
||||
return ClaudeJobResponse(
|
||||
success=False,
|
||||
job_id=job_spec.job_id,
|
||||
status="error",
|
||||
message=f"Error: {str(e)}",
|
||||
details=None
|
||||
)
|
||||
|
||||
@router.get("/list-jobs", response_model=List[Dict[str, Any]])
|
||||
async def list_jobs(namespace: str = Query("development")):
|
||||
"""
|
||||
List all jobs in the specified namespace.
|
||||
|
||||
Returns a simplified list of jobs with their IDs and statuses.
|
||||
"""
|
||||
try:
|
||||
# Create a Nomad service instance with the specified namespace
|
||||
nomad_service = NomadService()
|
||||
nomad_service.namespace = namespace
|
||||
|
||||
# Get all jobs
|
||||
jobs = nomad_service.list_jobs()
|
||||
|
||||
# Return a simplified list
|
||||
simplified_jobs = []
|
||||
for job in jobs:
|
||||
simplified_jobs.append({
|
||||
"id": job.get("ID"),
|
||||
"name": job.get("Name"),
|
||||
"status": job.get("Status"),
|
||||
"type": job.get("Type"),
|
||||
"namespace": namespace
|
||||
})
|
||||
|
||||
return simplified_jobs
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error listing jobs: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Error listing jobs: {str(e)}")
|
||||
|
||||
@router.get("/job-logs/{job_id}", response_model=Dict[str, Any])
|
||||
async def get_job_logs(job_id: str, namespace: str = Query("development")):
|
||||
"""
|
||||
Get logs for a job.
|
||||
|
||||
Returns logs from the latest allocation of the job.
|
||||
"""
|
||||
try:
|
||||
# Create a Nomad service instance with the specified namespace
|
||||
nomad_service = NomadService()
|
||||
nomad_service.namespace = namespace
|
||||
|
||||
# Get allocations for the job
|
||||
allocations = nomad_service.get_allocations(job_id)
|
||||
if not allocations:
|
||||
return {
|
||||
"success": False,
|
||||
"job_id": job_id,
|
||||
"message": f"No allocations found for job {job_id}",
|
||||
"logs": None
|
||||
}
|
||||
|
||||
# Sort allocations by creation time (descending)
|
||||
sorted_allocations = sorted(
|
||||
allocations,
|
||||
key=lambda a: a.get("CreateTime", 0),
|
||||
reverse=True
|
||||
)
|
||||
latest_alloc = sorted_allocations[0]
|
||||
alloc_id = latest_alloc.get("ID")
|
||||
|
||||
# Get the task name from the allocation
|
||||
task_name = None
|
||||
if "TaskStates" in latest_alloc:
|
||||
task_states = latest_alloc["TaskStates"]
|
||||
if task_states:
|
||||
task_name = next(iter(task_states.keys()))
|
||||
|
||||
if not task_name:
|
||||
task_name = "app" # Default task name
|
||||
|
||||
# Get logs for the allocation
|
||||
stdout_logs = nomad_service.get_allocation_logs(alloc_id, task_name, "stdout")
|
||||
stderr_logs = nomad_service.get_allocation_logs(alloc_id, task_name, "stderr")
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"job_id": job_id,
|
||||
"allocation_id": alloc_id,
|
||||
"task_name": task_name,
|
||||
"message": f"Retrieved logs for job {job_id}",
|
||||
"logs": {
|
||||
"stdout": stdout_logs,
|
||||
"stderr": stderr_logs
|
||||
}
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting logs for job {job_id}: {str(e)}")
|
||||
return {
|
||||
"success": False,
|
||||
"job_id": job_id,
|
||||
"message": f"Error getting logs: {str(e)}",
|
||||
"logs": None
|
||||
}
|
80
app/routers/configs.py
Normal file
80
app/routers/configs.py
Normal file
@ -0,0 +1,80 @@
|
||||
from fastapi import APIRouter, HTTPException, Body, Path
|
||||
from typing import List, Dict, Any
|
||||
import json
|
||||
|
||||
from app.services.config_service import ConfigService
|
||||
from app.schemas.config import ConfigCreate, ConfigUpdate, ConfigResponse
|
||||
|
||||
router = APIRouter()
|
||||
config_service = ConfigService()
|
||||
|
||||
@router.get("/", response_model=List[ConfigResponse])
|
||||
async def list_configs():
|
||||
"""List all available configurations."""
|
||||
return config_service.list_configs()
|
||||
|
||||
@router.get("/{name}", response_model=ConfigResponse)
|
||||
async def get_config(name: str = Path(..., description="Configuration name")):
|
||||
"""Get a specific configuration by name."""
|
||||
return config_service.get_config(name)
|
||||
|
||||
@router.post("/", response_model=ConfigResponse, status_code=201)
|
||||
async def create_config(config_data: ConfigCreate):
|
||||
"""Create a new configuration."""
|
||||
return config_service.create_config(config_data.name, config_data.dict(exclude={"name"}))
|
||||
|
||||
@router.put("/{name}", response_model=ConfigResponse)
|
||||
async def update_config(name: str, config_data: ConfigUpdate):
|
||||
"""Update an existing configuration."""
|
||||
return config_service.update_config(name, config_data.dict(exclude_unset=True))
|
||||
|
||||
@router.delete("/{name}", response_model=Dict[str, Any])
|
||||
async def delete_config(name: str = Path(..., description="Configuration name")):
|
||||
"""Delete a configuration."""
|
||||
return config_service.delete_config(name)
|
||||
|
||||
@router.get("/repository/{repository}")
|
||||
async def get_config_by_repository(repository: str):
|
||||
"""Find configuration by repository."""
|
||||
configs = config_service.list_configs()
|
||||
|
||||
for config in configs:
|
||||
if config.get("repository") == repository:
|
||||
return config
|
||||
|
||||
raise HTTPException(status_code=404, detail=f"No configuration found for repository: {repository}")
|
||||
|
||||
@router.get("/job/{job_id}")
|
||||
async def get_config_by_job(job_id: str):
|
||||
"""Find configuration by job ID."""
|
||||
configs = config_service.list_configs()
|
||||
|
||||
for config in configs:
|
||||
if config.get("job_id") == job_id:
|
||||
return config
|
||||
|
||||
raise HTTPException(status_code=404, detail=f"No configuration found for job_id: {job_id}")
|
||||
|
||||
@router.post("/link")
|
||||
async def link_repository_to_job(
|
||||
repository: str = Body(..., embed=True),
|
||||
job_id: str = Body(..., embed=True),
|
||||
name: str = Body(None, embed=True)
|
||||
):
|
||||
"""Link a repository to a job."""
|
||||
# Generate a name if not provided
|
||||
if not name:
|
||||
name = f"{job_id.lower().replace('/', '_').replace(' ', '_')}"
|
||||
|
||||
# Create the config
|
||||
config = {
|
||||
"repository": repository,
|
||||
"job_id": job_id,
|
||||
}
|
||||
|
||||
return config_service.create_config(name, config)
|
||||
|
||||
@router.post("/unlink/{name}")
|
||||
async def unlink_repository_from_job(name: str):
|
||||
"""Unlink a repository from a job by deleting the configuration."""
|
||||
return config_service.delete_config(name)
|
396
app/routers/jobs.py
Normal file
396
app/routers/jobs.py
Normal file
@ -0,0 +1,396 @@
|
||||
from fastapi import APIRouter, Depends, HTTPException, Body, Query
|
||||
from typing import Dict, Any, List, Optional
|
||||
import json
|
||||
import logging
|
||||
|
||||
from app.services.nomad_client import NomadService
|
||||
from app.services.config_service import ConfigService
|
||||
from app.schemas.job import JobResponse, JobOperation, JobSpecification
|
||||
|
||||
router = APIRouter()
|
||||
nomad_service = NomadService()
|
||||
config_service = ConfigService()
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@router.get("/", response_model=List[JobResponse])
|
||||
async def list_jobs():
|
||||
"""List all jobs."""
|
||||
jobs = nomad_service.list_jobs()
|
||||
# Enhance job responses with repository information if available
|
||||
for job in jobs:
|
||||
job_id = job.get("ID")
|
||||
if job_id:
|
||||
repository = config_service.get_repository_from_job(job_id)
|
||||
if repository:
|
||||
job["repository"] = repository
|
||||
return jobs
|
||||
|
||||
@router.get("/{job_id}", response_model=JobResponse)
|
||||
async def get_job(job_id: str):
|
||||
"""Get a job by ID."""
|
||||
job = nomad_service.get_job(job_id)
|
||||
# Add repository information if available
|
||||
repository = config_service.get_repository_from_job(job_id)
|
||||
if repository:
|
||||
job["repository"] = repository
|
||||
return job
|
||||
|
||||
@router.post("/", response_model=JobOperation)
|
||||
async def start_job(job_spec: JobSpecification = Body(...)):
|
||||
"""Start a Nomad job with the provided specification."""
|
||||
return nomad_service.start_job(job_spec.dict())
|
||||
|
||||
@router.delete("/{job_id}", response_model=JobOperation)
|
||||
async def stop_job(job_id: str, purge: bool = Query(False)):
|
||||
"""Stop a job by ID."""
|
||||
return nomad_service.stop_job(job_id, purge)
|
||||
|
||||
@router.get("/{job_id}/allocations")
|
||||
async def get_job_allocations(job_id: str):
|
||||
"""Get all allocations for a job."""
|
||||
return nomad_service.get_allocations(job_id)
|
||||
|
||||
@router.get("/{job_id}/latest-allocation")
|
||||
async def get_latest_allocation(job_id: str):
|
||||
"""Get the latest allocation for a job."""
|
||||
allocations = nomad_service.get_allocations(job_id)
|
||||
if not allocations:
|
||||
raise HTTPException(status_code=404, detail=f"No allocations found for job {job_id}")
|
||||
|
||||
# Sort allocations by creation time (descending)
|
||||
sorted_allocations = sorted(
|
||||
allocations,
|
||||
key=lambda a: a.get("CreateTime", 0),
|
||||
reverse=True
|
||||
)
|
||||
|
||||
return sorted_allocations[0]
|
||||
|
||||
@router.get("/{job_id}/status")
|
||||
async def get_job_status(job_id: str, namespace: str = Query(None, description="Nomad namespace")):
|
||||
"""Get the current status of a job, including deployment and latest allocation."""
|
||||
try:
|
||||
# Create a custom service with the specific namespace if provided
|
||||
custom_nomad = NomadService()
|
||||
if namespace:
|
||||
custom_nomad.namespace = namespace
|
||||
logger.info(f"Getting job status for {job_id} in namespace {namespace}")
|
||||
else:
|
||||
logger.info(f"Getting job status for {job_id} in default namespace (development)")
|
||||
|
||||
job = custom_nomad.get_job(job_id)
|
||||
status = {
|
||||
"job_id": job_id,
|
||||
"namespace": namespace or custom_nomad.namespace,
|
||||
"status": job.get("Status", "unknown"),
|
||||
"stable": job.get("Stable", False),
|
||||
"submitted_at": job.get("SubmitTime", 0),
|
||||
}
|
||||
|
||||
# Get the latest deployment if any
|
||||
try:
|
||||
deployment = custom_nomad.get_deployment_status(job_id)
|
||||
if deployment:
|
||||
status["deployment"] = {
|
||||
"id": deployment.get("ID"),
|
||||
"status": deployment.get("Status"),
|
||||
"description": deployment.get("StatusDescription"),
|
||||
}
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to get deployment for job {job_id}: {str(e)}")
|
||||
pass # Deployment info is optional
|
||||
|
||||
# Get the latest allocation if any
|
||||
try:
|
||||
allocations = custom_nomad.get_allocations(job_id)
|
||||
if allocations:
|
||||
sorted_allocations = sorted(
|
||||
allocations,
|
||||
key=lambda a: a.get("CreateTime", 0),
|
||||
reverse=True
|
||||
)
|
||||
latest_alloc = sorted_allocations[0]
|
||||
status["latest_allocation"] = {
|
||||
"id": latest_alloc.get("ID"),
|
||||
"status": latest_alloc.get("ClientStatus"),
|
||||
"description": latest_alloc.get("ClientDescription", ""),
|
||||
"created_at": latest_alloc.get("CreateTime", 0),
|
||||
}
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to get allocations for job {job_id}: {str(e)}")
|
||||
pass # Allocation info is optional
|
||||
|
||||
return status
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail=f"Failed to get job status: {str(e)}")
|
||||
|
||||
@router.get("/{job_id}/specification")
|
||||
async def get_job_specification(job_id: str, namespace: str = Query(None, description="Nomad namespace"), raw: bool = Query(False)):
|
||||
"""Get the job specification for a job."""
|
||||
try:
|
||||
# Create a custom service with the specific namespace if provided
|
||||
custom_nomad = NomadService()
|
||||
if namespace:
|
||||
custom_nomad.namespace = namespace
|
||||
logger.info(f"Getting job specification for {job_id} in namespace {namespace}")
|
||||
else:
|
||||
logger.info(f"Getting job specification for {job_id} in default namespace (development)")
|
||||
|
||||
job = custom_nomad.get_job(job_id)
|
||||
|
||||
if raw:
|
||||
return job
|
||||
|
||||
# Extract just the job specification part if present
|
||||
if "JobID" in job:
|
||||
job_spec = {
|
||||
"id": job.get("ID"),
|
||||
"name": job.get("Name"),
|
||||
"type": job.get("Type"),
|
||||
"status": job.get("Status"),
|
||||
"datacenters": job.get("Datacenters", []),
|
||||
"namespace": job.get("Namespace"),
|
||||
"task_groups": job.get("TaskGroups", []),
|
||||
"meta": job.get("Meta", {}),
|
||||
}
|
||||
return job_spec
|
||||
|
||||
return job
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=404, detail=f"Failed to get job specification: {str(e)}")
|
||||
|
||||
@router.post("/{job_id}/restart")
|
||||
async def restart_job(job_id: str):
|
||||
"""Restart a job by stopping it and starting it again."""
|
||||
try:
|
||||
# Get the current job specification
|
||||
job_spec = nomad_service.get_job(job_id)
|
||||
|
||||
# Stop the job
|
||||
nomad_service.stop_job(job_id)
|
||||
|
||||
# Start the job with the original specification
|
||||
result = nomad_service.start_job(job_spec)
|
||||
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"status": "restarted",
|
||||
"eval_id": result.get("eval_id"),
|
||||
}
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail=f"Failed to restart job: {str(e)}")
|
||||
|
||||
@router.get("/by-repository/{repository}")
|
||||
async def get_job_by_repository(repository: str):
|
||||
"""Get job information by repository URL or name."""
|
||||
job_info = config_service.get_job_from_repository(repository)
|
||||
if not job_info:
|
||||
raise HTTPException(status_code=404, detail=f"No job found for repository: {repository}")
|
||||
|
||||
job_id = job_info.get("job_id")
|
||||
namespace = job_info.get("namespace")
|
||||
|
||||
# Get the job using the specific namespace if provided
|
||||
try:
|
||||
if namespace:
|
||||
# Override the default namespace with the specific one
|
||||
custom_nomad = NomadService()
|
||||
custom_nomad.namespace = namespace
|
||||
job = custom_nomad.get_job(job_id)
|
||||
else:
|
||||
# Use the default namespace settings
|
||||
job = nomad_service.get_job(job_id)
|
||||
|
||||
# Add repository information
|
||||
job["repository"] = repository
|
||||
return job
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=404, detail=f"Job not found: {job_id}, Error: {str(e)}")
|
||||
|
||||
@router.post("/by-repository/{repository}/start")
|
||||
async def start_job_by_repository(repository: str):
|
||||
"""Start a job by its associated repository."""
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
job_info = config_service.get_job_from_repository(repository)
|
||||
if not job_info:
|
||||
raise HTTPException(status_code=404, detail=f"No job found for repository: {repository}")
|
||||
|
||||
job_id = job_info.get("job_id")
|
||||
namespace = job_info.get("namespace")
|
||||
|
||||
logger.info(f"Starting job for repository {repository}, job_id: {job_id}, namespace: {namespace}")
|
||||
|
||||
# Create a custom service with the specific namespace if provided
|
||||
custom_nomad = NomadService()
|
||||
if namespace:
|
||||
logger.info(f"Setting custom_nomad.namespace to {namespace}")
|
||||
custom_nomad.namespace = namespace
|
||||
|
||||
# Log the current namespace being used
|
||||
logger.info(f"Nomad client namespace: {custom_nomad.namespace}")
|
||||
|
||||
try:
|
||||
# Get the job specification from an existing job
|
||||
job_spec = custom_nomad.get_job(job_id)
|
||||
|
||||
# Log the job specification
|
||||
logger.info(f"Retrieved job specification for {job_id} from existing job")
|
||||
|
||||
# Ensure namespace is set in job spec
|
||||
if isinstance(job_spec, dict):
|
||||
# Ensure namespace is explicitly set
|
||||
if namespace:
|
||||
logger.info(f"Setting namespace in job spec to {namespace}")
|
||||
job_spec["Namespace"] = namespace
|
||||
|
||||
# Log the keys in the job specification
|
||||
logger.info(f"Job spec keys: {job_spec.keys()}")
|
||||
|
||||
# Start the job with the retrieved specification
|
||||
result = custom_nomad.start_job(job_spec)
|
||||
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"repository": repository,
|
||||
"status": "started",
|
||||
"eval_id": result.get("eval_id"),
|
||||
"namespace": namespace
|
||||
}
|
||||
except HTTPException as e:
|
||||
# If job not found, try to get spec from config
|
||||
if e.status_code == 404:
|
||||
logger.info(f"Job {job_id} not found, attempting to get specification from config")
|
||||
|
||||
# Try to get job spec from repository config
|
||||
job_spec = config_service.get_job_spec_from_repository(repository)
|
||||
|
||||
if not job_spec:
|
||||
logger.warning(f"No job specification found for repository {repository}, creating a default one")
|
||||
|
||||
# Create a simple default job spec if none exists
|
||||
job_spec = {
|
||||
"ID": job_id,
|
||||
"Name": job_id,
|
||||
"Type": "service",
|
||||
"Datacenters": ["jm"], # Default datacenter
|
||||
"TaskGroups": [
|
||||
{
|
||||
"Name": "app",
|
||||
"Count": 1,
|
||||
"Tasks": [
|
||||
{
|
||||
"Name": job_id.split('-')[0], # Use first part of job ID as task name
|
||||
"Driver": "docker",
|
||||
"Config": {
|
||||
"image": f"registry.dev.meisheng.group/{repository}:latest",
|
||||
"force_pull": True,
|
||||
"ports": ["http"]
|
||||
},
|
||||
"Resources": {
|
||||
"CPU": 500,
|
||||
"MemoryMB": 512
|
||||
}
|
||||
}
|
||||
],
|
||||
"Networks": [
|
||||
{
|
||||
"DynamicPorts": [
|
||||
{
|
||||
"Label": "http",
|
||||
"Value": 0,
|
||||
"To": 8000
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"Meta": {
|
||||
"repository": repository
|
||||
}
|
||||
}
|
||||
|
||||
# Set the namespace explicitly in the job spec
|
||||
if namespace:
|
||||
logger.info(f"Setting namespace in default job spec to {namespace}")
|
||||
job_spec["Namespace"] = namespace
|
||||
|
||||
logger.info(f"Starting job {job_id} with specification")
|
||||
|
||||
# Log the job specification structure
|
||||
if isinstance(job_spec, dict):
|
||||
logger.info(f"Job spec keys: {job_spec.keys()}")
|
||||
if "Namespace" in job_spec:
|
||||
logger.info(f"Job spec namespace: {job_spec['Namespace']}")
|
||||
|
||||
# Start the job with the specification
|
||||
result = custom_nomad.start_job(job_spec)
|
||||
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"repository": repository,
|
||||
"status": "started",
|
||||
"eval_id": result.get("eval_id"),
|
||||
"namespace": namespace
|
||||
}
|
||||
|
||||
@router.post("/by-repository/{repository}/stop")
|
||||
async def stop_job_by_repository(repository: str, purge: bool = Query(False)):
|
||||
"""Stop a job by its associated repository."""
|
||||
job_info = config_service.get_job_from_repository(repository)
|
||||
if not job_info:
|
||||
raise HTTPException(status_code=404, detail=f"No job found for repository: {repository}")
|
||||
|
||||
job_id = job_info.get("job_id")
|
||||
namespace = job_info.get("namespace")
|
||||
|
||||
# Create a custom service with the specific namespace if provided
|
||||
custom_nomad = NomadService()
|
||||
if namespace:
|
||||
custom_nomad.namespace = namespace
|
||||
|
||||
# Stop the job
|
||||
result = custom_nomad.stop_job(job_id, purge)
|
||||
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"repository": repository,
|
||||
"status": "stopped",
|
||||
"eval_id": result.get("eval_id"),
|
||||
"namespace": namespace
|
||||
}
|
||||
|
||||
@router.post("/by-repository/{repository}/restart")
|
||||
async def restart_job_by_repository(repository: str):
|
||||
"""Restart a job by its associated repository."""
|
||||
job_info = config_service.get_job_from_repository(repository)
|
||||
if not job_info:
|
||||
raise HTTPException(status_code=404, detail=f"No job found for repository: {repository}")
|
||||
|
||||
job_id = job_info.get("job_id")
|
||||
namespace = job_info.get("namespace")
|
||||
|
||||
# Create a custom service with the specific namespace if provided
|
||||
custom_nomad = NomadService()
|
||||
if namespace:
|
||||
custom_nomad.namespace = namespace
|
||||
|
||||
# Get the job specification
|
||||
job_spec = custom_nomad.get_job(job_id)
|
||||
|
||||
# Stop the job first
|
||||
custom_nomad.stop_job(job_id)
|
||||
|
||||
# Start the job with the original specification
|
||||
result = custom_nomad.start_job(job_spec)
|
||||
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"repository": repository,
|
||||
"status": "restarted",
|
||||
"eval_id": result.get("eval_id"),
|
||||
"namespace": namespace
|
||||
}
|
293
app/routers/logs.py
Normal file
293
app/routers/logs.py
Normal file
@ -0,0 +1,293 @@
|
||||
from fastapi import APIRouter, HTTPException, Query
|
||||
from typing import List, Dict, Any, Optional
|
||||
import logging
|
||||
|
||||
from app.services.nomad_client import NomadService
|
||||
from app.services.config_service import ConfigService
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter()
|
||||
nomad_service = NomadService()
|
||||
config_service = ConfigService()
|
||||
|
||||
# More specific routes first
|
||||
@router.get("/repository/{repository}")
|
||||
async def get_repository_logs(
|
||||
repository: str,
|
||||
log_type: str = Query("stderr", description="Log type: stdout or stderr"),
|
||||
limit: int = Query(1, description="Number of allocations to return logs for"),
|
||||
plain_text: bool = Query(False, description="Return plain text logs instead of JSON")
|
||||
):
|
||||
"""Get logs for a repository's associated job."""
|
||||
# Get the job info for the repository
|
||||
job_info = config_service.get_job_from_repository(repository)
|
||||
if not job_info:
|
||||
raise HTTPException(status_code=404, detail=f"No job found for repository: {repository}")
|
||||
|
||||
job_id = job_info.get("job_id")
|
||||
namespace = job_info.get("namespace")
|
||||
|
||||
logger.info(f"Getting logs for job {job_id} in namespace {namespace}")
|
||||
|
||||
# Create a custom service with the specific namespace if provided
|
||||
custom_nomad = NomadService()
|
||||
if namespace:
|
||||
custom_nomad.namespace = namespace
|
||||
|
||||
# Get allocations for the job
|
||||
allocations = custom_nomad.get_allocations(job_id)
|
||||
if not allocations:
|
||||
raise HTTPException(status_code=404, detail=f"No allocations found for job {job_id}")
|
||||
|
||||
logger.info(f"Found {len(allocations)} allocations for job {job_id}")
|
||||
|
||||
# Sort allocations by creation time (descending)
|
||||
sorted_allocations = sorted(
|
||||
allocations,
|
||||
key=lambda a: a.get("CreateTime", 0),
|
||||
reverse=True
|
||||
)
|
||||
|
||||
# Limit the number of allocations
|
||||
allocations_to_check = sorted_allocations[:limit]
|
||||
|
||||
# Also get the job info to determine task names
|
||||
job = custom_nomad.get_job(job_id)
|
||||
|
||||
# Collect logs for each allocation and task
|
||||
result = []
|
||||
error_messages = []
|
||||
|
||||
for alloc in allocations_to_check:
|
||||
# Use the full UUID of the allocation
|
||||
alloc_id = alloc.get("ID")
|
||||
if not alloc_id:
|
||||
logger.warning(f"Allocation ID not found in allocation data")
|
||||
error_messages.append("Allocation ID not found in allocation data")
|
||||
continue
|
||||
|
||||
logger.info(f"Processing allocation {alloc_id} for job {job_id}")
|
||||
|
||||
# Get task name from the allocation's TaskStates
|
||||
task_states = alloc.get("TaskStates", {})
|
||||
if not task_states:
|
||||
logger.warning(f"No task states found in allocation {alloc_id}")
|
||||
error_messages.append(f"No task states found in allocation {alloc_id}")
|
||||
|
||||
for task_name, task_state in task_states.items():
|
||||
try:
|
||||
logger.info(f"Retrieving logs for allocation {alloc_id}, task {task_name}")
|
||||
|
||||
logs = custom_nomad.get_allocation_logs(alloc_id, task_name, log_type)
|
||||
|
||||
# Check if logs is an error message
|
||||
if logs and isinstance(logs, str):
|
||||
if logs.startswith("Error:") or logs.startswith("No "):
|
||||
logger.warning(f"Error retrieving logs for {task_name}: {logs}")
|
||||
error_messages.append(logs)
|
||||
continue
|
||||
|
||||
# Only add if we got some logs
|
||||
if logs:
|
||||
result.append({
|
||||
"alloc_id": alloc_id,
|
||||
"task": task_name,
|
||||
"type": log_type,
|
||||
"create_time": alloc.get("CreateTime"),
|
||||
"logs": logs
|
||||
})
|
||||
logger.info(f"Successfully retrieved logs for {task_name}")
|
||||
else:
|
||||
error_msg = f"No logs found for {task_name}"
|
||||
logger.warning(error_msg)
|
||||
error_messages.append(error_msg)
|
||||
except Exception as e:
|
||||
# Log but continue to try other tasks
|
||||
error_msg = f"Failed to get logs for {alloc_id}/{task_name}: {str(e)}"
|
||||
logger.error(error_msg)
|
||||
error_messages.append(error_msg)
|
||||
|
||||
# Return as plain text if requested
|
||||
if plain_text:
|
||||
if not result:
|
||||
if error_messages:
|
||||
return f"No logs found for this job. Errors: {'; '.join(error_messages)}"
|
||||
return "No logs found for this job"
|
||||
return "\n\n".join([f"=== {r.get('task')} ===\n{r.get('logs')}" for r in result])
|
||||
|
||||
# Otherwise return as JSON
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"repository": repository,
|
||||
"namespace": namespace,
|
||||
"allocation_logs": result,
|
||||
"errors": error_messages if error_messages else None
|
||||
}
|
||||
|
||||
@router.get("/job/{job_id}")
|
||||
async def get_job_logs(
|
||||
job_id: str,
|
||||
namespace: str = Query(None, description="Nomad namespace"),
|
||||
log_type: str = Query("stderr", description="Log type: stdout or stderr"),
|
||||
limit: int = Query(1, description="Number of allocations to return logs for"),
|
||||
plain_text: bool = Query(False, description="Return plain text logs instead of JSON")
|
||||
):
|
||||
"""Get logs for the most recent allocations of a job."""
|
||||
# Create a custom service with the specific namespace if provided
|
||||
custom_nomad = NomadService()
|
||||
if namespace:
|
||||
custom_nomad.namespace = namespace
|
||||
logger.info(f"Getting logs for job {job_id} in namespace {namespace}")
|
||||
else:
|
||||
logger.info(f"Getting logs for job {job_id} in default namespace")
|
||||
|
||||
# Get all allocations for the job
|
||||
allocations = custom_nomad.get_allocations(job_id)
|
||||
if not allocations:
|
||||
raise HTTPException(status_code=404, detail=f"No allocations found for job {job_id}")
|
||||
|
||||
logger.info(f"Found {len(allocations)} allocations for job {job_id}")
|
||||
|
||||
# Sort allocations by creation time (descending)
|
||||
sorted_allocations = sorted(
|
||||
allocations,
|
||||
key=lambda a: a.get("CreateTime", 0),
|
||||
reverse=True
|
||||
)
|
||||
|
||||
# Limit the number of allocations
|
||||
allocations_to_check = sorted_allocations[:limit]
|
||||
|
||||
# Collect logs for each allocation and task
|
||||
result = []
|
||||
for alloc in allocations_to_check:
|
||||
alloc_id = alloc.get("ID")
|
||||
if not alloc_id:
|
||||
logger.warning(f"Allocation ID not found in allocation data")
|
||||
continue
|
||||
|
||||
logger.info(f"Processing allocation {alloc_id} for job {job_id}")
|
||||
|
||||
# Get task names from the allocation's TaskStates
|
||||
task_states = alloc.get("TaskStates", {})
|
||||
for task_name, task_state in task_states.items():
|
||||
try:
|
||||
logger.info(f"Retrieving logs for allocation {alloc_id}, task {task_name}")
|
||||
|
||||
logs = custom_nomad.get_allocation_logs(alloc_id, task_name, log_type)
|
||||
# Only add if we got some logs and not an error message
|
||||
if logs and not logs.startswith("No") and not logs.startswith("Error"):
|
||||
result.append({
|
||||
"alloc_id": alloc_id,
|
||||
"task": task_name,
|
||||
"type": log_type,
|
||||
"create_time": alloc.get("CreateTime"),
|
||||
"logs": logs
|
||||
})
|
||||
logger.info(f"Successfully retrieved logs for {task_name}")
|
||||
else:
|
||||
logger.warning(f"No logs found for {task_name}: {logs}")
|
||||
except Exception as e:
|
||||
# Log but continue to try other tasks
|
||||
logger.error(f"Failed to get logs for {alloc_id}/{task_name}: {str(e)}")
|
||||
|
||||
# Return as plain text if requested
|
||||
if plain_text:
|
||||
if not result:
|
||||
return "No logs found for this job"
|
||||
return "\n\n".join([f"=== {r.get('task')} ===\n{r.get('logs')}" for r in result])
|
||||
|
||||
# Otherwise return as JSON
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"namespace": namespace,
|
||||
"allocation_logs": result
|
||||
}
|
||||
|
||||
@router.get("/latest/{job_id}")
|
||||
async def get_latest_allocation_logs(
|
||||
job_id: str,
|
||||
log_type: str = Query("stderr", description="Log type: stdout or stderr"),
|
||||
plain_text: bool = Query(False, description="Return plain text logs instead of JSON")
|
||||
):
|
||||
"""Get logs from the latest allocation of a job."""
|
||||
# Get all allocations for the job
|
||||
allocations = nomad_service.get_allocations(job_id)
|
||||
if not allocations:
|
||||
raise HTTPException(status_code=404, detail=f"No allocations found for job {job_id}")
|
||||
|
||||
# Sort allocations by creation time (descending)
|
||||
sorted_allocations = sorted(
|
||||
allocations,
|
||||
key=lambda a: a.get("CreateTime", 0),
|
||||
reverse=True
|
||||
)
|
||||
|
||||
# Get the latest allocation
|
||||
latest_alloc = sorted_allocations[0]
|
||||
alloc_id = latest_alloc.get("ID")
|
||||
|
||||
# Get task group and task information
|
||||
job = nomad_service.get_job(job_id)
|
||||
task_groups = job.get("TaskGroups", [])
|
||||
|
||||
# Collect logs for each task in the latest allocation
|
||||
result = []
|
||||
for task_group in task_groups:
|
||||
tasks = task_group.get("Tasks", [])
|
||||
for task in tasks:
|
||||
task_name = task.get("Name")
|
||||
try:
|
||||
logs = nomad_service.get_allocation_logs(alloc_id, task_name, log_type)
|
||||
result.append({
|
||||
"alloc_id": alloc_id,
|
||||
"task": task_name,
|
||||
"type": log_type,
|
||||
"create_time": latest_alloc.get("CreateTime"),
|
||||
"logs": logs
|
||||
})
|
||||
except Exception as e:
|
||||
# Skip if logs cannot be retrieved for this task
|
||||
pass
|
||||
|
||||
# Return as plain text if requested
|
||||
if plain_text:
|
||||
return "\n\n".join([f"=== {r['task']} ===\n{r['logs']}" for r in result])
|
||||
|
||||
# Otherwise return as JSON
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"latest_allocation": alloc_id,
|
||||
"task_logs": result
|
||||
}
|
||||
|
||||
@router.get("/build/{job_id}")
|
||||
async def get_build_logs(job_id: str, plain_text: bool = Query(False)):
|
||||
"""Get build logs for a job (usually stderr logs from the latest allocation)."""
|
||||
# This is a convenience endpoint that returns stderr logs from the latest allocation
|
||||
return await get_latest_allocation_logs(job_id, "stderr", plain_text)
|
||||
|
||||
# Generic allocation logs route last
|
||||
@router.get("/allocation/{alloc_id}/{task}")
|
||||
async def get_allocation_logs(
|
||||
alloc_id: str,
|
||||
task: str,
|
||||
log_type: str = Query("stderr", description="Log type: stdout or stderr"),
|
||||
plain_text: bool = Query(False, description="Return plain text logs instead of JSON")
|
||||
):
|
||||
"""Get logs for a specific allocation and task."""
|
||||
# Validate log_type
|
||||
if log_type not in ["stdout", "stderr"]:
|
||||
raise HTTPException(status_code=400, detail="Log type must be stdout or stderr")
|
||||
|
||||
# Get logs from Nomad
|
||||
logs = nomad_service.get_allocation_logs(alloc_id, task, log_type)
|
||||
|
||||
# Return as plain text if requested
|
||||
if plain_text:
|
||||
return logs
|
||||
|
||||
# Otherwise return as JSON
|
||||
return {"alloc_id": alloc_id, "task": task, "type": log_type, "logs": logs}
|
89
app/routers/repositories.py
Normal file
89
app/routers/repositories.py
Normal file
@ -0,0 +1,89 @@
|
||||
from fastapi import APIRouter, HTTPException, Query
|
||||
from typing import List, Dict, Any, Optional
|
||||
|
||||
from app.services.gitea_client import GiteaClient
|
||||
from app.services.config_service import ConfigService
|
||||
|
||||
router = APIRouter()
|
||||
gitea_client = GiteaClient()
|
||||
config_service = ConfigService()
|
||||
|
||||
@router.get("/")
|
||||
async def list_repositories(limit: int = Query(100, description="Maximum number of repositories to return")):
|
||||
"""
|
||||
List all available repositories from Gitea.
|
||||
|
||||
If Gitea integration is not configured, returns an empty list.
|
||||
"""
|
||||
repositories = gitea_client.list_repositories(limit)
|
||||
|
||||
# Enhance with linked job information
|
||||
for repo in repositories:
|
||||
# Create a URL from clone_url
|
||||
repo_url = repo.get("clone_url")
|
||||
if repo_url:
|
||||
# Check if repository is linked to a job
|
||||
configs = config_service.list_configs()
|
||||
for config in configs:
|
||||
if config.get("repository") == repo_url:
|
||||
repo["linked_job"] = config.get("job_id")
|
||||
repo["config_name"] = config.get("name")
|
||||
break
|
||||
|
||||
return repositories
|
||||
|
||||
@router.get("/{repository}")
|
||||
async def get_repository_info(repository: str):
|
||||
"""
|
||||
Get information about a specific repository.
|
||||
|
||||
The repository parameter can be a repository URL or a repository alias.
|
||||
If it's a repository URL, we'll get the info directly from Gitea.
|
||||
If it's a repository alias, we'll get the info from the configuration and then from Gitea.
|
||||
"""
|
||||
# First check if it's a repository URL
|
||||
repo_info = gitea_client.get_repository_info(repository)
|
||||
|
||||
if repo_info:
|
||||
# Check if repository is linked to a job
|
||||
configs = config_service.list_configs()
|
||||
for config in configs:
|
||||
if config.get("repository") == repository:
|
||||
repo_info["linked_job"] = config.get("job_id")
|
||||
repo_info["config_name"] = config.get("name")
|
||||
repo_info["config"] = config
|
||||
break
|
||||
|
||||
return repo_info
|
||||
else:
|
||||
# Check if it's a repository alias in our configs
|
||||
config = config_service.get_config_by_repository(repository)
|
||||
if config:
|
||||
repo_url = config.get("repository")
|
||||
repo_info = gitea_client.get_repository_info(repo_url)
|
||||
|
||||
if repo_info:
|
||||
repo_info["linked_job"] = config.get("job_id")
|
||||
repo_info["config_name"] = config.get("name")
|
||||
repo_info["config"] = config
|
||||
return repo_info
|
||||
|
||||
raise HTTPException(status_code=404, detail=f"Repository not found: {repository}")
|
||||
|
||||
@router.get("/{repository}/branches")
|
||||
async def get_repository_branches(repository: str):
|
||||
"""
|
||||
Get branches for a specific repository.
|
||||
|
||||
The repository parameter can be a repository URL or a repository alias.
|
||||
"""
|
||||
# If it's a repository alias, get the actual URL
|
||||
config = config_service.get_config_by_repository(repository)
|
||||
if config:
|
||||
repository = config.get("repository")
|
||||
|
||||
branches = gitea_client.get_repository_branches(repository)
|
||||
if not branches:
|
||||
raise HTTPException(status_code=404, detail=f"No branches found for repository: {repository}")
|
||||
|
||||
return branches
|
1
app/schemas/__init__.py
Normal file
1
app/schemas/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
# Import schemas
|
BIN
app/schemas/__pycache__/__init__.cpython-313.pyc
Normal file
BIN
app/schemas/__pycache__/__init__.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/schemas/__pycache__/claude_api.cpython-313.pyc
Normal file
BIN
app/schemas/__pycache__/claude_api.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/schemas/__pycache__/config.cpython-313.pyc
Normal file
BIN
app/schemas/__pycache__/config.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/schemas/__pycache__/job.cpython-313.pyc
Normal file
BIN
app/schemas/__pycache__/job.cpython-313.pyc
Normal file
Binary file not shown.
78
app/schemas/claude_api.py
Normal file
78
app/schemas/claude_api.py
Normal file
@ -0,0 +1,78 @@
|
||||
from pydantic import BaseModel, Field
|
||||
from typing import Dict, Any, List, Optional, Union
|
||||
|
||||
|
||||
class ClaudeJobRequest(BaseModel):
|
||||
"""Request model for Claude to start or manage a job"""
|
||||
job_id: str = Field(..., description="The ID of the job to manage")
|
||||
action: str = Field(..., description="Action to perform: start, stop, restart, status")
|
||||
namespace: Optional[str] = Field("development", description="Nomad namespace")
|
||||
purge: Optional[bool] = Field(False, description="Whether to purge the job when stopping")
|
||||
|
||||
|
||||
class ClaudeJobSpecification(BaseModel):
|
||||
"""Simplified job specification for Claude to create a new job"""
|
||||
job_id: str = Field(..., description="The ID for the new job")
|
||||
name: Optional[str] = Field(None, description="Name of the job (defaults to job_id)")
|
||||
type: str = Field("service", description="Job type: service, batch, or system")
|
||||
datacenters: List[str] = Field(["jm"], description="List of datacenters")
|
||||
namespace: str = Field("development", description="Nomad namespace")
|
||||
docker_image: str = Field(..., description="Docker image to run")
|
||||
count: int = Field(1, description="Number of instances to run")
|
||||
cpu: int = Field(100, description="CPU resources in MHz")
|
||||
memory: int = Field(128, description="Memory in MB")
|
||||
ports: Optional[List[Dict[str, Any]]] = Field(None, description="Port mappings")
|
||||
env_vars: Optional[Dict[str, str]] = Field(None, description="Environment variables")
|
||||
|
||||
def to_nomad_job_spec(self) -> Dict[str, Any]:
|
||||
"""Convert to Nomad job specification format"""
|
||||
# Create a task with the specified Docker image
|
||||
task = {
|
||||
"Name": "app",
|
||||
"Driver": "docker",
|
||||
"Config": {
|
||||
"image": self.docker_image,
|
||||
},
|
||||
"Resources": {
|
||||
"CPU": self.cpu,
|
||||
"MemoryMB": self.memory
|
||||
}
|
||||
}
|
||||
|
||||
# Add environment variables if specified
|
||||
if self.env_vars:
|
||||
task["Env"] = self.env_vars
|
||||
|
||||
# Create network configuration
|
||||
network = {}
|
||||
if self.ports:
|
||||
network["DynamicPorts"] = self.ports
|
||||
task["Config"]["ports"] = [port["Label"] for port in self.ports]
|
||||
|
||||
# Create the full job specification
|
||||
job_spec = {
|
||||
"ID": self.job_id,
|
||||
"Name": self.name or self.job_id,
|
||||
"Type": self.type,
|
||||
"Datacenters": self.datacenters,
|
||||
"Namespace": self.namespace,
|
||||
"TaskGroups": [
|
||||
{
|
||||
"Name": "app",
|
||||
"Count": self.count,
|
||||
"Tasks": [task],
|
||||
"Networks": [network] if network else []
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
return job_spec
|
||||
|
||||
|
||||
class ClaudeJobResponse(BaseModel):
|
||||
"""Response model for Claude job operations"""
|
||||
success: bool = Field(..., description="Whether the operation was successful")
|
||||
job_id: str = Field(..., description="The ID of the job")
|
||||
status: str = Field(..., description="Current status of the job")
|
||||
message: str = Field(..., description="Human-readable message about the operation")
|
||||
details: Optional[Dict[str, Any]] = Field(None, description="Additional details about the job")
|
56
app/schemas/config.py
Normal file
56
app/schemas/config.py
Normal file
@ -0,0 +1,56 @@
|
||||
from pydantic import BaseModel, Field
|
||||
from typing import Dict, Any, Optional
|
||||
|
||||
|
||||
class ConfigBase(BaseModel):
|
||||
"""Base class for configuration schemas."""
|
||||
repository: str = Field(..., description="Repository URL or identifier")
|
||||
job_id: str = Field(..., description="Nomad job ID")
|
||||
description: Optional[str] = Field(None, description="Description of this configuration")
|
||||
repository_alias: Optional[str] = Field(None, description="Short name or alias for the repository")
|
||||
|
||||
# Additional metadata can be stored in the meta field
|
||||
meta: Optional[Dict[str, Any]] = Field(None, description="Additional metadata")
|
||||
|
||||
|
||||
class ConfigCreate(ConfigBase):
|
||||
"""Schema for creating a new configuration."""
|
||||
name: str = Field(..., description="Configuration name (used as the file name)")
|
||||
|
||||
|
||||
class ConfigUpdate(BaseModel):
|
||||
"""Schema for updating an existing configuration."""
|
||||
repository: Optional[str] = Field(None, description="Repository URL or identifier")
|
||||
job_id: Optional[str] = Field(None, description="Nomad job ID")
|
||||
description: Optional[str] = Field(None, description="Description of this configuration")
|
||||
repository_alias: Optional[str] = Field(None, description="Short name or alias for the repository")
|
||||
meta: Optional[Dict[str, Any]] = Field(None, description="Additional metadata")
|
||||
|
||||
|
||||
class ConfigResponse(ConfigBase):
|
||||
"""Schema for configuration response."""
|
||||
name: str = Field(..., description="Configuration name")
|
||||
repository_info: Optional[Dict[str, Any]] = Field(None, description="Repository information from Gitea if available")
|
||||
|
||||
class Config:
|
||||
schema_extra = {
|
||||
"example": {
|
||||
"name": "my-web-app",
|
||||
"repository": "http://gitea.internal.example.com/username/repo-name",
|
||||
"repository_alias": "web-app",
|
||||
"job_id": "web-app",
|
||||
"description": "Web application running in Nomad",
|
||||
"meta": {
|
||||
"owner": "devops-team",
|
||||
"environment": "production"
|
||||
},
|
||||
"repository_info": {
|
||||
"description": "A web application",
|
||||
"default_branch": "main",
|
||||
"stars": 5,
|
||||
"forks": 2,
|
||||
"owner": "username",
|
||||
"html_url": "http://gitea.internal.example.com/username/repo-name"
|
||||
}
|
||||
}
|
||||
}
|
80
app/schemas/job.py
Normal file
80
app/schemas/job.py
Normal file
@ -0,0 +1,80 @@
|
||||
from pydantic import BaseModel, Field
|
||||
from typing import Dict, Any, List, Optional
|
||||
|
||||
|
||||
class JobSpecification(BaseModel):
|
||||
"""
|
||||
Nomad job specification. This is a simplified schema as the actual
|
||||
Nomad job spec is quite complex and varies by job type.
|
||||
"""
|
||||
id: Optional[str] = Field(None, description="Job ID")
|
||||
ID: Optional[str] = Field(None, description="Job ID (Nomad format)")
|
||||
name: Optional[str] = Field(None, description="Job name")
|
||||
Name: Optional[str] = Field(None, description="Job name (Nomad format)")
|
||||
type: Optional[str] = Field(None, description="Job type (service, batch, system)")
|
||||
Type: Optional[str] = Field(None, description="Job type (Nomad format)")
|
||||
datacenters: Optional[List[str]] = Field(None, description="List of datacenters")
|
||||
Datacenters: Optional[List[str]] = Field(None, description="List of datacenters (Nomad format)")
|
||||
task_groups: Optional[List[Dict[str, Any]]] = Field(None, description="Task groups")
|
||||
TaskGroups: Optional[List[Dict[str, Any]]] = Field(None, description="Task groups (Nomad format)")
|
||||
meta: Optional[Dict[str, str]] = Field(None, description="Job metadata")
|
||||
Meta: Optional[Dict[str, str]] = Field(None, description="Job metadata (Nomad format)")
|
||||
|
||||
# Allow additional fields (to handle the complete Nomad job spec)
|
||||
class Config:
|
||||
extra = "allow"
|
||||
|
||||
|
||||
class JobOperation(BaseModel):
|
||||
"""Response after a job operation (start, stop, etc.)"""
|
||||
job_id: str = Field(..., description="The ID of the job")
|
||||
eval_id: Optional[str] = Field(None, description="The evaluation ID")
|
||||
status: str = Field(..., description="The status of the operation")
|
||||
warnings: Optional[str] = Field(None, description="Any warnings from Nomad")
|
||||
|
||||
|
||||
class JobResponse(BaseModel):
|
||||
"""
|
||||
Job response schema. This is a simplified version as the actual
|
||||
Nomad job response is quite complex and varies by job type.
|
||||
"""
|
||||
ID: str = Field(..., description="Job ID")
|
||||
Name: str = Field(..., description="Job name")
|
||||
Status: str = Field(..., description="Job status")
|
||||
Type: str = Field(..., description="Job type")
|
||||
repository: Optional[str] = Field(None, description="Associated repository if any")
|
||||
|
||||
# Allow additional fields (to handle the complete Nomad job response)
|
||||
class Config:
|
||||
extra = "allow"
|
||||
|
||||
|
||||
class TaskGroup(BaseModel):
|
||||
"""Task group schema."""
|
||||
Name: str
|
||||
Count: int
|
||||
Tasks: List[Dict[str, Any]]
|
||||
|
||||
class Config:
|
||||
extra = "allow"
|
||||
|
||||
|
||||
class Task(BaseModel):
|
||||
"""Task schema."""
|
||||
Name: str
|
||||
Driver: str
|
||||
Config: Dict[str, Any]
|
||||
|
||||
class Config:
|
||||
extra = "allow"
|
||||
|
||||
|
||||
class Allocation(BaseModel):
|
||||
"""Allocation schema."""
|
||||
ID: str
|
||||
JobID: str
|
||||
TaskGroup: str
|
||||
ClientStatus: str
|
||||
|
||||
class Config:
|
||||
extra = "allow"
|
1
app/services/__init__.py
Normal file
1
app/services/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
# Import services
|
BIN
app/services/__pycache__/__init__.cpython-313.pyc
Normal file
BIN
app/services/__pycache__/__init__.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/services/__pycache__/config_service.cpython-313.pyc
Normal file
BIN
app/services/__pycache__/config_service.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/services/__pycache__/gitea_client.cpython-313.pyc
Normal file
BIN
app/services/__pycache__/gitea_client.cpython-313.pyc
Normal file
Binary file not shown.
BIN
app/services/__pycache__/nomad_client.cpython-313.pyc
Normal file
BIN
app/services/__pycache__/nomad_client.cpython-313.pyc
Normal file
Binary file not shown.
299
app/services/config_service.py
Normal file
299
app/services/config_service.py
Normal file
@ -0,0 +1,299 @@
|
||||
import os
|
||||
import yaml
|
||||
import logging
|
||||
import json
|
||||
from typing import Dict, Any, Optional, List
|
||||
from fastapi import HTTPException
|
||||
from pathlib import Path
|
||||
|
||||
from app.services.gitea_client import GiteaClient
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Default configs directory
|
||||
CONFIG_DIR = os.getenv("CONFIG_DIR", "./configs")
|
||||
|
||||
class ConfigService:
|
||||
"""Service for managing repository to job mappings."""
|
||||
|
||||
def __init__(self, config_dir: str = CONFIG_DIR):
|
||||
self.config_dir = Path(config_dir)
|
||||
self._ensure_config_dir()
|
||||
self.gitea_client = GiteaClient()
|
||||
|
||||
def _ensure_config_dir(self):
|
||||
"""Ensure the config directory exists."""
|
||||
try:
|
||||
self.config_dir.mkdir(parents=True, exist_ok=True)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to create config directory {self.config_dir}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to create config directory: {str(e)}")
|
||||
|
||||
def list_configs(self) -> List[Dict[str, Any]]:
|
||||
"""List all available configurations."""
|
||||
configs = []
|
||||
try:
|
||||
for file_path in self.config_dir.glob("*.yaml"):
|
||||
with open(file_path, "r") as f:
|
||||
config = yaml.safe_load(f)
|
||||
config["name"] = file_path.stem
|
||||
configs.append(config)
|
||||
return configs
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to list configurations: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to list configurations: {str(e)}")
|
||||
|
||||
def get_config(self, name: str) -> Dict[str, Any]:
|
||||
"""Get a specific configuration by name."""
|
||||
file_path = self.config_dir / f"{name}.yaml"
|
||||
try:
|
||||
if not file_path.exists():
|
||||
raise HTTPException(status_code=404, detail=f"Configuration not found: {name}")
|
||||
|
||||
with open(file_path, "r") as f:
|
||||
config = yaml.safe_load(f)
|
||||
config["name"] = name
|
||||
|
||||
# Enrich with repository information if available
|
||||
if repository := config.get("repository"):
|
||||
repo_info = self.gitea_client.get_repository_info(repository)
|
||||
if repo_info:
|
||||
config["repository_info"] = {
|
||||
"description": repo_info.get("description"),
|
||||
"default_branch": repo_info.get("default_branch"),
|
||||
"stars": repo_info.get("stars_count"),
|
||||
"forks": repo_info.get("forks_count"),
|
||||
"owner": repo_info.get("owner", {}).get("login"),
|
||||
"html_url": repo_info.get("html_url"),
|
||||
}
|
||||
|
||||
return config
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to read configuration {name}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to read configuration: {str(e)}")
|
||||
|
||||
def create_config(self, name: str, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Create a new configuration."""
|
||||
file_path = self.config_dir / f"{name}.yaml"
|
||||
try:
|
||||
if file_path.exists():
|
||||
raise HTTPException(status_code=409, detail=f"Configuration already exists: {name}")
|
||||
|
||||
# Validate required fields
|
||||
required_fields = ["repository", "job_id"]
|
||||
for field in required_fields:
|
||||
if field not in config:
|
||||
raise HTTPException(status_code=400, detail=f"Missing required field: {field}")
|
||||
|
||||
# Validate repository exists if Gitea integration is configured
|
||||
if not self.gitea_client.check_repository_exists(config["repository"]):
|
||||
raise HTTPException(status_code=400, detail=f"Repository not found: {config['repository']}")
|
||||
|
||||
# Add name to the config
|
||||
config["name"] = name
|
||||
|
||||
# Get repository alias if not provided
|
||||
if "repository_alias" not in config:
|
||||
try:
|
||||
owner, repo = self.gitea_client.parse_repo_url(config["repository"])
|
||||
config["repository_alias"] = repo
|
||||
except:
|
||||
# Use job_id as fallback
|
||||
config["repository_alias"] = config["job_id"]
|
||||
|
||||
# Write config to file
|
||||
with open(file_path, "w") as f:
|
||||
yaml.dump(config, f, default_flow_style=False)
|
||||
|
||||
return config
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to create configuration {name}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to create configuration: {str(e)}")
|
||||
|
||||
def update_config(self, name: str, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Update an existing configuration."""
|
||||
file_path = self.config_dir / f"{name}.yaml"
|
||||
try:
|
||||
if not file_path.exists():
|
||||
raise HTTPException(status_code=404, detail=f"Configuration not found: {name}")
|
||||
|
||||
# Read existing config
|
||||
with open(file_path, "r") as f:
|
||||
existing_config = yaml.safe_load(f)
|
||||
|
||||
# Update with new values
|
||||
for key, value in config.items():
|
||||
existing_config[key] = value
|
||||
|
||||
# Validate repository exists if changed and Gitea integration is configured
|
||||
if "repository" in config and config["repository"] != existing_config.get("repository"):
|
||||
if not self.gitea_client.check_repository_exists(config["repository"]):
|
||||
raise HTTPException(status_code=400, detail=f"Repository not found: {config['repository']}")
|
||||
|
||||
# Validate required fields
|
||||
required_fields = ["repository", "job_id"]
|
||||
for field in required_fields:
|
||||
if field not in existing_config:
|
||||
raise HTTPException(status_code=400, detail=f"Missing required field: {field}")
|
||||
|
||||
# Add name to the config
|
||||
existing_config["name"] = name
|
||||
|
||||
# Update repository alias if repository changed
|
||||
if "repository" in config and "repository_alias" not in config:
|
||||
try:
|
||||
owner, repo = self.gitea_client.parse_repo_url(existing_config["repository"])
|
||||
existing_config["repository_alias"] = repo
|
||||
except:
|
||||
pass
|
||||
|
||||
# Write config to file
|
||||
with open(file_path, "w") as f:
|
||||
yaml.dump(existing_config, f, default_flow_style=False)
|
||||
|
||||
return existing_config
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to update configuration {name}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to update configuration: {str(e)}")
|
||||
|
||||
def delete_config(self, name: str) -> Dict[str, Any]:
|
||||
"""Delete a configuration."""
|
||||
file_path = self.config_dir / f"{name}.yaml"
|
||||
try:
|
||||
if not file_path.exists():
|
||||
raise HTTPException(status_code=404, detail=f"Configuration not found: {name}")
|
||||
|
||||
# Get the config before deleting
|
||||
with open(file_path, "r") as f:
|
||||
config = yaml.safe_load(f)
|
||||
config["name"] = name
|
||||
|
||||
# Delete the file
|
||||
file_path.unlink()
|
||||
|
||||
return {"name": name, "status": "deleted"}
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to delete configuration {name}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to delete configuration: {str(e)}")
|
||||
|
||||
def get_job_from_repository(self, repository: str) -> Optional[Dict[str, str]]:
|
||||
"""Find job_id and namespace associated with a repository."""
|
||||
try:
|
||||
for config in self.list_configs():
|
||||
if config.get("repository") == repository or config.get("repository_alias") == repository:
|
||||
return {
|
||||
"job_id": config.get("job_id"),
|
||||
"namespace": config.get("namespace")
|
||||
}
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to find job for repository {repository}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to find job for repository: {str(e)}")
|
||||
|
||||
def get_repository_from_job(self, job_id: str) -> Optional[str]:
|
||||
"""Find repository associated with a job_id."""
|
||||
try:
|
||||
for config in self.list_configs():
|
||||
if config.get("job_id") == job_id:
|
||||
return config.get("repository")
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to find repository for job {job_id}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to find repository for job: {str(e)}")
|
||||
|
||||
def get_config_by_repository(self, repository: str) -> Optional[Dict[str, Any]]:
|
||||
"""Find configuration by repository URL or alias."""
|
||||
try:
|
||||
for config in self.list_configs():
|
||||
if config.get("repository") == repository or config.get("repository_alias") == repository:
|
||||
return self.get_config(config.get("name"))
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to find config for repository {repository}: {str(e)}")
|
||||
return None
|
||||
|
||||
def get_job_spec_from_repository(self, repository: str) -> Optional[Dict[str, Any]]:
|
||||
"""Get job specification from repository config and template."""
|
||||
try:
|
||||
# Get the repository configuration
|
||||
config = self.get_config_by_repository(repository)
|
||||
if not config:
|
||||
logger.error(f"No configuration found for repository: {repository}")
|
||||
return None
|
||||
|
||||
# Check if the job template is specified
|
||||
job_template = config.get("job_template")
|
||||
if not job_template:
|
||||
logger.error(f"No job template specified for repository: {repository}")
|
||||
return None
|
||||
|
||||
# Read the job template file
|
||||
template_path = Path(self.config_dir) / "templates" / f"{job_template}.json"
|
||||
if not template_path.exists():
|
||||
logger.error(f"Job template not found: {job_template}")
|
||||
return None
|
||||
|
||||
try:
|
||||
with open(template_path, "r") as f:
|
||||
job_spec = json.load(f)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to read job template {job_template}: {str(e)}")
|
||||
return None
|
||||
|
||||
# Apply configuration parameters to the template
|
||||
job_spec["ID"] = config.get("job_id")
|
||||
job_spec["Name"] = config.get("job_id")
|
||||
|
||||
# Apply other customizations from config
|
||||
if env_vars := config.get("environment_variables"):
|
||||
for task_group in job_spec.get("TaskGroups", []):
|
||||
for task in task_group.get("Tasks", []):
|
||||
if "Env" not in task:
|
||||
task["Env"] = {}
|
||||
task["Env"].update(env_vars)
|
||||
|
||||
if meta := config.get("metadata"):
|
||||
job_spec["Meta"] = meta
|
||||
|
||||
# Add repository info to the metadata
|
||||
if "Meta" not in job_spec:
|
||||
job_spec["Meta"] = {}
|
||||
job_spec["Meta"]["repository"] = repository
|
||||
|
||||
# Override specific job parameters if specified in config
|
||||
if job_params := config.get("job_parameters"):
|
||||
for param_key, param_value in job_params.items():
|
||||
# Handle nested parameters with dot notation (e.g., "TaskGroups.0.Tasks.0.Config.image")
|
||||
if "." in param_key:
|
||||
parts = param_key.split(".")
|
||||
current = job_spec
|
||||
for part in parts[:-1]:
|
||||
# Handle array indices
|
||||
if part.isdigit() and isinstance(current, list):
|
||||
current = current[int(part)]
|
||||
elif part in current:
|
||||
current = current[part]
|
||||
else:
|
||||
break
|
||||
else:
|
||||
# Only set the value if we successfully navigated the path
|
||||
current[parts[-1]] = param_value
|
||||
else:
|
||||
# Direct parameter
|
||||
job_spec[param_key] = param_value
|
||||
|
||||
logger.info(f"Generated job specification for repository {repository} using template {job_template}")
|
||||
return job_spec
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get job specification for repository {repository}: {str(e)}")
|
||||
return None
|
180
app/services/gitea_client.py
Normal file
180
app/services/gitea_client.py
Normal file
@ -0,0 +1,180 @@
|
||||
import os
|
||||
import logging
|
||||
import requests
|
||||
from typing import Dict, Any, List, Optional, Tuple
|
||||
from urllib.parse import urlparse
|
||||
from fastapi import HTTPException
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
class GiteaClient:
|
||||
"""Client for interacting with Gitea API."""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize Gitea client with configuration from environment variables."""
|
||||
self.api_base_url = os.getenv("GITEA_API_URL", "").rstrip("/")
|
||||
self.token = os.getenv("GITEA_API_TOKEN")
|
||||
self.username = os.getenv("GITEA_USERNAME")
|
||||
self.verify_ssl = os.getenv("GITEA_VERIFY_SSL", "true").lower() == "true"
|
||||
|
||||
if not self.api_base_url:
|
||||
logger.warning("GITEA_API_URL is not configured. Gitea integration will not work.")
|
||||
|
||||
if not self.token and (self.username and os.getenv("GITEA_PASSWORD")):
|
||||
self.token = self._get_token_from_credentials()
|
||||
|
||||
def _get_token_from_credentials(self) -> Optional[str]:
|
||||
"""Get a token using username and password if provided."""
|
||||
try:
|
||||
response = requests.post(
|
||||
f"{self.api_base_url}/users/{self.username}/tokens",
|
||||
auth=(self.username, os.getenv("GITEA_PASSWORD", "")),
|
||||
json={
|
||||
"name": "nomad-mcp-service",
|
||||
"scopes": ["repo", "read:org"]
|
||||
},
|
||||
verify=self.verify_ssl
|
||||
)
|
||||
|
||||
if response.status_code == 201:
|
||||
return response.json().get("sha1")
|
||||
else:
|
||||
logger.error(f"Failed to get Gitea token: {response.text}")
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get Gitea token: {str(e)}")
|
||||
return None
|
||||
|
||||
def _get_headers(self) -> Dict[str, str]:
|
||||
"""Get request headers with authentication."""
|
||||
headers = {
|
||||
"Content-Type": "application/json",
|
||||
"Accept": "application/json"
|
||||
}
|
||||
|
||||
if self.token:
|
||||
headers["Authorization"] = f"token {self.token}"
|
||||
|
||||
return headers
|
||||
|
||||
def parse_repo_url(self, repo_url: str) -> Tuple[str, str]:
|
||||
"""
|
||||
Parse a Gitea repository URL to extract owner and repo name.
|
||||
|
||||
Examples:
|
||||
- http://gitea.internal.example.com/username/repo-name -> (username, repo-name)
|
||||
- https://gitea.example.com/org/project -> (org, project)
|
||||
"""
|
||||
try:
|
||||
# Parse the URL
|
||||
parsed_url = urlparse(repo_url)
|
||||
|
||||
# Get the path and remove leading/trailing slashes
|
||||
path = parsed_url.path.strip("/")
|
||||
|
||||
# Split the path
|
||||
parts = path.split("/")
|
||||
|
||||
if len(parts) < 2:
|
||||
raise ValueError(f"Invalid repository URL: {repo_url}")
|
||||
|
||||
# Extract owner and repo
|
||||
owner = parts[0]
|
||||
repo = parts[1]
|
||||
|
||||
return owner, repo
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to parse repository URL: {repo_url}, error: {str(e)}")
|
||||
raise ValueError(f"Invalid repository URL: {repo_url}")
|
||||
|
||||
def check_repository_exists(self, repo_url: str) -> bool:
|
||||
"""Check if a repository exists in Gitea."""
|
||||
if not self.api_base_url:
|
||||
# No Gitea integration configured, assume repository exists
|
||||
return True
|
||||
|
||||
try:
|
||||
owner, repo = self.parse_repo_url(repo_url)
|
||||
|
||||
response = requests.get(
|
||||
f"{self.api_base_url}/repos/{owner}/{repo}",
|
||||
headers=self._get_headers(),
|
||||
verify=self.verify_ssl
|
||||
)
|
||||
|
||||
return response.status_code == 200
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to check repository: {repo_url}, error: {str(e)}")
|
||||
return False
|
||||
|
||||
def get_repository_info(self, repo_url: str) -> Optional[Dict[str, Any]]:
|
||||
"""Get repository information from Gitea."""
|
||||
if not self.api_base_url:
|
||||
# No Gitea integration configured
|
||||
return None
|
||||
|
||||
try:
|
||||
owner, repo = self.parse_repo_url(repo_url)
|
||||
|
||||
response = requests.get(
|
||||
f"{self.api_base_url}/repos/{owner}/{repo}",
|
||||
headers=self._get_headers(),
|
||||
verify=self.verify_ssl
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
else:
|
||||
logger.error(f"Failed to get repository info: {response.text}")
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get repository info: {repo_url}, error: {str(e)}")
|
||||
return None
|
||||
|
||||
def list_repositories(self, limit: int = 100) -> List[Dict[str, Any]]:
|
||||
"""List available repositories from Gitea."""
|
||||
if not self.api_base_url:
|
||||
# No Gitea integration configured
|
||||
return []
|
||||
|
||||
try:
|
||||
response = requests.get(
|
||||
f"{self.api_base_url}/user/repos",
|
||||
headers=self._get_headers(),
|
||||
params={"limit": limit},
|
||||
verify=self.verify_ssl
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
else:
|
||||
logger.error(f"Failed to list repositories: {response.text}")
|
||||
return []
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to list repositories: {str(e)}")
|
||||
return []
|
||||
|
||||
def get_repository_branches(self, repo_url: str) -> List[Dict[str, Any]]:
|
||||
"""Get branches for a repository."""
|
||||
if not self.api_base_url:
|
||||
# No Gitea integration configured
|
||||
return []
|
||||
|
||||
try:
|
||||
owner, repo = self.parse_repo_url(repo_url)
|
||||
|
||||
response = requests.get(
|
||||
f"{self.api_base_url}/repos/{owner}/{repo}/branches",
|
||||
headers=self._get_headers(),
|
||||
verify=self.verify_ssl
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
else:
|
||||
logger.error(f"Failed to get repository branches: {response.text}")
|
||||
return []
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get repository branches: {repo_url}, error: {str(e)}")
|
||||
return []
|
505
app/services/nomad_client.py
Normal file
505
app/services/nomad_client.py
Normal file
@ -0,0 +1,505 @@
|
||||
import os
|
||||
import logging
|
||||
import nomad
|
||||
from fastapi import HTTPException
|
||||
from typing import Dict, Any, Optional, List
|
||||
from dotenv import load_dotenv
|
||||
import time
|
||||
|
||||
# Load environment variables
|
||||
load_dotenv()
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def get_nomad_client():
|
||||
"""
|
||||
Create and return a Nomad client using environment variables.
|
||||
"""
|
||||
try:
|
||||
nomad_addr = os.getenv("NOMAD_ADDR", "http://localhost:4646").rstrip('/')
|
||||
nomad_token = os.getenv("NOMAD_TOKEN")
|
||||
# Use "development" as the default namespace since all jobs are likely to be in this namespace
|
||||
nomad_namespace = os.getenv("NOMAD_NAMESPACE", "development")
|
||||
|
||||
# Ensure namespace is never "*" (wildcard)
|
||||
if nomad_namespace == "*":
|
||||
nomad_namespace = "development"
|
||||
logger.info("Replaced wildcard namespace '*' with 'development'")
|
||||
|
||||
# Extract host and port from the address
|
||||
host_with_port = nomad_addr.replace("http://", "").replace("https://", "")
|
||||
host = host_with_port.split(":")[0]
|
||||
|
||||
# Safely extract port
|
||||
port_part = host_with_port.split(":")[-1] if ":" in host_with_port else "4646"
|
||||
port = int(port_part.split('/')[0]) # Remove any path components
|
||||
|
||||
logger.info(f"Creating Nomad client with host={host}, port={port}, namespace={nomad_namespace}")
|
||||
|
||||
return nomad.Nomad(
|
||||
host=host,
|
||||
port=port,
|
||||
secure=nomad_addr.startswith("https"),
|
||||
token=nomad_token,
|
||||
timeout=10,
|
||||
namespace=nomad_namespace, # Query across development namespace by default
|
||||
verify=False if os.getenv("NOMAD_SKIP_VERIFY", "false").lower() == "true" else True
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to create Nomad client: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to connect to Nomad: {str(e)}")
|
||||
|
||||
class NomadService:
|
||||
"""Service for interacting with Nomad API."""
|
||||
|
||||
def __init__(self):
|
||||
self.client = get_nomad_client()
|
||||
self.namespace = os.getenv("NOMAD_NAMESPACE", "development") # Use "development" namespace as default
|
||||
|
||||
def get_job(self, job_id: str, max_retries: int = 3, retry_delay: int = 2) -> Dict[str, Any]:
|
||||
"""
|
||||
Get a job by ID with retry logic.
|
||||
|
||||
Args:
|
||||
job_id: The ID of the job to retrieve
|
||||
max_retries: Maximum number of retry attempts (default: 3)
|
||||
retry_delay: Delay between retries in seconds (default: 2)
|
||||
|
||||
Returns:
|
||||
Dict containing job details
|
||||
"""
|
||||
last_exception = None
|
||||
|
||||
# Try multiple times to get the job
|
||||
for attempt in range(max_retries):
|
||||
try:
|
||||
# Get the Nomad address from the client
|
||||
nomad_addr = f"http://{self.client.host}:{self.client.port}"
|
||||
|
||||
# Build the URL for the job endpoint
|
||||
url = f"{nomad_addr}/v1/job/{job_id}"
|
||||
|
||||
# Set up headers
|
||||
headers = {}
|
||||
if hasattr(self.client, 'token') and self.client.token:
|
||||
headers["X-Nomad-Token"] = self.client.token
|
||||
|
||||
# Set up params with the correct namespace
|
||||
params = {"namespace": self.namespace}
|
||||
|
||||
# Make the request directly
|
||||
import requests
|
||||
response = requests.get(
|
||||
url=url,
|
||||
headers=headers,
|
||||
params=params,
|
||||
verify=False if os.getenv("NOMAD_SKIP_VERIFY", "false").lower() == "true" else True
|
||||
)
|
||||
|
||||
# Check if the request was successful
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
elif response.status_code == 404:
|
||||
# If not the last attempt, log and retry
|
||||
if attempt < max_retries - 1:
|
||||
logger.warning(f"Job {job_id} not found on attempt {attempt+1}/{max_retries}, retrying in {retry_delay}s...")
|
||||
time.sleep(retry_delay)
|
||||
continue
|
||||
else:
|
||||
raise ValueError(f"Job not found after {max_retries} attempts: {job_id}")
|
||||
else:
|
||||
raise ValueError(f"Failed to get job: {response.text}")
|
||||
|
||||
except Exception as e:
|
||||
last_exception = e
|
||||
# If not the last attempt, log and retry
|
||||
if attempt < max_retries - 1:
|
||||
logger.warning(f"Error getting job {job_id} on attempt {attempt+1}/{max_retries}: {str(e)}, retrying in {retry_delay}s...")
|
||||
time.sleep(retry_delay)
|
||||
continue
|
||||
else:
|
||||
logger.error(f"Failed to get job {job_id} after {max_retries} attempts: {str(e)}")
|
||||
raise HTTPException(status_code=404, detail=f"Job not found: {job_id}")
|
||||
|
||||
# If we get here, all retries failed
|
||||
logger.error(f"Failed to get job {job_id} after {max_retries} attempts")
|
||||
raise HTTPException(status_code=404, detail=f"Job not found: {job_id}")
|
||||
|
||||
def list_jobs(self) -> List[Dict[str, Any]]:
|
||||
"""List all jobs."""
|
||||
try:
|
||||
# Get the Nomad address from the client
|
||||
nomad_addr = f"http://{self.client.host}:{self.client.port}"
|
||||
|
||||
# Build the URL for the jobs endpoint
|
||||
url = f"{nomad_addr}/v1/jobs"
|
||||
|
||||
# Set up headers
|
||||
headers = {}
|
||||
if hasattr(self.client, 'token') and self.client.token:
|
||||
headers["X-Nomad-Token"] = self.client.token
|
||||
|
||||
# Set up params with the correct namespace
|
||||
params = {"namespace": self.namespace}
|
||||
|
||||
# Make the request directly
|
||||
import requests
|
||||
response = requests.get(
|
||||
url=url,
|
||||
headers=headers,
|
||||
params=params,
|
||||
verify=False if os.getenv("NOMAD_SKIP_VERIFY", "false").lower() == "true" else True
|
||||
)
|
||||
|
||||
# Check if the request was successful
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
else:
|
||||
raise ValueError(f"Failed to list jobs: {response.text}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to list jobs: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to list jobs: {str(e)}")
|
||||
|
||||
def start_job(self, job_spec: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Start a job using the provided specification.
|
||||
|
||||
Args:
|
||||
job_spec: The job specification to submit. Can be a raw job spec or wrapped in a "Job" key.
|
||||
|
||||
Returns:
|
||||
Dict containing job_id, eval_id, status, and any warnings
|
||||
"""
|
||||
try:
|
||||
# Extract job ID from specification
|
||||
job_id = None
|
||||
if "Job" in job_spec:
|
||||
job_id = job_spec["Job"].get("ID") or job_spec["Job"].get("id")
|
||||
else:
|
||||
job_id = job_spec.get("ID") or job_spec.get("id")
|
||||
|
||||
if not job_id:
|
||||
raise ValueError("Job ID is required in the job specification")
|
||||
|
||||
logger.info(f"Processing job start request for job ID: {job_id}")
|
||||
|
||||
# Determine the namespace to use, with clear priorities:
|
||||
# 1. Explicitly provided in the job spec (highest priority)
|
||||
# 2. Service instance namespace
|
||||
# 3. Fallback to "development"
|
||||
namespace = self.namespace
|
||||
|
||||
# Normalize the job structure to ensure it has a "Job" wrapper
|
||||
normalized_job_spec = {}
|
||||
if "Job" in job_spec:
|
||||
normalized_job_spec = job_spec
|
||||
# Check if namespace is specified in the job spec
|
||||
if "Namespace" in job_spec["Job"]:
|
||||
namespace = job_spec["Job"]["Namespace"]
|
||||
logger.info(f"Using namespace from job spec: {namespace}")
|
||||
else:
|
||||
# Check if namespace is specified in the job spec
|
||||
if "Namespace" in job_spec:
|
||||
namespace = job_spec["Namespace"]
|
||||
logger.info(f"Using namespace from job spec: {namespace}")
|
||||
|
||||
# Wrap the job spec in a "Job" key
|
||||
normalized_job_spec = {"Job": job_spec}
|
||||
|
||||
# Replace wildcard namespaces with the default
|
||||
if namespace == "*":
|
||||
namespace = "development"
|
||||
logger.info(f"Replaced wildcard namespace with default: {namespace}")
|
||||
|
||||
# Always explicitly set the namespace in the job spec
|
||||
normalized_job_spec["Job"]["Namespace"] = namespace
|
||||
|
||||
logger.info(f"Submitting job {job_id} to namespace {namespace}")
|
||||
logger.info(f"Job specification structure: {list(normalized_job_spec.keys())}")
|
||||
logger.info(f"Job keys: {list(normalized_job_spec['Job'].keys())}")
|
||||
|
||||
# Submit the job - pass the job_id and job spec directly
|
||||
# The namespace is already set in the job spec
|
||||
response = self.client.job.register_job(job_id, normalized_job_spec)
|
||||
|
||||
logger.info(f"Job registration response: {response}")
|
||||
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"eval_id": response.get("EvalID"),
|
||||
"status": "started",
|
||||
"warnings": response.get("Warnings"),
|
||||
"namespace": namespace
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to start job: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to start job: {str(e)}")
|
||||
|
||||
def stop_job(self, job_id: str, purge: bool = False) -> Dict[str, Any]:
|
||||
"""
|
||||
Stop a job by ID.
|
||||
|
||||
Args:
|
||||
job_id: The ID of the job to stop
|
||||
purge: If true, the job will be purged from Nomad's state entirely
|
||||
|
||||
Returns:
|
||||
Dict containing job_id, eval_id, and status
|
||||
"""
|
||||
try:
|
||||
logger.info(f"Stopping job {job_id} in namespace {self.namespace} (purge={purge})")
|
||||
|
||||
# Get the Nomad address from the client
|
||||
nomad_addr = f"http://{self.client.host}:{self.client.port}"
|
||||
|
||||
# Build the URL for the job endpoint
|
||||
url = f"{nomad_addr}/v1/job/{job_id}"
|
||||
|
||||
# Set up headers
|
||||
headers = {}
|
||||
if hasattr(self.client, 'token') and self.client.token:
|
||||
headers["X-Nomad-Token"] = self.client.token
|
||||
|
||||
# Set up params with the correct namespace and purge option
|
||||
params = {
|
||||
"namespace": self.namespace,
|
||||
"purge": str(purge).lower()
|
||||
}
|
||||
|
||||
# Make the request directly
|
||||
import requests
|
||||
response = requests.delete(
|
||||
url=url,
|
||||
headers=headers,
|
||||
params=params,
|
||||
verify=False if os.getenv("NOMAD_SKIP_VERIFY", "false").lower() == "true" else True
|
||||
)
|
||||
|
||||
# Check if the request was successful
|
||||
if response.status_code == 200:
|
||||
response_data = response.json()
|
||||
logger.info(f"Job stop response: {response_data}")
|
||||
|
||||
return {
|
||||
"job_id": job_id,
|
||||
"eval_id": response_data.get("EvalID"),
|
||||
"status": "stopped",
|
||||
"namespace": self.namespace
|
||||
}
|
||||
else:
|
||||
raise ValueError(f"Failed to stop job: {response.text}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to stop job {job_id}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to stop job: {str(e)}")
|
||||
|
||||
def get_allocations(self, job_id: str) -> List[Dict[str, Any]]:
|
||||
"""Get all allocations for a job."""
|
||||
try:
|
||||
# Get the Nomad address from the client
|
||||
nomad_addr = f"http://{self.client.host}:{self.client.port}"
|
||||
|
||||
# Build the URL for the job allocations endpoint
|
||||
url = f"{nomad_addr}/v1/job/{job_id}/allocations"
|
||||
|
||||
# Set up headers
|
||||
headers = {}
|
||||
if hasattr(self.client, 'token') and self.client.token:
|
||||
headers["X-Nomad-Token"] = self.client.token
|
||||
|
||||
# Set up params with the correct namespace
|
||||
params = {"namespace": self.namespace}
|
||||
|
||||
# Make the request directly
|
||||
import requests
|
||||
response = requests.get(
|
||||
url=url,
|
||||
headers=headers,
|
||||
params=params,
|
||||
verify=False if os.getenv("NOMAD_SKIP_VERIFY", "false").lower() == "true" else True
|
||||
)
|
||||
|
||||
# Check if the request was successful
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
elif response.status_code == 404:
|
||||
logger.warning(f"No allocations found for job {job_id}")
|
||||
return []
|
||||
else:
|
||||
raise ValueError(f"Failed to get allocations: {response.text}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get allocations for job {job_id}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to get allocations: {str(e)}")
|
||||
|
||||
def get_allocation_logs(self, alloc_id: str, task: str, log_type: str = "stderr") -> str:
|
||||
"""Get logs for a specific allocation and task."""
|
||||
try:
|
||||
# More detailed debugging to understand what's happening
|
||||
logger.info(f"Getting logs for allocation {alloc_id}, task {task}, type {log_type}")
|
||||
|
||||
if alloc_id == "repository":
|
||||
logger.error("Invalid allocation ID 'repository' detected")
|
||||
return f"Error: Invalid allocation ID 'repository'"
|
||||
|
||||
# Verify the allocation ID is a valid UUID (must be 36 characters)
|
||||
if not alloc_id or len(alloc_id) != 36:
|
||||
logger.error(f"Invalid allocation ID format: {alloc_id} (length: {len(alloc_id) if alloc_id else 0})")
|
||||
return f"Error: Invalid allocation ID format - must be 36 character UUID"
|
||||
|
||||
# Get allocation info to verify it exists
|
||||
try:
|
||||
allocation = self.client.allocation.get_allocation(alloc_id)
|
||||
if not allocation:
|
||||
logger.warning(f"Allocation {alloc_id} not found")
|
||||
return f"Allocation {alloc_id} not found"
|
||||
except Exception as e:
|
||||
logger.error(f"Error checking allocation: {str(e)}")
|
||||
return f"Error checking allocation: {str(e)}"
|
||||
|
||||
# Try multiple approaches to get logs
|
||||
log_content = None
|
||||
error_messages = []
|
||||
|
||||
# Approach 1: Standard API
|
||||
try:
|
||||
logger.info(f"Attempting to get logs using standard API")
|
||||
logs = self.client.allocation.logs.get_logs(
|
||||
alloc_id,
|
||||
task,
|
||||
log_type,
|
||||
plain=True
|
||||
)
|
||||
|
||||
if logs:
|
||||
if isinstance(logs, dict) and logs.get("Data"):
|
||||
log_content = logs.get("Data")
|
||||
logger.info(f"Successfully retrieved logs using standard API")
|
||||
elif isinstance(logs, str):
|
||||
log_content = logs
|
||||
logger.info(f"Successfully retrieved logs as string")
|
||||
else:
|
||||
error_messages.append(f"Unexpected log format: {type(logs)}")
|
||||
logger.warning(f"Unexpected log format: {type(logs)}")
|
||||
else:
|
||||
error_messages.append("No logs returned from standard API")
|
||||
logger.warning("No logs returned from standard API")
|
||||
except Exception as e:
|
||||
error_str = str(e)
|
||||
error_messages.append(f"Standard API error: {error_str}")
|
||||
logger.warning(f"Standard API failed: {error_str}")
|
||||
|
||||
# Approach 2: Try raw HTTP if the standard API didn't work
|
||||
if not log_content:
|
||||
try:
|
||||
import requests
|
||||
|
||||
# Get the Nomad address from environment or use default
|
||||
nomad_addr = os.getenv("NOMAD_ADDR", "http://localhost:4646").rstrip('/')
|
||||
nomad_token = os.getenv("NOMAD_TOKEN")
|
||||
|
||||
# Construct the URL for logs
|
||||
logs_url = f"{nomad_addr}/v1/client/fs/logs/{alloc_id}"
|
||||
|
||||
# Setup headers
|
||||
headers = {}
|
||||
if nomad_token:
|
||||
headers["X-Nomad-Token"] = nomad_token
|
||||
|
||||
# Setup query parameters
|
||||
params = {
|
||||
"task": task,
|
||||
"type": log_type,
|
||||
"plain": "true"
|
||||
}
|
||||
|
||||
if self.namespace and self.namespace != "*":
|
||||
params["namespace"] = self.namespace
|
||||
|
||||
logger.info(f"Attempting to get logs using direct HTTP request to: {logs_url}")
|
||||
response = requests.get(logs_url, headers=headers, params=params, verify=False)
|
||||
|
||||
if response.status_code == 200:
|
||||
log_content = response.text
|
||||
logger.info(f"Successfully retrieved logs using direct HTTP request")
|
||||
else:
|
||||
error_messages.append(f"HTTP request failed with status {response.status_code}: {response.text}")
|
||||
logger.warning(f"HTTP request failed: {response.status_code} - {response.text}")
|
||||
except ImportError:
|
||||
error_messages.append("Requests library not available for fallback HTTP request")
|
||||
logger.warning("Requests library not available for fallback HTTP request")
|
||||
except Exception as e:
|
||||
error_str = str(e)
|
||||
error_messages.append(f"HTTP request error: {error_str}")
|
||||
logger.warning(f"HTTP request failed: {error_str}")
|
||||
|
||||
# Approach 3: Direct system call as a last resort
|
||||
if not log_content:
|
||||
try:
|
||||
import subprocess
|
||||
|
||||
# Get the Nomad command-line client path
|
||||
nomad_cmd = "nomad" # Default, assumes nomad is in PATH
|
||||
|
||||
# Build the command
|
||||
cmd_parts = [
|
||||
nomad_cmd,
|
||||
"alloc", "logs",
|
||||
"-verbose",
|
||||
]
|
||||
|
||||
# Add namespace if specified
|
||||
if self.namespace and self.namespace != "*":
|
||||
cmd_parts.extend(["-namespace", self.namespace])
|
||||
|
||||
# Add allocation and task info
|
||||
cmd_parts.extend(["-job", alloc_id, task])
|
||||
|
||||
# Use stderr or stdout
|
||||
if log_type == "stderr":
|
||||
cmd_parts.append("-stderr")
|
||||
else:
|
||||
cmd_parts.append("-stdout")
|
||||
|
||||
logger.info(f"Attempting to get logs using command: {' '.join(cmd_parts)}")
|
||||
process = subprocess.run(cmd_parts, capture_output=True, text=True)
|
||||
|
||||
if process.returncode == 0:
|
||||
log_content = process.stdout
|
||||
logger.info(f"Successfully retrieved logs using command-line client")
|
||||
else:
|
||||
error_messages.append(f"Command-line client failed: {process.stderr}")
|
||||
logger.warning(f"Command-line client failed: {process.stderr}")
|
||||
except Exception as e:
|
||||
error_str = str(e)
|
||||
error_messages.append(f"Command-line client error: {error_str}")
|
||||
logger.warning(f"Command-line client failed: {error_str}")
|
||||
|
||||
# Return the logs if we got them, otherwise return error
|
||||
if log_content:
|
||||
return log_content
|
||||
else:
|
||||
error_msg = "; ".join(error_messages)
|
||||
logger.error(f"Failed to get logs after multiple attempts: {error_msg}")
|
||||
return f"Error retrieving {log_type} logs: {error_msg}"
|
||||
|
||||
except Exception as e:
|
||||
error_str = str(e)
|
||||
logger.error(f"Failed to get logs for allocation {alloc_id}, task {task}: {error_str}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to get logs: {error_str}")
|
||||
|
||||
def get_deployment_status(self, job_id: str) -> Dict[str, Any]:
|
||||
"""Get the deployment status for a job."""
|
||||
try:
|
||||
return self.client.job.get_deployment(job_id, namespace=self.namespace)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get deployment status for job {job_id}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to get deployment status: {str(e)}")
|
||||
|
||||
def get_job_evaluations(self, job_id: str) -> List[Dict[str, Any]]:
|
||||
"""Get evaluations for a job."""
|
||||
try:
|
||||
return self.client.job.get_evaluations(job_id, namespace=self.namespace)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get evaluations for job {job_id}: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to get evaluations: {str(e)}")
|
33
check_path.py
Normal file
33
check_path.py
Normal file
@ -0,0 +1,33 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Script to check Python path and help diagnose import issues.
|
||||
"""
|
||||
|
||||
import sys
|
||||
import os
|
||||
|
||||
def main():
|
||||
print("Current working directory:", os.getcwd())
|
||||
print("\nPython path:")
|
||||
for path in sys.path:
|
||||
print(f" - {path}")
|
||||
|
||||
print("\nChecking for app directory:")
|
||||
if os.path.exists("app"):
|
||||
print("✅ 'app' directory exists in current working directory")
|
||||
print("Contents of app directory:")
|
||||
for item in os.listdir("app"):
|
||||
print(f" - {item}")
|
||||
else:
|
||||
print("❌ 'app' directory does not exist in current working directory")
|
||||
|
||||
print("\nChecking for app module:")
|
||||
try:
|
||||
import app
|
||||
print("✅ 'app' module can be imported")
|
||||
print(f"app module location: {app.__file__}")
|
||||
except ImportError as e:
|
||||
print(f"❌ Cannot import 'app' module: {e}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
71
claude_nomad_tool.json
Normal file
71
claude_nomad_tool.json
Normal file
@ -0,0 +1,71 @@
|
||||
{
|
||||
"tools": [
|
||||
{
|
||||
"name": "nomad_mcp",
|
||||
"description": "Manage Nomad jobs through the MCP service",
|
||||
"api_endpoints": [
|
||||
{
|
||||
"name": "list_jobs",
|
||||
"description": "List all jobs in a namespace",
|
||||
"method": "GET",
|
||||
"url": "http://127.0.0.1:8000/api/claude/list-jobs",
|
||||
"params": [
|
||||
{
|
||||
"name": "namespace",
|
||||
"type": "string",
|
||||
"description": "Nomad namespace",
|
||||
"required": false,
|
||||
"default": "development"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "manage_job",
|
||||
"description": "Manage a job (status, stop, restart)",
|
||||
"method": "POST",
|
||||
"url": "http://127.0.0.1:8000/api/claude/jobs",
|
||||
"body": {
|
||||
"job_id": "string",
|
||||
"action": "string",
|
||||
"namespace": "string",
|
||||
"purge": "boolean"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "create_job",
|
||||
"description": "Create a new job",
|
||||
"method": "POST",
|
||||
"url": "http://127.0.0.1:8000/api/claude/create-job",
|
||||
"body": {
|
||||
"job_id": "string",
|
||||
"name": "string",
|
||||
"type": "string",
|
||||
"datacenters": "array",
|
||||
"namespace": "string",
|
||||
"docker_image": "string",
|
||||
"count": "integer",
|
||||
"cpu": "integer",
|
||||
"memory": "integer",
|
||||
"ports": "array",
|
||||
"env_vars": "object"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "get_job_logs",
|
||||
"description": "Get logs for a job",
|
||||
"method": "GET",
|
||||
"url": "http://127.0.0.1:8000/api/claude/job-logs/{job_id}",
|
||||
"params": [
|
||||
{
|
||||
"name": "namespace",
|
||||
"type": "string",
|
||||
"description": "Nomad namespace",
|
||||
"required": false,
|
||||
"default": "development"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
70
cleanup_test_jobs.py
Normal file
70
cleanup_test_jobs.py
Normal file
@ -0,0 +1,70 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Script to clean up test jobs from Nomad.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from dotenv import load_dotenv
|
||||
from app.services.nomad_client import NomadService
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
def main():
|
||||
print("Cleaning up test jobs from Nomad...")
|
||||
|
||||
# Check if NOMAD_ADDR is configured
|
||||
nomad_addr = os.getenv("NOMAD_ADDR")
|
||||
if not nomad_addr:
|
||||
print("Error: NOMAD_ADDR is not configured in .env file.")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Connecting to Nomad at: {nomad_addr}")
|
||||
|
||||
try:
|
||||
# Initialize the Nomad service
|
||||
nomad_service = NomadService()
|
||||
|
||||
# List all jobs
|
||||
print("\nListing all jobs...")
|
||||
jobs = nomad_service.list_jobs()
|
||||
print(f"Found {len(jobs)} jobs")
|
||||
|
||||
# Filter for test jobs (starting with "test-")
|
||||
test_jobs = [job for job in jobs if job.get('ID', '').startswith('test-')]
|
||||
print(f"Found {len(test_jobs)} test jobs:")
|
||||
|
||||
# Print each test job's ID and status
|
||||
for job in test_jobs:
|
||||
print(f" - {job.get('ID')}: {job.get('Status')}")
|
||||
|
||||
# Confirm before proceeding
|
||||
if test_jobs:
|
||||
print("\nDo you want to stop and purge all these test jobs? (y/n)")
|
||||
response = input().strip().lower()
|
||||
|
||||
if response == 'y':
|
||||
print("\nStopping and purging test jobs...")
|
||||
|
||||
for job in test_jobs:
|
||||
job_id = job.get('ID')
|
||||
try:
|
||||
print(f"Stopping and purging job: {job_id}...")
|
||||
stop_response = nomad_service.stop_job(job_id, purge=True)
|
||||
print(f" - Success: {stop_response}")
|
||||
except Exception as e:
|
||||
print(f" - Error stopping job {job_id}: {str(e)}")
|
||||
|
||||
print("\nCleanup completed.")
|
||||
else:
|
||||
print("\nCleanup cancelled.")
|
||||
else:
|
||||
print("\nNo test jobs found to clean up.")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error during cleanup: {str(e)}")
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
9
configs/example.yaml
Normal file
9
configs/example.yaml
Normal file
@ -0,0 +1,9 @@
|
||||
repository: https://github.com/example/my-service
|
||||
job_id: my-service
|
||||
description: Example service managed by MCP
|
||||
meta:
|
||||
owner: ai-team
|
||||
environment: development
|
||||
tags:
|
||||
- api
|
||||
- example
|
11
configs/ms-qc-db.yaml
Normal file
11
configs/ms-qc-db.yaml
Normal file
@ -0,0 +1,11 @@
|
||||
repository: https://gitea.dev.meisheng.group/Mei_Sheng_Textiles/MS_QC_DB
|
||||
repository_alias: ms-qc-db
|
||||
job_id: ms-qc-db-dev
|
||||
namespace: development
|
||||
description: MS QC Database application for quality control tracking
|
||||
meta:
|
||||
owner: ms-team
|
||||
environment: development
|
||||
tags:
|
||||
- database
|
||||
- qc
|
10
configs/test-service.yaml
Normal file
10
configs/test-service.yaml
Normal file
@ -0,0 +1,10 @@
|
||||
repository: http://gitea.internal/username/test-service
|
||||
repository_alias: test-service
|
||||
job_id: test-service
|
||||
description: Test service managed by MCP for Gitea integration
|
||||
meta:
|
||||
owner: ai-team
|
||||
environment: development
|
||||
tags:
|
||||
- test
|
||||
- api
|
152
deploy_nomad_mcp.py
Normal file
152
deploy_nomad_mcp.py
Normal file
@ -0,0 +1,152 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Script to deploy the Nomad MCP service using our own Nomad client.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
from dotenv import load_dotenv
|
||||
from app.services.nomad_client import NomadService
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
def read_job_spec(file_path):
|
||||
"""Read the Nomad job specification from a file."""
|
||||
try:
|
||||
with open(file_path, 'r') as f:
|
||||
content = f.read()
|
||||
|
||||
# Convert HCL to JSON (simplified approach)
|
||||
# In a real scenario, you might want to use a proper HCL parser
|
||||
# This is a very basic approach that assumes the job spec is valid
|
||||
job_id = "nomad-mcp"
|
||||
|
||||
# Create a basic job structure
|
||||
job_spec = {
|
||||
"ID": job_id,
|
||||
"Name": job_id,
|
||||
"Type": "service",
|
||||
"Datacenters": ["jm"],
|
||||
"Namespace": "development",
|
||||
"TaskGroups": [
|
||||
{
|
||||
"Name": "app",
|
||||
"Count": 1,
|
||||
"Networks": [
|
||||
{
|
||||
"DynamicPorts": [
|
||||
{
|
||||
"Label": "http",
|
||||
"To": 8000
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"Tasks": [
|
||||
{
|
||||
"Name": "nomad-mcp",
|
||||
"Driver": "docker",
|
||||
"Config": {
|
||||
"image": "registry.dev.meisheng.group/nomad_mcp:20250226",
|
||||
"ports": ["http"],
|
||||
"command": "python",
|
||||
"args": ["-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
},
|
||||
"Env": {
|
||||
"NOMAD_ADDR": "http://pjmldk01.ds.meisheng.group:4646",
|
||||
"NOMAD_NAMESPACE": "development",
|
||||
"NOMAD_SKIP_VERIFY": "true",
|
||||
"PORT": "8000",
|
||||
"HOST": "0.0.0.0",
|
||||
"LOG_LEVEL": "INFO",
|
||||
"RELOAD": "true"
|
||||
},
|
||||
"Resources": {
|
||||
"CPU": 200,
|
||||
"MemoryMB": 256
|
||||
},
|
||||
"Services": [
|
||||
{
|
||||
"Name": "nomad-mcp",
|
||||
"PortLabel": "http",
|
||||
"Tags": [
|
||||
"traefik.enable=true",
|
||||
"traefik.http.routers.nomad-mcp.entryPoints=https",
|
||||
"traefik.http.routers.nomad-mcp.rule=Host(`nomad_mcp.dev.meisheng.group`)",
|
||||
"traefik.http.routers.nomad-mcp.middlewares=proxyheaders@consulcatalog"
|
||||
],
|
||||
"Checks": [
|
||||
{
|
||||
"Type": "http",
|
||||
"Path": "/api/health",
|
||||
"Interval": 10000000000,
|
||||
"Timeout": 2000000000,
|
||||
"CheckRestart": {
|
||||
"Limit": 3,
|
||||
"Grace": 60000000000
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"Update": {
|
||||
"MaxParallel": 1,
|
||||
"MinHealthyTime": 30000000000,
|
||||
"HealthyDeadline": 300000000000,
|
||||
"AutoRevert": True
|
||||
}
|
||||
}
|
||||
|
||||
return job_spec
|
||||
except Exception as e:
|
||||
print(f"Error reading job specification: {str(e)}")
|
||||
sys.exit(1)
|
||||
|
||||
def main():
|
||||
print("Deploying Nomad MCP service using our own Nomad client...")
|
||||
|
||||
# Check if NOMAD_ADDR is configured
|
||||
nomad_addr = os.getenv("NOMAD_ADDR")
|
||||
if not nomad_addr:
|
||||
print("Error: NOMAD_ADDR is not configured in .env file.")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Connecting to Nomad at: {nomad_addr}")
|
||||
|
||||
try:
|
||||
# Initialize the Nomad service
|
||||
nomad_service = NomadService()
|
||||
|
||||
# Read the job specification
|
||||
job_spec = read_job_spec("nomad_mcp_job.nomad")
|
||||
print("Job specification loaded successfully.")
|
||||
|
||||
# Start the job
|
||||
print("Registering and starting the nomad-mcp job...")
|
||||
response = nomad_service.start_job(job_spec)
|
||||
|
||||
print("\nJob registration response:")
|
||||
print(json.dumps(response, indent=2))
|
||||
|
||||
if response.get("status") == "started":
|
||||
print("\n✅ Nomad MCP service deployed successfully!")
|
||||
print(f"Job ID: {response.get('job_id')}")
|
||||
print(f"Evaluation ID: {response.get('eval_id')}")
|
||||
print("\nThe service will be available at: https://nomad_mcp.dev.meisheng.group")
|
||||
else:
|
||||
print("\n❌ Failed to deploy Nomad MCP service.")
|
||||
print(f"Status: {response.get('status')}")
|
||||
print(f"Message: {response.get('message', 'Unknown error')}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error deploying Nomad MCP service: {str(e)}")
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
97
deploy_with_claude_api.py
Normal file
97
deploy_with_claude_api.py
Normal file
@ -0,0 +1,97 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Script to deploy the Nomad MCP service using the Claude API.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
import requests
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
def main():
|
||||
print("Deploying Nomad MCP service using the Claude API...")
|
||||
|
||||
# Define the API endpoint
|
||||
api_url = "http://localhost:8000/api/claude/create-job"
|
||||
|
||||
# Create the job specification for the Claude API
|
||||
job_spec = {
|
||||
"job_id": "nomad-mcp",
|
||||
"name": "Nomad MCP Service",
|
||||
"type": "service",
|
||||
"datacenters": ["jm"],
|
||||
"namespace": "development",
|
||||
"docker_image": "registry.dev.meisheng.group/nomad_mcp:20250226",
|
||||
"count": 1,
|
||||
"cpu": 200,
|
||||
"memory": 256,
|
||||
"ports": [
|
||||
{
|
||||
"Label": "http",
|
||||
"Value": 0,
|
||||
"To": 8000
|
||||
}
|
||||
],
|
||||
"env_vars": {
|
||||
"NOMAD_ADDR": "http://pjmldk01.ds.meisheng.group:4646",
|
||||
"NOMAD_NAMESPACE": "development",
|
||||
"NOMAD_SKIP_VERIFY": "true",
|
||||
"PORT": "8000",
|
||||
"HOST": "0.0.0.0",
|
||||
"LOG_LEVEL": "INFO",
|
||||
"RELOAD": "true"
|
||||
},
|
||||
# Note: The Claude API doesn't directly support command and args,
|
||||
# so we'll need to add a note about this limitation
|
||||
}
|
||||
|
||||
try:
|
||||
# Make the API request
|
||||
print("Sending request to Claude API...")
|
||||
response = requests.post(
|
||||
api_url,
|
||||
json=job_spec,
|
||||
headers={"Content-Type": "application/json"}
|
||||
)
|
||||
|
||||
# Check if the request was successful
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
print("\nJob registration response:")
|
||||
print(json.dumps(result, indent=2))
|
||||
|
||||
if result.get("success"):
|
||||
print("\n✅ Nomad MCP service deployed successfully!")
|
||||
print(f"Job ID: {result.get('job_id')}")
|
||||
print(f"Status: {result.get('status')}")
|
||||
print("\nThe service will be available at: https://nomad_mcp.dev.meisheng.group")
|
||||
|
||||
# Add Traefik configuration and command information
|
||||
print("\nImportant Notes:")
|
||||
print("1. The Claude API doesn't directly support adding Traefik tags.")
|
||||
print(" You may need to update the job manually to add the following tags:")
|
||||
print(" - traefik.enable=true")
|
||||
print(" - traefik.http.routers.nomad-mcp.entryPoints=https")
|
||||
print(" - traefik.http.routers.nomad-mcp.rule=Host(`nomad_mcp.dev.meisheng.group`)")
|
||||
print(" - traefik.http.routers.nomad-mcp.middlewares=proxyheaders@consulcatalog")
|
||||
print("\n2. The Claude API doesn't directly support specifying command and args.")
|
||||
print(" You need to update the job manually to add the following:")
|
||||
print(" - command: python")
|
||||
print(" - args: [\"-m\", \"uvicorn\", \"app.main:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8000\"]")
|
||||
else:
|
||||
print("\n❌ Failed to deploy Nomad MCP service.")
|
||||
print(f"Message: {result.get('message', 'Unknown error')}")
|
||||
else:
|
||||
print(f"\n❌ API request failed with status code: {response.status_code}")
|
||||
print(f"Response: {response.text}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error deploying Nomad MCP service: {str(e)}")
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
14
docker-compose.yml
Normal file
14
docker-compose.yml
Normal file
@ -0,0 +1,14 @@
|
||||
version: '3'
|
||||
|
||||
services:
|
||||
nomad-mcp:
|
||||
build: .
|
||||
ports:
|
||||
- "8000:8000"
|
||||
volumes:
|
||||
- ./configs:/app/configs
|
||||
env_file:
|
||||
- .env
|
||||
environment:
|
||||
- CONFIG_DIR=/app/configs
|
||||
restart: unless-stopped
|
307
job_spec.json
Normal file
307
job_spec.json
Normal file
@ -0,0 +1,307 @@
|
||||
{
|
||||
"Job": {
|
||||
"Stop": false,
|
||||
"Region": "global",
|
||||
"Namespace": "development",
|
||||
"ID": "ms-qc-db-dev",
|
||||
"ParentID": "",
|
||||
"Name": "ms-qc-db-dev",
|
||||
"Type": "service",
|
||||
"Priority": 50,
|
||||
"AllAtOnce": false,
|
||||
"Datacenters": [
|
||||
"jm"
|
||||
],
|
||||
"NodePool": "default",
|
||||
"Constraints": null,
|
||||
"Affinities": null,
|
||||
"Spreads": null,
|
||||
"TaskGroups": [
|
||||
{
|
||||
"Name": "app",
|
||||
"Count": 1,
|
||||
"Update": {
|
||||
"Stagger": 30000000000,
|
||||
"MaxParallel": 1,
|
||||
"HealthCheck": "checks",
|
||||
"MinHealthyTime": 10000000000,
|
||||
"HealthyDeadline": 300000000000,
|
||||
"ProgressDeadline": 600000000000,
|
||||
"AutoRevert": false,
|
||||
"AutoPromote": false,
|
||||
"Canary": 0
|
||||
},
|
||||
"Migrate": {
|
||||
"MaxParallel": 1,
|
||||
"HealthCheck": "checks",
|
||||
"MinHealthyTime": 10000000000,
|
||||
"HealthyDeadline": 300000000000
|
||||
},
|
||||
"Constraints": [
|
||||
{
|
||||
"LTarget": "${attr.consul.version}",
|
||||
"RTarget": "\u003e= 1.8.0",
|
||||
"Operand": "semver"
|
||||
}
|
||||
],
|
||||
"Scaling": null,
|
||||
"RestartPolicy": {
|
||||
"Attempts": 2,
|
||||
"Interval": 1800000000000,
|
||||
"Delay": 15000000000,
|
||||
"Mode": "fail",
|
||||
"RenderTemplates": false
|
||||
},
|
||||
"Tasks": [
|
||||
{
|
||||
"Name": "ms-qc-db",
|
||||
"Driver": "docker",
|
||||
"User": "",
|
||||
"Config": {
|
||||
"command": "uvicorn",
|
||||
"args": [
|
||||
"app.main:app",
|
||||
"--host",
|
||||
"0.0.0.0",
|
||||
"--port",
|
||||
"8000",
|
||||
"--workers",
|
||||
"2",
|
||||
"--proxy-headers",
|
||||
"--forwarded-allow-ips",
|
||||
"*"
|
||||
],
|
||||
"image": "registry.dev.meisheng.group/ms_qc_db:20250211",
|
||||
"force_pull": true,
|
||||
"ports": [
|
||||
"http"
|
||||
]
|
||||
},
|
||||
"Env": {
|
||||
"PYTHONPATH": "/local/MS_QC_DB",
|
||||
"LOG_LEVEL": "INFO",
|
||||
"USE_SQLITE": "false"
|
||||
},
|
||||
"Services": null,
|
||||
"Vault": null,
|
||||
"Consul": null,
|
||||
"Templates": [
|
||||
{
|
||||
"SourcePath": "",
|
||||
"DestPath": "secrets/app.env",
|
||||
"EmbeddedTmpl": "{{with secret \"infrastructure/nomad/msqc\"}}\nDB_USER=\"{{ .Data.data.DB_USER }}\"\nDB_PASSWORD=\"{{ .Data.data.DB_PASSWORD }}\"\nDB_HOST=\"{{ .Data.data.DB_HOST }}\"\nDB_PORT=\"{{ .Data.data.DB_PORT }}\"\nDB_NAME=\"qc_rolls_dev\"\nWEBHOOK_SECRET=\"{{ .Data.data.WEBHOOK_SECRET }}\"\n{{end}}\n",
|
||||
"ChangeMode": "restart",
|
||||
"ChangeSignal": "",
|
||||
"ChangeScript": null,
|
||||
"Splay": 5000000000,
|
||||
"Perms": "0644",
|
||||
"Uid": null,
|
||||
"Gid": null,
|
||||
"LeftDelim": "{{",
|
||||
"RightDelim": "}}",
|
||||
"Envvars": true,
|
||||
"VaultGrace": 0,
|
||||
"Wait": null,
|
||||
"ErrMissingKey": false
|
||||
}
|
||||
],
|
||||
"Constraints": null,
|
||||
"Affinities": null,
|
||||
"Resources": {
|
||||
"CPU": 500,
|
||||
"Cores": 0,
|
||||
"MemoryMB": 512,
|
||||
"MemoryMaxMB": 0,
|
||||
"DiskMB": 0,
|
||||
"IOPS": 0,
|
||||
"Networks": null,
|
||||
"Devices": null,
|
||||
"NUMA": null
|
||||
},
|
||||
"RestartPolicy": {
|
||||
"Attempts": 2,
|
||||
"Interval": 1800000000000,
|
||||
"Delay": 15000000000,
|
||||
"Mode": "fail",
|
||||
"RenderTemplates": false
|
||||
},
|
||||
"DispatchPayload": null,
|
||||
"Lifecycle": null,
|
||||
"Meta": null,
|
||||
"KillTimeout": 5000000000,
|
||||
"LogConfig": {
|
||||
"MaxFiles": 10,
|
||||
"MaxFileSizeMB": 10,
|
||||
"Disabled": false
|
||||
},
|
||||
"Artifacts": [
|
||||
{
|
||||
"GetterSource": "git::ssh://git@gitea.service.mesh:2222/Mei_Sheng_Textiles/MS_QC_DB.git",
|
||||
"GetterOptions": {
|
||||
"sshkey": "LS0tLS1CRUdJTiBPUEVOU1NIIFBSSVZBVEUgS0VZLS0tLS0KYjNCbGJuTnphQzFyWlhrdGRqRUFBQUFBQkc1dmJtVUFBQUFFYm05dVpRQUFBQUFBQUFBQkFBQUFNd0FBQUF0emMyZ3RaVwpReU5UVXhPUUFBQUNENHJwM05hZXA4K2lwVnlOZXNEbEVKckE0Rlg3MXA5VW5BWmxZcEJCNDh6d0FBQUppQ1ZWczhnbFZiClBBQUFBQXR6YzJndFpXUXlOVFV4T1FBQUFDRDRycDNOYWVwOCtpcFZ5TmVzRGxFSnJBNEZYNzFwOVVuQVpsWXBCQjQ4encKQUFBRUNuckxjc1JDeUQyNmRnQ3dqdG5PUnNOK1VzUjdxZ1pqbXZpU2tVNmozalVmaXVuYzFwNm56NktsWEkxNndPVVFtcwpEZ1ZmdlduMVNjQm1WaWtFSGp6UEFBQUFFMjF6WDNGalgyUmlYMlJsY0d4dmVTQnJaWGtCQWc9PQotLS0tLUVORCBPUEVOU1NIIFBSSVZBVEUgS0VZLS0tLS0K",
|
||||
"ref": "main"
|
||||
},
|
||||
"GetterHeaders": null,
|
||||
"GetterMode": "any",
|
||||
"RelativeDest": "local/MS_QC_DB"
|
||||
}
|
||||
],
|
||||
"Leader": false,
|
||||
"ShutdownDelay": 0,
|
||||
"VolumeMounts": null,
|
||||
"ScalingPolicies": null,
|
||||
"KillSignal": "",
|
||||
"Kind": "",
|
||||
"CSIPluginConfig": null,
|
||||
"Identity": {
|
||||
"Name": "default",
|
||||
"Audience": [
|
||||
"nomadproject.io"
|
||||
],
|
||||
"ChangeMode": "",
|
||||
"ChangeSignal": "",
|
||||
"Env": false,
|
||||
"File": false,
|
||||
"ServiceName": "",
|
||||
"TTL": 0
|
||||
},
|
||||
"Identities": null,
|
||||
"Actions": null
|
||||
}
|
||||
],
|
||||
"EphemeralDisk": {
|
||||
"Sticky": false,
|
||||
"SizeMB": 300,
|
||||
"Migrate": false
|
||||
},
|
||||
"Meta": null,
|
||||
"ReschedulePolicy": {
|
||||
"Attempts": 0,
|
||||
"Interval": 0,
|
||||
"Delay": 30000000000,
|
||||
"DelayFunction": "exponential",
|
||||
"MaxDelay": 3600000000000,
|
||||
"Unlimited": true
|
||||
},
|
||||
"Affinities": null,
|
||||
"Spreads": null,
|
||||
"Networks": [
|
||||
{
|
||||
"Mode": "",
|
||||
"Device": "",
|
||||
"CIDR": "",
|
||||
"IP": "",
|
||||
"Hostname": "",
|
||||
"MBits": 0,
|
||||
"DNS": null,
|
||||
"ReservedPorts": null,
|
||||
"DynamicPorts": [
|
||||
{
|
||||
"Label": "http",
|
||||
"Value": 0,
|
||||
"To": 8000,
|
||||
"HostNetwork": "default"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"Consul": {
|
||||
"Namespace": "",
|
||||
"Cluster": "default",
|
||||
"Partition": ""
|
||||
},
|
||||
"Services": [
|
||||
{
|
||||
"Name": "${NOMAD_JOB_NAME}",
|
||||
"TaskName": "",
|
||||
"PortLabel": "http",
|
||||
"AddressMode": "auto",
|
||||
"Address": "",
|
||||
"EnableTagOverride": false,
|
||||
"Tags": [
|
||||
"traefik.http.routers.${NOMAD_JOB_NAME}.entryPoints=https",
|
||||
"traefik.http.routers.${NOMAD_JOB_NAME}.rule=Host(`dev_qc.dev.meisheng.group`)",
|
||||
"traefik.http.routers.${NOMAD_JOB_NAME}.middlewares=proxyheaders@consulcatalog",
|
||||
"traefik.enable=true"
|
||||
],
|
||||
"CanaryTags": null,
|
||||
"Checks": [
|
||||
{
|
||||
"Name": "service: \"${NOMAD_JOB_NAME}\" check",
|
||||
"Type": "http",
|
||||
"Command": "",
|
||||
"Args": null,
|
||||
"Path": "/api/v1/health",
|
||||
"Protocol": "",
|
||||
"PortLabel": "http",
|
||||
"Expose": false,
|
||||
"AddressMode": "",
|
||||
"Interval": 10000000000,
|
||||
"Timeout": 2000000000,
|
||||
"InitialStatus": "",
|
||||
"TLSServerName": "",
|
||||
"TLSSkipVerify": false,
|
||||
"Method": "",
|
||||
"Header": null,
|
||||
"CheckRestart": null,
|
||||
"GRPCService": "",
|
||||
"GRPCUseTLS": false,
|
||||
"TaskName": "",
|
||||
"SuccessBeforePassing": 0,
|
||||
"FailuresBeforeCritical": 0,
|
||||
"FailuresBeforeWarning": 0,
|
||||
"Body": "",
|
||||
"OnUpdate": "require_healthy"
|
||||
}
|
||||
],
|
||||
"Connect": null,
|
||||
"Meta": null,
|
||||
"CanaryMeta": null,
|
||||
"TaggedAddresses": null,
|
||||
"Namespace": "default",
|
||||
"OnUpdate": "require_healthy",
|
||||
"Provider": "consul",
|
||||
"Cluster": "default",
|
||||
"Identity": null
|
||||
}
|
||||
],
|
||||
"Volumes": null,
|
||||
"ShutdownDelay": null,
|
||||
"StopAfterClientDisconnect": null,
|
||||
"MaxClientDisconnect": null,
|
||||
"PreventRescheduleOnLost": false
|
||||
}
|
||||
],
|
||||
"Update": {
|
||||
"Stagger": 30000000000,
|
||||
"MaxParallel": 1,
|
||||
"HealthCheck": "",
|
||||
"MinHealthyTime": 0,
|
||||
"HealthyDeadline": 0,
|
||||
"ProgressDeadline": 0,
|
||||
"AutoRevert": false,
|
||||
"AutoPromote": false,
|
||||
"Canary": 0
|
||||
},
|
||||
"Multiregion": null,
|
||||
"Periodic": null,
|
||||
"ParameterizedJob": null,
|
||||
"Dispatched": false,
|
||||
"DispatchIdempotencyToken": "",
|
||||
"Payload": null,
|
||||
"Meta": null,
|
||||
"ConsulToken": "",
|
||||
"ConsulNamespace": "",
|
||||
"VaultToken": "",
|
||||
"VaultNamespace": "",
|
||||
"NomadTokenID": "",
|
||||
"Status": "dead",
|
||||
"StatusDescription": "",
|
||||
"Stable": true,
|
||||
"Version": 4,
|
||||
"SubmitTime": 1740554361561458507,
|
||||
"CreateIndex": 3415698,
|
||||
"ModifyIndex": 3416318,
|
||||
"JobModifyIndex": 3416317
|
||||
}
|
||||
}
|
182
nomad_job_api_docs.md
Normal file
182
nomad_job_api_docs.md
Normal file
@ -0,0 +1,182 @@
|
||||
# Nomad Job Management API Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This document outlines the process for managing jobs (starting, stopping, and monitoring) in Hashicorp Nomad via its HTTP API. These operations are essential for deploying, updating, and terminating workloads in a Nomad cluster.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- A running Nomad cluster
|
||||
- Network access to the Nomad API endpoint (default port 4646)
|
||||
- Proper authentication credentials (if ACLs are enabled)
|
||||
|
||||
## API Basics
|
||||
|
||||
- Base URL: `http://<nomad-server>:4646`
|
||||
- API Version: `v1`
|
||||
- Content Type: `application/json`
|
||||
|
||||
## Job Lifecycle
|
||||
|
||||
A Nomad job goes through multiple states during its lifecycle:
|
||||
|
||||
1. **Pending**: The job has been submitted but not yet scheduled
|
||||
2. **Running**: The job is active and its tasks are running
|
||||
3. **Dead**: The job has been stopped or failed
|
||||
|
||||
## Job Management Operations
|
||||
|
||||
### 1. List Jobs
|
||||
|
||||
List all jobs in a namespace to get an overview of the cluster's workloads.
|
||||
|
||||
```
|
||||
GET /v1/jobs?namespace=<namespace>
|
||||
```
|
||||
|
||||
Example PowerShell command:
|
||||
```powershell
|
||||
Invoke-RestMethod -Uri "http://nomad-server:4646/v1/jobs?namespace=development" -Method GET
|
||||
```
|
||||
|
||||
### 2. Starting a Job
|
||||
|
||||
Starting a job in Nomad involves registering a job specification with the API server.
|
||||
|
||||
```
|
||||
POST /v1/jobs
|
||||
```
|
||||
|
||||
With a job specification in the request body:
|
||||
|
||||
```json
|
||||
{
|
||||
"Job": {
|
||||
"ID": "example-job",
|
||||
"Name": "example-job",
|
||||
"Namespace": "development",
|
||||
"Type": "service",
|
||||
"Datacenters": ["dc1"],
|
||||
"TaskGroups": [
|
||||
{
|
||||
"Name": "app",
|
||||
"Count": 1,
|
||||
"Tasks": [
|
||||
{
|
||||
"Name": "server",
|
||||
"Driver": "docker",
|
||||
"Config": {
|
||||
"image": "nginx:latest"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Example PowerShell command:
|
||||
```powershell
|
||||
$jobSpec = @{
|
||||
Job = @{
|
||||
ID = "example-job"
|
||||
# ... other job properties
|
||||
}
|
||||
} | ConvertTo-Json -Depth 20
|
||||
|
||||
Invoke-RestMethod -Uri "http://nomad-server:4646/v1/jobs" -Method POST -Body $jobSpec -ContentType "application/json"
|
||||
```
|
||||
|
||||
To start an existing (stopped) job:
|
||||
1. Retrieve the job specification with `GET /v1/job/<job_id>?namespace=<namespace>`
|
||||
2. Set `Stop = false` in the job specification
|
||||
3. Submit the modified spec with `POST /v1/jobs`
|
||||
|
||||
### 3. Stopping a Job
|
||||
|
||||
Stopping a job is simpler and requires a DELETE request:
|
||||
|
||||
```
|
||||
DELETE /v1/job/<job_id>?namespace=<namespace>
|
||||
```
|
||||
|
||||
This marks the job for stopping but preserves its configuration in Nomad.
|
||||
|
||||
Example PowerShell command:
|
||||
```powershell
|
||||
Invoke-RestMethod -Uri "http://nomad-server:4646/v1/job/example-job?namespace=development" -Method DELETE
|
||||
```
|
||||
|
||||
Optional parameters:
|
||||
- `purge=true` - Completely removes the job from Nomad's state
|
||||
|
||||
### 4. Reading Job Status
|
||||
|
||||
To check the status of a job:
|
||||
|
||||
```
|
||||
GET /v1/job/<job_id>?namespace=<namespace>
|
||||
```
|
||||
|
||||
This returns detailed information about the job, including:
|
||||
- Current status (`running`, `pending`, `dead`)
|
||||
- Task group count and health
|
||||
- Version information
|
||||
|
||||
Example PowerShell command:
|
||||
```powershell
|
||||
Invoke-RestMethod -Uri "http://nomad-server:4646/v1/job/example-job?namespace=development" -Method GET
|
||||
```
|
||||
|
||||
### 5. Reading Job Allocations
|
||||
|
||||
To see all allocations (instances) of a job:
|
||||
|
||||
```
|
||||
GET /v1/job/<job_id>/allocations?namespace=<namespace>
|
||||
```
|
||||
|
||||
This returns information about where the job is running and in what state.
|
||||
|
||||
Example PowerShell command:
|
||||
```powershell
|
||||
Invoke-RestMethod -Uri "http://nomad-server:4646/v1/job/example-job/allocations?namespace=development" -Method GET
|
||||
```
|
||||
|
||||
## Common Issues and Troubleshooting
|
||||
|
||||
### Namespace Issues
|
||||
|
||||
Nomad requires specifying the correct namespace when managing jobs. If not specified, operations will default to the "default" namespace, which may not contain your jobs.
|
||||
|
||||
### Job Specification Formatting
|
||||
|
||||
When starting a job, ensure the job specification is properly wrapped in a "Job" object:
|
||||
|
||||
```json
|
||||
{
|
||||
"Job": {
|
||||
// job details go here
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Error Codes
|
||||
|
||||
- **400**: Bad request, often due to malformed job specification
|
||||
- **403**: Permission denied, check ACL tokens
|
||||
- **404**: Job not found, verify job ID and namespace
|
||||
- **500**: Server error, check Nomad server logs
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. Always specify the namespace explicitly in API calls
|
||||
2. Use the job's existing specification when updating, to avoid losing configuration
|
||||
3. Log API responses to aid in troubleshooting
|
||||
4. Implement proper error handling for API failures
|
||||
5. Consider using official client libraries instead of direct API calls when possible
|
||||
|
||||
## Conclusion
|
||||
|
||||
The Nomad HTTP API provides a robust interface for job lifecycle management. Understanding these API workflows is crucial for building reliable automation and integration with Nomad clusters.
|
79
nomad_mcp_job.nomad
Normal file
79
nomad_mcp_job.nomad
Normal file
@ -0,0 +1,79 @@
|
||||
job "nomad-mcp" {
|
||||
datacenters = ["jm"]
|
||||
type = "service"
|
||||
namespace = "development"
|
||||
|
||||
group "app" {
|
||||
count = 1
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
to = 8000
|
||||
}
|
||||
}
|
||||
|
||||
task "nomad-mcp" {
|
||||
driver = "docker"
|
||||
|
||||
config {
|
||||
image = "registry.dev.meisheng.group/nomad_mcp:20250226"
|
||||
ports = ["http"]
|
||||
command = "python"
|
||||
args = ["-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
}
|
||||
|
||||
env {
|
||||
# Nomad connection settings
|
||||
NOMAD_ADDR = "http://pjmldk01.ds.meisheng.group:4646"
|
||||
NOMAD_NAMESPACE = "development"
|
||||
NOMAD_SKIP_VERIFY = "true"
|
||||
|
||||
# API settings
|
||||
PORT = "8000"
|
||||
HOST = "0.0.0.0"
|
||||
|
||||
# Logging level
|
||||
LOG_LEVEL = "INFO"
|
||||
|
||||
# Enable to make development easier
|
||||
RELOAD = "true"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 200
|
||||
memory = 256
|
||||
}
|
||||
|
||||
service {
|
||||
name = "nomad-mcp"
|
||||
port = "http"
|
||||
tags = [
|
||||
"traefik.enable=true",
|
||||
"traefik.http.routers.nomad-mcp.entryPoints=https",
|
||||
"traefik.http.routers.nomad-mcp.rule=Host(`nomad_mcp.dev.meisheng.group`)",
|
||||
"traefik.http.routers.nomad-mcp.middlewares=proxyheaders@consulcatalog"
|
||||
]
|
||||
|
||||
check {
|
||||
type = "http"
|
||||
path = "/api/health"
|
||||
interval = "10s"
|
||||
timeout = "2s"
|
||||
|
||||
check_restart {
|
||||
limit = 3
|
||||
grace = "60s"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Define update strategy
|
||||
update {
|
||||
max_parallel = 1
|
||||
min_healthy_time = "30s"
|
||||
healthy_deadline = "5m"
|
||||
auto_revert = true
|
||||
}
|
||||
}
|
9
requirements.txt
Normal file
9
requirements.txt
Normal file
@ -0,0 +1,9 @@
|
||||
fastapi
|
||||
uvicorn
|
||||
python-nomad
|
||||
pydantic
|
||||
python-dotenv
|
||||
httpx
|
||||
python-multipart
|
||||
pyyaml
|
||||
requests
|
23
run.py
Normal file
23
run.py
Normal file
@ -0,0 +1,23 @@
|
||||
#!/usr/bin/env python
|
||||
import uvicorn
|
||||
import os
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load environment variables
|
||||
load_dotenv()
|
||||
|
||||
# Configuration from environment
|
||||
host = os.getenv("HOST", "0.0.0.0")
|
||||
port = int(os.getenv("PORT", "8000"))
|
||||
reload = os.getenv("RELOAD", "false").lower() == "true"
|
||||
|
||||
if __name__ == "__main__":
|
||||
print(f"Starting Nomad MCP service on {host}:{port}")
|
||||
print(f"API documentation available at http://{host}:{port}/docs")
|
||||
|
||||
uvicorn.run(
|
||||
"app.main:app",
|
||||
host=host,
|
||||
port=port,
|
||||
reload=reload,
|
||||
)
|
355
static/app.js
Normal file
355
static/app.js
Normal file
@ -0,0 +1,355 @@
|
||||
// API endpoints
|
||||
const API_BASE_URL = '/api/claude';
|
||||
const ENDPOINTS = {
|
||||
listJobs: `${API_BASE_URL}/list-jobs`,
|
||||
manageJob: `${API_BASE_URL}/jobs`,
|
||||
jobLogs: `${API_BASE_URL}/job-logs`
|
||||
};
|
||||
|
||||
// DOM elements
|
||||
const elements = {
|
||||
namespaceSelector: document.getElementById('namespace-selector'),
|
||||
refreshBtn: document.getElementById('refresh-btn'),
|
||||
jobList: document.getElementById('job-list'),
|
||||
jobTable: document.getElementById('job-table'),
|
||||
jobDetails: document.getElementById('job-details'),
|
||||
logContent: document.getElementById('log-content'),
|
||||
logTabs: document.querySelectorAll('.log-tab'),
|
||||
loading: document.getElementById('loading'),
|
||||
errorMessage: document.getElementById('error-message')
|
||||
};
|
||||
|
||||
// State
|
||||
let state = {
|
||||
jobs: [],
|
||||
selectedJob: null,
|
||||
selectedNamespace: 'development',
|
||||
logs: {
|
||||
stdout: '',
|
||||
stderr: '',
|
||||
currentTab: 'stdout'
|
||||
}
|
||||
};
|
||||
|
||||
// Initialize the app
|
||||
function init() {
|
||||
// Set up event listeners
|
||||
elements.namespaceSelector.addEventListener('change', handleNamespaceChange);
|
||||
elements.refreshBtn.addEventListener('click', loadJobs);
|
||||
elements.logTabs.forEach(tab => {
|
||||
tab.addEventListener('click', () => {
|
||||
const logType = tab.getAttribute('data-log-type');
|
||||
switchLogTab(logType);
|
||||
});
|
||||
});
|
||||
|
||||
// Load initial jobs
|
||||
loadJobs();
|
||||
}
|
||||
|
||||
// Load jobs from the API
|
||||
async function loadJobs() {
|
||||
showLoading(true);
|
||||
hideError();
|
||||
|
||||
try {
|
||||
const namespace = elements.namespaceSelector.value;
|
||||
const response = await fetch(`${ENDPOINTS.listJobs}?namespace=${namespace}`);
|
||||
|
||||
if (!response.ok) {
|
||||
throw new Error(`Failed to load jobs: ${response.statusText}`);
|
||||
}
|
||||
|
||||
const jobs = await response.json();
|
||||
state.jobs = jobs;
|
||||
state.selectedNamespace = namespace;
|
||||
|
||||
renderJobList();
|
||||
showLoading(false);
|
||||
} catch (error) {
|
||||
console.error('Error loading jobs:', error);
|
||||
showError(`Failed to load jobs: ${error.message}`);
|
||||
showLoading(false);
|
||||
}
|
||||
}
|
||||
|
||||
// Render the job list
|
||||
function renderJobList() {
|
||||
elements.jobList.innerHTML = '';
|
||||
|
||||
if (state.jobs.length === 0) {
|
||||
const row = document.createElement('tr');
|
||||
row.innerHTML = `<td colspan="4" class="no-jobs">No jobs found in the ${state.selectedNamespace} namespace</td>`;
|
||||
elements.jobList.appendChild(row);
|
||||
return;
|
||||
}
|
||||
|
||||
state.jobs.forEach(job => {
|
||||
const row = document.createElement('tr');
|
||||
row.setAttribute('data-job-id', job.id);
|
||||
row.innerHTML = `
|
||||
<td>${job.id}</td>
|
||||
<td>${job.type}</td>
|
||||
<td><span class="status status-${job.status.toLowerCase()}">${job.status}</span></td>
|
||||
<td class="job-actions">
|
||||
<button class="btn btn-primary btn-view" data-job-id="${job.id}">View</button>
|
||||
<button class="btn btn-success btn-restart" data-job-id="${job.id}">Restart</button>
|
||||
<button class="btn btn-danger btn-stop" data-job-id="${job.id}">Stop</button>
|
||||
</td>
|
||||
`;
|
||||
|
||||
elements.jobList.appendChild(row);
|
||||
});
|
||||
|
||||
// Add event listeners to buttons
|
||||
document.querySelectorAll('.btn-view').forEach(btn => {
|
||||
btn.addEventListener('click', () => viewJob(btn.getAttribute('data-job-id')));
|
||||
});
|
||||
|
||||
document.querySelectorAll('.btn-restart').forEach(btn => {
|
||||
btn.addEventListener('click', () => restartJob(btn.getAttribute('data-job-id')));
|
||||
});
|
||||
|
||||
document.querySelectorAll('.btn-stop').forEach(btn => {
|
||||
btn.addEventListener('click', () => stopJob(btn.getAttribute('data-job-id')));
|
||||
});
|
||||
}
|
||||
|
||||
// View job details
|
||||
async function viewJob(jobId) {
|
||||
showLoading(true);
|
||||
|
||||
try {
|
||||
// Get job status
|
||||
const statusResponse = await fetch(ENDPOINTS.manageJob, {
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Content-Type': 'application/json'
|
||||
},
|
||||
body: JSON.stringify({
|
||||
job_id: jobId,
|
||||
action: 'status',
|
||||
namespace: state.selectedNamespace
|
||||
})
|
||||
});
|
||||
|
||||
if (!statusResponse.ok) {
|
||||
throw new Error(`Failed to get job status: ${statusResponse.statusText}`);
|
||||
}
|
||||
|
||||
const jobStatus = await statusResponse.json();
|
||||
state.selectedJob = jobStatus;
|
||||
|
||||
// Get job logs
|
||||
const logsResponse = await fetch(`${ENDPOINTS.jobLogs}/${jobId}?namespace=${state.selectedNamespace}`);
|
||||
|
||||
if (logsResponse.ok) {
|
||||
const logsData = await logsResponse.json();
|
||||
|
||||
if (logsData.success) {
|
||||
state.logs.stdout = logsData.logs.stdout || 'No stdout logs available';
|
||||
state.logs.stderr = logsData.logs.stderr || 'No stderr logs available';
|
||||
} else {
|
||||
state.logs.stdout = 'Logs not available';
|
||||
state.logs.stderr = 'Logs not available';
|
||||
}
|
||||
} else {
|
||||
state.logs.stdout = 'Failed to load logs';
|
||||
state.logs.stderr = 'Failed to load logs';
|
||||
}
|
||||
|
||||
renderJobDetails();
|
||||
renderLogs();
|
||||
showLoading(false);
|
||||
|
||||
// Highlight the selected job in the table
|
||||
document.querySelectorAll('#job-list tr').forEach(row => {
|
||||
row.classList.remove('selected');
|
||||
});
|
||||
|
||||
const selectedRow = document.querySelector(`#job-list tr[data-job-id="${jobId}"]`);
|
||||
if (selectedRow) {
|
||||
selectedRow.classList.add('selected');
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error viewing job:', error);
|
||||
showError(`Failed to view job: ${error.message}`);
|
||||
showLoading(false);
|
||||
}
|
||||
}
|
||||
|
||||
// Restart a job
|
||||
async function restartJob(jobId) {
|
||||
if (!confirm(`Are you sure you want to restart job "${jobId}"?`)) {
|
||||
return;
|
||||
}
|
||||
|
||||
showLoading(true);
|
||||
|
||||
try {
|
||||
const response = await fetch(ENDPOINTS.manageJob, {
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Content-Type': 'application/json'
|
||||
},
|
||||
body: JSON.stringify({
|
||||
job_id: jobId,
|
||||
action: 'restart',
|
||||
namespace: state.selectedNamespace
|
||||
})
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
throw new Error(`Failed to restart job: ${response.statusText}`);
|
||||
}
|
||||
|
||||
const result = await response.json();
|
||||
|
||||
if (result.success) {
|
||||
alert(`Job "${jobId}" has been restarted successfully.`);
|
||||
loadJobs();
|
||||
} else {
|
||||
throw new Error(result.message);
|
||||
}
|
||||
|
||||
showLoading(false);
|
||||
} catch (error) {
|
||||
console.error('Error restarting job:', error);
|
||||
showError(`Failed to restart job: ${error.message}`);
|
||||
showLoading(false);
|
||||
}
|
||||
}
|
||||
|
||||
// Stop a job
|
||||
async function stopJob(jobId) {
|
||||
const purge = confirm(`Do you want to purge job "${jobId}" after stopping?`);
|
||||
|
||||
if (!confirm(`Are you sure you want to stop job "${jobId}"?`)) {
|
||||
return;
|
||||
}
|
||||
|
||||
showLoading(true);
|
||||
|
||||
try {
|
||||
const response = await fetch(ENDPOINTS.manageJob, {
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Content-Type': 'application/json'
|
||||
},
|
||||
body: JSON.stringify({
|
||||
job_id: jobId,
|
||||
action: 'stop',
|
||||
namespace: state.selectedNamespace,
|
||||
purge: purge
|
||||
})
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
throw new Error(`Failed to stop job: ${response.statusText}`);
|
||||
}
|
||||
|
||||
const result = await response.json();
|
||||
|
||||
if (result.success) {
|
||||
alert(`Job "${jobId}" has been stopped${purge ? ' and purged' : ''} successfully.`);
|
||||
loadJobs();
|
||||
} else {
|
||||
throw new Error(result.message);
|
||||
}
|
||||
|
||||
showLoading(false);
|
||||
} catch (error) {
|
||||
console.error('Error stopping job:', error);
|
||||
showError(`Failed to stop job: ${error.message}`);
|
||||
showLoading(false);
|
||||
}
|
||||
}
|
||||
|
||||
// Render job details
|
||||
function renderJobDetails() {
|
||||
if (!state.selectedJob) {
|
||||
elements.jobDetails.innerHTML = '<p class="select-job-message">Select a job to view details</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
const job = state.selectedJob;
|
||||
const details = job.details?.job || {};
|
||||
const allocation = job.details?.latest_allocation || {};
|
||||
|
||||
let detailsHtml = `
|
||||
<h3>${job.job_id}</h3>
|
||||
<p><span class="label">Status:</span> <span class="status status-${job.status.toLowerCase()}">${job.status}</span></p>
|
||||
`;
|
||||
|
||||
if (details.Type) {
|
||||
detailsHtml += `<p><span class="label">Type:</span> ${details.Type}</p>`;
|
||||
}
|
||||
|
||||
if (details.Namespace) {
|
||||
detailsHtml += `<p><span class="label">Namespace:</span> ${details.Namespace}</p>`;
|
||||
}
|
||||
|
||||
if (details.Datacenters) {
|
||||
detailsHtml += `<p><span class="label">Datacenters:</span> ${details.Datacenters.join(', ')}</p>`;
|
||||
}
|
||||
|
||||
if (allocation.ID) {
|
||||
detailsHtml += `
|
||||
<h3>Latest Allocation</h3>
|
||||
<p><span class="label">ID:</span> ${allocation.ID}</p>
|
||||
<p><span class="label">Status:</span> ${allocation.ClientStatus || 'Unknown'}</p>
|
||||
`;
|
||||
|
||||
if (allocation.ClientDescription) {
|
||||
detailsHtml += `<p><span class="label">Description:</span> ${allocation.ClientDescription}</p>`;
|
||||
}
|
||||
}
|
||||
|
||||
elements.jobDetails.innerHTML = detailsHtml;
|
||||
}
|
||||
|
||||
// Render logs
|
||||
function renderLogs() {
|
||||
elements.logContent.textContent = state.logs[state.logs.currentTab];
|
||||
}
|
||||
|
||||
// Switch log tab
|
||||
function switchLogTab(logType) {
|
||||
state.logs.currentTab = logType;
|
||||
|
||||
// Update active tab
|
||||
elements.logTabs.forEach(tab => {
|
||||
if (tab.getAttribute('data-log-type') === logType) {
|
||||
tab.classList.add('active');
|
||||
} else {
|
||||
tab.classList.remove('active');
|
||||
}
|
||||
});
|
||||
|
||||
renderLogs();
|
||||
}
|
||||
|
||||
// Handle namespace change
|
||||
function handleNamespaceChange() {
|
||||
loadJobs();
|
||||
}
|
||||
|
||||
// Show/hide loading indicator
|
||||
function showLoading(show) {
|
||||
elements.loading.style.display = show ? 'block' : 'none';
|
||||
elements.jobTable.style.display = show ? 'none' : 'table';
|
||||
}
|
||||
|
||||
// Show error message
|
||||
function showError(message) {
|
||||
elements.errorMessage.textContent = message;
|
||||
elements.errorMessage.style.display = 'block';
|
||||
}
|
||||
|
||||
// Hide error message
|
||||
function hideError() {
|
||||
elements.errorMessage.style.display = 'none';
|
||||
}
|
||||
|
||||
// Initialize the app when the DOM is loaded
|
||||
document.addEventListener('DOMContentLoaded', init);
|
66
static/index.html
Normal file
66
static/index.html
Normal file
@ -0,0 +1,66 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Nomad Job Manager</title>
|
||||
<link rel="stylesheet" href="styles.css">
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<header>
|
||||
<h1>Nomad Job Manager</h1>
|
||||
<div class="controls">
|
||||
<select id="namespace-selector">
|
||||
<option value="development">development</option>
|
||||
<option value="default">default</option>
|
||||
<option value="system">system</option>
|
||||
</select>
|
||||
<button id="refresh-btn" class="btn btn-primary">Refresh</button>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
<main>
|
||||
<div class="job-list-container">
|
||||
<h2>Jobs</h2>
|
||||
<div id="loading" class="loading">Loading jobs...</div>
|
||||
<div id="error-message" class="error-message"></div>
|
||||
<table id="job-table" class="job-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Job ID</th>
|
||||
<th>Type</th>
|
||||
<th>Status</th>
|
||||
<th>Actions</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="job-list">
|
||||
<!-- Jobs will be populated here -->
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
|
||||
<div class="job-details-container">
|
||||
<h2>Job Details</h2>
|
||||
<div id="job-details">
|
||||
<p class="select-job-message">Select a job to view details</p>
|
||||
</div>
|
||||
<div id="job-logs" class="job-logs">
|
||||
<h3>Logs</h3>
|
||||
<div class="log-tabs">
|
||||
<button class="log-tab active" data-log-type="stdout">stdout</button>
|
||||
<button class="log-tab" data-log-type="stderr">stderr</button>
|
||||
</div>
|
||||
<pre id="log-content" class="log-content">Select a job to view logs</pre>
|
||||
</div>
|
||||
</div>
|
||||
</main>
|
||||
|
||||
<footer>
|
||||
<p>Nomad MCP Service - Claude Integration</p>
|
||||
</footer>
|
||||
</div>
|
||||
|
||||
<script src="app.js"></script>
|
||||
</body>
|
||||
</html>
|
244
static/styles.css
Normal file
244
static/styles.css
Normal file
@ -0,0 +1,244 @@
|
||||
/* Base styles */
|
||||
:root {
|
||||
--primary-color: #1976d2;
|
||||
--secondary-color: #424242;
|
||||
--success-color: #4caf50;
|
||||
--danger-color: #f44336;
|
||||
--warning-color: #ff9800;
|
||||
--light-gray: #f5f5f5;
|
||||
--border-color: #e0e0e0;
|
||||
--text-color: #333;
|
||||
--text-light: #666;
|
||||
}
|
||||
|
||||
* {
|
||||
box-sizing: border-box;
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
}
|
||||
|
||||
body {
|
||||
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
|
||||
line-height: 1.6;
|
||||
color: var(--text-color);
|
||||
background-color: #f9f9f9;
|
||||
}
|
||||
|
||||
.container {
|
||||
max-width: 1200px;
|
||||
margin: 0 auto;
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
/* Header */
|
||||
header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
margin-bottom: 20px;
|
||||
padding-bottom: 10px;
|
||||
border-bottom: 1px solid var(--border-color);
|
||||
}
|
||||
|
||||
.controls {
|
||||
display: flex;
|
||||
gap: 10px;
|
||||
}
|
||||
|
||||
/* Buttons */
|
||||
.btn {
|
||||
padding: 8px 16px;
|
||||
border: none;
|
||||
border-radius: 4px;
|
||||
cursor: pointer;
|
||||
font-weight: 500;
|
||||
transition: background-color 0.2s;
|
||||
}
|
||||
|
||||
.btn-primary {
|
||||
background-color: var(--primary-color);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.btn-success {
|
||||
background-color: var(--success-color);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.btn-danger {
|
||||
background-color: var(--danger-color);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.btn-warning {
|
||||
background-color: var(--warning-color);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.btn:hover {
|
||||
opacity: 0.9;
|
||||
}
|
||||
|
||||
/* Form elements */
|
||||
select {
|
||||
padding: 8px;
|
||||
border: 1px solid var(--border-color);
|
||||
border-radius: 4px;
|
||||
background-color: white;
|
||||
}
|
||||
|
||||
/* Main content */
|
||||
main {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
gap: 20px;
|
||||
}
|
||||
|
||||
/* Job list */
|
||||
.job-list-container {
|
||||
background-color: white;
|
||||
border-radius: 8px;
|
||||
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
.job-table {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
margin-top: 10px;
|
||||
}
|
||||
|
||||
.job-table th,
|
||||
.job-table td {
|
||||
padding: 12px;
|
||||
text-align: left;
|
||||
border-bottom: 1px solid var(--border-color);
|
||||
}
|
||||
|
||||
.job-table th {
|
||||
background-color: var(--light-gray);
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.job-table tr:hover {
|
||||
background-color: var(--light-gray);
|
||||
}
|
||||
|
||||
.job-actions {
|
||||
display: flex;
|
||||
gap: 5px;
|
||||
}
|
||||
|
||||
/* Job details */
|
||||
.job-details-container {
|
||||
background-color: white;
|
||||
border-radius: 8px;
|
||||
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
.job-details {
|
||||
margin-bottom: 20px;
|
||||
}
|
||||
|
||||
.job-details h3 {
|
||||
margin-top: 15px;
|
||||
margin-bottom: 5px;
|
||||
color: var(--secondary-color);
|
||||
}
|
||||
|
||||
.job-details p {
|
||||
margin-bottom: 10px;
|
||||
}
|
||||
|
||||
.job-details .label {
|
||||
font-weight: 600;
|
||||
color: var(--text-light);
|
||||
}
|
||||
|
||||
/* Logs */
|
||||
.job-logs {
|
||||
margin-top: 20px;
|
||||
}
|
||||
|
||||
.log-tabs {
|
||||
display: flex;
|
||||
margin-bottom: 10px;
|
||||
}
|
||||
|
||||
.log-tab {
|
||||
padding: 8px 16px;
|
||||
background-color: var(--light-gray);
|
||||
border: 1px solid var(--border-color);
|
||||
border-bottom: none;
|
||||
cursor: pointer;
|
||||
}
|
||||
|
||||
.log-tab.active {
|
||||
background-color: white;
|
||||
border-bottom: 2px solid var(--primary-color);
|
||||
}
|
||||
|
||||
.log-content {
|
||||
background-color: #282c34;
|
||||
color: #abb2bf;
|
||||
padding: 15px;
|
||||
border-radius: 4px;
|
||||
overflow: auto;
|
||||
max-height: 300px;
|
||||
font-family: 'Courier New', Courier, monospace;
|
||||
white-space: pre-wrap;
|
||||
}
|
||||
|
||||
/* Status indicators */
|
||||
.status {
|
||||
display: inline-block;
|
||||
padding: 4px 8px;
|
||||
border-radius: 4px;
|
||||
font-size: 0.85em;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.status-running {
|
||||
background-color: rgba(76, 175, 80, 0.2);
|
||||
color: #2e7d32;
|
||||
}
|
||||
|
||||
.status-pending {
|
||||
background-color: rgba(255, 152, 0, 0.2);
|
||||
color: #ef6c00;
|
||||
}
|
||||
|
||||
.status-dead {
|
||||
background-color: rgba(244, 67, 54, 0.2);
|
||||
color: #c62828;
|
||||
}
|
||||
|
||||
/* Loading and error states */
|
||||
.loading {
|
||||
padding: 20px;
|
||||
text-align: center;
|
||||
color: var(--text-light);
|
||||
}
|
||||
|
||||
.error-message {
|
||||
padding: 10px;
|
||||
background-color: rgba(244, 67, 54, 0.1);
|
||||
color: var(--danger-color);
|
||||
border-radius: 4px;
|
||||
margin: 10px 0;
|
||||
display: none;
|
||||
}
|
||||
|
||||
.select-job-message {
|
||||
color: var(--text-light);
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
/* Footer */
|
||||
footer {
|
||||
margin-top: 40px;
|
||||
text-align: center;
|
||||
color: var(--text-light);
|
||||
font-size: 0.9em;
|
||||
}
|
123
test_direct_nomad.py
Normal file
123
test_direct_nomad.py
Normal file
@ -0,0 +1,123 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Test script to directly use the Nomad client library.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import uuid
|
||||
import nomad
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
def get_test_job_spec(job_id):
|
||||
"""Create a simple test job specification."""
|
||||
return {
|
||||
"Job": {
|
||||
"ID": job_id,
|
||||
"Name": job_id,
|
||||
"Type": "service",
|
||||
"Datacenters": ["jm"],
|
||||
"Namespace": "development",
|
||||
"Priority": 50,
|
||||
"TaskGroups": [
|
||||
{
|
||||
"Name": "app",
|
||||
"Count": 1,
|
||||
"Tasks": [
|
||||
{
|
||||
"Name": "nginx",
|
||||
"Driver": "docker",
|
||||
"Config": {
|
||||
"image": "nginx:latest",
|
||||
"ports": ["http"],
|
||||
},
|
||||
"Resources": {
|
||||
"CPU": 100,
|
||||
"MemoryMB": 128
|
||||
}
|
||||
}
|
||||
],
|
||||
"Networks": [
|
||||
{
|
||||
"DynamicPorts": [
|
||||
{
|
||||
"Label": "http",
|
||||
"Value": 0,
|
||||
"To": 80
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
def main():
|
||||
print("Testing direct Nomad client...")
|
||||
|
||||
# Check if NOMAD_ADDR is configured
|
||||
nomad_addr = os.getenv("NOMAD_ADDR")
|
||||
if not nomad_addr:
|
||||
print("Error: NOMAD_ADDR is not configured in .env file.")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Connecting to Nomad at: {nomad_addr}")
|
||||
|
||||
try:
|
||||
# Extract host and port from the address
|
||||
host_with_port = nomad_addr.replace("http://", "").replace("https://", "")
|
||||
host = host_with_port.split(":")[0]
|
||||
|
||||
# Safely extract port
|
||||
port_part = host_with_port.split(":")[-1] if ":" in host_with_port else "4646"
|
||||
port = int(port_part.split('/')[0]) # Remove any path components
|
||||
|
||||
# Initialize the Nomad client
|
||||
client = nomad.Nomad(
|
||||
host=host,
|
||||
port=port,
|
||||
secure=nomad_addr.startswith("https"),
|
||||
timeout=10,
|
||||
namespace="development", # Set namespace explicitly
|
||||
verify=False
|
||||
)
|
||||
|
||||
# Create a unique job ID for testing
|
||||
job_id = f"test-job-{uuid.uuid4().hex[:8]}"
|
||||
print(f"Created test job ID: {job_id}")
|
||||
|
||||
# Create job specification
|
||||
job_spec = get_test_job_spec(job_id)
|
||||
print("Created job specification with explicit namespace: development")
|
||||
|
||||
# Start the job
|
||||
print(f"Attempting to start job {job_id}...")
|
||||
|
||||
# Print the job spec for debugging
|
||||
print(f"Job spec structure: {list(job_spec.keys())}")
|
||||
print(f"Job keys: {list(job_spec['Job'].keys())}")
|
||||
|
||||
# Register the job
|
||||
response = client.job.register_job(job_id, job_spec)
|
||||
|
||||
print(f"Job registration response: {response}")
|
||||
print(f"Job {job_id} started successfully!")
|
||||
|
||||
# Clean up - stop the job
|
||||
print(f"Stopping job {job_id}...")
|
||||
stop_response = client.job.deregister_job(job_id, purge=True)
|
||||
print(f"Job stop response: {stop_response}")
|
||||
print(f"Job {job_id} stopped and purged successfully!")
|
||||
|
||||
print("\nDirect Nomad client test completed successfully.")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error during direct Nomad client test: {str(e)}")
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
90
test_gitea_integration.py
Normal file
90
test_gitea_integration.py
Normal file
@ -0,0 +1,90 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Test script to verify Gitea integration with Nomad MCP.
|
||||
This script tests the basic functionality of the Gitea client.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from dotenv import load_dotenv
|
||||
from app.services.gitea_client import GiteaClient
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
def main():
|
||||
print("Testing Gitea integration with Nomad MCP...")
|
||||
|
||||
# Check if Gitea API URL is configured
|
||||
gitea_api_url = os.getenv("GITEA_API_URL")
|
||||
if not gitea_api_url:
|
||||
print("Error: GITEA_API_URL is not configured in .env file.")
|
||||
print("Please configure the Gitea API URL and try again.")
|
||||
sys.exit(1)
|
||||
|
||||
# Check if authentication is configured
|
||||
gitea_token = os.getenv("GITEA_API_TOKEN")
|
||||
gitea_username = os.getenv("GITEA_USERNAME")
|
||||
gitea_password = os.getenv("GITEA_PASSWORD")
|
||||
|
||||
if not gitea_token and not (gitea_username and gitea_password):
|
||||
print("Warning: No authentication configured for Gitea API.")
|
||||
print("You might not be able to access protected repositories.")
|
||||
|
||||
# Initialize the Gitea client
|
||||
gitea_client = GiteaClient()
|
||||
|
||||
# Test listing repositories
|
||||
print("\nTesting repository listing...")
|
||||
repositories = gitea_client.list_repositories(limit=5)
|
||||
|
||||
if not repositories:
|
||||
print("No repositories found or error occurred.")
|
||||
else:
|
||||
print(f"Found {len(repositories)} repositories:")
|
||||
for repo in repositories:
|
||||
print(f" - {repo.get('full_name')}: {repo.get('html_url')}")
|
||||
|
||||
# Test parsing repository URLs
|
||||
print("\nTesting repository URL parsing...")
|
||||
test_urls = [
|
||||
f"{gitea_api_url.replace('/api/v1', '')}/username/repo-name",
|
||||
"http://gitea.internal.example.com/org/project",
|
||||
"https://gitea.example.com/user/repository",
|
||||
]
|
||||
|
||||
for url in test_urls:
|
||||
try:
|
||||
owner, repo = gitea_client.parse_repo_url(url)
|
||||
print(f" {url} -> Owner: {owner}, Repo: {repo}")
|
||||
except ValueError as e:
|
||||
print(f" {url} -> Error: {str(e)}")
|
||||
|
||||
# If we have repositories, test getting repository info for the first one
|
||||
if repositories:
|
||||
print("\nTesting repository info retrieval...")
|
||||
first_repo = repositories[0]
|
||||
repo_url = first_repo.get("html_url")
|
||||
|
||||
repo_info = gitea_client.get_repository_info(repo_url)
|
||||
if repo_info:
|
||||
print(f"Repository info for {repo_url}:")
|
||||
print(f" Name: {repo_info.get('name')}")
|
||||
print(f" Description: {repo_info.get('description')}")
|
||||
print(f" Default branch: {repo_info.get('default_branch')}")
|
||||
print(f" Stars: {repo_info.get('stars_count')}")
|
||||
print(f" Forks: {repo_info.get('forks_count')}")
|
||||
|
||||
# Test getting branches
|
||||
branches = gitea_client.get_repository_branches(repo_url)
|
||||
if branches:
|
||||
print(f" Branches: {', '.join([b.get('name') for b in branches])}")
|
||||
else:
|
||||
print(" No branches found or error occurred.")
|
||||
else:
|
||||
print(f"Error retrieving repository info for {repo_url}")
|
||||
|
||||
print("\nGitea integration test completed.")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
54
test_gitea_repos.py
Normal file
54
test_gitea_repos.py
Normal file
@ -0,0 +1,54 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Test script to list all accessible Gitea repositories grouped by owner.
|
||||
This will show both personal and organization repositories.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from dotenv import load_dotenv
|
||||
from app.services.gitea_client import GiteaClient
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
def main():
|
||||
print("Testing Gitea Repository Access for Personal and Organization Accounts...")
|
||||
|
||||
# Check if Gitea API URL is configured
|
||||
gitea_api_url = os.getenv("GITEA_API_URL")
|
||||
if not gitea_api_url:
|
||||
print("Error: GITEA_API_URL is not configured in .env file.")
|
||||
sys.exit(1)
|
||||
|
||||
# Initialize the Gitea client
|
||||
gitea_client = GiteaClient()
|
||||
|
||||
# Get all repositories (increase limit if you have many)
|
||||
repositories = gitea_client.list_repositories(limit=100)
|
||||
|
||||
if not repositories:
|
||||
print("No repositories found or error occurred.")
|
||||
sys.exit(1)
|
||||
|
||||
# Group repositories by owner
|
||||
owners = defaultdict(list)
|
||||
for repo in repositories:
|
||||
owner_name = repo.get('owner', {}).get('login', 'unknown')
|
||||
owners[owner_name].append(repo)
|
||||
|
||||
# Display repositories grouped by owner
|
||||
print(f"\nFound {len(repositories)} repositories across {len(owners)} owners:")
|
||||
|
||||
for owner, repos in owners.items():
|
||||
print(f"\n== {owner} ({len(repos)} repositories) ==")
|
||||
for repo in repos:
|
||||
print(f" - {repo.get('name')}: {repo.get('html_url')}")
|
||||
print(f" Description: {repo.get('description') or 'No description'}")
|
||||
print(f" Default branch: {repo.get('default_branch')}")
|
||||
|
||||
print("\nTest completed successfully.")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
100
test_job_registration.py
Normal file
100
test_job_registration.py
Normal file
@ -0,0 +1,100 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Test script to verify job registration with explicit namespace.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import uuid
|
||||
from dotenv import load_dotenv
|
||||
from app.services.nomad_client import NomadService
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
def get_test_job_spec(job_id):
|
||||
"""Create a simple test job specification."""
|
||||
return {
|
||||
"ID": job_id,
|
||||
"Name": job_id,
|
||||
"Type": "service",
|
||||
"Datacenters": ["jm"],
|
||||
"Namespace": "development",
|
||||
"Priority": 50,
|
||||
"TaskGroups": [
|
||||
{
|
||||
"Name": "app",
|
||||
"Count": 1,
|
||||
"Tasks": [
|
||||
{
|
||||
"Name": "nginx",
|
||||
"Driver": "docker",
|
||||
"Config": {
|
||||
"image": "nginx:latest",
|
||||
"ports": ["http"],
|
||||
},
|
||||
"Resources": {
|
||||
"CPU": 100,
|
||||
"MemoryMB": 128
|
||||
}
|
||||
}
|
||||
],
|
||||
"Networks": [
|
||||
{
|
||||
"DynamicPorts": [
|
||||
{
|
||||
"Label": "http",
|
||||
"Value": 0,
|
||||
"To": 80
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
def main():
|
||||
print("Testing Nomad job registration...")
|
||||
|
||||
# Check if NOMAD_ADDR is configured
|
||||
nomad_addr = os.getenv("NOMAD_ADDR")
|
||||
if not nomad_addr:
|
||||
print("Error: NOMAD_ADDR is not configured in .env file.")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Connecting to Nomad at: {nomad_addr}")
|
||||
|
||||
try:
|
||||
# Initialize the Nomad service
|
||||
nomad_service = NomadService()
|
||||
|
||||
# Create a unique job ID for testing
|
||||
job_id = f"test-job-{uuid.uuid4().hex[:8]}"
|
||||
print(f"Created test job ID: {job_id}")
|
||||
|
||||
# Create job specification
|
||||
job_spec = get_test_job_spec(job_id)
|
||||
print("Created job specification with explicit namespace: development")
|
||||
|
||||
# Start the job
|
||||
print(f"Attempting to start job {job_id}...")
|
||||
start_response = nomad_service.start_job(job_spec)
|
||||
|
||||
print(f"Job start response: {start_response}")
|
||||
print(f"Job {job_id} started successfully!")
|
||||
|
||||
# Clean up - stop the job
|
||||
print(f"Stopping job {job_id}...")
|
||||
stop_response = nomad_service.stop_job(job_id, purge=True)
|
||||
print(f"Job stop response: {stop_response}")
|
||||
print(f"Job {job_id} stopped and purged successfully!")
|
||||
|
||||
print("\nNomad job registration test completed successfully.")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error during job registration test: {str(e)}")
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
66
test_nomad_connection.py
Normal file
66
test_nomad_connection.py
Normal file
@ -0,0 +1,66 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Test script to verify Nomad connection and check for specific jobs.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from dotenv import load_dotenv
|
||||
from pprint import pprint
|
||||
from app.services.nomad_client import NomadService
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
def main():
|
||||
print("Testing Nomad connection...")
|
||||
|
||||
# Check if NOMAD_ADDR is configured
|
||||
nomad_addr = os.getenv("NOMAD_ADDR")
|
||||
if not nomad_addr:
|
||||
print("Error: NOMAD_ADDR is not configured in .env file.")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Connecting to Nomad at: {nomad_addr}")
|
||||
|
||||
try:
|
||||
# Initialize the Nomad service
|
||||
nomad_service = NomadService()
|
||||
|
||||
# List all jobs
|
||||
print("\nListing all jobs...")
|
||||
jobs = nomad_service.list_jobs()
|
||||
print(f"Found {len(jobs)} jobs:")
|
||||
|
||||
# Print each job's ID and status
|
||||
for job in jobs:
|
||||
print(f" - {job.get('ID')}: {job.get('Status')}")
|
||||
|
||||
# Look for specific job
|
||||
job_id = "ms-qc-db-dev"
|
||||
print(f"\nLooking for job '{job_id}'...")
|
||||
|
||||
job_found = False
|
||||
for job in jobs:
|
||||
if job.get('ID') == job_id:
|
||||
job_found = True
|
||||
print(f"Found job '{job_id}'!")
|
||||
print(f" Status: {job.get('Status')}")
|
||||
print(f" Type: {job.get('Type')}")
|
||||
print(f" Priority: {job.get('Priority')}")
|
||||
break
|
||||
|
||||
if not job_found:
|
||||
print(f"Job '{job_id}' not found in the list of jobs.")
|
||||
print("Available jobs:")
|
||||
for job in jobs:
|
||||
print(f" - {job.get('ID')}")
|
||||
|
||||
print("\nNomad connection test completed successfully.")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error connecting to Nomad: {str(e)}")
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
86
test_nomad_namespaces.py
Normal file
86
test_nomad_namespaces.py
Normal file
@ -0,0 +1,86 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Test script to identify the exact namespace of the ms-qc-db-dev job.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from dotenv import load_dotenv
|
||||
import nomad
|
||||
from pprint import pprint
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
def get_nomad_client():
|
||||
"""Create a direct nomad client without going through our service layer."""
|
||||
nomad_addr = os.getenv("NOMAD_ADDR", "http://localhost:4646").rstrip('/')
|
||||
host_with_port = nomad_addr.replace("http://", "").replace("https://", "")
|
||||
host = host_with_port.split(":")[0]
|
||||
|
||||
# Safely extract port
|
||||
port_part = host_with_port.split(":")[-1] if ":" in host_with_port else "4646"
|
||||
port = int(port_part.split('/')[0])
|
||||
|
||||
return nomad.Nomad(
|
||||
host=host,
|
||||
port=port,
|
||||
timeout=10,
|
||||
namespace="*", # Try with explicit wildcard
|
||||
verify=False
|
||||
)
|
||||
|
||||
def main():
|
||||
print(f"Creating Nomad client...")
|
||||
client = get_nomad_client()
|
||||
|
||||
print(f"\n=== Testing with namespace='*' ===")
|
||||
try:
|
||||
# List all jobs with namespace '*'
|
||||
jobs = client.jobs.get_jobs(namespace="*")
|
||||
print(f"Found {len(jobs)} jobs using namespace='*'")
|
||||
|
||||
# Look for our specific job and show its namespace
|
||||
found = False
|
||||
for job in jobs:
|
||||
if job.get('ID') == 'ms-qc-db-dev':
|
||||
found = True
|
||||
print(f"\nFound job 'ms-qc-db-dev' in namespace: {job.get('Namespace', 'unknown')}")
|
||||
print(f"Job status: {job.get('Status')}")
|
||||
print(f"Job type: {job.get('Type')}")
|
||||
print(f"Job priority: {job.get('Priority')}")
|
||||
break
|
||||
|
||||
if not found:
|
||||
print(f"\nJob 'ms-qc-db-dev' not found with namespace='*'")
|
||||
except Exception as e:
|
||||
print(f"Error with namespace='*': {str(e)}")
|
||||
|
||||
# Try listing all available namespaces
|
||||
print(f"\n=== Listing available namespaces ===")
|
||||
try:
|
||||
namespaces = client.namespaces.get_namespaces()
|
||||
print(f"Found {len(namespaces)} namespaces:")
|
||||
for ns in namespaces:
|
||||
print(f" - {ns.get('Name')}")
|
||||
|
||||
# Try finding the job in each namespace specifically
|
||||
print(f"\n=== Searching for job in each namespace ===")
|
||||
for ns in namespaces:
|
||||
ns_name = ns.get('Name')
|
||||
try:
|
||||
job = client.job.get_job('ms-qc-db-dev', namespace=ns_name)
|
||||
print(f"Found job in namespace '{ns_name}'!")
|
||||
print(f" Status: {job.get('Status')}")
|
||||
print(f" Type: {job.get('Type')}")
|
||||
break
|
||||
except Exception:
|
||||
print(f"Not found in namespace '{ns_name}'")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error listing namespaces: {str(e)}")
|
||||
|
||||
print("\nTest completed.")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
Binary file not shown.
193
tests/test_nomad_service.py
Normal file
193
tests/test_nomad_service.py
Normal file
@ -0,0 +1,193 @@
|
||||
import os
|
||||
import pytest
|
||||
import time
|
||||
import uuid
|
||||
from dotenv import load_dotenv
|
||||
from app.services.nomad_client import NomadService
|
||||
|
||||
# Load environment variables
|
||||
load_dotenv()
|
||||
|
||||
# Skip tests if Nomad server is not configured
|
||||
nomad_addr = os.getenv("NOMAD_ADDR")
|
||||
if not nomad_addr:
|
||||
pytest.skip("NOMAD_ADDR not configured", allow_module_level=True)
|
||||
|
||||
# Test job ID prefix - each test will append a unique suffix
|
||||
TEST_JOB_ID_PREFIX = "test-job-"
|
||||
|
||||
# Simple nginx job specification template for testing
|
||||
def get_test_job_spec(job_id):
|
||||
return {
|
||||
"ID": job_id,
|
||||
"Name": job_id,
|
||||
"Type": "service",
|
||||
"Datacenters": ["jm"], # Adjust to match your Nomad cluster
|
||||
"Namespace": "development",
|
||||
"Priority": 50,
|
||||
"TaskGroups": [
|
||||
{
|
||||
"Name": "app",
|
||||
"Count": 1,
|
||||
"Tasks": [
|
||||
{
|
||||
"Name": "nginx",
|
||||
"Driver": "docker",
|
||||
"Config": {
|
||||
"image": "nginx:latest",
|
||||
"ports": ["http"],
|
||||
},
|
||||
"Resources": {
|
||||
"CPU": 100,
|
||||
"MemoryMB": 128
|
||||
}
|
||||
}
|
||||
],
|
||||
"Networks": [
|
||||
{
|
||||
"DynamicPorts": [
|
||||
{
|
||||
"Label": "http",
|
||||
"Value": 0,
|
||||
"To": 80
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
@pytest.fixture
|
||||
def nomad_service():
|
||||
"""Fixture to provide a NomadService instance."""
|
||||
return NomadService()
|
||||
|
||||
@pytest.fixture
|
||||
def test_job_id():
|
||||
"""Fixture to provide a unique job ID for each test."""
|
||||
job_id = f"{TEST_JOB_ID_PREFIX}{uuid.uuid4().hex[:8]}"
|
||||
yield job_id
|
||||
|
||||
# Cleanup: ensure job is stopped after the test
|
||||
try:
|
||||
service = NomadService()
|
||||
service.stop_job(job_id, purge=True)
|
||||
print(f"Cleaned up job {job_id}")
|
||||
except Exception as e:
|
||||
print(f"Error cleaning up job {job_id}: {str(e)}")
|
||||
|
||||
def test_job_start_and_stop(nomad_service, test_job_id):
|
||||
"""Test starting and stopping a job."""
|
||||
# Create job specification
|
||||
job_spec = get_test_job_spec(test_job_id)
|
||||
|
||||
# Start the job
|
||||
start_response = nomad_service.start_job(job_spec)
|
||||
assert start_response["job_id"] == test_job_id
|
||||
assert start_response["status"] == "started"
|
||||
assert "eval_id" in start_response
|
||||
|
||||
# Wait longer for job to be registered (increased from 2 to 10 seconds)
|
||||
time.sleep(10)
|
||||
|
||||
# Verify job exists
|
||||
job = nomad_service.get_job(test_job_id)
|
||||
assert job["ID"] == test_job_id
|
||||
|
||||
# Stop the job
|
||||
stop_response = nomad_service.stop_job(test_job_id)
|
||||
assert stop_response["job_id"] == test_job_id
|
||||
assert stop_response["status"] == "stopped"
|
||||
|
||||
# Wait for job to be stopped
|
||||
time.sleep(5)
|
||||
|
||||
# Verify job is stopped
|
||||
job = nomad_service.get_job(test_job_id)
|
||||
assert job["Stop"] is True
|
||||
|
||||
def test_job_with_namespace(nomad_service, test_job_id):
|
||||
"""Test job with explicit namespace."""
|
||||
# Create job specification with explicit namespace
|
||||
job_spec = get_test_job_spec(test_job_id)
|
||||
job_spec["Namespace"] = "development"
|
||||
|
||||
# Start the job
|
||||
start_response = nomad_service.start_job(job_spec)
|
||||
assert start_response["job_id"] == test_job_id
|
||||
assert start_response["namespace"] == "development"
|
||||
|
||||
# Wait longer for job to be registered (increased from 2 to 10 seconds)
|
||||
time.sleep(10)
|
||||
|
||||
# Verify job exists in the correct namespace
|
||||
job = nomad_service.get_job(test_job_id)
|
||||
assert job["Namespace"] == "development"
|
||||
|
||||
# Clean up
|
||||
nomad_service.stop_job(test_job_id)
|
||||
|
||||
def test_job_with_job_wrapper(nomad_service, test_job_id):
|
||||
"""Test job specification already wrapped in 'Job' key."""
|
||||
# Create job specification with Job wrapper
|
||||
job_spec = {
|
||||
"Job": get_test_job_spec(test_job_id)
|
||||
}
|
||||
|
||||
# Start the job
|
||||
start_response = nomad_service.start_job(job_spec)
|
||||
assert start_response["job_id"] == test_job_id
|
||||
|
||||
# Wait longer for job to be registered (increased from 2 to 10 seconds)
|
||||
time.sleep(10)
|
||||
|
||||
# Verify job exists
|
||||
job = nomad_service.get_job(test_job_id)
|
||||
assert job["ID"] == test_job_id
|
||||
|
||||
# Clean up
|
||||
nomad_service.stop_job(test_job_id)
|
||||
|
||||
def test_list_jobs(nomad_service):
|
||||
"""Test listing jobs."""
|
||||
jobs = nomad_service.list_jobs()
|
||||
assert isinstance(jobs, list)
|
||||
|
||||
# List should contain job details
|
||||
if jobs:
|
||||
assert "ID" in jobs[0]
|
||||
assert "Status" in jobs[0]
|
||||
|
||||
def test_job_lifecycle(nomad_service, test_job_id):
|
||||
"""Test the full job lifecycle - start, check status, get allocations, stop."""
|
||||
# Start the job
|
||||
job_spec = get_test_job_spec(test_job_id)
|
||||
start_response = nomad_service.start_job(job_spec)
|
||||
assert start_response["status"] == "started"
|
||||
|
||||
# Wait longer for job to be scheduled (increased from 5 to 15 seconds)
|
||||
time.sleep(15)
|
||||
|
||||
# Check job status
|
||||
job = nomad_service.get_job(test_job_id)
|
||||
assert job["ID"] == test_job_id
|
||||
|
||||
# Get allocations
|
||||
try:
|
||||
allocations = nomad_service.get_allocations(test_job_id)
|
||||
assert isinstance(allocations, list)
|
||||
except Exception:
|
||||
# It's possible allocations aren't available yet, which is okay for the test
|
||||
pass
|
||||
|
||||
# Stop the job
|
||||
stop_response = nomad_service.stop_job(test_job_id)
|
||||
assert stop_response["status"] == "stopped"
|
||||
|
||||
# Wait longer for job to be stopped (increased from 2 to 5 seconds)
|
||||
time.sleep(5)
|
||||
|
||||
# Verify job is stopped
|
||||
job = nomad_service.get_job(test_job_id)
|
||||
assert job["Stop"] is True
|
Reference in New Issue
Block a user