How operations work
When you deploy applications through Tensor9, each customer appliance runs in isolated infrastructure that you don’t directly access. Without operations capability, you would need to ask customers to grant you temporary access credentials, coordinate SSH sessions, or have customers run commands on your behalf - all of which are slow, error-prone, and don’t scale. Tensor9’s operations system solves this by providing secure, audited remote command execution:1
Initiate operation
You run a Tensor9 CLI command specifying which appliance to target, which resource to access, and what command to execute (e.g.,
tensor9 ops kubectl -appName my-app -customerName acme-corp -originResourceId "aws_eks_cluster.main_cluster" -command "kubectl get pods").2
Customer approval (optional)
If the customer has configured approval workflows for operations, they review and approve the command before it executes. The customer sees the exact command you want to run.
3
Assume operate permissions
Your control plane assumes operate permissions in the customer’s appliance. Operations require operate permissions to execute commands and access resources for troubleshooting.
4
Execute command
The command executes within the customer’s appliance infrastructure. The execution happens in the customer’s environment, not in your control plane.
5
Return output
Command output is streamed back to your workstation via the control plane. You see the output as if you had run the command directly in the customer’s environment.
6
Audit logging
The operation (command, output, timestamp, operator identity) is logged in both the customer’s audit trail and your control plane’s audit trail.
Operations require operate permissions, which are time-bounded and may require customer approval. You cannot run operations during steady-state - you need active operate access to the appliance.
Types of operations
Tensor9 supports various types of remote operations depending on your application’s infrastructure:Kubernetes operations
Execute kubectl commands on customer Kubernetes clusters. Your origin stack defines the Kubernetes cluster resource:Multiple resources: If your origin stack defines multiple Kubernetes clusters, databases, or other resources, use the specific origin resource ID to target the correct resource. Each resource in your origin stack has a unique Terraform resource address (e.g.,
aws_eks_cluster.main_cluster vs aws_eks_cluster.analytics_cluster).Kubernetes namespaces: Namespace names in kubectl commands should match your origin stack’s namespace configuration. If your origin stack creates namespaces dynamically (e.g.,
my-app-${var.instance_id}), use the actual namespace name as deployed in the target appliance. Tensor9 does not perform variable substitution for namespace names in kubectl commands - specify the literal namespace name.Container operations
Execute commands inside running containers:Database operations
Run queries or administrative commands on databases. Your origin stack defines the database resource:Cloud resource operations
Perform operations on cloud resources. Your origin stack defines the resources:Customer approval workflows
Your customers can configure whether operations require approval and what approval process to use:Automatic approval
Operations execute immediately without customer approval. This is useful for:- Trusted vendor relationships
- Non-production appliances (development, staging)
- Operations during maintenance windows
Manual approval
Customer administrators review and approve each operation before execution. Results are returned asynchronously - it could take hours or days for your customers to approve the command and then approve the output.- The exact command you want to execute
- Which appliance it targets
- Your identity (operator email/name)
- Timestamp of the request
Approval via time windows
Customers can configure time windows when operations are automatically approved. During these windows, both command execution and output delivery are automatically approved without manual review:- Business hours: Operations during Monday-Friday 9 AM - 5 PM are auto-approved (command and output)
- Maintenance windows: Operations during scheduled maintenance are auto-approved (command and output)
- Always require approval: All operations require explicit approval for both command and output, regardless of timing
Operations and permissions
Operations require operate permissions to execute commands and access resources for troubleshooting:Permission requirements
- Steady-state permissions: NOT sufficient for operations (read-only)
- Operate permissions: Required for operations (read-write access for troubleshooting and debugging)
- Deploy permissions: NOT used for operations (reserved for deployments and updates)
- Install permissions: NOT used for operations (reserved for major infrastructure changes)
Requesting operate access
Before running operations, you need operate access to the target appliance:Permission scope
Operate permissions for operations are scoped to:- Resources owned by your application (tagged with your
instance_id) - The specific appliance you’re accessing
- The time window granted by the customer
Operations across form factors
Operations work across all form factors, with commands adapted to the target environment:| Form Factor | Operations Support | Example Commands |
|---|---|---|
| AWS | kubectl (EKS), aws CLI, docker exec, Lambda invoke | kubectl get pods, aws s3 ls, aws lambda invoke |
| Google Cloud | kubectl (GKE), gcloud CLI, docker exec, Cloud Functions invoke | kubectl get pods, gcloud functions call |
| Azure | kubectl (AKS), az CLI, docker exec, Azure Functions invoke | kubectl get pods, az functionapp invoke |
| DigitalOcean | kubectl (DOKS), docker exec, doctl CLI | kubectl get pods, doctl compute droplet list |
| Private Kubernetes | kubectl, docker exec | kubectl get pods, docker exec |
| On-prem | kubectl, docker exec, SSH access | kubectl get pods, ssh user@host |
kubectl get pods works the same way whether the appliance runs on EKS, GKE, AKS, or private Kubernetes.
Audit logging
All operations are logged for security and compliance:What gets logged
| Event | What Gets Logged |
|---|---|
| Operation request | Command, target appliance, operator identity, timestamp |
| Approval workflow | Approval status, approver identity, approval timestamp |
| Command execution | Full command executed, execution start/end time, exit code |
| Command output | Full stdout/stderr output from the command (may be truncated for very large outputs) |
| Permission assumption | When deploy permissions were assumed, duration, approving principal |
Where logs are stored
| Audit Trail | Location | What It Contains |
|---|---|---|
| Customer audit trail | Customer’s CloudTrail (AWS), Cloud Audit Logs (GCP), Azure Monitor (Azure), or SIEM (private) | All operations executed in their appliance, including command, output, operator, and timestamp |
| Control plane audit trail | Your Tensor9 control plane | All operations your team initiated, approval workflows, and which appliances were accessed |
Interactive vs. non-interactive operations
Operations can be non-interactive (one-off commands) or interactive (persistent sessions):Non-interactive operations
Execute a single command and return the output:- Status checks
- Log retrieval
- Database queries
- Resource inspection
- One-off commands that don’t require interaction
Interactive operations
For interactive work, create an ops endpoint that establishes a secure tunnel to the appliance. The ops endpoint requires customer approval to create the tunnel, but once established, commands executed through the tunnel do not require individual approval. This allows you to work interactively without waiting for approval on each command.Create a kubectl ops endpoint
First, your origin stack defines a Kubernetes cluster resource:-originResourceId parameter tells Tensor9 which resource from your origin stack to connect to. Tensor9 maps this to the corresponding deployed resource in the customer’s appliance (e.g., the EKS cluster that was created from aws_eks_cluster.main_cluster).
The kubeconfig file written to ~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml contains:
RBAC permissions for ops endpoints
The ops endpoint tunnel has Kubernetes RBAC permissions scoped to:- Namespaces: Only your application’s namespaces (e.g.,
my-app-${instance_id}) - Resources: Pods, deployments, services, configmaps, secrets owned by your application
- Operations: Read (get, list, describe), write (create, update, delete), and exec permissions
Other ops endpoint types
Create ops endpoints for different access types by referencing origin stack resources: PostgreSQL database access: Origin stack defines the database:- Database: Only the vendor application’s database
- Schema access: Tables and schemas owned by the vendor application
- Operations: Full SQL access (SELECT, INSERT, UPDATE, DELETE, DDL) within vendor-owned schemas
- Restrictions: Cannot access system tables, other databases, or customer-owned schemas outside the vendor application
- Instance: Only the specified vendor-owned VM
- User permissions: Standard user permissions (not root by default)
- File system access: Vendor application directories and logs
- Commands: Standard shell commands for debugging and troubleshooting
- Restrictions: Cannot access other VMs, customer data directories, or system-critical files outside vendor application scope
- Debugging complex issues requiring multiple commands
- Exploring file systems and logs interactively
- Testing and iterating on solutions
- Extended troubleshooting sessions
Managing ops endpoint lifecycle
Ops endpoints have their own lifecycle independent of deploy access: Retire an endpoint when you’re finished working:Ops endpoints persist independently of deploy access windows. You can maintain an active ops endpoint across multiple deploy access sessions, or retire it when finished regardless of deploy access status.
Cleaning up ops endpoints
Ops endpoints are cleaned up automatically or manually: Automatic cleanup - Endpoints are automatically destroyed when:- The endpoint’s configured expiration time is reached
- Active connections are gracefully terminated
- All active connections (kubectl sessions, database connections, SSH sessions) are immediately terminated
- The endpoint domain becomes invalid
- Generated credentials (kubeconfig files, connection strings, SSH keys) stop working
- You must create a new endpoint to resume work
Error handling
Operations can fail for various reasons. Understanding how Tensor9 handles errors helps you troubleshoot effectively:Common error scenarios
| Error Type | What Happens | How to Resolve |
|---|---|---|
| Customer denies approval | Operation is cancelled and you receive a denial notification with optional customer message | Review the operation request, adjust the command if needed, and resubmit with additional context |
| Operate access expired | Operation is rejected before execution | Request new operate access with tensor9 access request -level operate before retrying the operation |
| Command timeout | Operation terminates after timeout period (default: 5 minutes for non-interactive operations) | Break complex operations into smaller steps, or use interactive ops endpoints for long-running tasks |
| Command execution failure | Error output from the command is returned (exit code, stderr) | Review command syntax, check resource availability, verify permissions |
| Ops endpoint expired | Active sessions are terminated when endpoint expiration time is reached | Create a new ops endpoint to continue work |
| Invalid origin resource ID | Operation fails with error indicating resource not found in origin stack | Verify the origin resource ID matches a resource in your origin stack definition |
Error response format
Failed operations return structured error information:Operations and observability
Operations and observability work together:- Observability identifies issues: Your observability platform alerts you to high error rates in a specific appliance
- Operations investigates: You use operations to inspect pods, check logs, query databases
- Operations remediates: You run commands to restart services, clear caches, or apply fixes
- Observability verifies: You confirm the issue is resolved through your observability dashboard
Best practices
Request minimal deploy access duration
Request minimal deploy access duration
Request only the deploy access duration you need for the operation. If you’re running a quick diagnostic command, request 15-30 minutes instead of several hours. This minimizes the window of elevated permissions.
Use read-only commands when possible
Use read-only commands when possible
Prefer read-only commands (kubectl get, kubectl describe, SELECT queries) over commands that modify state. Only use write operations (kubectl delete, kubectl restart, UPDATE queries) when necessary.
Test operations on test appliances first
Test operations on test appliances first
Before running operations on customer appliances, test your commands on test appliances to ensure they work correctly and produce the expected results.
Never access customer data
Never access customer data
Operations should be used for application debugging and infrastructure troubleshooting, not for accessing customer business data. Do not run queries that return customer PII, financial data, or proprietary information.
Related topics
- Permissions Model: Understanding deploy permissions required for operations
- Observability: Using observability to identify when operations are needed
- Appliances: Customer environments where operations execute
- Deployments: Deployments vs. operations