Skip to main content
Operations in Tensor9 enable you to execute remote commands on customer appliances for debugging, troubleshooting, and maintenance. Using the Tensor9 CLI, you can run kubectl commands, execute shell commands in containers, query databases, or perform other operational tasks across all your customer appliances from your workstation.

How operations work

When you deploy applications through Tensor9, each customer appliance runs in isolated infrastructure that you don’t directly access. Without operations capability, you would need to ask customers to grant you temporary access credentials, coordinate SSH sessions, or have customers run commands on your behalf - all of which are slow, error-prone, and don’t scale. Tensor9’s operations system solves this by providing secure, audited remote command execution:
1

Initiate operation

You run a Tensor9 CLI command specifying which appliance to target, which resource to access, and what command to execute (e.g., tensor9 ops kubectl -appName my-app -customerName acme-corp -originResourceId "aws_eks_cluster.main_cluster" -command "kubectl get pods").
2

Customer approval (optional)

If the customer has configured approval workflows for operations, they review and approve the command before it executes. The customer sees the exact command you want to run.
3

Assume operate permissions

Your control plane assumes operate permissions in the customer’s appliance. Operations require operate permissions to execute commands and access resources for troubleshooting.
4

Execute command

The command executes within the customer’s appliance infrastructure. The execution happens in the customer’s environment, not in your control plane.
5

Return output

Command output is streamed back to your workstation via the control plane. You see the output as if you had run the command directly in the customer’s environment.
6

Audit logging

The operation (command, output, timestamp, operator identity) is logged in both the customer’s audit trail and your control plane’s audit trail.
Operations require operate permissions, which are time-bounded and may require customer approval. You cannot run operations during steady-state - you need active operate access to the appliance.

Types of operations

Tensor9 supports various types of remote operations depending on your application’s infrastructure:

Kubernetes operations

Execute kubectl commands on customer Kubernetes clusters. Your origin stack defines the Kubernetes cluster resource:
# Origin stack snippet
resource "aws_eks_cluster" "main_cluster" {
  name     = "my-app-cluster-${var.instance_id}"
  role_arn = aws_iam_role.cluster.arn
  # ...
}

# If your origin stack defines multiple clusters, use unique resource names
resource "aws_eks_cluster" "analytics_cluster" {
  name     = "my-app-analytics-${var.instance_id}"
  role_arn = aws_iam_role.analytics.arn
  # ...
}
Reference the specific cluster when executing kubectl operations:
# Get pod status
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods -n my-app-namespace"

# View logs from a specific pod
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl logs deployment/api -n my-app-namespace --tail=100"

# Restart a deployment
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl rollout restart deployment/api -n my-app-namespace"

# Describe a resource
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl describe pod api-7d9f8c5b4-xk2mn -n my-app-namespace"

# Target a different cluster using its origin resource ID
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.analytics_cluster" \
  -command "kubectl get pods -n analytics-namespace"
Multiple resources: If your origin stack defines multiple Kubernetes clusters, databases, or other resources, use the specific origin resource ID to target the correct resource. Each resource in your origin stack has a unique Terraform resource address (e.g., aws_eks_cluster.main_cluster vs aws_eks_cluster.analytics_cluster).
Kubernetes namespaces: Namespace names in kubectl commands should match your origin stack’s namespace configuration. If your origin stack creates namespaces dynamically (e.g., my-app-${var.instance_id}), use the actual namespace name as deployed in the target appliance. Tensor9 does not perform variable substitution for namespace names in kubectl commands - specify the literal namespace name.

Container operations

Execute commands inside running containers:
# Execute a shell command in a container (non-interactive)
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl exec deployment/api -n my-app-namespace -- curl http://localhost:8080/health"

# For interactive shell access, use an ops endpoint (see Interactive operations section)

Database operations

Run queries or administrative commands on databases. Your origin stack defines the database resource:
# Origin stack snippet
resource "aws_db_instance" "postgres" {
  identifier = "myapp-db-${var.instance_id}"
  engine     = "postgres"
  # ...
}
Reference the database when executing operations:
# Execute a PostgreSQL query
tensor9 ops db \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_db_instance.postgres" \
  -command "SELECT count(*) FROM users WHERE created_at > NOW() - INTERVAL '24 hours'"

# Check database replication lag
tensor9 ops db \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_db_instance.postgres" \
  -command "SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag"

Cloud resource operations

Perform operations on cloud resources. Your origin stack defines the resources:
# Origin stack snippet
resource "aws_s3_bucket" "data" {
  bucket = "my-app-data-${var.instance_id}"
  # ...
}

resource "aws_lambda_function" "api" {
  function_name = "my-app-api-${var.instance_id}"
  # ...
}
Reference the resources when executing operations:
# List S3 bucket contents
tensor9 ops aws \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_s3_bucket.data" \
  -command "aws s3 ls s3://${resource.bucket}"

# Invoke a Lambda function
tensor9 ops aws \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_lambda_function.api" \
  -command "aws lambda invoke --function-name ${resource.function_name} output.json"

Customer approval workflows

Your customers can configure whether operations require approval and what approval process to use:

Automatic approval

Operations execute immediately without customer approval. This is useful for:
  • Trusted vendor relationships
  • Non-production appliances (development, staging)
  • Operations during maintenance windows
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods"

# Output:
# Executing: kubectl get pods
# NAME                   READY   STATUS    RESTARTS   AGE
# api-7d9f8c5b4-xk2mn   1/1     Running   0          2d

Manual approval

Customer administrators review and approve each operation before execution. Results are returned asynchronously - it could take hours or days for your customers to approve the command and then approve the output.
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl rollout restart deployment/api"

# Output:
# Operation request submitted. Waiting for customer approval...
# [Customer reviews the exact command: "kubectl rollout restart deployment/api"]
# [Could take hours or days for approval]
# Approved by [email protected] at 2024-01-15 14:32:18 UTC
# Executing: kubectl rollout restart deployment/api
# deployment.apps/api restarted
# [Customer reviews and approves the output before you receive results]
Customers see:
  • The exact command you want to execute
  • Which appliance it targets
  • Your identity (operator email/name)
  • Timestamp of the request
After execution, customers also review and approve the command output before it’s returned to you.

Approval via time windows

Customers can configure time windows when operations are automatically approved. During these windows, both command execution and output delivery are automatically approved without manual review:
  • Business hours: Operations during Monday-Friday 9 AM - 5 PM are auto-approved (command and output)
  • Maintenance windows: Operations during scheduled maintenance are auto-approved (command and output)
  • Always require approval: All operations require explicit approval for both command and output, regardless of timing

Operations and permissions

Operations require operate permissions to execute commands and access resources for troubleshooting:

Permission requirements

  • Steady-state permissions: NOT sufficient for operations (read-only)
  • Operate permissions: Required for operations (read-write access for troubleshooting and debugging)
  • Deploy permissions: NOT used for operations (reserved for deployments and updates)
  • Install permissions: NOT used for operations (reserved for major infrastructure changes)

Requesting operate access

Before running operations, you need operate access to the target appliance:
# Request operate access
tensor9 access request \
  -appName my-app \
  -customerName acme-corp \
  -level operate \
  -duration 1h \
  -reason "Investigating API latency issues"

# Output:
# Operate access granted for 1 hour
# Expires at: 2024-01-15 15:30:00 UTC
Once you have operate access, you can run operations until the access window expires.

Permission scope

Operate permissions for operations are scoped to:
  • Resources owned by your application (tagged with your instance_id)
  • The specific appliance you’re accessing
  • The time window granted by the customer
You cannot access resources outside your application or other customers’ appliances.

Operations across form factors

Operations work across all form factors, with commands adapted to the target environment:
Form FactorOperations SupportExample Commands
AWSkubectl (EKS), aws CLI, docker exec, Lambda invokekubectl get pods, aws s3 ls, aws lambda invoke
Google Cloudkubectl (GKE), gcloud CLI, docker exec, Cloud Functions invokekubectl get pods, gcloud functions call
Azurekubectl (AKS), az CLI, docker exec, Azure Functions invokekubectl get pods, az functionapp invoke
DigitalOceankubectl (DOKS), docker exec, doctl CLIkubectl get pods, doctl compute droplet list
Private Kuberneteskubectl, docker execkubectl get pods, docker exec
On-premkubectl, docker exec, SSH accesskubectl get pods, ssh user@host
Tensor9 translates your operation command to work correctly in the target environment. For example, kubectl get pods works the same way whether the appliance runs on EKS, GKE, AKS, or private Kubernetes.

Audit logging

All operations are logged for security and compliance:

What gets logged

EventWhat Gets Logged
Operation requestCommand, target appliance, operator identity, timestamp
Approval workflowApproval status, approver identity, approval timestamp
Command executionFull command executed, execution start/end time, exit code
Command outputFull stdout/stderr output from the command (may be truncated for very large outputs)
Permission assumptionWhen deploy permissions were assumed, duration, approving principal

Where logs are stored

Audit TrailLocationWhat It Contains
Customer audit trailCustomer’s CloudTrail (AWS), Cloud Audit Logs (GCP), Azure Monitor (Azure), or SIEM (private)All operations executed in their appliance, including command, output, operator, and timestamp
Control plane audit trailYour Tensor9 control planeAll operations your team initiated, approval workflows, and which appliances were accessed
Customers have complete visibility into what operations your team executes on their infrastructure.

Interactive vs. non-interactive operations

Operations can be non-interactive (one-off commands) or interactive (persistent sessions):

Non-interactive operations

Execute a single command and return the output:
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods"
Use non-interactive operations for:
  • Status checks
  • Log retrieval
  • Database queries
  • Resource inspection
  • One-off commands that don’t require interaction

Interactive operations

For interactive work, create an ops endpoint that establishes a secure tunnel to the appliance. The ops endpoint requires customer approval to create the tunnel, but once established, commands executed through the tunnel do not require individual approval. This allows you to work interactively without waiting for approval on each command.

Create a kubectl ops endpoint

First, your origin stack defines a Kubernetes cluster resource:
# Origin stack snippet
resource "aws_eks_cluster" "main_cluster" {
  name     = "my-app-cluster-${var.instance_id}"
  role_arn = aws_iam_role.cluster.arn

  vpc_config {
    subnet_ids = var.subnet_ids
  }
}
Request an ops endpoint by referencing the origin stack resource address:
# Request an ops endpoint for kubectl access
tensor9 ops endpoint create \
  -appName my-app \
  -customerName acme-corp \
  -type kubectl \
  -originResourceId "aws_eks_cluster.main_cluster"

# Output:
# Ops endpoint request submitted. Waiting for customer approval...
# Approved by [email protected] at 2024-01-15 14:32:18 UTC
# Creating secure tunnel to appliance...
# Mapping aws_eks_cluster.main_cluster to deployed cluster in acme-corp appliance...
# Ops endpoint created successfully.
#
# Kubeconfig written to: ~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml
# Endpoint expires at: 2024-01-15 15:32:18 UTC
#
# To use this config:
#   export KUBECONFIG=~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml
#   kubectl get pods
The -originResourceId parameter tells Tensor9 which resource from your origin stack to connect to. Tensor9 maps this to the corresponding deployed resource in the customer’s appliance (e.g., the EKS cluster that was created from aws_eks_cluster.main_cluster). The kubeconfig file written to ~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml contains:
apiVersion: v1
kind: Config
clusters:
- cluster:
    server: https://kubectl-ops-main-cluster-abc123.tensor9.your-domain.co
    certificate-authority-data: LS0tLS1CRUdJTi...
  name: acme-corp-cluster
contexts:
- context:
    cluster: acme-corp-cluster
    user: vendor-operator
  name: acme-corp
current-context: acme-corp
users:
- name: vendor-operator
  user:
    token: eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...
Use the kubeconfig with standard kubectl commands:
# Set the KUBECONFIG environment variable
export KUBECONFIG=~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml

# Now use regular kubectl commands
kubectl get pods -n my-app-namespace
kubectl logs deployment/api -n my-app-namespace --follow
kubectl exec -it deployment/api -n my-app-namespace -- /bin/sh
kubectl describe pod api-7d9f8c5b4-xk2mn -n my-app-namespace

# The tunnel has specific RBAC permissions scoped to your application's namespace

RBAC permissions for ops endpoints

The ops endpoint tunnel has Kubernetes RBAC permissions scoped to:
  • Namespaces: Only your application’s namespaces (e.g., my-app-${instance_id})
  • Resources: Pods, deployments, services, configmaps, secrets owned by your application
  • Operations: Read (get, list, describe), write (create, update, delete), and exec permissions
You cannot access resources outside your application’s namespaces or other customers’ resources.

Other ops endpoint types

Create ops endpoints for different access types by referencing origin stack resources: PostgreSQL database access: Origin stack defines the database:
resource "aws_db_instance" "postgres" {
  identifier = "myapp-db-${var.instance_id}"
  engine     = "postgres"
  # ...
}
Create ops endpoint:
tensor9 ops endpoint create \
  -appName my-app \
  -customerName acme-corp \
  -type postgres \
  -originResourceId "aws_db_instance.postgres"

# Returns connection string:
# postgresql://vendor-operator:[email protected]:5432/myapp
The database connection is scoped to:
  • Database: Only the vendor application’s database
  • Schema access: Tables and schemas owned by the vendor application
  • Operations: Full SQL access (SELECT, INSERT, UPDATE, DELETE, DDL) within vendor-owned schemas
  • Restrictions: Cannot access system tables, other databases, or customer-owned schemas outside the vendor application
SSH access to a VM: Origin stack defines the compute instance:
resource "aws_instance" "api_server" {
  instance_type = "t3.medium"
  ami          = var.ami_id
  # ...
}
Create ops endpoint:
tensor9 ops endpoint create \
  -appName my-app \
  -customerName acme-corp \
  -type ssh \
  -originResourceId "aws_instance.api_server"

# Returns SSH connection details:
# ssh -i /path/to/generated-key [email protected]
The SSH session is scoped to:
  • Instance: Only the specified vendor-owned VM
  • User permissions: Standard user permissions (not root by default)
  • File system access: Vendor application directories and logs
  • Commands: Standard shell commands for debugging and troubleshooting
  • Restrictions: Cannot access other VMs, customer data directories, or system-critical files outside vendor application scope
Use interactive operations for:
  • Debugging complex issues requiring multiple commands
  • Exploring file systems and logs interactively
  • Testing and iterating on solutions
  • Extended troubleshooting sessions
Interactive sessions persist independently of deploy access. Ops endpoints remain active until explicitly retired or until their configured expiration time.

Managing ops endpoint lifecycle

Ops endpoints have their own lifecycle independent of deploy access: Retire an endpoint when you’re finished working:
tensor9 ops endpoint retire \
  -appName my-app \
  -customerName acme-corp \
  -endpointId kubectl-ops-main-cluster-abc123
Create a new endpoint for a new troubleshooting session:
tensor9 ops endpoint create \
  -appName my-app \
  -customerName acme-corp \
  -type kubectl \
  -originResourceId "aws_eks_cluster.main_cluster"
Ops endpoints persist independently of deploy access windows. You can maintain an active ops endpoint across multiple deploy access sessions, or retire it when finished regardless of deploy access status.

Cleaning up ops endpoints

Ops endpoints are cleaned up automatically or manually: Automatic cleanup - Endpoints are automatically destroyed when:
  • The endpoint’s configured expiration time is reached
  • Active connections are gracefully terminated
Manual cleanup - Retire an endpoint when you’re done working:
tensor9 ops endpoint retire \
  -appName my-app \
  -customerName acme-corp \
  -endpointId kubectl-ops-main-cluster-abc123

# Output:
# Terminating active connections...
# Ops endpoint kubectl-ops-main-cluster-abc123 retired successfully
# Kubeconfig file ~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml is no longer valid
When an endpoint is retired:
  • All active connections (kubectl sessions, database connections, SSH sessions) are immediately terminated
  • The endpoint domain becomes invalid
  • Generated credentials (kubeconfig files, connection strings, SSH keys) stop working
  • You must create a new endpoint to resume work

Error handling

Operations can fail for various reasons. Understanding how Tensor9 handles errors helps you troubleshoot effectively:

Common error scenarios

Error TypeWhat HappensHow to Resolve
Customer denies approvalOperation is cancelled and you receive a denial notification with optional customer messageReview the operation request, adjust the command if needed, and resubmit with additional context
Operate access expiredOperation is rejected before executionRequest new operate access with tensor9 access request -level operate before retrying the operation
Command timeoutOperation terminates after timeout period (default: 5 minutes for non-interactive operations)Break complex operations into smaller steps, or use interactive ops endpoints for long-running tasks
Command execution failureError output from the command is returned (exit code, stderr)Review command syntax, check resource availability, verify permissions
Ops endpoint expiredActive sessions are terminated when endpoint expiration time is reachedCreate a new ops endpoint to continue work
Invalid origin resource IDOperation fails with error indicating resource not found in origin stackVerify the origin resource ID matches a resource in your origin stack definition

Error response format

Failed operations return structured error information:
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods -n wrong-namespace"

# Output:
# Error: Command execution failed
# Exit code: 1
# Error: namespaces "wrong-namespace" not found
Customer approval denials include the denial reason:
# Output:
# Operation denied by [email protected] at 2024-01-15 14:32:18 UTC
# Reason: This operation requires change management approval. Please submit a change request first.

Operations and observability

Operations and observability work together:
  1. Observability identifies issues: Your observability platform alerts you to high error rates in a specific appliance
  2. Operations investigates: You use operations to inspect pods, check logs, query databases
  3. Operations remediates: You run commands to restart services, clear caches, or apply fixes
  4. Observability verifies: You confirm the issue is resolved through your observability dashboard
Example workflow:
# 1. Observability: Alert shows high error rate for acme-corp appliance

# 2. Investigate: Check pod status
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods -n my-app-namespace"

# 3. Investigate: Check recent logs
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl logs deployment/api -n my-app-namespace --tail=50"

# 4. Remediate: Restart the deployment
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl rollout restart deployment/api -n my-app-namespace"

# 5. Observability: Verify error rate returns to normal

Best practices

Request only the deploy access duration you need for the operation. If you’re running a quick diagnostic command, request 15-30 minutes instead of several hours. This minimizes the window of elevated permissions.
Prefer read-only commands (kubectl get, kubectl describe, SELECT queries) over commands that modify state. Only use write operations (kubectl delete, kubectl restart, UPDATE queries) when necessary.
Before running operations on customer appliances, test your commands on test appliances to ensure they work correctly and produce the expected results.
Operations should be used for application debugging and infrastructure troubleshooting, not for accessing customer business data. Do not run queries that return customer PII, financial data, or proprietary information.