Operations

Operations in Tensor9 enable you to execute remote commands on customer appliances for debugging, troubleshooting, and maintenance. Using the Tensor9 CLI, you can run kubectl commands, execute shell commands in containers, query databases, or perform other operational tasks across all your customer appliances from your workstation.

How operations work

When you deploy applications through Tensor9, each customer appliance runs in isolated infrastructure that you don’t directly access. Without operations capability, you would need to ask customers to grant you temporary access credentials, coordinate SSH sessions, or have customers run commands on your behalf - all of which are slow, error-prone, and don’t scale. Tensor9’s operations system solves this by providing secure, audited remote command execution:

Initiate operation

You run a Tensor9 CLI command specifying which appliance to target, which resource to access, and what command to execute (e.g.,

tensor9 ops kubectl -appName my-app -customerName acme-corp -originResourceId "aws_eks_cluster.main_cluster" -command "kubectl get pods"

Customer approval (optional)

If the customer has configured approval workflows for operations, they review and approve the command before it executes. The customer sees the exact command you want to run.

Assume operate permissions

Your control plane assumes operate permissions in the customer’s appliance. Operations require operate permissions to execute commands and access resources for troubleshooting.

Execute command

The command executes within the customer’s appliance infrastructure. The execution happens in the customer’s environment, not in your control plane.

Return output

Command output is streamed back to your workstation via the control plane. You see the output as if you had run the command directly in the customer’s environment.

Audit logging

The operation (command, output, timestamp, operator identity) is logged in both the customer’s audit trail and your control plane’s audit trail.

Operations require operate permissions, which are time-bounded and may require customer approval. You cannot run operations during steady-state - you need active operate access to the appliance.

Types of operations

Tensor9 supports various types of remote operations depending on your application’s infrastructure:

Kubernetes operations

Execute kubectl commands on customer Kubernetes clusters. Your origin stack defines the Kubernetes cluster resource:

# Origin stack snippet
resource "aws_eks_cluster" "main_cluster" {
  name     = "my-app-cluster-${var.instance_id}"
  role_arn = aws_iam_role.cluster.arn
  # ...
}

# If your origin stack defines multiple clusters, use unique resource names
resource "aws_eks_cluster" "analytics_cluster" {
  name     = "my-app-analytics-${var.instance_id}"
  role_arn = aws_iam_role.analytics.arn
  # ...
}

Reference the specific cluster when executing kubectl operations:

# Get pod status
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods -n my-app-namespace"

# View logs from a specific pod
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl logs deployment/api -n my-app-namespace --tail=100"

# Restart a deployment
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl rollout restart deployment/api -n my-app-namespace"

# Describe a resource
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl describe pod api-7d9f8c5b4-xk2mn -n my-app-namespace"

# Target a different cluster using its origin resource ID
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.analytics_cluster" \
  -command "kubectl get pods -n analytics-namespace"

Multiple resources: If your origin stack defines multiple Kubernetes clusters, databases, or other resources, use the specific origin resource ID to target the correct resource. Each resource in your origin stack has a unique Terraform resource address (e.g., aws_eks_cluster.main_cluster vs aws_eks_cluster.analytics_cluster).

Kubernetes namespaces: Namespace names in kubectl commands should match your origin stack’s namespace configuration. If your origin stack creates namespaces dynamically (e.g., my-app-${var.instance_id}), use the actual namespace name as deployed in the target appliance. Tensor9 does not perform variable substitution for namespace names in kubectl commands - specify the literal namespace name.

Container operations

Execute commands inside running containers:

# Execute a shell command in a container (non-interactive)
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl exec deployment/api -n my-app-namespace -- curl http://localhost:8080/health"

# For interactive shell access, use an ops endpoint (see Interactive operations section)

Database operations

Run queries or administrative commands on databases. Your origin stack defines the database resource:

# Origin stack snippet
resource "aws_db_instance" "postgres" {
  identifier = "myapp-db-${var.instance_id}"
  engine     = "postgres"
  # ...
}

Reference the database when executing operations:

# Execute a PostgreSQL query
tensor9 ops db \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_db_instance.postgres" \
  -command "SELECT count(*) FROM users WHERE created_at > NOW() - INTERVAL '24 hours'"

# Check database replication lag
tensor9 ops db \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_db_instance.postgres" \
  -command "SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag"

Cloud resource operations

Perform operations on cloud resources. Your origin stack defines the resources:

# Origin stack snippet
resource "aws_s3_bucket" "data" {
  bucket = "my-app-data-${var.instance_id}"
  # ...
}

resource "aws_lambda_function" "api" {
  function_name = "my-app-api-${var.instance_id}"
  # ...
}

Reference the resources when executing operations:

# List S3 bucket contents
tensor9 ops aws \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_s3_bucket.data" \
  -command "aws s3 ls s3://${resource.bucket}"

# Invoke a Lambda function
tensor9 ops aws \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_lambda_function.api" \
  -command "aws lambda invoke --function-name ${resource.function_name} output.json"

Customer approval workflows

Your customers can configure whether operations require approval and what approval process to use:

Automatic approval

Operations execute immediately without customer approval. This is useful for:

Trusted vendor relationships
Non-production appliances (development, staging)
Operations during maintenance windows

tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods"

# Output:
# Executing: kubectl get pods
# NAME                   READY   STATUS    RESTARTS   AGE
# api-7d9f8c5b4-xk2mn   1/1     Running   0          2d

Manual approval

Customer administrators review and approve each operation before execution. Results are returned asynchronously - it could take hours or days for your customers to approve the command and then approve the output.

tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl rollout restart deployment/api"

# Output:
# Operation request submitted. Waiting for customer approval...
# [Customer reviews the exact command: "kubectl rollout restart deployment/api"]
# [Could take hours or days for approval]
# Approved by [email protected] at 2024-01-15 14:32:18 UTC
# Executing: kubectl rollout restart deployment/api
# deployment.apps/api restarted
# [Customer reviews and approves the output before you receive results]

Customers see:

The exact command you want to execute
Which appliance it targets
Your identity (operator email/name)
Timestamp of the request

After execution, customers also review and approve the command output before it’s returned to you.

Approval via time windows

Customers can configure time windows when operations are automatically approved. During these windows, both command execution and output delivery are automatically approved without manual review:

Business hours: Operations during Monday-Friday 9 AM - 5 PM are auto-approved (command and output)
Maintenance windows: Operations during scheduled maintenance are auto-approved (command and output)
Always require approval: All operations require explicit approval for both command and output, regardless of timing

Operations and permissions

Operations require operate permissions to execute commands and access resources for troubleshooting:

Permission requirements

Steady-state permissions: NOT sufficient for operations (read-only)
Operate permissions: Required for operations (read-write access for troubleshooting and debugging)
Deploy permissions: NOT used for operations (reserved for deployments and updates)
Install permissions: NOT used for operations (reserved for major infrastructure changes)

Requesting operate access

Before running operations, you need operate access to the target appliance:

# Request operate access
tensor9 access request \
  -appName my-app \
  -customerName acme-corp \
  -level operate \
  -duration 1h \
  -reason "Investigating API latency issues"

# Output:
# Operate access granted for 1 hour
# Expires at: 2024-01-15 15:30:00 UTC

Once you have operate access, you can run operations until the access window expires.

Permission scope

Operate permissions for operations are scoped to:

Resources owned by your application (tagged with your instance_id)
The specific appliance you’re accessing
The time window granted by the customer

You cannot access resources outside your application or other customers’ appliances.

Operations across form factors

Operations work across all form factors, with commands adapted to the target environment:

Form Factor	Operations Support	Example Commands
AWS	kubectl (EKS), aws CLI, docker exec, Lambda invoke	`kubectl get pods`, `aws s3 ls`, `aws lambda invoke`
Google Cloud	kubectl (GKE), gcloud CLI, docker exec, Cloud Functions invoke	`kubectl get pods`, `gcloud functions call`
Azure	kubectl (AKS), az CLI, docker exec, Azure Functions invoke	`kubectl get pods`, `az functionapp invoke`
DigitalOcean	kubectl (DOKS), docker exec, doctl CLI	`kubectl get pods`, `doctl compute droplet list`
Private Kubernetes	kubectl, docker exec	`kubectl get pods`, `docker exec`
On-prem	kubectl, docker exec, SSH access	`kubectl get pods`, `ssh user@host`

Tensor9 translates your operation command to work correctly in the target environment. For example, kubectl get pods works the same way whether the appliance runs on EKS, GKE, AKS, or private Kubernetes.

Audit logging

All operations are logged for security and compliance:

What gets logged

Event	What Gets Logged
Operation request	Command, target appliance, operator identity, timestamp
Approval workflow	Approval status, approver identity, approval timestamp
Command execution	Full command executed, execution start/end time, exit code
Command output	Full stdout/stderr output from the command (may be truncated for very large outputs)
Permission assumption	When deploy permissions were assumed, duration, approving principal

Where logs are stored

Audit Trail	Location	What It Contains
Customer audit trail	Customer’s CloudTrail (AWS), Cloud Audit Logs (GCP), Azure Monitor (Azure), or SIEM (private)	All operations executed in their appliance, including command, output, operator, and timestamp
Control plane audit trail	Your Tensor9 control plane	All operations your team initiated, approval workflows, and which appliances were accessed

Customers have complete visibility into what operations your team executes on their infrastructure.

Interactive vs. non-interactive operations

Operations can be non-interactive (one-off commands) or interactive (persistent sessions):

Non-interactive operations

Execute a single command and return the output:

tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods"

Use non-interactive operations for:

Status checks
Log retrieval
Database queries
Resource inspection
One-off commands that don’t require interaction

Interactive operations

For interactive work, create an ops endpoint that establishes a secure tunnel to the appliance. The ops endpoint requires customer approval to create the tunnel, but once established, commands executed through the tunnel do not require individual approval. This allows you to work interactively without waiting for approval on each command.

Create a kubectl ops endpoint

First, your origin stack defines a Kubernetes cluster resource:

# Origin stack snippet
resource "aws_eks_cluster" "main_cluster" {
  name     = "my-app-cluster-${var.instance_id}"
  role_arn = aws_iam_role.cluster.arn

  vpc_config {
    subnet_ids = var.subnet_ids
  }
}

Request an ops endpoint by referencing the origin stack resource address:

# Request an ops endpoint for kubectl access
tensor9 ops endpoint create \
  -appName my-app \
  -customerName acme-corp \
  -type kubectl \
  -originResourceId "aws_eks_cluster.main_cluster"

# Output:
# Ops endpoint request submitted. Waiting for customer approval...
# Approved by [email protected] at 2024-01-15 14:32:18 UTC
# Creating secure tunnel to appliance...
# Mapping aws_eks_cluster.main_cluster to deployed cluster in acme-corp appliance...
# Ops endpoint created successfully.
#
# Kubeconfig written to: ~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml
# Endpoint expires at: 2024-01-15 15:32:18 UTC
#
# To use this config:
#   export KUBECONFIG=~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml
#   kubectl get pods

The -originResourceId parameter tells Tensor9 which resource from your origin stack to connect to. Tensor9 maps this to the corresponding deployed resource in the customer’s appliance (e.g., the EKS cluster that was created from aws_eks_cluster.main_cluster). The kubeconfig file written to ~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml contains:

apiVersion: v1
kind: Config
clusters:
- cluster:
    server: https://kubectl-ops-main-cluster-abc123.tensor9.your-domain.co
    certificate-authority-data: LS0tLS1CRUdJTi...
  name: acme-corp-cluster
contexts:
- context:
    cluster: acme-corp-cluster
    user: vendor-operator
  name: acme-corp
current-context: acme-corp
users:
- name: vendor-operator
  user:
    token: eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...

Use the kubeconfig with standard kubectl commands:

# Set the KUBECONFIG environment variable
export KUBECONFIG=~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml

# Now use regular kubectl commands
kubectl get pods -n my-app-namespace
kubectl logs deployment/api -n my-app-namespace --follow
kubectl exec -it deployment/api -n my-app-namespace -- /bin/sh
kubectl describe pod api-7d9f8c5b4-xk2mn -n my-app-namespace

# The tunnel has specific RBAC permissions scoped to your application's namespace

RBAC permissions for ops endpoints

The ops endpoint tunnel has Kubernetes RBAC permissions scoped to:

Namespaces: Only your application’s namespaces (e.g., my-app-${instance_id})
Resources: Pods, deployments, services, configmaps, secrets owned by your application
Operations: Read (get, list, describe), write (create, update, delete), and exec permissions

You cannot access resources outside your application’s namespaces or other customers’ resources.

Other ops endpoint types

Create ops endpoints for different access types by referencing origin stack resources: PostgreSQL database access: Origin stack defines the database:

resource "aws_db_instance" "postgres" {
  identifier = "myapp-db-${var.instance_id}"
  engine     = "postgres"
  # ...
}

Create ops endpoint:

tensor9 ops endpoint create \
  -appName my-app \
  -customerName acme-corp \
  -type postgres \
  -originResourceId "aws_db_instance.postgres"

# Returns connection string:
# postgresql://vendor-operator:[email protected]:5432/myapp

The database connection is scoped to:

Database: Only the vendor application’s database
Schema access: Tables and schemas owned by the vendor application
Operations: Full SQL access (SELECT, INSERT, UPDATE, DELETE, DDL) within vendor-owned schemas
Restrictions: Cannot access system tables, other databases, or customer-owned schemas outside the vendor application

SSH access to a VM: Origin stack defines the compute instance:

resource "aws_instance" "api_server" {
  instance_type = "t3.medium"
  ami          = var.ami_id
  # ...
}

Create ops endpoint:

tensor9 ops endpoint create \
  -appName my-app \
  -customerName acme-corp \
  -type ssh \
  -originResourceId "aws_instance.api_server"

# Returns SSH connection details:
# ssh -i /path/to/generated-key [email protected]

The SSH session is scoped to:

Instance: Only the specified vendor-owned VM
User permissions: Standard user permissions (not root by default)
File system access: Vendor application directories and logs
Commands: Standard shell commands for debugging and troubleshooting
Restrictions: Cannot access other VMs, customer data directories, or system-critical files outside vendor application scope

Use interactive operations for:

Debugging complex issues requiring multiple commands
Exploring file systems and logs interactively
Testing and iterating on solutions
Extended troubleshooting sessions

Interactive sessions persist independently of deploy access. Ops endpoints remain active until explicitly retired or until their configured expiration time.

Managing ops endpoint lifecycle

Ops endpoints have their own lifecycle independent of deploy access: Retire an endpoint when you’re finished working:

tensor9 ops endpoint retire \
  -appName my-app \
  -customerName acme-corp \
  -endpointId kubectl-ops-main-cluster-abc123

Create a new endpoint for a new troubleshooting session:

tensor9 ops endpoint create \
  -appName my-app \
  -customerName acme-corp \
  -type kubectl \
  -originResourceId "aws_eks_cluster.main_cluster"

Ops endpoints persist independently of deploy access windows. You can maintain an active ops endpoint across multiple deploy access sessions, or retire it when finished regardless of deploy access status.

Cleaning up ops endpoints

Ops endpoints are cleaned up automatically or manually: Automatic cleanup - Endpoints are automatically destroyed when:

The endpoint’s configured expiration time is reached
Active connections are gracefully terminated

Manual cleanup - Retire an endpoint when you’re done working:

tensor9 ops endpoint retire \
  -appName my-app \
  -customerName acme-corp \
  -endpointId kubectl-ops-main-cluster-abc123

# Output:
# Terminating active connections...
# Ops endpoint kubectl-ops-main-cluster-abc123 retired successfully
# Kubeconfig file ~/.kube/kubectl-ops-main-cluster-abc123.tensor9.your-domain.co.yaml is no longer valid

When an endpoint is retired:

All active connections (kubectl sessions, database connections, SSH sessions) are immediately terminated
The endpoint domain becomes invalid
Generated credentials (kubeconfig files, connection strings, SSH keys) stop working
You must create a new endpoint to resume work

Error handling

Operations can fail for various reasons. Understanding how Tensor9 handles errors helps you troubleshoot effectively:

Common error scenarios

Error Type	What Happens	How to Resolve
Customer denies approval	Operation is cancelled and you receive a denial notification with optional customer message	Review the operation request, adjust the command if needed, and resubmit with additional context
Operate access expired	Operation is rejected before execution	Request new operate access with `tensor9 access request -level operate` before retrying the operation
Command timeout	Operation terminates after timeout period (default: 5 minutes for non-interactive operations)	Break complex operations into smaller steps, or use interactive ops endpoints for long-running tasks
Command execution failure	Error output from the command is returned (exit code, stderr)	Review command syntax, check resource availability, verify permissions
Ops endpoint expired	Active sessions are terminated when endpoint expiration time is reached	Create a new ops endpoint to continue work
Invalid origin resource ID	Operation fails with error indicating resource not found in origin stack	Verify the origin resource ID matches a resource in your origin stack definition

Error response format

Failed operations return structured error information:

tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods -n wrong-namespace"

# Output:
# Error: Command execution failed
# Exit code: 1
# Error: namespaces "wrong-namespace" not found

Customer approval denials include the denial reason:

# Output:
# Operation denied by [email protected] at 2024-01-15 14:32:18 UTC
# Reason: This operation requires change management approval. Please submit a change request first.

Operations and observability

Operations and observability work together:

Observability identifies issues: Your observability platform alerts you to high error rates in a specific appliance
Operations investigates: You use operations to inspect pods, check logs, query databases
Operations remediates: You run commands to restart services, clear caches, or apply fixes
Observability verifies: You confirm the issue is resolved through your observability dashboard

Example workflow:

# 1. Observability: Alert shows high error rate for acme-corp appliance

# 2. Investigate: Check pod status
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl get pods -n my-app-namespace"

# 3. Investigate: Check recent logs
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl logs deployment/api -n my-app-namespace --tail=50"

# 4. Remediate: Restart the deployment
tensor9 ops kubectl \
  -appName my-app \
  -customerName acme-corp \
  -originResourceId "aws_eks_cluster.main_cluster" \
  -command "kubectl rollout restart deployment/api -n my-app-namespace"

# 5. Observability: Verify error rate returns to normal

Best practices

Request minimal deploy access duration

Request only the deploy access duration you need for the operation. If you’re running a quick diagnostic command, request 15-30 minutes instead of several hours. This minimizes the window of elevated permissions.

Use read-only commands when possible

Prefer read-only commands (kubectl get, kubectl describe, SELECT queries) over commands that modify state. Only use write operations (kubectl delete, kubectl restart, UPDATE queries) when necessary.

Test operations on test appliances first

Before running operations on customer appliances, test your commands on test appliances to ensure they work correctly and produce the expected results.

Never access customer data

Operations should be used for application debugging and infrastructure troubleshooting, not for accessing customer business data. Do not run queries that return customer PII, financial data, or proprietary information.

Permissions Model: Understanding deploy permissions required for operations
Observability: Using observability to identify when operations are needed
Appliances: Customer environments where operations execute
Deployments: Deployments vs. operations

Tensor9

Getting Started

Fundamentals

Your Origin Stack

Your Customer's Environment

Integrations

tensor9 CLI

How operations work

Types of operations

Kubernetes operations

Container operations

Database operations

Cloud resource operations

Customer approval workflows

Automatic approval

Manual approval

Approval via time windows

Operations and permissions

Permission requirements

Requesting operate access

Permission scope

Operations across form factors

Audit logging

What gets logged

Where logs are stored

Interactive vs. non-interactive operations

Non-interactive operations

Interactive operations

Create a kubectl ops endpoint

RBAC permissions for ops endpoints

Other ops endpoint types

Managing ops endpoint lifecycle

Cleaning up ops endpoints

Error handling

Common error scenarios

Error response format

Operations and observability

Best practices

Tensor9

Getting Started

Fundamentals

Your Origin Stack

Your Customer's Environment

Integrations

tensor9 CLI

​How operations work

​Types of operations

​Kubernetes operations

​Container operations

​Database operations

​Cloud resource operations

​Customer approval workflows

​Automatic approval

​Manual approval

​Approval via time windows

​Operations and permissions

​Permission requirements

​Requesting operate access

​Permission scope

​Operations across form factors

​Audit logging

​What gets logged

​Where logs are stored

​Interactive vs. non-interactive operations

​Non-interactive operations

​Interactive operations

​Create a kubectl ops endpoint

​RBAC permissions for ops endpoints

​Other ops endpoint types

​Managing ops endpoint lifecycle

​Cleaning up ops endpoints

​Error handling

​Common error scenarios

​Error response format

​Operations and observability

​Best practices

​Related topics

How operations work

Types of operations

Kubernetes operations

Container operations

Database operations

Cloud resource operations

Customer approval workflows

Automatic approval

Manual approval

Approval via time windows

Operations and permissions

Permission requirements

Requesting operate access

Permission scope

Operations across form factors

Audit logging

What gets logged

Where logs are stored

Interactive vs. non-interactive operations

Non-interactive operations

Interactive operations

Create a kubectl ops endpoint

RBAC permissions for ops endpoints

Other ops endpoint types

Managing ops endpoint lifecycle

Cleaning up ops endpoints

Error handling

Common error scenarios

Error response format

Operations and observability

Best practices

Related topics