-
Notifications
You must be signed in to change notification settings - Fork 39
Open
Labels
type: bugSomething isn't workingSomething isn't working
Description
I tried this:
We have deployed Synapse to AWS EKS, but are having difficulty getting workflows to run.
Our operator config is as follows:
apiVersion: apps/v1
kind: Deployment
metadata:
name: operator-1
spec:
replicas: 1
selector:
matchLabels:
app: operator
template:
metadata:
labels:
app: operator
spec:
volumes:
- name: synapse-pvc
persistentVolumeClaim:
claimName: synapse-pvc
serviceAccountName: operator
containers:
- name: operator
image: ghcr.io/serverlessworkflow/synapse/operator:1.0.0-alpha5.15
env:
- name: CONNECTIONSTRINGS__REDIS
value: garnet:6379
- name: SYNAPSE_RUNNER_IMAGE
value: ghcr.io/serverlessworkflow/synapse/runner:1.0.0-alpha5.15
- name: SYNAPSE_OPERATOR_NAMESPACE
value: default
- name: SYNAPSE_OPERATOR_NAME
value: operator-1
- name: SYNAPSE_RUNNER_API
value: <DOMAIN>
- name: SYNAPSE_RUNNER_LIFECYCLE_EVENTS
value: "true"
- name: SYNAPSE_RUNNER_CONTAINER_PLATFORM
value: kubernetes
- name: SYNAPSE_RUNTIME_MODE
value: kubernetes
- name: SYNAPSE_RUNTIME_K8S_SERVICE_ACCOUNT
value: operator
- name: SYNAPSE_RUNTIME_K8S_NAMESPACE
value: ssmo-dev-shared-synapse
volumeMounts:
- name: synapse-pvc
mountPath: /run/secrets/synapse
---
apiVersion: v1
kind: Service
metadata:
name: operator
namespace: ssmo-dev-shared-synapse
spec:
ports:
- port: 80
targetPort: 8080
selector:
app: operator
type: ClusterIP
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: operator-role
rules:
- apiGroups: [""]
resources: ["pods", "secrets", "configmaps", "persistentvolumeclaims", "serviceaccounts", "services"]
verbs: ["get", "list", "watch", "create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: operator-role-binding
subjects:
- kind: ServiceAccount
name: operator
namespace: ssmo-dev-shared-synapse
roleRef:
kind: ClusterRole
name: operator-role
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: synapse-pv
spec:
capacity:
storage: "10Gi"
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: "efs-sc"
csi:
driver: efs.csi.aws.com
volumeHandle: "${AWS_EFS_FILESYSTEM_ID}::${AWS_EFS_FULL_ACCESS_AP_ID}"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: synapse-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: "10Gi"
storageClassName: "efs-sc"
volumeName: synapse-pv
This happened:
Anytime we attempt to get a workflow running, this warning persists and the workflow request times out:
[14:57:32] warn: Synapse.Runner.Services.SecretsManager[0] │
│ Failed to load secrets because there are none or because they are improperly configured. Error: Access to the path '/run/secrets/synapse' is denied. │
│ [14:57:32] info: Microsoft.Hosting.Lifetime[0] │
│ Application started. Press Ctrl+C to shut down. │
│ [14:57:32] info: Microsoft.Hosting.Lifetime[0] │
│ Hosting environment: Production │
│ [14:57:32] info: Microsoft.Hosting.Lifetime[0] │
│ Content root path: /app │
│ [14:57:32] info: System.Net.Http.HttpClient.Default.LogicalHandler[100] │
│ Start processing HTTP request GET https://synapse.dev-shared.ssmo.appdat.jsc.nasa.gov:8080/.well-known/openid-configuration │
│ [14:57:32] info: System.Net.Http.HttpClient.Default.ClientHandler[100] │
│ Sending HTTP request GET https://synapse.dev-shared.ssmo.appdat.jsc.nasa.gov:8080/.well-known/openid-configuration │
│ [14:59:12] fail: Synapse.Runner.Services.RunnerApplication[0] │
│ An error occurred while running the specified workflow instance: System.Threading.Tasks.TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing. │
│ ---> System.TimeoutException: A task was canceled. │
│ ---> System.Threading.Tasks.TaskCanceledException: A task was canceled. │
│ at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken) │
│ at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken) │
│ at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken) │
│ at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|4_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken) │
│ at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|4_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken) │
│ at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellat │
│ --- End of inner exception stack trace --- │
│ --- End of inner exception stack trace --- │
│ at System.Net.Http.HttpClient.HandleFailure(Exception e, Boolean telemetryStarted, HttpResponseMessage response, CancellationTokenSource cts, CancellationToken cancellationToken, CancellationTokenSource pendingRequestsCts) │
│ at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellat │
│ at IdentityModel.Client.HttpClientDiscoveryExtensions.GetDiscoveryDocumentAsync(HttpMessageInvoker client, DiscoveryDocumentRequest request, CancellationToken cancellationToken) │
│ at Synapse.Core.Infrastructure.Services.OAuth2TokenManager.GetTokenAsync(OAuth2AuthenticationSchemeDefinitionBase configuration, CancellationToken cancellationToken) in /src/src/core/Synapse.Core.Infrastructure/Services/OAuth2TokenManager.cs:line 77 │
│ at Program.<>c__DisplayClass0_2.<<<Main>$>b__14>d.MoveNext() in /src/src/runner/Synapse.Runner/Program.cs:line 60 │
│ --- End of stack trace from previous location --- │
│ at Synapse.Api.Client.Services.ApiClientBase.ProcessRequestAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /src/src/api/Synapse.Api.Client.Http/Services/ApiClientBase.cs:line 62 │
│ at Synapse.Api.Client.Services.ResourceHttpApiClient`1.GetAsync(String name, String namespace, CancellationToken cancellationToken) in /src/src/api/Synapse.Api.Client.Http/Services/ResourceHttpApiClient.cs:line 63 │
│ at Synapse.Runner.Services.RunnerApplication.RunAsync(CancellationToken cancellationToken) in /src/src/runner/Synapse.Runner/Services/RunnerApplication.cs:line 117 │
│ [14:59:12] info: Microsoft.Hosting.Lifetime[0] │
│ Application is shutting down...
I expected this:
Given that we have an EFS mount pointing to the path that the operator config specifies, the expectation is that we wouldn't be running into this kind of issue.
Furthermore, should we mount the EFS to the correlator, api, and garnet pods? I'm not entirely sure what local deployments of Synapse bind to from a file system perspective.
Is there a workaround?
No response
Anything else?
No response
Platform(s)
No response
Community Notes
- Please vote by adding a 👍 reaction to the issue to help us prioritize.
- If you are interested to work on this issue, please leave a comment.name: Bug Report 🐞
Metadata
Metadata
Assignees
Labels
type: bugSomething isn't workingSomething isn't working