The natural way to use the KCL
As Sylvite is a wrapper around the KCL, all of the KCL configuration is supported.
Configuration values can be set via either environment variables of via a configuration file. See the usage page for examples.
The following settings are required.
Type: string
Used by the KCL as the name of this application. Will be used as the name of a Amazon DynamoDB table which will store the lease and checkpoint information for workers with this application name.Type: string
The child executable to spawn for each shard. The executable must speak the KCL multi-lang protocol. All environment variables passed to the parent process will be inherited by the child process.Type: string
The Kinesis stream to process.The following settings are optional and all have default values if not set.
Type: string
Default: DefaultAWSCredentialsProviderChain
The default credential provider used to sign AWS requests.Type: boolean
Default:false
Call the RecordProcessor::processRecords() API even if GetRecords returned an empty record list.Type: boolean
Default: true
Cleanup leases upon shards completion (don't wait until they expire in Kinesis). Keeping leases takes some tracking/resources (e.g. they need to be renewed, assigned), so by default we try to delete the ones we don't need any longer.Type: string
Default: DefaultAWSCredentialsProviderChain
AWSCredentialsProvider for the cloudwatch access. Should only be set if you want to override the default AWS_CREDENTIALS_PROVIDER.Type: string
Default: DefaultAWSCredentialsProviderChain
AWSCredentialsProvider for the dynamodb access. Should only be set if you want to override the default AWS_CREDENTIALS_PROVIDER.Type: integer
Default: 10000
Fail over time in milliseconds. A worker which does not renew it's lease within this time interval will be regarded as having problems and it's shards will be assigned to other workers. For applications that have a large number of shards, this may be set to a higher number to reduce the number of DynamoDB IOPS required for tracking leases.Type: integer
Default: 1000
Idle time between calls to fetch data from Kinesis. This should be tuned with MAX_RECORDS in order to ensure you are not falling behind.Type: integer
Default: 10
The Amazon DynamoDB table used for tracking leases will be provisioned with this read capacity. Only applies if the table does not exist, otherwise the capacity is not changed.Type: integer
Default: 10
The Amazon DynamoDB table used for tracking leases will be provisioned with this write capacity. Only applies if the table does not exist, otherwise the capacity is not changed.Type: one of [LATEST, TRIM_HORIZON]
Default: TRIM_HORIZON
One of LATEST or TRIM_HORIZON. The Amazon Kinesis Client Library will start fetching records from this position when the application starts up if there are no checkpoints. If there are checkpoints, it will process records from the checkpoint position.Type: string
Default: DefaultAWSCredentialsProviderChain
AWSCredentialsProvider for the kinesis access. Should only be set if you want to override the default AWS_CREDENTIALS_PROVIDER.Type: integer
Default: 0
The maximum number of threads the multi-lang daemon will use. The default value of 0 does not limit the number of threads and should only be changed if you really know what you're doing.Type: integer
Default: 2,147,483,647
The max number of leases (shards) this worker should process. This can be useful to avoid overloading (and thrashing) a worker when a host has resource constraints or during deployment. NOTE: Setting this to a low value can cause data loss if workers are not able to pick up all shards in the stream due to the max limit.Type: integer
Default: 1
Max leases to steal from a more loaded Worker at one time (for load balancing). Setting this to a higher number can allow for faster load convergence (e.g. during deployments, cold starts), but can cause higher churn in the system.Type: integer
Default: 10000
Max records to fetch in a Kinesis getRecords() call. This should be tuned with IDLE_TIME_BETWEEN_READS_IN_MILLIS in order to ensure you are not falling behind.Type: integer
Default: 10000
Metrics are buffered for at most this long before publishing to CloudWatch.Type: string
Default: null
Sets the dimensions that are allowed to be emitted in metrics.Type: one of [NONE, SUMMARY, DETAILED]
Default: DETAILED
Sets metrics level that should be enabled. Possible values are:Type: integer
Default: 10000
Max number of metrics to buffer before publishing to CloudWatch.Type: integer
Default: 10000
Interval in milliseconds between polling to check for parent shard completion. Polling frequently will take up more DynamoDB IOPS (when there are leases for shards waiting on completion of parent shards).Type: string
Default: null
The language you are using to process the stream. This has no purpose other than augmenting the multi-lang user-agent string.Type: string
Default: null
The aws region name for the service.Type: integer
Default: 60000
Shard sync interval in milliseconds - e.g. wait for this long between shard sync tasks.Type: integer
Default: 500
Backoff time in milliseconds for Amazon Kinesis Client Library tasks (in the event of failures).Type: string
Default: null
Override the default user-agent used in aws requests.Type: boolean
Default: true
Whether KCL should validate client provided sequence numbers with a call to Amazon Kinesis before actually checkpointing. If true, this calls a kinesis getIterator() api call, and may cause throttling errors if you are checkpointing frequently.Type: string
Default: null
Explicit worker id for the given worker, used to distinguish different workers/processes of a Kinesis application. These must be unique between workers.