taskmanager.data.bind-port |
(none) |
Integer |
The task manager's bind port used for data exchange operations. If not configured, 'taskmanager.data.port' will be used. |
taskmanager.data.port |
0 |
Integer |
The task manager’s external port used for data exchange operations. |
taskmanager.data.ssl.enabled |
true |
Boolean |
Enable SSL support for the taskmanager data transport. This is applicable only when the global flag for internal SSL (security.ssl.internal.enabled) is set to true |
taskmanager.network.batch-shuffle.compression.enabled |
true |
Boolean |
Boolean flag indicating whether the shuffle data will be compressed for batch shuffle mode. Note that data is compressed per buffer and compression can incur extra CPU overhead, so it is more effective for IO bounded scenario when compression ratio is high. |
taskmanager.network.blocking-shuffle.type |
"file" |
String |
The blocking shuffle type, either "mmap" or "file". The "auto" means selecting the property type automatically based on system memory architecture (64 bit for mmap and 32 bit for file). Note that the memory usage of mmap is not accounted by configured memory limits, but some resource frameworks like yarn would track this memory usage and kill the container once memory exceeding some threshold. Also note that this option is experimental and might be changed future. |
taskmanager.network.compression.codec |
"LZ4" |
String |
The codec to be used when compressing shuffle data, only "LZ4", "LZO" and "ZSTD" are supported now. Through tpc-ds test of these three algorithms, the results show that "LZ4" algorithm has the highest compression and decompression speed, but the compression ratio is the lowest. "ZSTD" has the highest compression ratio, but the compression and decompression speed is the slowest, and LZO is between the two. Also note that this option is experimental and might be changed in the future. |
taskmanager.network.detailed-metrics |
false |
Boolean |
Boolean flag to enable/disable more detailed metrics about inbound/outbound network queue lengths. |
taskmanager.network.hybrid-shuffle.enable-new-mode |
true |
Boolean |
The option is used to enable the new mode of hybrid shuffle, which has resolved existing issues in the legacy mode. First, the new mode uses less required network memory. Second, the new mode can store shuffle data in remote storage when the disk space is not enough, which could avoid insufficient disk space errors and is only supported when taskmanager.network.hybrid-shuffle.remote.path is configured. The new mode is currently in an experimental phase. It can be set to false to fallback to the legacy mode if something unexpected. Once the new mode reaches a stable state, the legacy mode as well as the option will be removed. |
taskmanager.network.hybrid-shuffle.num-retained-in-memory-regions-max |
1048576 |
Long |
Controls the max number of hybrid retained regions in memory. Note: This option will be ignored if taskmanager.network.hybrid-shuffle.enable-new-mode is set true. |
taskmanager.network.hybrid-shuffle.remote.path |
(none) |
String |
The option is used to configure the base path of remote storage for hybrid shuffle. The shuffle data will be stored in remote storage when the disk space is not enough. Note: If the option is configured and taskmanager.network.hybrid-shuffle.enable-new-mode is false, this option will be ignored. If the option is not configured and taskmanager.network.hybrid-shuffle.enable-new-mode is true, the remote storage will be disabled. |
taskmanager.network.hybrid-shuffle.spill-index-region-group-size |
1024 |
Integer |
Controls the region group size(in bytes) of hybrid spilled file data index. Note: This option will be ignored if taskmanager.network.hybrid-shuffle.enable-new-mode is set true. |
taskmanager.network.max-num-tcp-connections |
1 |
Integer |
The maximum number of tpc connections between taskmanagers for data communication. |
taskmanager.network.memory.buffers-per-channel |
2 |
Integer |
Number of exclusive network buffers for each outgoing/incoming channel (subpartition/input channel) in the credit-based flow control model. For the outgoing channel(subpartition), this value is the effective exclusive buffers per channel. For the incoming channel(input channel), this value is the max number of exclusive buffers per channel, the number of effective exclusive network buffers per channel is dynamically calculated from taskmanager.network.memory.read-buffer.required-per-gate.max and the effective range is from 0 to the configured value. The minimum valid value for the option is 0. When the option is configured as 0, the exclusive network buffers used by downstream incoming channel will be 0, but for each upstream outgoing channel, max(1, configured value) will be used. In other words, we ensure that, for performance reasons, at least one buffer is used per outgoing channel regardless of the configuration. |
taskmanager.network.memory.floating-buffers-per-gate |
8 |
Integer |
Number of floating network buffers for each outgoing/incoming gate (result partition/input gate). In credit-based flow control mode, this indicates how many floating credits are shared among all the channels. The floating buffers can help relieve back-pressure caused by unbalanced data distribution among the subpartitions. For the outgoing gate(result partition), this value is the effective floating buffers per gate. For the incoming gate(input gate), this value is a recommended number of floating buffers, the number of effective floating network buffers per gate is dynamically calculated from taskmanager.network.memory.read-buffer.required-per-gate.max and the range of effective floating buffers is from 0 to (parallelism - 1). |
taskmanager.network.memory.max-buffers-per-channel |
10 |
Integer |
Number of max buffers that can be used for each channel. If a channel exceeds the number of max buffers, it will make the task become unavailable, cause the back pressure and block the data processing. This might speed up checkpoint alignment by preventing excessive growth of the buffered in-flight data in case of data skew and high number of configured floating buffers. This limit is not strictly guaranteed, and can be ignored by things like flatMap operators, records spanning multiple buffers or single timer producing large amount of data. |
taskmanager.network.memory.max-overdraft-buffers-per-gate |
5 |
Integer |
Number of max overdraft network buffers to use for each ResultPartition. The overdraft buffers will be used when the subtask cannot apply to the normal buffers due to back pressure, while subtask is performing an action that can not be interrupted in the middle, like serializing a large record, flatMap operator producing multiple records for one single input record or processing time timer producing large output. In situations like that system will allow subtask to request overdraft buffers, so that the subtask can finish such uninterruptible action, without blocking unaligned checkpoints for long period of time. Overdraft buffers are provided on best effort basis only if the system has some unused buffers available. Subtask that has used overdraft buffers won't be allowed to process any more records until the overdraft buffers are returned to the pool. It should be noted that this config option only takes effect for Pipelined Shuffle. |
taskmanager.network.memory.read-buffer.required-per-gate.max |
(none) |
Integer |
The maximum number of network read buffers that are required by an input gate. (An input gate is responsible for reading data from all subtasks of an upstream task.) The number of buffers needed by an input gate is dynamically calculated in runtime, depending on various factors (e.g., the parallelism of the upstream task). Among the calculated number of needed buffers, the part below this configured value is required, while the excess part, if any, is optional. A task will fail if the required buffers cannot be obtained in runtime. A task will not fail due to not obtaining optional buffers, but may suffer a performance reduction. If not explicitly configured, the default value is Integer.MAX_VALUE for streaming workloads, and 1000 for batch workloads. If explicitly configured, the configured value should be at least 1. |
taskmanager.network.netty.client.connectTimeoutSec |
120 |
Integer |
The Netty client connection timeout. |
taskmanager.network.netty.client.numThreads |
-1 |
Integer |
The number of Netty client threads. |
taskmanager.network.netty.client.tcp.keepCount |
(none) |
Integer |
The maximum number of keepalive probes TCP should send before Netty client dropping the connection. Note: This will not take effect when using netty transport type of nio with an older version of JDK 8, refer to https://bugs.openjdk.org/browse/JDK-8194298. |
taskmanager.network.netty.client.tcp.keepIdleSec |
(none) |
Integer |
The time (in seconds) the connection needs to remain idle before TCP starts sending keepalive probes. Note: This will not take effect when using netty transport type of nio with an older version of JDK 8, refer to https://bugs.openjdk.org/browse/JDK-8194298. |
taskmanager.network.netty.client.tcp.keepIntervalSec |
(none) |
Integer |
The time (in seconds) between individual keepalive probes. Note: This will not take effect when using netty transport type of nio with an older version of JDK 8, refer to https://bugs.openjdk.org/browse/JDK-8194298. |
taskmanager.network.netty.num-arenas |
-1 |
Integer |
The number of Netty arenas. |
taskmanager.network.netty.sendReceiveBufferSize |
0 |
Integer |
The Netty send and receive buffer size. This defaults to the system buffer size (cat /proc/sys/net/ipv4/tcp_[rw]mem) and is 4 MiB in modern Linux. |
taskmanager.network.netty.server.backlog |
0 |
Integer |
The netty server connection backlog. |
taskmanager.network.netty.server.numThreads |
-1 |
Integer |
The number of Netty server threads. |
taskmanager.network.netty.transport |
"auto" |
String |
The Netty transport type, either "nio" or "epoll". The "auto" means selecting the property mode automatically based on the platform. Note that the "epoll" mode can get better performance, less GC and have more advanced features which are only available on modern Linux. |
taskmanager.network.request-backoff.initial |
100 |
Integer |
Minimum backoff in milliseconds for partition requests of input channels. |
taskmanager.network.request-backoff.max |
10000 |
Integer |
Maximum backoff in milliseconds for partition requests of input channels. |
taskmanager.network.retries |
0 |
Integer |
The number of retry attempts for network communication. Currently it's only used for establishing input/output channel connections |
taskmanager.network.sort-shuffle.min-buffers |
512 |
Integer |
Minimum number of network buffers required per blocking result partition for sort-shuffle. For production usage, it is suggested to increase this config value to at least 2048 (64M memory if the default 32K memory segment size is used) to improve the data compression ratio and reduce the small network packets. Usually, several hundreds of megabytes memory is enough for large scale batch jobs. Note: you may also need to increase the size of total network memory to avoid the 'insufficient number of network buffers' error if you are increasing this config value. |
taskmanager.network.sort-shuffle.min-parallelism |
1 |
Integer |
Parallelism threshold to switch between sort-based blocking shuffle and hash-based blocking shuffle, which means for batch jobs of smaller parallelism, hash-shuffle will be used and for batch jobs of larger or equal parallelism, sort-shuffle will be used. The value 1 means that sort-shuffle is the default option. Note: For production usage, you may also need to tune 'taskmanager.network.sort-shuffle.min-buffers' and 'taskmanager.memory.framework.off-heap.batch-shuffle.size' for better performance. |
taskmanager.network.tcp-connection.enable-reuse-across-jobs |
true |
Boolean |
Whether to reuse tcp connections across multi jobs. If set to true, tcp connections will not be released after job finishes. The subsequent jobs will be free from the overhead of the connection re-establish. However, this may lead to an increase in the total number of connections on your machine. When it reaches the upper limit, you can set it to false to release idle connections. Note that to avoid connection leak, you must set taskmanager.network.max-num-tcp-connections to a smaller value before you enable tcp connection reuse. |