bufferByteSizeThresholdForFlush

bufferByteSizeThresholdForFlush triggers flushing the record buffer to stdout once the buffer's size (in bytes) grows past this value.

Flushing the record buffer to stdout is done by calling println which is synchronized. The choice of println is imposed by our use of the ConsoleJSONAppender log4j appended which concurrently calls println to print AirbyteMessages of type LOG to standard output.

Because calling println incurs both a synchronization overhead and a syscall overhead, the connector's performance will noticeably degrade if it's called too often. This happens primarily when emitting lots of tiny RECORD messages, which is typical of source connectors.

For this reason, the bufferByteSizeThresholdForFlush value should not be too small. The default value of 4kB is good in this respect. For example, if the average serialized record size is 100 bytes, this will reduce the volume of println calls by a factor of 40.

Conversely, the bufferByteSizeThresholdForFlush value should also not be too large. Otherwise, the output becomes bursty and this also degrades performance. As of today (and hopefully not for long) the platform still pipes the connector's stdout into socat to emit the output as TCP packets. While socat is buffered, its buffer size is only 8 kB. In any case, TCP packet sized (capped by the MTU) are also in the low kilobytes.