perf(format): cache compact row layout per nested slot#3717
Open
stevenschlansker wants to merge 2 commits into
Open
perf(format): cache compact row layout per nested slot#3717stevenschlansker wants to merge 2 commits into
stevenschlansker wants to merge 2 commits into
Conversation
CompactBinaryRowWriter.writeUnaligned(byte[]) / writeUnaligned(MemoryBuffer) / writeAlignedBytes(MemoryBuffer) passed 0 as the source offset on the inline-fixed-width branch instead of the caller's offset/baseOffset. Any copy from a non-zero source offset (e.g. re-emitting a struct read from a sorted-schema row whose inline field sits past byte 0) wrote the wrong bytes. Fall back to the caller's offset in all three methods.
4727dc7 to
7c138fe
Compare
CompactRowLayout precomputes per-schema fixed offsets, fixed widths, bitmap width, and child layouts, and is shared across all CompactBinaryRow instances built from the same schema. The writer holds the cached layout and exposes it via BaseBinaryRowWriter.newRow(), so each BinaryRowEncoder.decode call allocates a fresh BinaryRow over the already-built layout tree rather than rebuilding it. A fresh row per decode keeps lazy interface-backed decoders safe: those wrap and retain the source BinaryRow, so reusing one row across decode() calls would silently alias every previously returned proxy onto the next payload. The unused Encoding.newRow(Schema) entry point is removed along with the now-dead codecFactory field on BinaryRowEncoder. Also drops the per-inline-struct getBuffer().slice(...) allocation in CompactBinaryRow.getStruct by pointing the nested row directly at the parent buffer, since all nested reads already add baseOffset.
7c138fe to
6292e96
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why?
Avoid redundant re-computation of compact codec offsets layout, saving cpu and memory allocations
and
fix(format): honor input offset when copying inline-struct bytes
What does this PR do?
Related issues
AI Contribution Checklist
yes/noAI Usage Disclosure
Does this PR introduce any user-facing change?
Compact codec user enjoys more CPU and memory on business logic without even realizing it.
No public API or wire changes.
Benchmark