-
Notifications
You must be signed in to change notification settings - Fork 124
feat: extract compute_hash function #928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis change adds two PostgreSQL functions: Changes
Sequence Diagram(s)sequenceDiagram
participant DB as PostgreSQL
participant Trigger as set_log_hash (trigger)
participant Func as compute_hash (function)
Note over Trigger: On new log insertion
Trigger->>DB: Query previous hash from logs table
Trigger->>Func: Call compute_hash(previous_hash, new_record)
Func-->>Trigger: Return computed hash
Trigger->>DB: Assign new hash to inserted log record
Possibly related PRs
Suggested reviewers
Poem
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. Note ⚡️ Faster reviews with cachingCodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure ✨ Finishing Touches🧪 Generate Unit Tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
192f849
to
57cbcd2
Compare
57cbcd2
to
62106f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
internal/storage/bucket/migrations/35-create-compute-hash-function/up.sql (4)
1-7
: Declare function immutability and useTEXT
for JSON
compute_hash
is a pure, deterministic function: consider marking itIMMUTABLE
so Postgres can optimize calls. Also preferTEXT
overVARCHAR
without length for potentially large JSON strings. For example:create or replace function compute_hash(previous_hash bytea, r logs) returns bytea - language plpgsql + language plpgsql + immutable as $$ -declare - marshalledAsJSON varchar; +declare + marshalledAsJSON text; begin
19-25
: Useconvert_to
for reliable encoding
Casting text tobytea
with::bytea
depends on server encoding and can vary. Replace it withconvert_to(..., 'UTF8')
for deterministic UTF-8 byte conversion. For example:-return (select public.digest( - case - when previous_hash is null - then marshalledAsJSON::bytea - else '"' || encode(previous_hash, 'base64')::bytea || E'"\n' || marshalledAsJSON::bytea - end || E'\n', 'sha256' -)); +return public.digest( + case + when previous_hash is null + then convert_to(marshalledAsJSON || E'\n', 'UTF8') + else convert_to('"' || encode(previous_hash, 'base64') || '"' || E'\n' || marshalledAsJSON || E'\n', 'UTF8') + end, + 'sha256' +);
27-27
: Avoid relying onsearch_path
for schema resolution
SET search_path
can be overridden and introduce ambiguity. It’s safer to schema-qualify object names (e.g.,{{ .Schema }}.logs
) in function bodies so you’re always referencing the intended schema.Also applies to: 48-48
38-42
: Add index for optimized lookup
Every insert queries the latesthash
byledger
ordered byseq DESC
. On a largelogs
table, this can lead to full scans. Consider creating a b-tree index on(ledger, seq DESC)
or a covering index to speed up this lookup.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
internal/storage/bucket/migrations/35-create-compute-hash-function/notes.yaml
is excluded by!**/*.yaml
📒 Files selected for processing (1)
internal/storage/bucket/migrations/35-create-compute-hash-function/up.sql
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Tests
🔇 Additional comments (3)
internal/storage/bucket/migrations/35-create-compute-hash-function/up.sql (3)
29-33
: Verify trigger attachment and security definer usage
You’ve definedset_log_hash()
as asecurity definer
trigger function. Please confirm:
- A
CREATE TRIGGER ... BEFORE INSERT ON {{ .Schema }}.logs
exists in a separate migration to invoke this function.- The function owner and
search_path
are locked down to prevent privilege escalation.
35-37
: DeclaringpreviousHash bytea
is straightforward and requires no changes.
44-44
: Verify behavior on first record
When there’s no previous log for a ledger,previousHash
isNULL
and you rely oncompute_hash
’s branch forNULL
. Please confirm this is the intended behavior for the genesis log entry.
select '{' || | ||
'"type":"' || r.type || '",' || | ||
'"data":' || encode(r.memento, 'escape') || ',' || | ||
'"date":"' || (to_json(r.date::timestamp)#>>'{}') || 'Z",' || | ||
'"idempotencyKey":"' || coalesce(r.idempotency_key, '') || '",' || | ||
'"id":0,' || | ||
'"hash":null' || | ||
'}' into marshalledAsJSON; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use JSONB constructors to build JSON safely
Manual string concatenation can produce invalid or insecure JSON if any field contains quotes, backslashes, or other special characters. Leverage jsonb_build_object
(or row_to_json
) to serialize the record reliably.
Example refactor:
-declare
- marshalledAsJSON varchar;
+declare
+ marshalledAsJSON text;
begin
- select '{' ||
- '"type":"' || r.type || '",' ||
- '"data":' || encode(r.memento, 'escape') || ',' ||
- '"date":"' || (to_json(r.date::timestamp)#>>'{}') || 'Z",' ||
- '"idempotencyKey":"' || coalesce(r.idempotency_key, '') || '",' ||
- '"id":0,' ||
- '"hash":null' ||
- '}' into marshalledAsJSON;
+ marshalledAsJSON := to_jsonb(
+ jsonb_build_object(
+ 'type', r.type,
+ 'data', encode(r.memento, 'escape'),
+ 'date', to_char(r.date, 'YYYY-MM-DD"T"HH24:MI:SS"Z"'),
+ 'idempotencyKey', coalesce(r.idempotency_key, ''),
+ 'id', 0,
+ 'hash', null
+ )
+ )::text;
🤖 Prompt for AI Agents
In internal/storage/bucket/migrations/35-create-compute-hash-function/up.sql
around lines 10 to 17, the JSON string is built using manual string
concatenation, which risks invalid or insecure JSON if fields contain special
characters. Replace the concatenation with a call to jsonb_build_object (or
row_to_json) to construct the JSON object safely and correctly serialize all
fields, ensuring proper escaping and formatting.
No description provided.