Postgres 序列可以无序提交

  • Sequin: A Postgres CDC tool like Kafka, SQS, etc. Stable, sequential ordering is useful in streaming systems but Postgres' behavior can be surprising.
  • MVCC and Postgres: MVCC systems like Postgres are strongly consistent but can seem eventually consistent from a client's perspective. For example, in a table with inserted_at field, query results can change even if the table is effectively immutable. This is because operations on different processes can finish in different orders and default values like now() are resolved at the beginning of queries.
  • Cursor-based pagination issue: In cursor-based pagination, due to query volatility at the tail, rows can be skipped. This is a common problem in many APIs that use cursor-based pagination, like Salesforce, Stripe, and HubSpot.
  • Mitigation strategies:

    • Serializing writes with a sequence table: Create a new table to serialize writes and eliminate race conditions. Use a trigger function to set access_logs.seq on insert. Optimization includes making sure my_seq is not touched until the last possible moment in a transaction and putting the seq on a different table if rows are heavy.
    • Intelligently restrict reads: Avoid the tail of a table by setting a max/upper limit on queries or finding a way to read the table up to the current consistent state. Options include using pg_snapshot_xmin, pg_stat_activity, and advisory locks. Advisory locks are a key property needed as they can affect observable state from inside a transaction for other sessions to read. Stacking locks can be used to communicate two pieces of information. Functions can be created to encapsulate this logic for easier use.
  • Suggestions for Postgres: Provide more support on the read side, such as making the 2-arity version of pg_try_advisory_xact_lock accept a bigint and having something like max_safe_seq supported first-class for sequences.
阅读 14
0 条评论