Skip to content

Crash (null-pointer access) in rd_kafka_metadata_cache_entry_by_id_cmp() during rd_avl_insert() #4778

@GerKr

Description

@GerKr

Description

After about 2 months of usage a crash happened. The crashdump-file shows that it happened in the function rd_kafka_metadata_cache_entry_by_id_cmp(const void *_a, const void *_b), where _b is 0x0000000000000000. This leads to a write access to some address 0x0000000000000088.
During the call of rd_avl_insert(ravl, elm, ran) the variable ravl contains:
ravl->ravl_root == 0x12eaf651500 (no nullptr) and
ravl->ravl_root.ran_height == 0x61657268 (converted to text it is "hrea" from the below found "hread_0")
ravl->ravl_root.ran_elm == 0x0000000000000000 (nullptr!!!)

The memory where rafl->rafl_root points to looks like following:
0f 00 00 00 00 00 00 00 00 40 89 b8 2e 01 00 00 68 72 65 61 64 5f 30 00 00 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00 00 74 e2 b6 2e 01 00 00 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00 00 65 74 43 68 61 6e

The call-stack looks like following:

librdkafka.dll!rd_kafka_metadata_cache_entry_by_id_cmp(const void * _a, const void * _b) Line 700 C
librdkafka.dll!rd_avl_insert_node(rd_avl_s * ravl, rd_avl_node_s * parent, rd_avl_node_s * ran, rd_avl_node_s * * existing) Line 104 C
librdkafka.dll!rd_kafka_metadata_cache_insert(rd_kafka_s * rk, const rd_kafka_metadata_topic * mtopic, const rd_kafka_metadata_topic_internal_s * metadata_internal_topic, __int64 now, __int64 ts_expires, unsigned char include_racks, rd_kafka_metadata_broker_internal_s * brokers_internal, unsigned __int64 broker_cnt) Line 380 C
librdkafka.dll!rd_kafka_metadata_cache_topic_update(rd_kafka_s * rk, const rd_kafka_metadata_topic * mdt, const rd_kafka_metadata_topic_internal_s * mdit, unsigned char propagate, unsigned char include_racks, rd_kafka_metadata_broker_internal_s * brokers, unsigned __int64 broker_cnt, unsigned char only_existing) Line 508 C
librdkafka.dll!rd_kafka_parse_Metadata0(rd_kafka_broker_s * rkb, rd_kafka_buf_s * request, rd_kafka_buf_s * rkbuf, rd_kafka_metadata_internal_s * * mdip, rd_list_s * request_topics, const char * reason) Line 857 C
librdkafka.dll!rd_kafka_parse_Metadata(rd_kafka_broker_s * rkb, rd_kafka_buf_s * request, rd_kafka_buf_s * rkbuf, rd_kafka_metadata_internal_s * * mdip) Line 1113 C
librdkafka.dll!rd_kafka_handle_Metadata(rd_kafka_s * rk, rd_kafka_broker_s * rkb, rd_kafka_resp_err_t err, rd_kafka_buf_s * rkbuf, rd_kafka_buf_s * request, void * opaque) Line 2490 C
librdkafka.dll!rd_kafka_buf_callback(rd_kafka_s * rk, rd_kafka_broker_s * rkb, rd_kafka_resp_err_t err, rd_kafka_buf_s * response, rd_kafka_buf_s * request) Line 512 C
librdkafka.dll!rd_kafka_buf_handle_op(rd_kafka_op_s * rko, rd_kafka_resp_err_t err) Line 453 C
librdkafka.dll!rd_kafka_op_handle_std(rd_kafka_s * rk, rd_kafka_q_s * rkq, rd_kafka_op_s * rko, int cb_type) Line 884 C
librdkafka.dll!rd_kafka_op_handle(rd_kafka_s * rk, rd_kafka_q_s * rkq, rd_kafka_op_s * rko, rd_kafka_q_cb_type_t cb_type, void * opaque, rd_kafka_op_res_t(*)(rd_kafka_s *, rd_kafka_q_s *, rd_kafka_op_s *, rd_kafka_q_cb_type_t, void ) callback) Line 916 C
librdkafka.dll!rd_kafka_q_serve(rd_kafka_q_s * rkq, int timeout_ms, int max_cnt, rd_kafka_q_cb_type_t cb_type, rd_kafka_op_res_t(
)(rd_kafka_s *, rd_kafka_q_s *, rd_kafka_op_s *, rd_kafka_q_cb_type_t, void *) callback, void * opaque) Line 581 C
librdkafka.dll!rd_kafka_thread_main(void * arg) Line 2138 C
librdkafka.dll!_thrd_wrapper_function(void * aArg) Line 589 C
kernel32.dll!00007ffd36527e94() Unknown
ntdll.dll!RtlUserThreadStart�() Unknown

How to reproduce

I don't know how to reproduce, as within the last 2 months it did not happen. It seems to be a sporadic problem.
The crashdump and the according pdb-file and sources (v2.4.0) are available. On request I can deliver the values of variables or memory dumps.

Checklist

Please provide the following information:

  • librdkafka version (release number or git tag): v2.4.0
  • Apache Kafka version: v2.8.1
  • librdkafka client configuration: N/A
  • Operating system: Windows Server 2019
  • Provide logs: call stack - see in "Description"
  • Provide broker log excerpts: N/A
  • Critical issue: the crash kills the application

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions