Skip to content

[fix](binlog) Fix missing binlog column index when converting TTabletSchema to TabletSchemaPB#64484

Open
heguanhui wants to merge 4 commits into
apache:masterfrom
heguanhui:fix/fix-binlog-be-coredump-issue
Open

[fix](binlog) Fix missing binlog column index when converting TTabletSchema to TabletSchemaPB#64484
heguanhui wants to merge 4 commits into
apache:masterfrom
heguanhui:fix/fix-binlog-be-coredump-issue

Conversation

@heguanhui

@heguanhui heguanhui commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: https://github.com/apache/doris/issues/64483

Problem Summary:
When running GroupRowsetWriterTest.sub_writer_rollback, a coredump occurs:
F20260613 19:13:27.347458 row_binlog_segment_writer.cpp:69] Check failed: lsn_col_id >= 0 binlog schema missing DORIS_BINLOG_LSN

Root Cause

TabletMeta::init_schema_from_thrift() does not set binlog_lsn_col_idx and binlog_timestamp_col_idx in TabletSchemaPB when converting from TTabletSchema. As a result, these fields remain -1 after deserialization, causing the CHECK failure.

Solution

Add logic in init_schema_from_thrift() to:

  1. Detect binlog special columns (__DORIS_BINLOG_LSN__, __DORIS_BINLOG_TIMESTAMP__) by name
  2. Record their column indices
  3. Set the corresponding fields in TabletSchemaPB

Additionally, fix a nullable type mismatch for the timestamp column in _fill_binlog_columns() by using check_and_get_column instead of assert_cast.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@heguanhui

Copy link
Copy Markdown
Contributor Author

/review

@heguanhui heguanhui force-pushed the fix/fix-binlog-be-coredump-issue branch from f707b1d to ca44c72 Compare June 13, 2026 20:06
@heguanhui

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29004 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ca44c726c9401866066b89731eb6c9af0f488c6a, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17692	4004	3978	3978
q2	q3	10767	1338	812	812
q4	4684	465	344	344
q5	7535	875	571	571
q6	184	170	137	137
q7	776	844	620	620
q8	9485	1613	1622	1613
q9	7218	4524	4477	4477
q10	6825	1813	1513	1513
q11	432	265	245	245
q12	635	420	290	290
q13	18094	3412	2797	2797
q14	268	259	242	242
q15	q16	820	778	708	708
q17	1165	1048	818	818
q18	6662	5677	5549	5549
q19	1320	1271	1065	1065
q20	498	401	266	266
q21	5958	2808	2640	2640
q22	462	379	319	319
Total cold run time: 101480 ms
Total hot run time: 29004 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4793	4742	4705	4705
q2	q3	5314	5215	4586	4586
q4	2114	2179	1364	1364
q5	4884	4923	4759	4759
q6	237	178	135	135
q7	1884	1751	1563	1563
q8	2385	2017	1945	1945
q9	7321	7370	7341	7341
q10	5047	4652	4229	4229
q11	545	382	352	352
q12	731	731	527	527
q13	3007	3394	2795	2795
q14	274	293	266	266
q15	q16	685	700	611	611
q17	1279	1265	1250	1250
q18	7390	6830	6761	6761
q19	1140	1075	1099	1075
q20	2223	2219	1972	1972
q21	5295	4580	4437	4437
q22	554	464	399	399
Total cold run time: 57102 ms
Total hot run time: 51072 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 168871 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ca44c726c9401866066b89731eb6c9af0f488c6a, data reload: false

query5	4318	619	495	495
query6	433	192	173	173
query7	4870	574	302	302
query8	362	216	201	201
query9	8751	4051	4061	4051
query10	452	313	251	251
query11	5969	2367	2131	2131
query12	152	101	100	100
query13	1304	589	423	423
query14	6341	5362	5085	5085
query14_1	4391	4424	4419	4419
query15	202	196	182	182
query16	1022	458	418	418
query17	1094	672	574	574
query18	2415	468	338	338
query19	194	179	137	137
query20	110	107	106	106
query21	208	143	117	117
query22	13715	13650	13486	13486
query23	17196	16579	16161	16161
query23_1	16255	16232	16286	16232
query24	7646	1798	1303	1303
query24_1	1326	1324	1320	1320
query25	580	480	392	392
query26	1323	321	166	166
query27	2656	560	335	335
query28	4484	2053	2023	2023
query29	1130	651	501	501
query30	314	231	200	200
query31	1114	1074	961	961
query32	118	65	61	61
query33	525	331	266	266
query34	1187	1159	641	641
query35	760	820	674	674
query36	1400	1398	1249	1249
query37	155	113	101	101
query38	3174	3164	3041	3041
query39	953	913	895	895
query39_1	896	869	860	860
query40	222	128	108	108
query41	72	67	67	67
query42	100	98	134	98
query43	328	335	294	294
query44	
query45	194	187	179	179
query46	1058	1231	778	778
query47	2321	2309	2261	2261
query48	402	416	293	293
query49	637	492	348	348
query50	1012	358	266	266
query51	4379	4294	4199	4199
query52	87	88	78	78
query53	239	268	187	187
query54	264	222	196	196
query55	77	74	76	74
query56	228	214	218	214
query57	1423	1396	1325	1325
query58	236	209	210	209
query59	1572	1629	1393	1393
query60	277	244	225	225
query61	149	146	148	146
query62	701	648	579	579
query63	238	185	188	185
query64	2549	791	602	602
query65	
query66	1818	468	338	338
query67	29660	29646	29491	29491
query68	
query69	428	297	265	265
query70	1001	972	968	968
query71	286	221	241	221
query72	2902	2627	2373	2373
query73	871	744	427	427
query74	5111	4978	4779	4779
query75	2664	2597	2242	2242
query76	2319	1167	799	799
query77	353	376	300	300
query78	12261	12359	11934	11934
query79	1462	1052	791	791
query80	720	477	408	408
query81	468	280	236	236
query82	580	162	124	124
query83	340	281	253	253
query84	
query85	899	510	409	409
query86	407	303	274	274
query87	3390	3342	3237	3237
query88	3697	2752	2777	2752
query89	427	387	326	326
query90	1807	186	180	180
query91	175	161	134	134
query92	66	60	56	56
query93	1538	1445	883	883
query94	619	344	273	273
query95	681	372	444	372
query96	1019	826	335	335
query97	2708	2673	2623	2623
query98	210	205	205	205
query99	1126	1192	1019	1019
Total cold run time: 250510 ms
Total hot run time: 168871 ms

@hello-stephen

Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 90.00% (18/20) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 54.36% (21306/39196)
Line Coverage 37.96% (203358/535659)
Region Coverage 33.99% (159594/469596)
Branch Coverage 34.97% (69788/199561)

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 90.00% (18/20) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.07% (28348/38274)
Line Coverage 58.06% (309172/532521)
Region Coverage 54.96% (259295/471748)
Branch Coverage 56.29% (112479/199820)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants