[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_li
From: |
Zhang, Chen |
Subject: |
RE: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list |
Date: |
Fri, 1 Apr 2022 03:33:05 +0000 |
> -----Original Message-----
> From: lizhijian@fujitsu.com <lizhijian@fujitsu.com>
> Sent: Friday, April 1, 2022 9:47 AM
> To: Zhang, Chen <chen.zhang@intel.com>; Jason Wang
> <jasowang@redhat.com>
> Cc: qemu-dev <qemu-devel@nongnu.org>; Like Xu <like.xu@linux.intel.com>
> Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the
> conn_list
>
>
>
> On 31/03/2022 10:25, Zhang, Chen wrote:
> >
> >> -----Original Message-----
> >> From: lizhijian@fujitsu.com <lizhijian@fujitsu.com>
> >> Sent: Thursday, March 31, 2022 9:15 AM
> >> To: Zhang, Chen <chen.zhang@intel.com>; Jason Wang
> >> <jasowang@redhat.com>
> >> Cc: qemu-dev <qemu-devel@nongnu.org>; Like Xu
> >> <like.xu@linux.intel.com>
> >> Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear
> >> the conn_list
> >>
> >>
> >> connection_track_table
> >> -----+----------
> >> key1 | conn
> >> |-----------------------------------------------------------+
> >> -----+----------
> >> |
> >> key2 | conn |----------------------------------+
> >> |
> >> -----+---------- |
> >> |
> >> key3 | conn |---------+ |
> >> |
> >> -----+---------- | |
> >> |
> >> | |
> >> |
> >> | |
> >> |
> >> + CompareState ++ | |
> >> |
> >> | | V V
> >> V
> >> +---------------+ +---------------+ +---------------+
> >> |conn_list +--->conn +--------->conn |
> >> connx
> >> +---------------+ +---------------+ +---------------+
> >> | | | | | |
> >> +---------------+ +---v----+ +---v----+ +---v----+ +---v----+
> >> |primary | |secondary |primary | |secondary
> >> |packet | |packet + |packet | |packet +
> >> +--------+ +--------+ +--------+ +--------+
> >> | | | |
> >> +---v----+ +---v----+ +---v----+ +---v----+
> >> |primary | |secondary |primary | |secondary
> >> |packet | |packet + |packet | |packet +
> >> +--------+ +--------+ +--------+ +--------+
> >> | | | |
> >> +---v----+ +---v----+ +---v----+ +---v----+
> >> |primary | |secondary |primary | |secondary
> >> |packet | |packet + |packet | |packet +
> >> +--------+ +--------+ +--------+ +--------+
> >>
> >> I recalled that we should above relationships between
> >> connection_track_table conn_list and conn.
> >> That means both connection_track_table and conn_list reference to the
> >> same conn instance.
> >>
> >> So before this patch, connection_get() is possible to
> >> use-after-free/double free conn. where 1st was in
> >> connection_hashtable_reset() and 2nd was
> >> 221 while (!g_queue_is_empty(conn_list)) {
> >> 222 connection_destroy(g_queue_pop_head(conn_list));
> >> 223 }
> >>
> >> I also doubt that your current abort was just due to above use-after-
> >> free/double free.
> >> If so, looks it's enough we just update to g_queue_clear(conn_list)
> >> in the 2nd place.
> > Make sense, but It also means the original patch works here, skip free conn
> in connection_hashtable_reset() and do it in:
> > 221 while (!g_queue_is_empty(conn_list)) {
> > 222 connection_destroy(g_queue_pop_head(conn_list));
> > 223 }.
> > It also avoid use-after-free/double free conn.
> Although you will not use-after-free here, you have to consider other
> situations carefully that
> g_hash_table_remove_all() g_hash_table_destroy() were called where the
> conn_list should also be freed with you approach.
>
>
I re-checked the code, it looks fine to me.
>
>
> > Maybe we can keep the original version to fix it?
> And your commit log should be more clear.
OK, I will update V2 for commit log.
Thanks
Chen
>
> Thanks
> Zhijian
>
> >
> > Thanks
> > Chen
> >
> >> Thanks
> >> Zhijian
> >>
> >>
> >> On 28/03/2022 17:13, Zhang, Chen wrote:
> >>>> -----Original Message-----
> >>>> From: lizhijian@fujitsu.com <lizhijian@fujitsu.com>
> >>>> Sent: Monday, March 21, 2022 11:06 AM
> >>>> To: Zhang, Chen <chen.zhang@intel.com>; Jason Wang
> >>>> <jasowang@redhat.com>; lizhijian@fujitsu.com
> >>>> Cc: qemu-dev <qemu-devel@nongnu.org>; Like Xu
> >>>> <like.xu@linux.intel.com>
> >>>> Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to
> >>>> clear the conn_list
> >>>>
> >>>>
> >>>>
> >>>> On 09/03/2022 16:38, Zhang Chen wrote:
> >>>>> We notice the QEMU may crash when the guest has too many
> incoming
> >>>>> network connections with the following log:
> >>>>>
> >>>>> 15197@1593578622.668573:colo_proxy_main : colo proxy connection
> >>>>> hashtable full, clear it
> >>>>> free(): invalid pointer
> >>>>> [1] 15195 abort (core dumped) qemu-system-x86_64 ....
> >>>>>
> >>>>> This is because we create the s->connection_track_table with
> >>>>> g_hash_table_new_full() which is defined as:
> >>>>>
> >>>>> GHashTable * g_hash_table_new_full (GHashFunc hash_func,
> >>>>> GEqualFunc key_equal_func,
> >>>>> GDestroyNotify key_destroy_func,
> >>>>> GDestroyNotify value_destroy_func);
> >>>>>
> >>>>> The fourth parameter connection_destroy() will be called to free
> >>>>> the memory allocated for all 'Connection' values in the hashtable
> >>>>> when we call g_hash_table_remove_all() in the
> connection_hashtable_reset().
> >>>>>
> >>>>> It's unnecessary because we clear the conn_list explicitly later,
> >>>>> and it's buggy when other agents try to call connection_get() with
> >>>>> the same connection_track_table.
> >>>>>
> >>>>> Signed-off-by: Like Xu <like.xu@linux.intel.com>
> >>>>> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> >>>>> ---
> >>>>> net/colo-compare.c | 2 +-
> >>>>> net/filter-rewriter.c | 2 +-
> >>>>> 2 files changed, 2 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/net/colo-compare.c b/net/colo-compare.c index
> >>>>> 62554b5b3c..ab054cfd21 100644
> >>>>> --- a/net/colo-compare.c
> >>>>> +++ b/net/colo-compare.c
> >>>>> @@ -1324,7 +1324,7 @@ static void
> >>>> colo_compare_complete(UserCreatable *uc, Error **errp)
> >>>>> s->connection_track_table =
> >>>> g_hash_table_new_full(connection_key_hash,
> >>>>>
> >>>>> connection_key_equal,
> >>>>> g_free,
> >>>>> -
> >>>>> connection_destroy);
> >>>>> + NULL);
> >>>> 202 /* if not found, create a new connection and add to hash table
> >>>> */
> >>>> 203 Connection *connection_get(GHashTable *connection_track_table,
> >>>> 204 ConnectionKey *key,
> >>>> 205 GQueue *conn_list)
> >>>> 206 {
> >>>> 207 Connection *conn =
> >>>> g_hash_table_lookup(connection_track_table,
> >>>> key);
> >>>> 208
> >>>> 209 if (conn == NULL) {
> >>>> 210 ConnectionKey *new_key = g_memdup(key, sizeof(*key));
> >>>> 211
> >>>> 212 conn = connection_new(key);
> >>>> 213
> >>>> 214 if (g_hash_table_size(connection_track_table) >
> >>>> HASHTABLE_MAX_SIZE) {
> >>>> 215 trace_colo_proxy_main("colo proxy connection hashtable
> full,"
> >>>> 216 " clear it");
> >>>> 217 connection_hashtable_reset(connection_track_table);
> >>>>
> >>>> 197 void connection_hashtable_reset(GHashTable
> >>>> *connection_track_table)
> >>>> 198 {
> >>>> 199 g_hash_table_remove_all(connection_track_table);
> >>>> 200 }
> >>>>
> >>>> IIUC, above subroutine will do some cleanup explicitly. And before
> >>>> your patch, connection_hashtable_reset() will release all keys and
> >>>> their values in this hashtable. But now, you remove all keys and
> >>>> just one value(conn_list) instead. Does it means other values will be
> leaked ?
> >>> Thanks Zhijian. Re-think about it. Yes, I think you are right.
> >>> It looks need a lock to avoid into connection_get() when triggered
> >>> the clear
> >> hashtable operation.
> >>> What do you think?
> >>>
> >>> Thanks
> >>> Chen
> >>>
> >>>
> >>>> 218 /*
> >>>> 219 * clear the conn_list
> >>>> 220 */
> >>>> 221 while (!g_queue_is_empty(conn_list)) {
> >>>> 222 connection_destroy(g_queue_pop_head(conn_list));
> >>>> 223 }
> >>>> 224 }
> >>>> 225
> >>>> 226 g_hash_table_insert(connection_track_table, new_key,
> >>>> conn);
> >>>> 227 }
> >>>> 228
> >>>> 229 return conn;
> >>>> 230 }
> >>>>
> >>>>
> >>>> Thanks
> >>>> Zhijian
> >>>>
> >>>>> colo_compare_iothread(s);
> >>>>>
> >>>>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index
> >>>>> bf05023dc3..c18c4c2019 100644
> >>>>> --- a/net/filter-rewriter.c
> >>>>> +++ b/net/filter-rewriter.c
> >>>>> @@ -383,7 +383,7 @@ static void colo_rewriter_setup(NetFilterState
> >>>>> *nf,
> >>>> Error **errp)
> >>>>> s->connection_track_table =
> >>>> g_hash_table_new_full(connection_key_hash,
> >>>>>
> >>>>> connection_key_equal,
> >>>>> g_free,
> >>>>> -
> >>>>> connection_destroy);
> >>>>> + NULL);
> >>>>> s->incoming_queue =
> >>>> qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
> >>>>> }
> >>>>>
- [PATCH 0/4] COLO net and runstate bugfix/optimization, Zhang Chen, 2022/03/09
- [PATCH 1/4] softmmu/runstate.c: add RunStateTransition support form COLO to PRELAUNCH, Zhang Chen, 2022/03/09
- [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list, Zhang Chen, 2022/03/09
- Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list, address@hidden, 2022/03/20
- RE: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list, Zhang, Chen, 2022/03/28
- Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list, address@hidden, 2022/03/30
- RE: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list, Zhang, Chen, 2022/03/30
- Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list, address@hidden, 2022/03/31
- RE: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list,
Zhang, Chen <=
[PATCH 4/4] net/colo.c: fix segmentation fault when packet is not parsed correctly, Zhang Chen, 2022/03/09
[PATCH 3/4] net/colo.c: No need to track conn_list for filter-rewriter, Zhang Chen, 2022/03/09