Read view是实现多版本并发控制(mvcc)中的一个重要的数据结构。Read view也可以称作是快照。其用来描述当前数据库中的存在的事务情况。因为数据的可见性与事务息息相关。
本文对其实现进行源码分析。
Read view是实现多版本并发控制(mvcc)中的一个重要的数据结构。Read view也可以称作是快照。其用来描述当前数据库中的存在的事务情况。因为数据的可见性与事务息息相关。
/*
-------------------------------------------------------------------------------
FACT A: Cursor read view on a secondary index sees only committed versions
-------
of the records in the secondary index or those versions of rows created
by transaction which created a cursor before cursor was created even
if transaction which created the cursor has changed that clustered index page.
PROOF: We must show that read goes always to the clustered index record
to see that record is visible in the cursor read view. Consider e.g.
following table and SQL-clauses:
create table t1(a int not null, b int, primary key(a), index(b));
insert into t1 values (1,1),(2,2);
commit;
Now consider that we have a cursor for a query
select b from t1 where b >= 1;
This query will use secondary key on the table t1. Now after the first fetch
on this cursor if we do a update:
update t1 set b = 5 where b = 2;
Now second fetch of the cursor should not see record (2,5) instead it should
see record (2,2).
We also should show that if we have delete t1 where b = 5; we still
can see record (2,2).
When we access a secondary key record maximum transaction id is fetched
from this record and this trx_id is compared to up_limit_id in the view.
If trx_id in the record is greater or equal than up_limit_id in the view
cluster record is accessed. Because trx_id of the creating
transaction is stored when this view was created to the list of
trx_ids not seen by this read view previous version of the
record is requested to be built. This is build using clustered record.
If the secondary key record is delete-marked, its corresponding
clustered record can be already be purged only if records
trx_id < low_limit_no. Purge can't remove any record deleted by a
transaction which was active when cursor was created. But, we still
may have a deleted secondary key record but no clustered record. But,
this is not a problem because this case is handled in
row_sel_get_clust_rec() function which is called
whenever we note that this read view does not see trx_id in the
record. Thus, we see correct version. Q. E. D.
-------------------------------------------------------------------------------
FACT B: Cursor read view on a clustered index sees only committed versions
-------
of the records in the clustered index or those versions of rows created
by transaction which created a cursor before cursor was created even
if transaction which created the cursor has changed that clustered index page.
PROOF: Consider e.g.following table and SQL-clauses:
create table t1(a int not null, b int, primary key(a));
insert into t1 values (1),(2);
commit;
Now consider that we have a cursor for a query
select a from t1 where a >= 1;
This query will use clustered key on the table t1. Now after the first fetch
on this cursor if we do a update:
update t1 set a = 5 where a = 2;
Now second fetch of the cursor should not see record (5) instead it should
see record (2).
We also should show that if we have execute delete t1 where a = 5; after
the cursor is opened we still can see record (2).
When accessing clustered record we always check if this read view sees
trx_id stored to clustered record. By default we don't see any changes
if record trx_id >= low_limit_id i.e. change was made transaction
which started after transaction which created the cursor. If row
was changed by the future transaction a previous version of the
clustered record is created. Thus we see only committed version in
this case. We see all changes made by committed transactions i.e.
record trx_id < up_limit_id. In this case we don't need to do anything,
we already see correct version of the record. We don't see any changes
made by active transaction except creating transaction. We have stored
trx_id of creating transaction to list of trx_ids when this view was
created. Thus we can easily see if this record was changed by the
creating transaction. Because we already have clustered record we can
access roll_ptr. Using this roll_ptr we can fetch undo record.
We can now check that undo_no of the undo record is less than undo_no of the
trancaction which created a view when cursor was created. We see this
clustered record only in case when record undo_no is less than undo_no
in the view. If this is not true we build based on undo_rec previous
version of the record. This record is found because purge can't remove
records accessed by active transaction. Thus we see correct version. Q. E. D.
-------------------------------------------------------------------------------
FACT C: Purge does not remove any delete-marked row that is visible
-------
in any cursor read view.
PROOF: We know that:
1: Currently active read views in trx_sys_t::view_list are ordered by
ReadView::low_limit_no in descending order, that is,
newest read view first.
2: Purge clones the oldest read view and uses that to determine whether there
are any active transactions that can see the to be purged records.
Therefore any joining or active transaction will not have a view older
than the purge view, according to 1.
When purge needs to remove a delete-marked row from a secondary index,
it will first check that the DB_TRX_ID value of the corresponding
record in the clustered index is older than the purge view. It will
also check if there is a newer version of the row (clustered index
record) that is not delete-marked in the secondary index. If such a
row exists and is collation-equal to the delete-marked secondary index
record then purge will not remove the secondary index record.
Delete-marked clustered index records will be removed by
row_purge_remove_clust_if_poss(), unless the clustered index record
(and its DB_ROLL_PTR) has been updated. Every new version of the
clustered index record will update DB_ROLL_PTR, pointing to a new UNDO
log entry that allows the old version to be reconstructed. The
DB_ROLL_PTR in the oldest remaining version in the old-version chain
may be pointing to garbage (an undo log record discarded by purge),
but it will never be dereferenced, because the purge view is older
than any active transaction.
For details see: row_vers_old_has_index_entry() and row_purge_poss_sec()
Some additional issues:
What if trx_sys->view_list == NULL and some transaction T1 and Purge both
try to open read_view at same time. Only one can acquire trx_sys->mutex.
In which order will the views be opened? Should it matter? If no, why?
The order does not matter. No new transactions can be created and no running
RW transaction can commit or rollback (or free views). AC-NL-RO transactions
will mark their views as closed but not actually free their views.
*/
数据的可见性除了与read view相关外,还与当前的隔离级别和锁有关系,而这些技术则完整的构成了MVCC(多版本并发控制)的概念,用来给mysql提供一致性非锁读的能力。例如:read view创建的时间与事务的隔离级别有关系,RC下readview 创建的时候和RR下是有所不同。read view相当于存在一个上帝角色,在系统的运行的时候拿一个照相机对着系统进行照相,记录下当前系统的正在运行时候的状态。例如:当前运行着哪些事务,这些事务的ID是多少,并将这些事务的id保存至 ReadView::ids_t 对象中。MySQL中 ReadView 类用来描述了那些记录对哪些事务可见。
这里提一句是mysql官方的文档是非常非常非常好的文档。对于里面很多文章值得我们仔细研读。
是reaadview还有一个作用是其会影响到purge线程对于deleted rows的操作。 具体的描述可以从在上述的注释中可得到,这里我们先不进行详细的论述。这里需要提及一下,对于RC隔离级别和RR隔离级别下,ReadView不同。从RC定义中其实对于read committed,即:那些已经提交的事务中的数据会对本次查询可见。从上述的图中我们可以看成,当时Transaction1开始的时候,此时ReadView::ids_t中的所有事务的id都大于 Transaction1的事务ID,即:trx_id < ReadView::ids_t::low_limit_no。当Transaction1启动的时候,ids_t中的这些事务还没有启动,那么此时对于RC和RR所能看到的数据是不一样的。在RC级别下,对于那些已经提及的数据对于Transaction1是可见的,而且在RR级别下,即使这些记录已经提交了,Transaction1仍然无法看到这些已提交的数据。在访问这些记录的时候都需要检查其对于该事务的可见性。那么此时的ReadView就是至关重要的。trx_sys->view_list则记录当前系统中所有的活动的readview情况。
/** Read view lists the trx ids of those transactions for which a consistent
read should not see the modifications to the database. */class ReadView {
从ReadView的定义可以看成,其可以类比为trx ids的一个集合。
对于low_limit_no和up_limit_no 代码给出的定义如下:
/** The read should not see any transaction with trx id >= thisvalue. In other words, this is the "high water mark". */trx_id_t m_low_limit_id;/** The read should see all trx ids which are strictlysmaller (<) than this value. In other words, this is thelow water mark". */trx_id_t m_up_limit_id;
/** Allocate and create a view.
@param viewView owned by this class created for the caller. Must be
freed by calling view_close()
@param trxTransaction instance of caller */
void MVCC::view_open(ReadView *&view, trx_t *trx) {
view_open函数用来创建一个readview。例如:当我们在进行查询操作的时候,在我们开启一个事务的时候,系统会依据隔离级别,在相应的位置通过trx_assign_read_view来创建一个属于该事务的ReadView,在该事务的什么周期内会对于每个要访问的记录会与该ReadView进行可见性判断view_open函数中,会使用
view = get_view();
来具体的获取一个,如果当前无readview则创建一个,否则使用当前事务已有的。
/**
Find a free view from the active list, if none found then allocate
a new view.
@return a view to use */ReadView *MVCC::get_view() {ut_ad(mutex_own(&trx_sys->mutex));ReadView *view;if (UT_LIST_GET_LEN(m_free) > 0) {view = UT_LIST_GET_FIRST(m_free);UT_LIST_REMOVE(m_free, view);} else {view = UT_NEW_NOKEY(ReadView());if (view == nullptr) {ib::error(ER_IB_MSG_918) << "Failed to allocate MVCC view";}}return (view);
}
原文发表于 ReadView 和MVCC (qq.com)