这篇文章主要介绍“PostgreSQL中ReadBuffer_common函数有什么作用”,在日常操作中,相信很多人在PostgreSQL中ReadBuffer_common函数有什么作用问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答”PostgreSQL中ReadBuffer_common函数有什么作用”的疑惑有所帮助!接下来,请跟着小编一起来学习吧!
BufferDesc
共享缓冲区的共享描述符(状态)数据
/*
* Flags for buffer descriptors
* buffer描述器标记
*
* Note: TAG_VALID essentially means that there is a buffer hashtable
* entry associated with the buffer's tag.
* 注意:TAG_VALID本质上意味着有一个与缓冲区的标记相关联的缓冲区散列表条目。
*/
//buffer header锁定
#define BM_LOCKED (1U << 22) /* buffer header is locked */
//数据需要写入(标记为DIRTY)
#define BM_DIRTY (1U << 23) /* data needs writing */
//数据是有效的
#define BM_VALID (1U << 24) /* data is valid */
//已分配buffer tag
#define BM_TAG_VALID (1U << 25) /* tag is assigned */
//正在R/W
#define BM_IO_IN_PROGRESS (1U << 26) /* read or write in progress */
//上一个I/O出现错误
#define BM_IO_ERROR (1U << 27) /* previous I/O failed */
//开始写则变DIRTY
#define BM_JUST_DIRTIED (1U << 28) /* dirtied since write started */
//存在等待sole pin的其他进程
#define BM_PIN_COUNT_WAITER (1U << 29) /* have waiter for sole pin */
//checkpoint发生,必须刷到磁盘上
#define BM_CHECKPOINT_NEEDED (1U << 30) /* must write for checkpoint */
//持久化buffer(不是unlogged或者初始化fork)
#define BM_PERMANENT (1U << 31) /* permanent buffer (not unlogged,
* or init fork) */
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
* BufferDesc -- 共享缓冲区的共享描述符(状态)数据
*
* Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change
* the tag, state or wait_backend_pid fields. In general, buffer header lock
* is a spinlock which is combined with flags, refcount and usagecount into
* single atomic variable. This layout allow us to do some operations in a
* single atomic operation, without actually acquiring and releasing spinlock;
* for instance, increase or decrease refcount. buf_id field never changes
* after initialization, so does not need locking. freeNext is protected by
* the buffer_strategy_lock not buffer header lock. The LWLock can take care
* of itself. The buffer header lock is *not* used to control access to the
* data in the buffer!
* 注意:必须持有Buffer header锁(BM_LOCKED标记)才能检查或修改tag/state/wait_backend_pid字段.
* 通常来说,buffer header lock是spinlock,它与标记位/参考计数/使用计数组合到单个原子变量中.
* 这个布局设计允许我们执行原子操作,而不需要实际获得或者释放spinlock(比如,增加或者减少参考计数).
* buf_id字段在初始化后不会出现变化,因此不需要锁定.
* freeNext通过buffer_strategy_lock锁而不是buffer header lock保护.
* LWLock可以很好的处理自己的状态.
* 务请注意的是:buffer header lock不用于控制buffer中的数据访问!
*
* It's assumed that nobody changes the state field while buffer header lock
* is held. Thus buffer header lock holder can do complex updates of the
* state variable in single write, simultaneously with lock release (cleaning
* BM_LOCKED flag). On the other hand, updating of state without holding
* buffer header lock is restricted to CAS, which insure that BM_LOCKED flag
* is not set. Atomic increment/decrement, OR/AND etc. are not allowed.
* 假定在持有buffer header lock的情况下,没有人改变状态字段.
* 持有buffer header lock的进程可以执行在单个写操作中执行复杂的状态变量更新,
* 同步的释放锁(清除BM_LOCKED标记).
* 换句话说,如果没有持有buffer header lock的状态更新,会受限于CAS,
* 这种情况下确保BM_LOCKED没有被设置.
* 比如原子的增加/减少(AND/OR)等操作是不允许的.
*
* An exception is that if we have the buffer pinned, its tag can't change
* underneath us, so we can examine the tag without locking the buffer header.
* Also, in places we do one-time reads of the flags without bothering to
* lock the buffer header; this is generally for situations where we don't
* expect the flag bit being tested to be changing.
* 一种例外情况是如果我们已有buffer pinned,该buffer的tag不能改变(在本进程之下),
* 因此不需要锁定buffer header就可以检查tag了.
* 同时,在执行一次性的flags读取时不需要锁定buffer header.
* 这种情况通常用于我们不希望正在测试的flag bit将被改变.
*
* We can't physically remove items from a disk page if another backend has
* the buffer pinned. Hence, a backend may need to wait for all other pins
* to go away. This is signaled by storing its own PID into
* wait_backend_pid and setting flag bit BM_PIN_COUNT_WAITER. At present,
* there can be only one such waiter per buffer.
* 如果其他进程有buffer pinned,那么进程不能物理的从磁盘页面中删除items.
* 因此,后台进程需要等待其他pins清除.这可以通过存储它自己的PID到wait_backend_pid中,
* 并设置标记位BM_PIN_COUNT_WAITER.
* 目前,每个缓冲区只能由一个等待进程.
*
* We use this same struct for local buffer headers, but the locks are not
* used and not all of the flag bits are useful either. To avoid unnecessary
* overhead, manipulations of the state field should be done without actual
* atomic operations (i.e. only pg_atomic_read_u32() and
* pg_atomic_unlocked_write_u32()).
* 本地缓冲头部使用同样的结构,但并不需要使用locks,而且并不是所有的标记位都使用.
* 为了避免不必要的负载,状态域的维护不需要实际的原子操作
* (比如只有pg_atomic_read_u32() and pg_atomic_unlocked_write_u32())
*
* Be careful to avoid increasing the size of the struct when adding or
* reordering members. Keeping it below 64 bytes (the most common CPU
* cache line size) is fairly important for performance.
* 在增加或者记录成员变量时,小心避免增加结构体的大小.
* 保持结构体大小在64字节内(通常的CPU缓存线大小)对于性能是非常重要的.
*/
typedef struct BufferDesc
{
//buffer tag
BufferTag tag; /* ID of page contained in buffer */
//buffer索引编号(0开始)
int buf_id; /* buffer's index number (from 0) */
/* state of the tag, containing flags, refcount and usagecount */
//tag状态,包括flags/refcount和usagecount
pg_atomic_uint32 state;
//pin-count等待进程ID
int wait_backend_pid; /* backend PID of pin-count waiter */
//空闲链表链中下一个空闲的buffer
int freeNext; /* link in freelist chain */
//缓冲区内容锁
LWLock content_lock; /* to lock access to buffer contents */
} BufferDesc;
BufferTag
Buffer tag标记了buffer存储的是磁盘中哪个block
/*
* Buffer tag identifies which disk block the buffer contains.
* Buffer tag标记了buffer存储的是磁盘中哪个block
*
* Note: the BufferTag data must be sufficient to determine where to write the
* block, without reference to pg_class or pg_tablespace entries. It's
* possible that the backend flushing the buffer doesn't even believe the
* relation is visible yet (its xact may have started before the xact that
* created the rel). The storage manager must be able to cope anyway.
* 注意:BufferTag必须足以确定如何写block而不需要参照pg_class或者pg_tablespace数据字典信息.
* 有可能后台进程在刷新缓冲区的时候深圳不相信关系是可见的(事务可能在创建rel的事务之前).
* 存储管理器必须可以处理这些事情.
*
* Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have
* to be fixed to zero them, since this struct is used as a hash key.
* 注意:如果在结构体中有填充的字节,INIT_BUFFERTAG必须将它们固定为零,因为这个结构体用作散列键.
*/
typedef struct buftag
{
//物理relation标识符
RelFileNode rnode; /* physical relation identifier */
ForkNumber forkNum;
//相对于relation起始的块号
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
ReadBuffer_common函数是所有ReadBuffer相关的通用逻辑,其实现逻辑如下:
1.初始化相关变量和执行相关判断(是否扩展isExtend?是否临时表isLocalBuf?)
2.如为临时表,则调用LocalBufferAlloc获取描述符;否则调用BufferAlloc获取描述符;
同时,设置是否在缓存命中的标记(变量found)
3.如在缓存中命中
3.1如非扩展buffer,更新统计信息,如有需要,锁定buffer并返回
3.2如为扩展buffer,则获取block
3.2.1如PageIsNew返回F,则报错
3.2.2如为本地buffer(临时表),则调整标记
3.2.3如非本地buffer,则清除BM_VALID标记
4.没有在缓存中命中,则获取block
4.1如为扩展buffer,通过填充0初始化buffer,调用smgrextend扩展
4.2如为普通buffer
4.2.1如模式为RBM_ZERO_AND_LOCK/RBM_ZERO_AND_CLEANUP_LOCK,填充0
4.2.2否则,通过smgr(存储管理器)读取block,如需要,则跟踪I/O时间,同时检查垃圾数据
5.已扩展了buffer或者已读取了block
5.1如需要,锁定buffer
5.2如为临时表,则调整标记;否则设置BM_VALID,中断IO,唤醒等待的进程
5.3更新统计信息
5.4返回buffer
/*
* ReadBuffer_common -- common logic for all ReadBuffer variants
* ReadBuffer_common -- 所有ReadBuffer相关的通用逻辑
*
* *hit is set to true if the request was satisfied from shared buffer cache.
* *hit设置为T,如shared buffer中已存在此buffer
*/
static Buffer
ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool *hit)
{
BufferDesc *bufHdr;//buffer描述符
Block bufBlock;//相应的block
bool found;//是否命中?
bool isExtend;//扩展?
bool isLocalBuf = SmgrIsTemp(smgr);//本地buffer?
*hit = false;
/* Make sure we will have room to remember the buffer pin */
//确保有空间存储buffer pin
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
//如为P_NEW,则需扩展
isExtend = (blockNum == P_NEW);
//跟踪
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
smgr->smgr_rnode.node.relNode,
smgr->smgr_rnode.backend,
isExtend);
/* Substitute proper block number if caller asked for P_NEW */
//如调用方要求P_NEW,则替换适当的块号
if (isExtend)
blockNum = smgrnblocks(smgr, forkNum);
if (isLocalBuf)
{
//本地buffer(临时表)
bufHdr = LocalBufferAlloc(smgr, forkNum, blockNum, &found);
if (found)
pgBufferUsage.local_blks_hit++;
else if (isExtend)
pgBufferUsage.local_blks_written++;
else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
mode == RBM_ZERO_ON_ERROR)
pgBufferUsage.local_blks_read++;
}
else
{
//非临时表
/*
* lookup the buffer. IO_IN_PROGRESS is set if the requested block is
* not currently in memory.
* 搜索buffer.
* 如请求的block不在内存中,则IO_IN_PROGRESS设置为T
*/
//获取buffer描述符
bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,
strategy, &found);
if (found)
//在内存中命中
pgBufferUsage.shared_blks_hit++;
else if (isExtend)
//新的buffer
pgBufferUsage.shared_blks_written++;
else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
mode == RBM_ZERO_ON_ERROR)
//读取block
pgBufferUsage.shared_blks_read++;
}
/* At this point we do NOT hold any locks. */
//这时候,我们还没有持有任何锁.
/* if it was already in the buffer pool, we're done */
//---------- 如果buffer已在换冲池中,工作已完成
if (found)
{
//------------- buffer已在缓冲池中
//已在换冲池中
if (!isExtend)
{
//非扩展buffer
/* Just need to update stats before we exit */
//在退出前,更新统计信息
*hit = true;
VacuumPageHit++;
if (VacuumCostActive)
VacuumCostBalance += VacuumCostPageHit;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
smgr->smgr_rnode.node.relNode,
smgr->smgr_rnode.backend,
isExtend,
found);
/*
* In RBM_ZERO_AND_LOCK mode the caller expects the page to be
* locked on return.
* RBM_ZERO_AND_LOCK模式,调用者期望page锁定后才返回
*/
if (!isLocalBuf)
{
//非临时表buffer
if (mode == RBM_ZERO_AND_LOCK)
LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),
LW_EXCLUSIVE);
else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBufferForCleanup(BufferDescriptorGetBuffer(bufHdr));
}
//根据buffer描述符读取buffer并返回buffer
//#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
return BufferDescriptorGetBuffer(bufHdr);
}
/*
* We get here only in the corner case where we are trying to extend
* the relation but we found a pre-existing buffer marked BM_VALID.
* This can happen because mdread doesn't complain about reads beyond
* EOF (when zero_damaged_pages is ON) and so a previous attempt to
* read a block beyond EOF could have left a "valid" zero-filled
* buffer. Unfortunately, we have also seen this case occurring
* because of buggy Linux kernels that sometimes return an
* lseek(SEEK_END) result that doesn't account for a recent write. In
* that situation, the pre-existing buffer would contain valid data
* that we don't want to overwrite. Since the legitimate case should
* always have left a zero-filled buffer, complain if not PageIsNew.
* 程序执行来到这里,进程尝试扩展relation但发现了先前已存在的标记为BM_VALID的buffer.
* 这种情况之所以发生是因为mdread对于在EOF之后的读不会报错(zero_damaged_pages设置为ON),
* 并且先前尝试读取EOF的block遗留了"valid"的已初始化(填充0)的buffer.
* 不幸的是,我们同样发现因为Linux内核的bug(有时候会返回lseek/SEEK_END结果)导致这种情况.
* 在这种情况下,先前已存在的buffer会存储有效的数据,这些数据不希望被覆盖.
* 由于合法的情况下应该总是留下一个零填充的缓冲区,如果不是PageIsNew,则报错。
*/
//获取block
bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);
if (!PageIsNew((Page) bufBlock))
//不是PageIsNew,则报错
ereport(ERROR,
(errmsg("unexpected data beyond EOF in block %u of relation %s",
blockNum, relpath(smgr->smgr_rnode, forkNum)),
errhint("This has been seen to occur with buggy kernels; consider updating your system.")));
/*
* We *must* do smgrextend before succeeding, else the page will not
* be reserved by the kernel, and the next P_NEW call will decide to
* return the same page. Clear the BM_VALID bit, do the StartBufferIO
* call that BufferAlloc didn't, and proceed.
* 在成功执行前,必须执行smgrextend,否则的话page不能被内核保留,
* 同时下一个P_NEW调用会确定返回同样的page.
* 清除BM_VALID位,执行BufferAlloc没有执行的StartBufferIO调用,然后继续。
*/
if (isLocalBuf)
{
//临时表
/* Only need to adjust flags */
//只需要调整标记
uint32 buf_state = pg_atomic_read_u32(&bufHdr->state);
Assert(buf_state & BM_VALID);
buf_state &= ~BM_VALID;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
else
{
//非临时表
/*
* Loop to handle the very small possibility that someone re-sets
* BM_VALID between our clearing it and StartBufferIO inspecting
* it.
* 循环,直至StartBufferIO返回T为止
*/
do
{
uint32 buf_state = LockBufHdr(bufHdr);
Assert(buf_state & BM_VALID);
//清除BM_VALID标记
buf_state &= ~BM_VALID;
UnlockBufHdr(bufHdr, buf_state);
} while (!StartBufferIO(bufHdr, true));
}
}
//------------- buffer不在缓冲池中
/*
* if we have gotten to this point, we have allocated a buffer for the
* page but its contents are not yet valid. IO_IN_PROGRESS is set for it,
* if it's a shared buffer.
* 如果到了这个份上,我们已经为page分配了buffer,但其中的内容还没有生效.
* 如果是共享内存,那么设置IO_IN_PROGRESS标记.
*
* Note: if smgrextend fails, we will end up with a buffer that is
* allocated but not marked BM_VALID. P_NEW will still select the same
* block number (because the relation didn't get any longer on disk) and
* so future attempts to extend the relation will find the same buffer (if
* it's not been recycled) but come right back here to try smgrextend
* again.
* 注意:如果smgrextend失败,我们将以一个已分配但为设置为BM_VALID的buffer结束这次调用
*/
//验证
Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID)); /* spinlock not needed */
//获取block
bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);
if (isExtend)
{
//-------- 扩展block
/* new buffers are zero-filled */
//新buffers使用0填充
MemSet((char *) bufBlock, 0, BLCKSZ);
/* don't set checksum for all-zero page */
//对于使用全0填充的page,不要设置checksum
smgrextend(smgr, forkNum, blockNum, (char *) bufBlock, false);
/*
* NB: we're *not* doing a ScheduleBufferTagForWriteback here;
* although we're essentially performing a write. At least on linux
* doing so defeats the 'delayed allocation' mechanism, leading to
* increased file fragmentation.
* 注意:这里我们不会执行ScheduleBufferTagForWriteback.虽然我们实质上正在执行写操作.
* 起码,在Linux平台,执行这个操作会破坏“延迟分配”机制,导致文件碎片.
*/
}
else
{
//-------- 普通block
/*
* Read in the page, unless the caller intends to overwrite it and
* just wants us to allocate a buffer.
* 读取page,除非调用者期望覆盖它并且希望我们分配buffer.
*
*/
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
//如为RBM_ZERO_AND_LOCK或者RBM_ZERO_AND_CLEANUP_LOCK模式,初始化为0
MemSet((char *) bufBlock, 0, BLCKSZ);
else
{
//其他模式
instr_time io_start,//io的起止时间
io_time;
if (track_io_timing)
INSTR_TIME_SET_CURRENT(io_start);
//smgr(存储管理器)读取block
smgrread(smgr, forkNum, blockNum, (char *) bufBlock);
if (track_io_timing)
{
//需要跟踪io时间
INSTR_TIME_SET_CURRENT(io_time);
INSTR_TIME_SUBTRACT(io_time, io_start);
pgstat_count_buffer_read_time(INSTR_TIME_GET_MICROSEC(io_time));
INSTR_TIME_ADD(pgBufferUsage.blk_read_time, io_time);
}
/* check for garbage data */
//检查垃圾数据
if (!PageIsVerified((Page) bufBlock, blockNum))
{
//如果page为通过验证
if (mode == RBM_ZERO_ON_ERROR || zero_damaged_pages)
{
//出错,则初始化
ereport(WARNING,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s; zeroing out page",
blockNum,
relpath(smgr->smgr_rnode, forkNum))));
//初始化
MemSet((char *) bufBlock, 0, BLCKSZ);
}
else
//出错,报错
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s",
blockNum,
relpath(smgr->smgr_rnode, forkNum))));
}
}
}
//--------- 已扩展了buffer或者已读取了block
/*
* In RBM_ZERO_AND_LOCK mode, grab the buffer content lock before marking
* the page as valid, to make sure that no other backend sees the zeroed
* page before the caller has had a chance to initialize it.
* 在RBM_ZERO_AND_LOCK模式下,在标记page为有效之前获取buffer content lock,
* 确保在调用者初始化之前没有其他进程看到已初始化为0的page
*
* Since no-one else can be looking at the page contents yet, there is no
* difference between an exclusive lock and a cleanup-strength lock. (Note
* that we cannot use LockBuffer() or LockBufferForCleanup() here, because
* they assert that the buffer is already valid.)
* 由于没有其他进程可以搜索page内容,因此获取独占锁和cleanup-strength锁没有区别.
* (注意不能在这里使用LockBuffer()或者LockBufferForCleanup(),因为这些函数假定buffer有效)
*/
if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&
!isLocalBuf)
{
//锁定
LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_EXCLUSIVE);
}
if (isLocalBuf)
{
//临时表
/* Only need to adjust flags */
//只需要调整标记
uint32 buf_state = pg_atomic_read_u32(&bufHdr->state);
buf_state |= BM_VALID;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
else
{
//普通表
/* Set BM_VALID, terminate IO, and wake up any waiters */
//设置BM_VALID,中断IO,唤醒等待的进程
TerminateBufferIO(bufHdr, false, BM_VALID);
}
//更新统计信息
VacuumPageMiss++;
if (VacuumCostActive)
VacuumCostBalance += VacuumCostPageMiss;
//跟踪
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
smgr->smgr_rnode.node.relNode,
smgr->smgr_rnode.backend,
isExtend,
found);
//返回buffer
//#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
return BufferDescriptorGetBuffer(bufHdr);
}
测试场景一:Block不在缓冲区中
脚本:
16:42:48 (xdb@[local]:5432)testdb=# select * from t1 limit 10;
启动gdb,设置断点
(gdb) b ReadBuffer_common
Breakpoint 1 at 0x876e28: file bufmgr.c, line 711.
(gdb) c
Continuing.
Breakpoint 1, ReadBuffer_common (smgr=0x2b7cce0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL,
strategy=0x0, hit=0x7ffc7761dfab) at bufmgr.c:711
711 bool isLocalBuf = SmgrIsTemp(smgr);
(gdb)
1.初始化相关变量和执行相关判断(是否扩展isExtend?是否临时表isLocalBuf?)
(gdb) n
713 *hit = false;
(gdb)
716 ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
(gdb)
718 isExtend = (blockNum == P_NEW);
(gdb)
720 TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
(gdb)
728 if (isExtend)
(gdb)
731 if (isLocalBuf)
(gdb)
745 bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,
(gdb)
2.调用BufferAlloc获取buffer描述符
(gdb)
747 if (found)
(gdb) p *bufHdr
$1 = {tag = {rnode = {spcNode = 1663, dbNode = 16402, relNode = 51439}, forkNum = MAIN_FORKNUM, blockNum = 0},
buf_id = 108, state = {value = 2248409089}, wait_backend_pid = 0, freeNext = -2, content_lock = {tranche = 54, state = {
value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}}
(gdb) p found
$2 = false
(gdb)
(gdb) n
750 pgBufferUsage.shared_blks_read++; --> 更新统计信息
(gdb)
4.没有在缓存中命中,则获取block
756 if (found)
(gdb)
856 Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID)); /* spinlock not needed */
(gdb)
858 bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);
(gdb)
860 if (isExtend)
(gdb) p bufBlock
$4 = (Block) 0x7fe8c240e380
4.2如为普通buffer
4.2.1如模式为RBM_ZERO_AND_LOCK/RBM_ZERO_AND_CLEANUP_LOCK,填充0
4.2.2否则,通过smgr(存储管理器)读取block,如需要,则跟踪I/O时间,同时检查垃圾数据
(gdb) p mode
$5 = RBM_NORMAL
(gdb)
(gdb) n
880 if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
(gdb)
887 if (track_io_timing)
(gdb)
890 smgrread(smgr, forkNum, blockNum, (char *) bufBlock);
(gdb)
892 if (track_io_timing)
(gdb) p *smgr
$6 = {smgr_rnode = {node = {spcNode = 1663, dbNode = 16402, relNode = 51439}, backend = -1}, smgr_owner = 0x7fe8ee2bc7b8,
smgr_targblock = 4294967295, smgr_fsm_nblocks = 4294967295, smgr_vm_nblocks = 4294967295, smgr_which = 0,
md_num_open_segs = {1, 0, 0, 0}, md_seg_fds = {0x2b0dd78, 0x0, 0x0, 0x0}, next_unowned_reln = 0x0}
(gdb) p forkNum
$7 = MAIN_FORKNUM
(gdb) p blockNum
$8 = 0
(gdb) p (char *) bufBlock
$9 = 0x7fe8c240e380 "\001"
(gdb)
5.已扩展了buffer或者已读取了block
5.1如需要,锁定buffer
5.2如为临时表,则调整标记;否则设置BM_VALID,中断IO,唤醒等待的进程
(gdb) n
901 if (!PageIsVerified((Page) bufBlock, blockNum))
(gdb)
932 if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&
(gdb) n
938 if (isLocalBuf)
(gdb)
949 TerminateBufferIO(bufHdr, false, BM_VALID);
(gdb)
5.3更新统计信息
5.4返回buffer
(gdb)
952 VacuumPageMiss++;
(gdb)
953 if (VacuumCostActive)
(gdb)
956 TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
(gdb)
964 return BufferDescriptorGetBuffer(bufHdr);
(gdb)
965 }
(gdb)
buf为109
(gdb)
ReadBufferExtended (reln=0x7fe8ee2bc7a8, forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL, strategy=0x0) at bufmgr.c:666
666 if (hit)
(gdb)
668 return buf;
(gdb) p buf
$10 = 109
(gdb)
测试场景二:Block已在缓冲区中
再次执行上面的SQL语句,这时候相应的block已读入到buffer中
(gdb) del
Delete all breakpoints? (y or n) y
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007fe8ec448903 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) b ReadBuffer_common
Breakpoint 2 at 0x876e28: file bufmgr.c, line 711.
(gdb)
found变量为T
...
(gdb)
745 bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,
(gdb)
747 if (found)
(gdb) p found
$11 = true
(gdb)
(gdb) n
748 pgBufferUsage.shared_blks_hit++;
(gdb)
进入相应的逻辑
3.如在缓存中命中
3.1如非扩展buffer,更新统计信息,如有需要,锁定buffer并返回
3.2如为扩展buffer,则获取block
3.2.1如PageIsNew返回F,则报错
3.2.2如为本地buffer(临时表),则调整标记
3.2.3如非本地buffer,则清除BM_VALID标记
(gdb)
756 if (found)
(gdb)
758 if (!isExtend)
(gdb)
761 *hit = true;
(gdb)
762 VacuumPageHit++;
(gdb)
764 if (VacuumCostActive)
(gdb)
767 TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
(gdb)
779 if (!isLocalBuf)
(gdb)
781 if (mode == RBM_ZERO_AND_LOCK)
(gdb)
784 else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)
(gdb)
788 return BufferDescriptorGetBuffer(bufHdr);
(gdb)
965 }
(gdb)
到此,关于“PostgreSQL中ReadBuffer_common函数有什么作用”的学习就结束了,希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习,快去试试吧!若想继续学习更多相关知识,请继续关注亿速云网站,小编会继续努力为大家带来更多实用的文章!
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。
原文链接:http://blog.itpub.net/6906/viewspace-2636132/