如何用源码分析在linux上的safe point

发布时间：2021-10-29 10:50:41 阅读：145 作者：柒染栏目：编程语言

Linux服务器限时活动，0元免费领，库存有限，领完即止！点击查看>>

这篇文章将为大家详细讲解有关如何用源码分析在linux上的safe point，文章内容质量较高，因此小编分享给大家做个参考，希望大家阅读完这篇文章后对相关知识有一定的了解。

safe point 顾明思意，就是安全点，当需要jvm做一些操作的时候，需要把当前正在运行的线程进入一个安全点的状态（也可以说停止状态），这样才能做一些安全的操作，比如线程的dump，堆栈的信息。

在jvm里面通常vm_thread（我们一直在谈论的做一些属于vm 份内事情的线程）和cms_thread（内存回收的线程）做的操作，是需要将其他的线程通过调用SafepointSynchronize::begin 和 SafepointSynchronize:end来实现让其他的线程进入或者退出safe point 的状态。

通常safepoint 的有三种状态

_not_synchronized	说明没有任何打断现在所有线程运行的操作，也就是vm thread, cms thread 没有接到操作的指令
_synchronizing	vm thread,cms thread 接到操作指令，正在等待所有线程进入safe point
_synchronized	所有线程进入safe point, vm thread, cms thread 可以开始指令操作

Java线程的状态

通常在java 进程中的Java 的线程有几个不同的状态，如何让这些线程进入safepoint 的状态中，jvm是采用不同的方式

a. 正在解释执行

由于java是解释性语言，而线程在解释java 字节码的时候，需要dispatch table,记录方法地址进行跳转的，那么这样让线程进入停止状态就比较容易了，只要替换掉dispatch table 就可以了，让线程知道当前进入softpoint 状态。

java里会设置3个DispatchTable， _active_table， _normal_table， _safept_table

_active_table 正在解释运行的线程使用的dispatch table

_normal_table 就是正常运行的初始化的dispatch table

_safept_table safe point需要的dispatch table

解释运行的线程一直都在使用_active_table,关键处就是在进入saftpoint 的时候，用_safept_table替换_active_table, 在退出saftpoint 的时候，使用_normal_table来替换_active_table。

具体实现可以查看源码

void TemplateInterpreter::notice_safepoints() {    if (!_notice_safepoints) {      // switch to safepoint dispatch table      _notice_safepoints = true;      copy_table((address*)&_safept_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));    }  }   // switch from the dispatch table which notices safepoints back to the  // normal dispatch table.  So that we can notice single stepping points,  // keep the safepoint dispatch table if we are single stepping in JVMTI.  // Note that the should_post_single_step test is exactly as fast as the  // JvmtiExport::_enabled test and covers both cases.  void TemplateInterpreter::ignore_safepoints() {    if (_notice_safepoints) {      if (!JvmtiExport::should_post_single_step()) {        // switch to normal dispatch table        _notice_safepoints = false;        copy_table((address*)&_normal_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));      }    }  }

b. 运行在native code

如果线程运行在native code的时候，vm thread 是不需要等待线程执行完的，只需要在从native code 返回的时候去判断一下 _state 的状态就可以了。

在方法体里就是前面博客也出现过的 SafepointSynchronize::do_call_back()

inline static bool do_call_back() {    return (_state != _not_synchronized);  }

判断了_state 不是_not_synchronized状态

为了能让线程从native code 回到java 的时候为了能读到/设置正确线程的状态，通常的解决方法使用memory barrier，java 使用OrderAccess::fence(); 在汇编里使用__asm__ volatile ("lock; addl $0,0(%%rsp)" : : : "cc", "memory"); 保证从内存里读到正确的值，但是这种方法严重影响系统的性能，于是java使用了每个线程都有独立的内存页来设置状态。通过使用使用参数-XX:+UseMembar 参数使用memory barrier，默认是不打开的，也就是使用独立的内存页来设置状态。

c. 运行编译的代码

1. Poling page 页面

Poling page是在jvm初始化启动的时候会初始化的一个单独的内存页面，这个页面是让运行的编译过的代码的线程进入停止状态的关键。

在linux里面使用了mmap初始化，源码如下

address polling_page = (address) ::mmap(NULL, Linux::page_size(), PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

2. 编译

java 的JIT 会直接编译一些热门的源码到机器码，直接执行而不需要在解释执行从而提高效率，在编译的代码中，当函数或者方法块返回的时候会去访问一个内存poling页面。

x86架构下

void LIR_Assembler::return_op(LIR_Opr result) {    assert(result->is_illegal() || !result->is_single_cpu() || result->as_register() == rax, "word returns are in rax,");    if (!result->is_illegal() && result->is_float_kind() && !result->is_xmm_register()) {      assert(result->fpu() == 0, "result must already be on TOS");    }     // Pop the stack before the safepoint code    __ remove_frame(initial_frame_size_in_bytes());     bool result_is_oop = result->is_valid() ? result->is_oop() : false;     // Note: we do not need to round double result; float result has the right precision    // the poll sets the condition code, but no data registers    AddressLiteral polling_page(os::get_polling_page() + (SafepointPollOffset % os::vm_page_size()),                                relocInfo::poll_return_type);     // NOTE: the requires that the polling page be reachable else the reloc    // goes to the movq that loads the address and not the faulting instruction    // which breaks the signal handler code     __ test32(rax, polling_page);     __ ret(0);  }

在前面提到的SafepointSynchronize::begin 函数源码中

if (UseCompilerSafepoints && DeferPollingPageLoopCount < 0) {    // Make polling safepoint aware    guarantee (PageArmed == 0, "invariant") ;    PageArmed = 1 ;    os::make_polling_page_unreadable();  }

这里提到了2个参数 UseCompilerSafepoints 和 DeferPollingPageLoopCount ，在默认的情况下这2个参数是true和-1

函数体将会调用os:make_polling_page_unreadable();在linux os 下具体实现是调用了mprotect(bottom,size,prot) 使polling 内存页变成不可读。

3. 信号

到当编译好的程序尝试在去访问这个不可读的polling页面的时候，在系统级别会产生一个错误信号SIGSEGV, 可以参考笔者的一篇博客中曾经讲过java 的信号处理，可以知道信号SIGSEGV的处理函数在x86体系下见下源码：

JVM_handle_linux_signal(int sig,                          siginfo_t* info,                          void* ucVoid,                          int abort_if_unrecognized){     ....     if (sig == SIGSEGV && os::is_poll_address((address)info->si_addr)) {          stub = SharedRuntime::get_poll_stub(pc);        }      ....  }

在linux x86,64 bit的体系中，poll stub 的地址就是 SafepointSynchronize::handle_polling_page_exception 详细程序可见shareRuntime_x86_64.cpp

回到safepoint.cpp中，SafepointSynchronize::handle_polling_page_exception通过取出线程的safepoint_stat,调用函数void ThreadSafepointState::handle_polling_page_exception，***通过调用SafepointSynchronize::block(thread()); 来block当前线程。

d. block 状态

当线程进入block状态的时候，继续保持block状态。

关于如何用源码分析在linux上的safe point就分享到这里了，希望以上内容可以对大家有一定的帮助，可以学到更多知识。如果觉得文章不错，可以把它分享出去让更多的人看到。

亿速云「云服务器」，即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘，价格低至29元/月。点击查看>>

向AI问一下细节

如何用源码分析在linux上的safe point

猜你喜欢

最新资讯

相关推荐

开发者交流群：

相关标签