记一次 .NET 某娱乐聊天流平台 CPU 爆高分析

一:背景1.讲故事前段时间有位朋友加微信,说他的程序直接 CPU=100%,每次只能手工介入重启,让我帮忙看下到底怎么回事,哈哈,这种CPU打满的事故,程序员压力会非常大, 我让朋友在 CPU 高的时候抓 2 个 dump 下来,然后发给我分析 。
二:WinDbg 分析1. CPU 真的被打满了吗?为了防止南辕北辙 , 一定要用 !tp 命令去验证下是不是真的 CPU 爆高 。
0:000> !tpCPU utilization: 100%Worker Thread: Total: 21 Running: 7 Idle: 0 MaxLimit: 32767 MinLimit: 4Work Request in Queue: 3    AsyncTimerCallbackCompletion TimerInfo@00000000042d2430    AsyncTimerCallbackCompletion TimerInfo@00000000042d2f90    AsyncTimerCallbackCompletion TimerInfo@000000000420c150--------------------------------------Number of Timers: 0--------------------------------------Completion Port Thread:Total: 18 Free: 9 MaxFree: 8 CurrentLimit: 18 MaxLimit: 1000 MinLimit: 4从卦中看确实 100%,太牛了 , 而且 WorkRequest 还有任务堆积现象,确认无疑后 , 接下来看下是谁引发的?
2. 谁导致的 CPU 爆高根据惯例首先怀疑是不是 GC 触发所致,可以用 !t 查看下线程列表,观察下有没有 GC 字样 。
:000> !tThreadCount:      53UnstartedThread:  0BackgroundThread: 53PendingThread:    0DeadThread:       0Hosted Runtime:   no                                                                                                        Lock       ID OSID ThreadOBJ           State GC Mode     GC Alloc Context                  Domain           Count Apt Exception   4    1 1240 00000000021cdf30    2a220 Preemptive  0000000000000000:0000000000000000 00000000021d94c0 0     MTA  23    2 4db4 00000000041cdaa0    2b220 Preemptive  0000000000000000:0000000000000000 00000000021d94c0 0     MTA (Finalizer)  ...  65  156 22f4 000000000b1a3f60  8029220 Preemptive  00000004527751F0:0000000452775EE8 00000000021d94c0 0     MTA (Threadpool Completion Port)  66  205 2ef8 000000000b1a1080  8029220 Preemptive  0000000157641DE0:00000001576435B0 00000000021d94c0 0     MTA (Threadpool Completion Port)  ...从卦中看没有 GC 字样 , 也表明这个程序并不是由 GC 触发所致,接下来该怎么排查呢? 一般来说 CPU 的爆高是由线程抬起来的,所以接下来就是看下 CPU 的档次和各个线程栈,看有没有什么新线索, 可以使用 ~*e !clrstack。
0:000> !cpuidCP  F/M/S  Manufacturer     MHz 0  6,79,1  <unavailable>   2299 1  6,79,1  <unavailable>   2299 2  6,79,1  <unavailable>   2299 3  6,79,1  <unavailable>   22990:000> ~*e !clrstackOS Thread Id: 0x2cc4 (68)        Child SP               IP Call Site000000000c14e758 00007ffadeb86e4a [GCFrame: 000000000c14e758]000000000c14e840 00007ffadeb86e4a [GCFrame: 000000000c14e840]000000000c14e878 00007ffadeb86e4a [HelperMethodFrame: 000000000c14e878] System.Threading.Monitor.Enter(System.Object)000000000c14e970 00007ffaceb40491 System.Net.ConnectionGroup.Disassociate(System.Net.Connection) [f:\dd\NDP\fx\src\net\System\Net\_ConnectionGroup.cs @ 148]000000000c14e9d0 00007ffaceb3fc93 System.Net.Connection.PrepareCloseConnectionSocket(System.Net.ConnectionReturnResult ByRef) [f:\dd\NDP\fx\src\net\System\Net\_Connection.cs @ 3048]000000000c14eaa0 00007ffacf139bfb System.Net.Connection.HandleError(Boolean, Boolean, System.Net.WebExceptionStatus, System.Net.ConnectionReturnResult ByRef) [f:\dd\NDP\fx\src\net\System\Net\_Connection.cs @ 3119]000000000c14eb00 00007ffacebc4118 System.Net.Connection.ReadComplete(Int32, System.Net.WebExceptionStatus) [f:\dd\NDP\fx\src\net\System\Net\_Connection.cs @ 3387]000000000c14eb80 00007ffacebe3dc5 System.Net.LazyAsyncResult.Complete(IntPtr) [f:\dd\NDP\fx\src\net\System\Net\_LazyAsyncResult.cs @ 415]000000000c14ebe0 00007ffacebe3d07 System.Net.LazyAsyncResult.ProtectedInvokeCallback(System.Object, IntPtr) [f:\dd\NDP\fx\src\net\System\Net\_LazyAsyncResult.cs @ 368]000000000c14ec20 00007ffacf3a476f System.Net.Security._SslStream.StartFrameBody(Int32, Byte[], Int32, Int32, System.Net.AsyncProtocolRequest)000000000c14ec80 00007ffacebb3ed8 System.Net.Security._SslStream.ReadHeaderCallback(System.Net.AsyncProtocolRequest) [f:\dd\NDP\fx\src\net\System\Net\SecureProtocols\_SslStream.cs @ 1007]000000000c14ece0 00007ffacebae5ee System.Net.AsyncProtocolRequest.CompleteRequest(Int32) [f:\dd\NDP\fx\src\net\System\Net\SecureProtocols\_HelperAsyncResults.cs @ 142]000000000c14ed10 00007ffacf3a3567 System.Net.FixedSizeReader.CheckCompletionBeforeNextRead(Int32)000000000c14ed40 00007ffacebae507 System.Net.FixedSizeReader.ReadCallback(System.IAsyncResult) [f:\dd\NDP\fx\src\net\System\Net\SecureProtocols\_FixedSizeReader.cs @ 148]000000000c14ed90 00007ffacebe3dc5 System.Net.LazyAsyncResult.Complete(IntPtr) [f:\dd\NDP\fx\src\net\System\Net\_LazyAsyncResult.cs @ 415]000000000c14edf0 00007ffadcbe3a63 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954]000000000c14eec0 00007ffadcbe38f4 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902]000000000c14eef0 00007ffadcbe38c2 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 891]000000000c14ef40 00007ffaceba60cf System.Net.ContextAwareResult.Complete(IntPtr) [f:\dd\NDP\fx\src\net\System\Net\_ContextAwareResult.cs @ 463]000000000c14ef90 00007ffacebe3d07 System.Net.LazyAsyncResult.ProtectedInvokeCallback(System.Object, IntPtr) [f:\dd\NDP\fx\src\net\System\Net\_LazyAsyncResult.cs @ 368]000000000c14efd0 00007ffaceba5e2f System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(UInt32, UInt32, System.Threading.NativeOverlapped*) [f:\dd\NDP\fx\src\net\System\Net\Sockets\_BaseOverlappedAsyncResult.cs @ 399]000000000c14f040 00007ffadcc2ffef System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*) [f:\dd\ndp\clr\src\BCL\system\threading\overlapped.cs @ 135]000000000c14f1f0 00007ffade9a6d93 [GCFrame: 000000000c14f1f0] OS Thread Id: 0x5d6c (70)        Child SP               IP Call Site000000000943e920 00007ffacf139bfb System.Net.Connection.HandleError(Boolean, Boolean, System.Net.WebExceptionStatus, System.Net.ConnectionReturnResult ByRef) OS Thread Id: 0x4ad4 (75)        Child SP               IP Call Site...000000000c94e5a0 00007ffacf139bfb System.Net.Connection.HandleError(Boolean, Boolean, System.Net.WebExceptionStatus, System.Net.ConnectionReturnResult ByRef)...OS Thread Id: 0x1d70 (80)        Child SP               IP Call Site...000000000d24e3a0 00007ffacf139bfb System.Net.Connection.HandleError(Boolean, Boolean, System.Net.WebExceptionStatus, System.Net.ConnectionReturnResult ByRef) [f:\dd\NDP\fx\src\net\System\Net\_Connection.cs @ 3119]...

推荐阅读