概要
TeaCon 2020 展览服务器开放后,出现了数次玩家全员掉线,并且服务端失去响应的情况。经过一段时间的调查, 我们发现是有人恶意利用服务端 TheOneProbe 模组中包处理机制存在的漏洞攻击服务器。
排查经过
初步调查
发现服务器卡死后,笔者第一时间创建了线程 dump 查看情况:
2020-08-17 20:08:12 Full thread dump OpenJDK 64-Bit Server VM (25.252-b09 mixed mode): ... "Server thread" #28 prio=5 os_prio=0 tid=0x00007f0d915b1000 nid=0xacb4 runnable [0x00007f0d644eb000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000003c954ad68> (a java.lang.String) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at net.minecraft.util.concurrent.ThreadTaskExecutor.func_223705_bi(SourceFile:139) at net.minecraft.util.concurrent.ThreadTaskExecutor.func_213161_c(SourceFile:129) at net.minecraft.world.server.ServerChunkProvider.func_212849_a_(SourceFile:139) at net.minecraft.world.server.ServerChunkProvider.func_222868_e(SourceFile:122) at net.minecraft.world.server.ServerChunkProvider$$Lambda$4762/1079846110.get(Unknown Source) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) at net.minecraft.util.concurrent.ThreadTaskExecutor.func_213166_h(SourceFile:144) at net.minecraft.world.server.ServerChunkProvider$ChunkExecutor.func_213166_h(SourceFile:551) at net.minecraft.util.concurrent.ThreadTaskExecutor.func_213168_p(SourceFile:118) at net.minecraft.world.server.ServerChunkProvider$ChunkExecutor.func_213168_p(SourceFile:560) at net.minecraft.world.server.ServerChunkProvider.func_217234_d(SourceFile:278) at net.minecraft.server.MinecraftServer.func_213205_aW(MinecraftServer.java:719) at net.minecraft.server.MinecraftServer.func_213168_p(MinecraftServer.java:708) at net.minecraft.util.concurrent.ThreadTaskExecutor.func_213161_c(SourceFile:127) at net.minecraft.server.MinecraftServer.func_213202_o(MinecraftServer.java:694) at net.minecraft.server.MinecraftServer.func_213186_a(MinecraftServer.java:467) at net.minecraft.server.MinecraftServer.func_71247_a(MinecraftServer.java:366) at net.minecraft.server.dedicated.DedicatedServer.func_71197_b(DedicatedServer.java:212) at net.minecraft.server.MinecraftServer.run(MinecraftServer.java:613) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - None ...
可以发现,Server thread 在等待大量的区块生成和保存。那么之后的目标便显而易见了:找到大量加载区块的代码。
定位问题
在 @海螺 的帮助下,我们在 getChunk 方法中打入了 hook ,在加载异常坐标区块时打印 stacktrace 信息。遂得到如下日志:
[23Aug2020 22:38:59.731] [Server thread/WARN] [Debug/]: Large coord chunk load 117920 561984 [23Aug2020 22:38:59.731] [Server thread/DEBUG] [Debug/]: chunk load debug java.lang.RuntimeException: chunk load debug at net.minecraft.world.server.ChunkHolder.handler$zza000$arclight$onChunkLoad(ChunkHolderMixin1.java:536) ~[?:?] at net.minecraft.world.server.ChunkHolder.func_219291_a(ChunkHolderMixin1.java:360) ~[?:?] at net.minecraft.world.server.TicketManager.func_219343_a(SourceFile:113) ~[?:?] at java.lang.Iterable.forEach(Iterable.java:75) ~[?:1.8.0_252] at net.minecraft.world.server.TicketManager.func_219353_a(SourceFile:113) ~[?:?] at net.minecraft.world.server.ServerChunkProvider.func_217235_l(SourceFile:282) ~[?:?] at net.minecraft.world.server.ServerChunkProvider.func_217233_c(SourceFile:221) ~[?:?] at net.minecraft.world.server.ServerChunkProvider.func_212849_a_(SourceFile:138) ~[?:?] at net.minecraft.world.World.func_217353_a(World.java:149) ~[?:?] at net.minecraft.world.IWorldReader.func_217348_a(IWorldReader.java:101) ~[?:?] at net.minecraft.world.World.func_212866_a_(World.java:145) ~[?:?] at net.minecraft.world.World.func_180495_p(World.java:361) ~[?:?] at mcjty.theoneprobe.network.PacketGetInfo.getProbeInfo(PacketGetInfo.java:122) ~[?:1.15-2.0.5] at mcjty.theoneprobe.network.PacketGetInfo.lambda$handle$0(PacketGetInfo.java:100) ~[?:1.15-2.0.5] at net.minecraftforge.fml.network.NetworkEvent$Context.enqueueWork(NetworkEvent.java:215) ~[?:?] at mcjty.theoneprobe.network.PacketGetInfo.handle(PacketGetInfo.java:97) ~[?:1.15-2.0.5] at net.minecraftforge.fml.network.simple.IndexedMessageCodec.lambda$tryDecode$3(IndexedMessageCodec.java:128) ~[?:?] at java.util.Optional.ifPresent(Optional.java:159) ~[?:1.8.0_252] at net.minecraftforge.fml.network.simple.IndexedMessageCodec.tryDecode(IndexedMessageCodec.java:128) ~[?:?] at net.minecraftforge.fml.network.simple.IndexedMessageCodec.consume(IndexedMessageCodec.java:162) ~[?:?] at net.minecraftforge.fml.network.simple.SimpleChannel.networkEventListener(SimpleChannel.java:80) ~[?:?] at net.minecraftforge.eventbus.EventBus.doCastFilter(EventBus.java:212) ~[eventbus-2.2.0-service.jar:?] at net.minecraftforge.eventbus.EventBus.lambda$addListener$11(EventBus.java:204) ~[eventbus-2.2.0-service.jar:?] at net.minecraftforge.eventbus.EventBus.post(EventBus.java:258) ~[eventbus-2.2.0-service.jar:?] at net.minecraftforge.fml.network.NetworkInstance.dispatch(NetworkInstance.java:84) ~[?:?] at net.minecraftforge.fml.network.NetworkHooks.lambda$onCustomPayload$1(NetworkHooks.java:78) ~[?:?] at java.util.Optional.map(Optional.java:215) ~[?:1.8.0_252] at net.minecraftforge.fml.network.NetworkHooks.onCustomPayload(NetworkHooks.java:78) ~[?:?] at net.minecraft.network.play.ServerPlayNetHandler.func_147349_a(ServerPlayNetHandler.java:1279) ~[?:?] at net.minecraft.network.play.client.CCustomPayloadPacket.func_148833_a(CCustomPayloadPacket.java:42) ~[?:?] at net.minecraft.network.play.client.CCustomPayloadPacket.func_148833_a(CCustomPayloadPacket.java:12) ~[?:?] at net.minecraft.network.PacketThreadUtil.func_225383_a(SourceFile:21) ~[?:?] at net.minecraft.util.concurrent.TickDelayedTask.run(SourceFile:18) [?:?] at net.minecraft.util.concurrent.ThreadTaskExecutor.func_213166_h(SourceFile:144) [?:?] at net.minecraft.util.concurrent.RecursiveEventLoop.func_213166_h(SourceFile:23) [?:?] at net.minecraft.server.MinecraftServer.func_213166_h(MinecraftServer.java:731) [?:?] at net.minecraft.server.MinecraftServer.func_213166_h(MinecraftServer.java:141) [?:?] at net.minecraft.util.concurrent.ThreadTaskExecutor.func_213168_p(SourceFile:118) [?:?] at net.minecraft.server.MinecraftServer.func_213205_aW(MinecraftServer.java:714) [?:?] at net.minecraft.server.MinecraftServer.func_213168_p(MinecraftServer.java:708) [?:?] at net.minecraft.util.concurrent.ThreadTaskExecutor.func_213160_bf(SourceFile:103) [?:?] at net.minecraft.server.MinecraftServer.func_213202_o(MinecraftServer.java:693) [?:?] at net.minecraft.server.MinecraftServer.run(MinecraftServer.java:641) [?:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
可以看到,问题出在 TheOneProbe 的 PacketGetInfo.getProbeInfo
方法中:
public void handle(Supplier<NetworkEvent.Context> ctx) { ctx.get().enqueueWork(() -> { ServerWorld world = DimensionManager.getWorld(ctx.get().getSender().server, dim, true, false); if (world != null) { ProbeInfo probeInfo = getProbeInfo(ctx.get().getSender(), mode, world, pos, sideHit, hitVec, pickBlock); PacketHandler.INSTANCE.sendTo(new PacketReturnInfo(dim, pos, probeInfo), ctx.get().getSender().connection.getNetworkManager(), NetworkDirection.PLAY_TO_CLIENT); } }); ctx.get().setPacketHandled(true); } private static ProbeInfo getProbeInfo(PlayerEntity player, ProbeMode mode, World world, BlockPos blockPos, Direction sideHit, Vec3d hitVec, @Nonnull ItemStack pickBlock) { if (Config.needsProbe.get() == PROBE_NEEDEDFOREXTENDED) { // We need a probe only for extended information if (!ModItems.hasAProbeSomewhere(player)) { // No probe anywhere, switch EXTENDED to NORMAL if (mode == ProbeMode.EXTENDED) { mode = ProbeMode.NORMAL; } } } else if (Config.needsProbe.get() == PROBE_NEEDEDHARD && !ModItems.hasAProbeSomewhere(player)) { // The server says we need a probe but we don't have one in our hands return null; } BlockState state = world.getBlockState(blockPos); ProbeInfo probeInfo = TheOneProbe.theOneProbeImp.create(); IProbeHitData data = new ProbeHitData(blockPos, hitVec, sideHit, pickBlock); IProbeConfig probeConfig = TheOneProbe.theOneProbeImp.createProbeConfig(); List<IProbeConfigProvider> configProviders = TheOneProbe.theOneProbeImp.getConfigProviders(); for (IProbeConfigProvider configProvider : configProviders) { configProvider.getProbeConfig(probeConfig, player, world, state, data); } Config.setRealConfig(probeConfig); List<IProbeInfoProvider> providers = TheOneProbe.theOneProbeImp.getProviders(); for (IProbeInfoProvider provider : providers) { try { provider.addProbeInfo(mode, probeInfo, player, world, state, data); } catch (Throwable e) { ThrowableIdentity.registerThrowable(e); probeInfo.text(LABEL + "Error: " + ERROR + provider.getID()); } } return probeInfo; }
该 handler 在未检查坐标所在区块是否加载的情况下就调用了 getBlockState 方法,可被攻击者恶意利用生成大量地图边界外的区块,导致服务端长时间未响应。此次的攻击便是攻击者发送大量伪造的 GetProbeInfo 数据包生成区块,导致服务端失去响应的。
找出攻击者
鉴于服务器开启了正版验证,我们决定直接找出攻击者的 id 。只需要在对应数据包 handler 中加入坐标合法性判断,再打印出有非法行为的玩家信息即刻。在 “撒网” 后的第二天,我们顺利找到了攻击者:
[24Aug2020 17:19:28.573] [Server thread/INFO] [STDOUT/]: [mcjty.theoneprobe.network.PacketGetInfo:lambda$handle$0:101]: Hello player quasimodito, what are you fucking doing? [24Aug2020 17:19:28.573] [Server thread/INFO] [STDOUT/]: [mcjty.theoneprobe.network.PacketGetInfo:lambda$handle$0:102]: Your ip is 14.18.251.97.
写在后面
网络安全是一个十分重要的话题。在 Minecraft mod 开发中,即使一个小小的 network handler 中如此不显眼的问题也能给服务器带来巨大的损失。
诸如区块加载检查这类问题,甚至在 forge 关于 SimpleChannel 的文档上都有提及,无奈却连 TheOneProbe 这样几乎整合包 “必备” 的 mod 中都存在如此严重的问题。
在 TOP 模组问题发现后不久,我们又从参赛的一个 mod 中找到了类似的问题。在 3T 与 McJty 交涉后,McJty 在他的所有 mod 中修复了 23 个同类的错误,可见问题的普遍性。
希望经过这次事件,能让 Minecraft modder 们对网络安全提高重视。笔者在这次排错过程中也有不小的收获,比如这张海螺的黑历史
发表回复