Tinyhttp的学习剖析

it’s fucking good!!这是源码最新的一条评论,忍不住让人爆粗的httpserver,怎么能不好好学习一番呢::joy:

源码地址:https://sourceforge.net/projects/tinyhttpd/?source=typ_redirect

tar -xzvf解压后,直接make,各种警告。。还有一个错误。解决错误先。。

修改错误

先让其能make生成一个可执行程序。

Makefile修改

由于是Centos所以不需要-lsocket这个选项。而且把gcc -W -Wall -lsocket -lpthread -o httpd httpd.c ,修改为:gcc -W -Wall -o httpd httpd.c -lpthread

pthread_create 参数错误

pthread_create 函数第三个参数即回调函数名,要求是(void*)(*start_rtn)(void*)类型,但是代码中提供的 accept_request 函数是 void (*)(int) 类型的。所以将函数accpet_request的声明和定义改为如下:

1
2
3
4
void *accept_request(void *);
void *accept_request(void *pclient)
{...}

剖析

tinyHttp的处理流程如下:

tinyhttp

建议源码阅读顺序:main -> startup -> accept_request -> execute_cgi。再加上上面的流程图,会对源码有一个很好的理解。

各函数声明:

1
2
3
4
5
6
7
8
9
10
11
12
void *accept_request(void *);//甄别处理http请求
void bad_request(int);//给客户端返回错误请求
void cat(int, FILE *);//读取数据到socket
void cannot_execute(int);//回应客户端无法执行cgi
void error_die(const char *);//打印出错信息
void execute_cgi(int, const char *, const char *, const char *);//运行cgi
int get_line(int, char *, int);//按行读取套接字,
void headers(int, const char *);//把http请求send入套接字
void not_found(int);//404的处理
void serve_file(int, const char *);//调用cat返回包含有header和error的文件给client
int startup(u_short *);//初始化 httpd 服务,包括建立套接字,绑定端口,进行监听等
void unimplemented(int);//返回给浏览器表明收到的 HTTP 请求所用的 method 不被支持

以下就是关于源码的注释:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
/* J. David's webserver */
/* This is a simple webserver.
* Created November 1999 by J. David Blackstone.
* CSE 4344 (Network concepts), Prof. Zeigler
* University of Texas at Arlington
*/
/* This program compiles for Sparc Solaris 2.6.
* To compile for Linux:
* 1) Comment out the #include <pthread.h> line.
* 2) Comment out the line that defines the variable newthread.
* 3) Comment out the two lines that run pthread_create().
* 4) Uncomment the line that runs accept_request().
* 5) Remove -lsocket from the Makefile.
*/
#include <stdio.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <ctype.h>
#include <strings.h>
#include <string.h>
#include <sys/stat.h>
#include <pthread.h>
#include <sys/wait.h>
#include <stdlib.h>
#define ISspace(x) isspace((int)(x))
#define SERVER_STRING "Server: jdbhttpd/0.1.0\r\n"
void *accept_request(void *);
void bad_request(int);
void cat(int, FILE *);
void cannot_execute(int);
void error_die(const char *);
void execute_cgi(int, const char *, const char *, const char *);
int get_line(int, char *, int);
void headers(int, const char *);
void not_found(int);
void serve_file(int, const char *);
int startup(u_short *);
void unimplemented(int);
/**********************************************************************/
/* A request has caused a call to accept() on the server port to
* return. Process the request appropriately.
* Parameters: the socket connected to the client */
/**********************************************************************/
void *accept_request(void *client)
{
char buf[1024];
int numchars;
char method[255];
char url[255];
char path[512];
size_t i, j;
struct stat st;
int cgi = 0; /* becomes true if server decides this is a CGI
* program */
char *query_string = NULL;
//得到请求第一行,并保存第一行字节数
numchars = get_line(client, buf, sizeof(buf));
i = 0; j = 0;
//将客户端的请求方法存入method数组
while (!ISspace(buf[j]) && (i < sizeof(method) - 1))
{
method[i] = buf[j];
i++; j++;
}
method[i] = '\0';
//方法既不是GET又不是POST则返回错误信息
if (strcasecmp(method, "GET") && strcasecmp(method, "POST"))
{
unimplemented(client);
return;
}
//是POST,cgi置1
if (strcasecmp(method, "POST") == 0)
cgi = 1;
//读取uri,存入uri[]
i = 0;
while (ISspace(buf[j]) && (j < sizeof(buf)))
j++;
while (!ISspace(buf[j]) && (i < sizeof(url) - 1) && (j < sizeof(buf)))
{
url[i] = buf[j];
i++; j++;
}
url[i] = '\0';
//方法为GET
if (strcasecmp(method, "GET") == 0)
{
query_string = url;
while ((*query_string != '?') && (*query_string != '\0'))
query_string++;
//读取?后参数
if (*query_string == '?')
{
cgi = 1;
*query_string = '\0';
query_string++;
}
}
//将url格式化输出到path数组,并且在url前加上htocs,index.html在htdoc目录下
sprintf(path, "htdocs%s", url);
//path加上index.html
if (path[strlen(path) - 1] == '/')
strcat(path, "index.html");
//根据path找文件
if (stat(path, &st) == -1) {
//把所有headers的信息都丢掉
while ((numchars > 0) && strcmp("\n", buf)) /* read & discard headers */
numchars = get_line(client, buf, sizeof(buf));
not_found(client);//找不到客户端
}
else
{
//如果是个目录,则默认使用该目录下 index.html 文件
if ((st.st_mode & S_IFMT) == S_IFDIR)
strcat(path, "/index.html");
if ((st.st_mode & S_IXUSR) ||(st.st_mode & S_IXGRP) ||(st.st_mode & S_IXOTH) )
cgi = 1;
//不是 cgi,直接把服务器文件返回,否则执行 cgi
if (!cgi)
serve_file(client, path);
else
execute_cgi(client, path, method, query_string);
}
//断开与客户端的连接(HTTP 特点:无连接)
close(client);
}
/**********************************************************************/
/* Inform the client that a request it has made has a problem.
* Parameters: client socket */
/**********************************************************************/
void bad_request(int client)
{
char buf[1024];
//没有找到content-length
sprintf(buf, "HTTP/1.0 400 BAD REQUEST\r\n");
send(client, buf, sizeof(buf), 0);
sprintf(buf, "Content-type: text/html\r\n");
send(client, buf, sizeof(buf), 0);
sprintf(buf, "\r\n");
send(client, buf, sizeof(buf), 0);
sprintf(buf, "<P>Your browser sent a bad request, ");
send(client, buf, sizeof(buf), 0);
sprintf(buf, "such as a POST without a Content-Length.\r\n");
send(client, buf, sizeof(buf), 0);
}
/**********************************************************************/
/* Put the entire contents of a file out on a socket. This function
* is named after the UNIX "cat" command, because it might have been
* easier just to do something like pipe, fork, and exec("cat").
* Parameters: the client socket descriptor
* FILE pointer for the file to cat */
/**********************************************************************/
void cat(int client, FILE *resource)
{
char buf[1024];
//读取文件信息写入套接字
fgets(buf, sizeof(buf), resource);
while (!feof(resource))
{
send(client, buf, strlen(buf), 0);
fgets(buf, sizeof(buf), resource);
}
}
/**********************************************************************/
/* Inform the client that a CGI script could not be executed.
* Parameter: the client socket descriptor. */
/**********************************************************************/
void cannot_execute(int client)
{
char buf[1024];
//建立管道或者fork时失败
sprintf(buf, "HTTP/1.0 500 Internal Server Error\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "Content-type: text/html\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<P>Error prohibited CGI execution.\r\n");
send(client, buf, strlen(buf), 0);
}
/**********************************************************************/
/* Print out an error message with perror() (for system errors; based
* on value of errno, which indicates system call errors) and exit the
* program indicating an error. */
/**********************************************************************/
void error_die(const char *sc)
{
//用perror进行报错
perror(sc);
exit(1);
}
/**********************************************************************/
/* Execute a CGI script. Will need to set environment variables as
* appropriate.
* Parameters: client socket descriptor
* path to the CGI script */
/**********************************************************************/
void execute_cgi(int client, const char *path,
const char *method, const char *query_string)
{
char buf[1024];
int cgi_output[2];
int cgi_input[2];
pid_t pid;
int status;
int i;
char c;
int numchars = 1;
int content_length = -1;
buf[0] = 'A'; buf[1] = '\0';
if (strcasecmp(method, "GET") == 0)
//读取整个报头并丢弃
while ((numchars > 0) && strcmp("\n", buf)) /* read & discard headers */
numchars = get_line(client, buf, sizeof(buf));
else /* POST */
{
//找出content-length
numchars = get_line(client, buf, sizeof(buf));
while ((numchars > 0) && strcmp("\n", buf))
{
buf[15] = '\0';
if (strcasecmp(buf, "Content-Length:") == 0)
content_length = atoi(&(buf[16]));
numchars = get_line(client, buf, sizeof(buf));
}
//没有找到
if (content_length == -1) {
bad_request(client);
return;
}
}
//添加状态码
sprintf(buf, "HTTP/1.0 200 OK\r\n");
send(client, buf, strlen(buf), 0);
//建立管道output
if (pipe(cgi_output) < 0) {
cannot_execute(client);
return;
}
//建立input管道
if (pipe(cgi_input) < 0) {
cannot_execute(client);
return;
}
if ( (pid = fork()) < 0 ) {
cannot_execute(client);
return;
}
if (pid == 0) /* child: CGI script */
{
char meth_env[255];
char query_env[255];
char length_env[255];
//把 STDOUT 重定向到 cgi_output 的写入端
dup2(cgi_output[1], 1);
//把 STDIN 重定向到 cgi_input 的读取端
dup2(cgi_input[0], 0);
//关闭 cgi_input 的写入端 和 cgi_output 的读取端
close(cgi_output[0]);
close(cgi_input[1]);
//设置request_method的环境变量
sprintf(meth_env, "REQUEST_METHOD=%s", method);
putenv(meth_env);
if (strcasecmp(method, "GET") == 0) {
//设置query_string的环境变量
sprintf(query_env, "QUERY_STRING=%s", query_string);
putenv(query_env);
}
else { /* POST */
//设置 content_length 的环境变量
sprintf(length_env, "CONTENT_LENGTH=%d", content_length);
putenv(length_env);
}
//子进程execl替换运行cgi程序
execl(path, path, NULL);
exit(0);
} else { /* parent */
//关闭 cgi_input 的读取端 和 cgi_output 的写入端
close(cgi_output[1]);
close(cgi_input[0]);
if (strcasecmp(method, "POST") == 0)
//接收 POST 过来的数据
for (i = 0; i < content_length; i++) {
recv(client, &c, 1, 0);
//把 POST 数据写入 cgi_input,现在重定向到 STDIN
write(cgi_input[1], &c, 1);
}
while (read(cgi_output[0], &c, 1) > 0)
send(client, &c, 1, 0);
//关闭相应管道
close(cgi_output[0]);
close(cgi_input[1]);
//等待子进程
waitpid(pid, &status, 0);
}
}
/**********************************************************************/
/* Get a line from a socket, whether the line ends in a newline,
* carriage return, or a CRLF combination. Terminates the string read
* with a null character. If no newline indicator is found before the
* end of the buffer, the string is terminated with a null. If any of
* the above three line terminators is read, the last character of the
* string will be a linefeed and the string will be terminated with a
* null character.
* Parameters: the socket descriptor
* the buffer to save the data in
* the size of the buffer
* Returns: the number of bytes stored (excluding null) */
/**********************************************************************/
int get_line(int sock, char *buf, int size)
{
int i = 0;
char c = '\0';
int n;
//以换行符结束
while ((i < size - 1) && (c != '\n'))
{
//按字节接收
n = recv(sock, &c, 1, 0);
/* DEBUG printf("%02X\n", c); */
if (n > 0)
{
//收到\r则需判断下一字节,可能是\r\n
if (c == '\r')
{
//使用 MSG_PEEK 标志使下一次读取依然可以得到这次读取的内容,可认为接收窗口不滑动
n = recv(sock, &c, 1, MSG_PEEK);
/* DEBUG printf("%02X\n", c); */
//下一字节是\n,则把它接收了
if ((n > 0) && (c == '\n'))
recv(sock, &c, 1, 0);
else
//否则直接置为\n
c = '\n';
}
//存入缓冲区,i++读下一字节
buf[i] = c;
i++;
}
else
c = '\n';
}
//整行最后置为null
buf[i] = '\0';
//返回数组大小
return(i);
}
/**********************************************************************/
/* Return the informational HTTP headers about a file. */
/* Parameters: the socket to print the headers on
* the name of the file */
/**********************************************************************/
void headers(int client, const char *filename)
{
char buf[1024];
(void)filename; /* could use filename to determine file type */
strcpy(buf, "HTTP/1.0 200 OK\r\n");
send(client, buf, strlen(buf), 0);
//服务器信息
strcpy(buf, SERVER_STRING);
send(client, buf, strlen(buf), 0);
sprintf(buf, "Content-Type: text/html\r\n");
send(client, buf, strlen(buf), 0);
strcpy(buf, "\r\n");
send(client, buf, strlen(buf), 0);
}
/**********************************************************************/
/* Give a client a 404 not found status message. */
/**********************************************************************/
void not_found(int client)
{
char buf[1024];
sprintf(buf, "HTTP/1.0 404 NOT FOUND\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, SERVER_STRING);
send(client, buf, strlen(buf), 0);
sprintf(buf, "Content-Type: text/html\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<HTML><TITLE>Not Found</TITLE>\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<BODY><P>The server could not fulfill\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "your request because the resource specified\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "is unavailable or nonexistent.\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "</BODY></HTML>\r\n");
send(client, buf, strlen(buf), 0);
}
/**********************************************************************/
/* Send a regular file to the client. Use headers, and report
* errors to client if they occur.
* Parameters: a pointer to a file structure produced from the socket
* file descriptor
* the name of the file to serve */
/**********************************************************************/
void serve_file(int client, const char *filename)
{
FILE *resource = NULL;
int numchars = 1;
char buf[1024];
buf[0] = 'A'; buf[1] = '\0';
while ((numchars > 0) && strcmp("\n", buf)) /* read & discard headers */
numchars = get_line(client, buf, sizeof(buf));
//打开sever的文件
resource = fopen(filename, "r");
if (resource == NULL)
not_found(client);
else
{
//添加header
headers(client, filename);
//复制文件到client
cat(client, resource);
}
fclose(resource);
}
/**********************************************************************/
/* This function starts the process of listening for web connections
* on a specified port. If the port is 0, then dynamically allocate a
* port and modify the original port variable to reflect the actual
* port.
* Parameters: pointer to variable containing the port to connect on
* Returns: the socket */
/**********************************************************************/
int startup(u_short *port)
{
int httpd = 0;
struct sockaddr_in name;
//建立套接字
httpd = socket(PF_INET, SOCK_STREAM, 0);
if (httpd == -1)
error_die("socket");
memset(&name, 0, sizeof(name));
name.sin_family = AF_INET;
name.sin_port = htons(*port);
name.sin_addr.s_addr = htonl(INADDR_ANY);
if (bind(httpd, (struct sockaddr *)&name, sizeof(name)) < 0)
error_die("bind");
//port为0,则通过getsockname动态获取一个(bind之后)
if (*port == 0) /* if dynamically allocating a port */
{
int namelen = sizeof(name);
if (getsockname(httpd, (struct sockaddr *)&name, &namelen) == -1)
error_die("getsockname");
*port = ntohs(name.sin_port);
}
//开始监听
if (listen(httpd, 5) < 0)
error_die("listen");
//返回socket id
return(httpd);
}
/**********************************************************************/
/* Inform the client that the requested web method has not been
* implemented.
* Parameter: the client socket */
/**********************************************************************/
void unimplemented(int client)
{
char buf[1024];
sprintf(buf, "HTTP/1.0 501 Method Not Implemented\r\n");
send(client, buf, strlen(buf), 0);
//sever信息
sprintf(buf, SERVER_STRING);
send(client, buf, strlen(buf), 0);
sprintf(buf, "Content-Type: text/html\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<HTML><HEAD><TITLE>Method Not Implemented\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "</TITLE></HEAD>\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<BODY><P>HTTP request method not supported.\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "</BODY></HTML>\r\n");
send(client, buf, strlen(buf), 0);
}
/**********************************************************************/
int main(void)
{
int server_sock = -1;
u_short port = 0;
int client_sock = -1;
struct sockaddr_in client_name;
int client_name_len = sizeof(client_name);
pthread_t newthread;
//在port端口处建立http服务,如若port为0随机分配一个端口,sever_sock接收套接字id
server_sock = startup(&port);
printf("httpd running on port %d\n", port);
//不停循环等待接收客户端请求
while (1)
{
//accept客户端请求
client_sock = accept(server_sock,(struct sockaddr *)&client_name, &client_name_len);
//accept失败,打印错误
if (client_sock == -1)
error_die("accept");
/* accept_request(client_sock); */
//创建线程,accept_request处理请求
if (pthread_create(&newthread , NULL, accept_request, client_sock) != 0)
perror("pthread_create");
}
//关闭套接字
close(server_sock);
return(0);
}

最后Google看到一个流程的详细总结,看完了上面的一切,看看总结也是极好的2333

1) 服务器启动,在指定端口或随机选取端口绑定 httpd 服务。

​ (2)收到一个 HTTP 请求时(其实就是 listen 的端口 accpet 的时候),派生一个线程运行 accept_request 函数。

​ (3)取出 HTTP 请求中的 method (GET 或 POST) 和 url,。对于 GET 方法,如果有携带参数,则 query_string 指针指向 url 中 ? 后面的 GET 参数。

​ (4) 格式化 url 到 path 数组,表示浏览器请求的服务器文件路径,在 tinyhttpd 中服务器文件是在 htdocs 文件夹下。当 url 以 / 结尾,或 url 是个目录,则默认在 path 中加上 index.html,表示访问主页。

​ (5)如果文件路径合法,对于无参数的 GET 请求,直接输出服务器文件到浏览器,即用 HTTP 格式写到套接字上,跳到(10)。其他情况(带参数 GET,POST 方式,url 为可执行文件),则调用 excute_cgi 函数执行 cgi 脚本。

​ (6)读取整个 HTTP 请求并丢弃,如果是 POST 则找出 Content-Length. 把 HTTP 200 状态码写到套接字。

​ (7) 建立两个管道,cgi_input 和 cgi_output, 并 fork 一个进程。

​ (8) 在子进程中,把 STDOUT 重定向到 cgi_outputt 的写入端,把 STDIN 重定向到 cgi_input 的读取端,关闭 cgi_input 的写入端 和 cgi_output 的读取端,设置 request_method 的环境变量,GET 的话设置 query_string 的环境变量,POST 的话设置 content_length 的环境变量,这些环境变量都是为了给 cgi 脚本调用,接着用 execl 运行 cgi 程序。

​ (9) 在父进程中,关闭 cgi_input 的读取端 和 cgi_output 的写入端,如果 POST 的话,把 POST 数据写入 cgi_input,已被重定向到 STDIN,读取 cgi_output 的管道输出到客户端,该管道输入是 STDOUT。接着关闭所有管道,等待子进程结束。

(10) 关闭与浏览器的连接,完成了一次 HTTP 请求与回应,因为 HTTP 是无连接的。