Skip to main content

composer的psr4

· 10 min read

我这次主要是要描述composer的psr4自动加载相关内容.php有很多历史的包袱,所以需要做很多妥协,而namespace 以及自动加载也是.

include 和 require的大坑

例子

include 和require的区别什么的可能还是某些面试官的问题之一,但是include和require都有一个致命的大坑,include和require一个相对路径是相对于工作目录的.

举个例子.

当前在index.php 的目录中

# tree

test
   ├── index.php
   ├── relative.php
   └── subdir
   ├── a.php
   └── relative.php

index.php 的代码很简单,就是包含一个路径

<?php
include "./subdir/a.php";

两个relative.php 文件分别输出自己的路径 subdir/relative.php文件:

<?php
echo "test/subdir/relative.php"

relative.php文件:

<?php
echo "test/relative.php";

那么如果与index.php 同目录下会include哪一个呢?

答案是:

# php index.php 
test/relative.php

include了与index.php 同一个目录下的relative.php 文件

而如果你在index.php的上一层目录执行,也就是test目录它甚至会报错

php test/index.php 
PHP Warning: include(./subdir/a.php): failed to open stream: No such file or directory in /root/test/index.php on line 2
PHP Warning: include(): Failed opening './subdir/a.php' for inclusion (include_path='.:/usr/share/php') in /root/test/index.php on line 2

这一切都是因为当是相对路径的时候,调用了getcwd()来获取工作目录,如果你使用shell的pwd话也可以看自己的工作目录.

由于这个比较坑的特性,php的代码如果手工使用include并且还使用了相对路径,那之后就非常难以维护了.所以我们需要尽量减少使用include相对路径,因为你知道的原因,你一旦写了一个相对路径,总会有后人copy and paste你的代码,然后把这个include也复制进去了,而这就是下一个屎坑的开始.

所以,自动加载可以减缓这种大坑的产生,因为他可以减少手工include相对路径的风险,因为他们往往会这样include文件include __DIR__ . 'aaa/bbb/ccc.php',由于不是相对路径,所以会好很多.

CLI模式与CGI/FASTCGI工作目录的不同

CLI SAPI 不会将当前目录改为已运行的脚本所在的目录。

以下范例显示了本模块与 CGI SAPI 模块之间的不同:

<?php
// 名为 test.php 的简单测试程序
echo getcwd(), "\n";
?>

在使用 CGI 版本时,其输出为

$ pwd
/tmp

$ php-cgi -f another_directory/test.php
/tmp/another_directory

明显可以看到 PHP 将当前目录改成了刚刚运行过的脚本所在的目录。

使用 CLI SAPI 模式,得到:

$ pwd
/tmp

$ php -q another_directory/test.php
/tmp

include 和require 的opcode和getcwd

require 和include 词法分析和语法分析后,生成opcode是73,ZEND_INCLUDE_OR_EVAL,在include或者require之后,如果是相对路径

最后会调用

VCWD_GETCWD(cwd, MAXPATHLEN)

这个最后就是调用glibc 下面的getcwd

getcwd 系统调用

每个进程task_struct会有fs_struct 结构,这个结构体会含有pwdroot,如果使用getcwd()这个函数,通过glibc会通过系统调用读取fs_struct的pwd属性并返回

static void get_fs_root_and_pwd_rcu(struct fs_struct *fs, struct path *root,
struct path *pwd)
{
...
*root = fs->root;
*pwd = fs->pwd;
...
}
SYSCALL_DEFINE2(getcwd, char __user *, buf, unsigned long, size)
{
int error;
struct path pwd, root;
char *page = __getname();

if (!page)
return -ENOMEM;

rcu_read_lock();
get_fs_root_and_pwd_rcu(current->fs, &root, &pwd); // 每个进程会关联一个fs_struct结构,fs_struct 结构有两个属性root和pwd描述了root目录和pwd目录

char *cwd = page + PATH_MAX;
int buflen = PATH_MAX;

prepend(&cwd, &buflen, "\0", 1);
error = prepend_path(&pwd, &root, &cwd, &buflen);
...
copy_to_user(buf, cwd, len) // 将处理后的pwd 返回到用户态
...

}

include和require总结

include以及require如果引入相对路径的文件,那么这个相对路径都是相对于getcwd(),也就是当前工作目录.

而cgi和cli模式又有不同

  • cli模式下的当前路径就是shell pwd的值
  • 而cgi 这个SAPI和cli这个CLI SAPI不一样的地方在于他会帮你切换一次工作目录到第一次运行的php文件的当前目录作为工作目录.

命名空间

命名空间是什么?

其实就是一堆限定符.

为什么要有命名空间?
因为我们要复用别人的代码,你想引用别人的一个库,别人库里写了个hello函数,你也写了个hello函数.这就麻烦了,所以引入命名空间,只要保证大家的命名空间不一样,那样就算大家都有相同的函数名,也不会冲突了.

自动加载

开始说到自动加载了,自动加载.什么是自动加载呢?

其实就是动态include,或者叫做运行时include.

平时我们怎么include文件的呢?

就是手工include一堆文件,就像我刚才上面的例子一样.这样至少有两个风险:

  • 新手使用了相对路径include
  • 得手工引入,但是include会重复引入文件,得使用include_once 或者require_once

就风险而言,新手使用相对路径引入的危险是非常大的.重复引入只是会校验多一点有一点性能影响而言.

spl_autoload_register

spl_autoload_*这一类的函数都是php自动加载的核心函数,实现自动加载则是依赖spl_autoload_register


/* {{{ proto bool spl_autoload_register([mixed autoload_function [, bool throw [, bool prepend]]])
Register given function as __autoload() implementation */
PHP_FUNCTION(spl_autoload_register)
{

...

if (zend_hash_add_mem(SPL_G(autoload_functions), lc_name, &alfi, sizeof(autoload_func_info)) == NULL) {
...
}
...
} /* }}} */

然后相关的调用会在zend_hash_exists(EG(class_table), lc_name) 判断是否在全局的EG(class_table) 里面
下面的spl_autoload_call是一个例子

PHP_FUNCTION(spl_autoload_call)
{

if (SPL_G(autoload_functions)) { // spl_autoload_register 放进去的 SPL_G(autoload_functions)
int l_autoload_running = SPL_G(autoload_running);
SPL_G(autoload_running) = 1;
lc_name = zend_string_alloc(Z_STRLEN_P(class_name), 0);
zend_str_tolower_copy(ZSTR_VAL(lc_name), Z_STRVAL_P(class_name), Z_STRLEN_P(class_name));
zend_hash_internal_pointer_reset_ex(SPL_G(autoload_functions), &pos);
while (zend_hash_get_current_key_ex(SPL_G(autoload_functions), &func_name, &num_idx, &pos) == HASH_KEY_IS_STRING) { // 循环回调函数
alfi = zend_hash_get_current_data_ptr_ex(SPL_G(autoload_functions), &pos);
zend_call_method(Z_ISUNDEF(alfi->obj)? NULL : &alfi->obj, alfi->ce, &alfi->func_ptr, ZSTR_VAL(func_name), ZSTR_LEN(func_name), retval, 1, class_name, NULL); // 调用注册的回调函数

if (zend_hash_exists(EG(class_table), lc_name)) { // 回调找到了类名,则跳出循环

break;
}
zend_hash_move_forward_ex(SPL_G(autoload_functions), &pos);
}
...
}
..
} /* }}} */

自动加载流程其实很简单 自动加载的例子

<?php
// test.php
spl_autoload_register(function ($class) {
include "$class" . '.php';
});
$obj = new ClassA();

以及类ClassA.php

<?php
class ClassA{}

下面是堆栈

(gdb) bt
#0 zif_spl_autoload_call (execute_data=0x7fffef61e0a0, return_value=0x7fffffffa2f0) at /home/dinosaur/Downloads/php-7.2.2/ext/spl/php_spl.c:393
#1 0x0000000000932807 in zend_call_function (fci=0x7fffffffa330, fci_cache=0x7fffffffa300) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_execute_API.c:833
#2 0x0000000000933000 in zend_lookup_class_ex (name=0x7fffe6920b58, key=0x7fffe70e63f0, use_autoload=1) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_execute_API.c:990
#3 0x0000000000933dbd in zend_fetch_class_by_name (class_name=0x7fffe6920b58, key=0x7fffe70e63f0, fetch_type=512) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_execute_API.c:1425
#4 0x00000000009b7e46 in ZEND_NEW_SPEC_CONST_HANDLER () at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_vm_execute.h:3211
#5 0x0000000000a380a4 in execute_ex (ex=0x7fffef61e030) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_vm_execute.h:59929
#6 0x0000000000a3d0ab in zend_execute (op_array=0x7fffef683300, return_value=0x0) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_vm_execute.h:63760
#7 0x000000000094cd22 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend.c:1496
#8 0x00000000008b0b4a in php_execute_script (primary_file=0x7fffffffcaa0) at /home/dinosaur/Downloads/php-7.2.2/main/main.c:2590
#9 0x0000000000a3fd23 in do_cli (argc=2, argv=0x1441a60) at /home/dinosaur/Downloads/php-7.2.2/sapi/cli/php_cli.c:1011
#10 0x0000000000a40ee0 in main (argc=2, argv=0x1441a60) at /home/dinosaur/Downloads/php-7.2.2/sapi/cli/php_cli.c:1404

所以整个自动加载的核心流程就是在查找类的时候会去调用spl_autoload_call,这个函数则会回调注册的自动加载函数,直到遍历所有的回调函数都没有找到或者在某个遍历的时候找到了直接返回。

psr规范与psr4

psrPHP Standards Recommendations的简称,而psr4和psr0有都是和自动加载相关的内容.

其实就是规定了一个简单的替换

\Aura\Web\Response\Status	Aura\Web	/path/to/aura-web/src/	/path/to/aura-web/src/Response/Status.php

psr4规定了我们如何去加载一个文件: 将完全限定名用前缀地址替换,后面则是后面的文件. 举个例子: 你要加载的类是:

\Aura\Web\Response\Status

那么你可以使用Aura\Web 映射/path/to/aura-web/src/,那么类\Aura\Web\Response\Status就会去/path/to/aura-web/src/Response/Status.php文件找

可以说有点像nginx的路由配置: 下面是nginx的配置

location ^~ /images/ {
    # 匹配任何已 /images/ 开头的任何查询并且停止搜索。任何正则表达式将不会被测试。
}

那么上面的\Aura\Web\Response\Status的psr4 有点像这样:

location ^~ /Aura/Web/ {
    root /path/to/aura-web/src/;
}

相关阅读

php7 异常、错误以及相关坑

· 3 min read

php 的坑非常之多,有高低版本的,有历史包袱类的。也有与其他语言不一致导致的知识迁移导致的坑。

前置知识

throwable

PHP 7 changes how most errors are reported by PHP. Instead of reporting errors through the traditional error reporting mechanism used by PHP 5, most errors are now reported by throwing Error exceptions.

(人肉机翻)php 7 改变了php大多数的errors的警告提示方式。和php 5 传统的error reporting 机制不同,php 的大多数错误通过抛出错误异常来警告提示。

填坑开始

例子1 

  • php 版本7,除以0的错误会变成异常
<?php
// test.php
try {
echo 1%0;
} catch (DivisionByZeroError $e) {
echo "bbb";
}
?>

然后执行

php test.php 
bbb

输出bbb ,也就是被try catch 住了。

那么我们先看php 是怎么catch 住这个错误的

堆栈如下:

Breakpoint 1, zend_throw_exception_ex (exception_ce=0x14cfe70, code=0, format=0x1087ea4 "Modulo by zero") at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_exceptions.c:913
913 {
(gdb) bt
#0 zend_throw_exception_ex (exception_ce=0x14cfe70, code=0, format=0x1087ea4 "Modulo by zero") at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_exceptions.c:913
#1 0x00000000009b9feb in ZEND_MOD_SPEC_CONST_CONST_HANDLER () at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_vm_execute.h:4270
#2 0x0000000000a381e4 in execute_ex (ex=0x7fffef61e030) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_vm_execute.h:59989
#3 0x0000000000a3d0ab in zend_execute (op_array=0x7fffef684300, return_value=0x0) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_vm_execute.h:63760
#4 0x000000000094cd22 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend.c:1496
#5 0x00000000008b0b4a in php_execute_script (primary_file=0x7fffffffca10) at /home/dinosaur/Downloads/php-7.2.2/main/main.c:2590
#6 0x0000000000a3fd23 in do_cli (argc=2, argv=0x1441f40) at /home/dinosaur/Downloads/php-7.2.2/sapi/cli/php_cli.c:1011
#7 0x0000000000a40ee0 in main (argc=2, argv=0x1441f40) at /home/dinosaur/Downloads/php-7.2.2/sapi/cli/php_cli.c:1404

相关阅读

例子二

php 版本7

<?php
try {
echo 1/0; // 取余改成了除法
} catch (DivisionByZeroError $e) {
echo "bbb";
}
?>

输出

Warning: Division by zero in /home/dinosaur/test/test.php on line 3
INF

发现了不一样了吗?

① 抛了warning 没有被try catch 住

② php 脚本继续执行,(并输出INF)

我们看看堆栈:

(gdb) bt
#0 zend_error (type=2, format=0x107dcfc "Division by zero") at /home/dinosaur/Downloads/php-7.2.2/Zend/zend.c:1105
#1 0x000000000093fb5b in div_function (result=0x7fffef61e090, op1=0x7fffe70e61c0, op2=0x7fffe70e61d0) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_operators.c:1173
#2 0x00000000009a82a0 in fast_div_function (result=0x7fffef61e090, op1=0x7fffe70e61c0, op2=0x7fffe70e61d0) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_operators.h:738
#3 0x00000000009b9f22 in ZEND_DIV_SPEC_CONST_CONST_HANDLER () at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_vm_execute.h:4251
#4 0x0000000000a381d4 in execute_ex (ex=0x7fffef61e030) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_vm_execute.h:59986
#5 0x0000000000a3d0ab in zend_execute (op_array=0x7fffef684300, return_value=0x0) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend_vm_execute.h:63760
#6 0x000000000094cd22 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /home/dinosaur/Downloads/php-7.2.2/Zend/zend.c:1496
#7 0x00000000008b0b4a in php_execute_script (primary_file=0x7fffffffca10) at /home/dinosaur/Downloads/php-7.2.2/main/main.c:2590
#8 0x0000000000a3fd23 in do_cli (argc=2, argv=0x1441f40) at /home/dinosaur/Downloads/php-7.2.2/sapi/cli/php_cli.c:1011
#9 0x0000000000a40ee0 in main (argc=2, argv=0x1441f40) at /home/dinosaur/Downloads/php-7.2.2/sapi/cli/php_cli.c:1404

zend_error 翻到最底下就是write 系统调用了

				if (Z_LVAL_P(op2) == 0) {
zend_error(E_WARNING, "Division by zero");
ZVAL_DOUBLE(result, ((double) Z_LVAL_P(op1) / (double) Z_LVAL_P(op2)));
return SUCCESS;
}

zend_error后就return 了,所以后面的程序可以继续执行

对比总结

1/0 不会被抛出异常,会有warning 并继续执行 

坑点在于:

  • 不是所有的error都能被catch
  • 没有被catch 住的话会继续执行

lex和yacc

· 5 min read

词法分析

lex主要是用来做词法分析用的,简单来说就是分词.
每次调用yylex都会返回一个词,lucence的标准分词器也是用lex一类的包分好词的.
Lucene的分词分好之后会构造倒排索引.

lex例子


%{
%}
%%

end { ECHO ;return 2 ;}

aaa {ECHO ;}

.|\N {}

%%
int main(){
yylex();
}
int yywrap(){
return 1;
}

然后执行

lex test.lex

语法分析

语法分析是什么?

语法分析是一个特别的规则系统,或者说.语法分析是一个图灵机,可以表达正则表达式无法表达的内容

语法分析如何选择? 语法分析的一个关键问题是如何在多个产生式中选择一个产生式,有且仅有一个产生式.

bison是yacc的gun版本
和flex一样,bison也是分成3个部分,使用%%分割 Linux下面开源的yacc版本为bison

...定义段...

%%

...规则段...

%%

...用户子例程段...

第一个部分主要是c的相关声明和token声明,非终结符的声明等
第二部分主要是产生式和语义动作
第三部分则是执行的相关c函数

/* Infix notation calculator--calc */

%{
#define YYSTYPE double
#include <math.h>
#include <stdio.h>
%}

/* BISON Declarations */
%token NUM
%left '-' '+'
%left '*' '/'
%left NEG /* negation--unary minus */
%right '^' /* exponentiation */

/* Grammar follows */
%%
input: /* empty string */
| input line
;

line: '\n'
| exp '\n' { printf ("\t%.10g\n", $1); }
;

exp: NUM { $$ = $1; }
| exp '+' exp { $$ = $1 + $3; }
| exp '-' exp { $$ = $1 - $3; }
| exp '*' exp { $$ = $1 * $3; }
| exp '/' exp { $$ = $1 / $3; }
| '-' exp %prec NEG { $$ = -$2; }
| exp '^' exp { $$ = pow ($1, $3); }
| '(' exp ')' { $$ = $2; }
;
%%
#include <ctype.h>
main ()
{
yyparse ();
}
yyerror (s) /* Called by yyparse on error */
char *s;
{
printf ("%s\n", s);
}

yylex ()
{
int c;

/* skip white space */
while ((c = getchar ()) == ' ' || c == '\t')
;
/* process numbers */
if (c == '.' || isdigit (c))
{
ungetc (c, stdin);
scanf ("%lf", &yylval);
return NUM;
}
/* return end-of-file */
if (c == EOF)
return 0;
/* return single chars */
return c;
}

生成并编译

bison bison parse.y
gcc parse.tab.c -lm
# ./a.out
3+2
5

下面描述常用的变量的使用

%token

%token 放在定义段

%token NUMBER

会在生成c文件的时候变成

#define NUMBER 258 

所以可以理解%token是一种简写,可以减少#define的使用

YYSTYPE

In real parsers, the values of different symbols use different data types, e.g., int and double for numeric symbols, char * for strings, and pointers to structures for higher level symbols. If you have multiple value types, you have to list all the value types used in a parser so that yacc can create a C union typedef called YYSTYPE to contain them. (Fortunately, yacc gives you a lot of help ensuring that you use the right value type for each symbol .)

引用自lex & yacc

YYSTYPE 是一个类型的宏定义,目的是给终结符合非终结符确定类型的集合

%union 是YYSTYPE定义的简写

%token 是定义词素枚举值的简写

%type 是非终结符的类型定义的简写

%union {
double dval;
int vblno;
}

%token NUMBER

使用--defines参数生成头文件

# bison --defines test.y

最后会生成如下的文件

enum yytokentype{
NUMBER = 258
};

union YYSTYPE{
double dval;
int vblno;
};

如果给token 添加类型的话

%token <vblno> NAME
%token <dval> NUMBER
%type <dval> expression

In action code, yacc automatically qualifies symbol value references with the appropriate field'name, e.g., if the third symbol is a NUMBER, a reference to $3 acts like $3,dval.

引用自lex & yacc

在语义动作的代码里面,如果第三个元素是NUMBER 的话, $3等价于$3.dval

相关阅读

https_tls_ssl

· 9 min read

最近找了个华为云的vps,想做个简单的网址,于是一番注册域名和http证书。

结过弄了很久发现居然访问不了。

其实原因是没有备案,我的证书配置是正常的。

排查过程

通过curl 定位

直接curl -v url可以看到详细的握手过程

./curl https://gitlab.shakudada.xyz -v
* STATE: INIT => CONNECT handle 0x1a23898; line 1491 (connection #-5000)
* Added connection 0. The cache now contains 1 members
* STATE: CONNECT => WAITRESOLVE handle 0x1a23898; line 1532 (connection #0)
* Trying 139.9.222.124:443...
* TCP_NODELAY set
* STATE: WAITRESOLVE => WAITCONNECT handle 0x1a23898; line 1611 (connection #0)
* Connected to gitlab.shakudada.xyz (139.9.222.124) port 443 (#0)
* STATE: WAITCONNECT => SENDPROTOCONNECT handle 0x1a23898; line 1667 (connection #0)
* Marked for [keep alive]: HTTP default
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* STATE: SENDPROTOCONNECT => PROTOCONNECT handle 0x1a23898; line 1682 (connection #0)
* error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
* Marked for [closure]: Failed HTTPS connection
* multi_done
* Closing connection 0
* The cache now contains 0 members
* Expire cleared (transfer 0x1a23898)
curl: (35) error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
dinosaur@dinosaur-X550VXK:~/curl/mycurl/bin$

client hello 后就失败了,所以直接tcpdump 看看包数据

tcpdump -i wlp3s0  host 139.9.222.124 and  port 443 -A -X

其中139.9.222.124就是我那没有备案的ip

以下是抓包,去掉了开始的tcp的三次握手

22:22:23.280367 IP 192.168.1.106.33170 > 139.9.222.124.https: Flags [P.], seq 1:518, ack 1, win 229, options [nop,nop,TS val 434651673 ecr 3365805465], length 517
0x0000: 4500 0239 15ab 4000 4006 f77b c0a8 016a E..9..@.@..{...j
0x0010: 8b09 de7c 8192 01bb f6ab 981d 4d9a ceb8 ...|........M...
0x0020: 8018 00e5 22f2 0000 0101 080a 19e8 4219 ....".........B.
0x0030: c89e 1d99 1603 0102 0001 0001 fc03 0304 ................
0x0040: 6e2a ea14 6844 e2e1 db8c 1ee3 3582 e33f n*..hD......5..?
0x0050: 9128 2ad2 cd1c bac2 1e70 dd4f 6587 d700 .(*......p.Oe...
0x0060: 009e c030 c02c c028 c024 c014 c00a 00a5 ...0.,.(.$......
0x0070: 00a3 00a1 009f 006b 006a 0069 0068 0039 .......k.j.i.h.9
0x0080: 0038 0037 0036 0088 0087 0086 0085 c032 .8.7.6.........2
0x0090: c02e c02a c026 c00f c005 009d 003d 0035 ...*.&.......=.5
0x00a0: 0084 c02f c02b c027 c023 c013 c009 00a4 .../.+.'.#......
0x00b0: 00a2 00a0 009e 0067 0040 003f 003e 0033 .......g.@.?.>.3
0x00c0: 0032 0031 0030 009a 0099 0098 0097 0045 .2.1.0.........E
0x00d0: 0044 0043 0042 c031 c02d c029 c025 c00e .D.C.B.1.-.).%..
0x00e0: c004 009c 003c 002f 0096 0041 c012 c008 .....<./...A....
0x00f0: 0016 0013 0010 000d c00d c003 000a 00ff ................
0x0100: 0100 0135 0000 0019 0017 0000 1467 6974 ...5.........git
0x0110: 6c61 622e 7368 616b 7564 6164 612e 7879 lab.shakudada.xy
0x0120: 7a00 0b00 0403 0001 0200 0a00 1c00 1a00 z...............
0x0130: 1700 1900 1c00 1b00 1800 1a00 1600 0e00 ................
0x0140: 0d00 0b00 0c00 0900 0a00 0d00 2000 1e06 ................
0x0150: 0106 0206 0305 0105 0205 0304 0104 0204 ................
0x0160: 0303 0103 0203 0302 0102 0202 0300 0f00 ................
0x0170: 0101 3374 0000 0010 000b 0009 0868 7474 ..3t.........htt
0x0180: 702f 312e 3100 1500 b000 0000 0000 0000 p/1.1...........
0x0190: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x01a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x01b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x01c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x01d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x01e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x01f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0200: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0210: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0220: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0230: 0000 0000 0000 0000 00 .........
22:22:23.292507 IP 139.9.222.124.https > 192.168.1.106.33170: Flags [FP.], seq 1:650, ack 518, win 8192, length 649
0x0000: 4500 02b1 15ac 4000 f606 4102 8b09 de7c E.....@...A....|
0x0010: c0a8 016a 01bb 8192 4d9a ceb8 f6ab 9a22 ...j....M......"
0x0020: 5019 2000 d25b 0000 4854 5450 2f31 2e31 P....[..HTTP/1.1
0x0030: 2034 3033 2046 6f72 6269 6464 656e 0a43 .403.Forbidden.C
0x0040: 6f6e 7465 6e74 2d54 7970 653a 2074 6578 ontent-Type:.tex
0x0050: 742f 6874 6d6c 3b20 6368 6172 7365 743d t/html;.charset=
0x0060: 7574 662d 380a 5365 7276 6572 3a20 4144 utf-8.Server:.AD
0x0070: 4d2f 322e 312e 310a 436f 6e6e 6563 7469 M/2.1.1.Connecti
0x0080: 6f6e 3a20 636c 6f73 650a 436f 6e74 656e on:.close.Conten
0x0090: 742d 4c65 6e67 7468 3a20 3533 300a 0a3c t-Length:.530..<
0x00a0: 6874 6d6c 3e0a 3c68 6561 643e 0a3c 6d65 html>.<head>.<me
0x00b0: 7461 2068 7474 702d 6571 7569 763d 2243 ta.http-equiv="C
0x00c0: 6f6e 7465 6e74 2d54 7970 6522 2063 6f6e ontent-Type".con
0x00d0: 7465 6e74 3d22 7465 7874 6d6c 3b63 6861 tent="textml;cha
0x00e0: 7273 6574 3d47 4232 3331 3222 202f 3e0a rset=GB2312"./>.
0x00f0: 2020 203c 7374 796c 653e 626f 6479 7b62 ...<style>body{b
0x0100: 6163 6b67 726f 756e 642d 636f 6c6f 723a ackground-color:
0x0110: 2346 4646 4646 467d 3c2f 7374 796c 653e #FFFFFF}</style>
0x0120: 200a 3c74 6974 6c65 3ee9 9d9e e6b3 95e9 ..<title>.......
0x0130: 98bb e696 ad32 3334 3c2f 7469 746c 653e .....234</title>
0x0140: 0a20 203c 7363 7269 7074 206c 616e 6775 ...<script.langu
0x0150: 6167 653d 226a 6176 6173 6372 6970 7422 age="javascript"
0x0160: 2074 7970 653d 2274 6578 742f 6a61 7661 .type="text/java
0x0170: 7363 7269 7074 223e 0a20 2020 2020 2020 script">........
0x0180: 2020 7769 6e64 6f77 2e6f 6e6c 6f61 6420 ..window.onload.
0x0190: 3d20 6675 6e63 7469 6f6e 2028 2920 7b20 =.function.().{.
0x01a0: 0a20 2020 2020 2020 2020 2020 646f 6375 ............docu
0x01b0: 6d65 6e74 2e67 6574 456c 656d 656e 7442 ment.getElementB
0x01c0: 7949 6428 226d 6169 6e46 7261 6d65 2229 yId("mainFrame")
0x01d0: 2e73 7263 3d20 2268 7474 703a 2f2f 3131 .src=."http://11
0x01e0: 342e 3131 352e 3139 322e 3234 363a 3930 4.115.192.246:90
0x01f0: 3830 2f65 7272 6f72 2e68 746d 6c22 3b0a 80/error.html";.
0x0200: 2020 2020 2020 2020 2020 2020 7d0a 3c2f ............}.</
0x0210: 7363 7269 7074 3e20 2020 0a3c 2f68 6561 script>....</hea
0x0220: 643e 0a20 203c 626f 6479 3e0a 2020 2020 d>...<body>.....
0x0230: 3c69 6672 616d 6520 7374 796c 653d 2277 <iframe.style="w
0x0240: 6964 7468 3a31 3030 253b 2068 6569 6768 idth:100%;.heigh
0x0250: 743a 3130 3025 3b22 2069 643d 226d 6169 t:100%;".id="mai
0x0260: 6e46 7261 6d65 2220 7372 633d 2222 2066 nFrame".src="".f
0x0270: 7261 6d65 626f 7264 6572 3d22 3022 2073 rameborder="0".s
0x0280: 6372 6f6c 6c69 6e67 3d22 6e6f 223e 3c2f crolling="no"></
0x0290: 6966 7261 6d65 3e0a 2020 2020 3c2f 626f iframe>.....</bo
0x02a0: 6479 3e0a 2020 2020 2020 3c2f 6874 6d6c dy>.......</html
0x02b0: 3e >
22:22:23.292552 IP 192.168.1.106.33170 > 139.9.222.124.https: Flags [.], ack 651, win 239, options [nop,nop,TS val 434651685 ecr 3365805465], length 0
0x0000: 4500 0034 15ac 4000 4006 f97f c0a8 016a E..4..@.@......j
0x0010: 8b09 de7c 8192 01bb f6ab 9a22 4d9a d142 ...|......."M..B
0x0020: 8010 00ef d4f7 0000 0101 080a 19e8 4225 ..............B%
0x0030: c89e 1d99 ....
22:22:23.292562 IP 139.9.222.124.https > 192.168.1.106.33170: Flags [.], ack 518, win 235, options [nop,nop,TS val 3365805485 ecr 434651673], length 0
0x0000: 4500 0034 1ff1 4000 3106 fe3a 8b09 de7c E..4..@.1..:...|
0x0010: c0a8 016a 01bb 8192 4d9a ceb8 f6ab 9a22 ...j....M......"
0x0020: 8010 00eb d77d 0000 0101 080a c89e 1dad .....}..........
0x0030: 19e8 4219

4500 这两个字节开头明显就是ip报头,4代表ipv4,5则是ip报头的长度,也就是ip报头长度是5*4=20;

ip报头

也就是

	0x0000:  4500 0239 15ab 4000 4006 f77b c0a8 016a  E..9..@.@..{...j
0x0010: 8b09 de7c

一直到de7c都是ip报头

tcp 报头

	0x0000:  4500 0239 15ab 4000 4006 f77b c0a8 016a  E..9..@.@..{...j
0x0010: 8b09 de7c 8192 01bb <- 01bb就是443也就是目的端口

1*16*16+11*16+16=443

版本:IP协议的版本,目前的IP协议版本号为4,下一代IP协议版本号为6。

首部长度:IP报头的长度。固定部分的长度(20字节)和可变部分的长度之和。共占4位。最大为1111,即10进制的15,代表IP报头的最大长度可以为15个32bits(4字节),也就是最长可为15*4=60字节,除去固定部分的长度20字节,可变部分的长度最大为40字节。

翻了一下rfc8446,TLSV12的client hello的版本magic number0x0303,搜索了一下果然有

 struct {
ProtocolVersion legacy_version = 0x0303; /* TLS v1.2 */
Random random;
opaque legacy_session_id<0..32>;
CipherSuite cipher_suites<2..2^16-2>;
opaque legacy_compression_methods<1..2^8-1>;
Extension extensions<8..2^16-1>;
} ClientHello;

但是返回的明文很明显不是一个错误的链接

所以被sni阻断了

In the OCSPStatusRequest, the "ResponderIDs" provides a list of OCSP
responders that the client trusts. A zero-length "responder_id_list"
sequence has the special meaning that the responders are implicitly
known to the server - e.g., by prior arrangement. "Extensions" is a
DER encoding of OCSP request extensions.

Both "ResponderID" and "Extensions" are DER-encoded ASN.1 types as
defined in [OCSP]. "Extensions" is imported from [PKIX]. A zero-
length "request_extensions" value means that there are no extensions
(as opposed to a zero-length ASN.1 SEQUENCE, which is not valid for
the "Extensions" type).

In the case of the "id-pkix-ocsp-nonce" OCSP extension, [OCSP] is
unclear about its encoding; for clarification, the nonce MUST be a
DER-encoded OCTET STRING, which is encapsulated as another OCTET
STRING (note that implementations based on an existing OCSP client
will need to be checked for conformance to this requirement).

Servers that receive a client hello containing the "status_request"
extension, MAY return a suitable certificate status response to the
client along with their certificate. If OCSP is requested, they
SHOULD use the information contained in the extension when selecting
an OCSP responder, and SHOULD include request_extensions in the OCSP
request.

mysql字符串最大长度

· 5 min read

本文主要是记录mysql各种类型的字符串受什么限制。

前言

今天遇到一个特别的事情:把一个pdf的文档转成html然后存进mysql里面,所以我用了text 的字段来存。 结果读出来的时候发现少了一截。搜索了一番才发现text居然最大只能支持16kb的字节的内容。

字节和字符

如果你写过php,你可以比较清晰地知道strlen("你好")mb_strlen("你好")两者的区别。
如果是java的话,字节流的InputStreamOutputStream 或者writerreader这两个系列的区别你肯定也不陌生。

mysql字符串的长度与类型关系

String Type Storage Requirements

In the following table, M represents the declared column length in characters for nonbinary string types and bytes for binary string types. L represents the actual length in bytes of a given string value.

Data TypeStorage Required
CHAR(M)The compact family of InnoDB row formats optimize storage for variable-length character
BINARY(M)M bytes, 0 <= M <= 255
VARCHAR(M), VARBINARY(M)L + 1 bytes if column values require 0 − 255 bytes, L + 2 bytes if values may require more than 255 bytes
TINYBLOB, TINYTEXTL + 1 bytes, where L < 28
BLOB, TEXTL + 2 bytes, where L < 216
MEDIUMBLOB, MEDIUMTEXTL + 3 bytes, where L < 224
LONGBLOB, LONGTEXTL + 4 bytes, where L < 232
ENUM('value1','value2',...)1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum)
SET('value1','value2',...)1, 2, 3, 4, or 8 bytes, depending on the number of set members (64 members maximum)

来源

CHAR

CHAR 最大是255个字符

用如下的sql创建256个字符的char类型字符串会报错误

ERROR 1074 (42000): Column length too big for column 'name' (max = 255); use BLOB or TEXT instead

CREATE TABLE `test123` ( `name` char(256)) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4;
(gdb) bt
#0 my_error (nr=1074, MyFlags=0) at /home/dinosaur/Downloads/mysql-5.7.21/mysys/my_error.c:194
#1 0x0000000000f93e75 in Create_field::init (this=0x7fb9b8006740, thd=0x7fb9b8000b70, fld_name=0x7fb9b8006730 "name", fld_type=MYSQL_TYPE_STRING, fld_length=0x7fb9b8006738 "256", fld_decimals=0x0, fld_type_modifier=0,
fld_default_value=0x0, fld_on_update_value=0x0, fld_comment=0x7fb9b8002fe0, fld_change=0x0, fld_interval_list=0x7fb9b8003150, fld_charset=0x0, fld_geom_type=0, fld_gcol_info=0x0)
at /home/dinosaur/Downloads/mysql-5.7.21/sql/field.cc:10962
#2 0x000000000163ae21 in add_field_to_list (thd=0x7fb9b8000b70, field_name=0x7fba3d30c460, type=MYSQL_TYPE_STRING, length=0x7fb9b8006738 "256", decimals=0x0, type_modifier=0, default_value=0x0, on_update_value=0x0,
comment=0x7fb9b8002fe0, change=0x0, interval_list=0x7fb9b8003150, cs=0x0, uint_geom_type=0, gcol_info=0x0) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_parse.cc:5798
#3 0x000000000178e3f6 in MYSQLparse (YYTHD=0x7fb9b8000b70) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_yacc.yy:6337
#4 0x000000000163d75a in parse_sql (thd=0x7fb9b8000b70, parser_state=0x7fba3d30d550, creation_ctx=0x0) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_parse.cc:7131
#5 0x0000000001639f07 in mysql_parse (thd=0x7fb9b8000b70, parser_state=0x7fba3d30d550) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_parse.cc:5469
#6 0x000000000162f0a3 in dispatch_command (thd=0x7fb9b8000b70, com_data=0x7fba3d30de00, command=COM_QUERY) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_parse.cc:1458
#7 0x000000000162df32 in do_command (thd=0x7fb9b8000b70) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_parse.cc:999
#8 0x0000000001770f97 in handle_connection (arg=0x570d510) at /home/dinosaur/Downloads/mysql-5.7.21/sql/conn_handler/connection_handler_per_thread.cc:300
#9 0x0000000001de0b41 in pfs_spawn_thread (arg=0x5749fc0) at /home/dinosaur/Downloads/mysql-5.7.21/storage/perfschema/pfs.cc:2190
#10 0x00007fba478aa6ba in start_thread (arg=0x7fba3d30e700) at pthread_create.c:333
#11 0x00007fba46d3341d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

varchar最大长度

和char类似,想创建一个65532字符的varchar类型字段

CREATE TABLE `test123` ( `name` varchar(65533)) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4;

结果也是一样的错误

ERROR 1074 (42000): Column length too big for column 'name' (max = 16383); use BLOB or TEXT instead

(gdb) bt
#0 my_error (nr=1074, MyFlags=0) at /home/dinosaur/Downloads/mysql-5.7.21/mysys/my_error.c:194
#1 0x00000000016c9998 in prepare_blob_field (thd=0x7fb9b8000b70, sql_field=0x7fb9b8006840) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_table.cc:4715
#2 0x00000000016c6a33 in mysql_prepare_create_table (thd=0x7fb9b8000b70, error_schema_name=0x7fb9b8006728 "test", error_table_name=0x7fb9b8006168 "test123", create_info=0x7fba3d30c6b0, alter_info=0x7fba3d30c600,
tmp_table=false, db_options=0x7fba3d30b080, file=0x7fb9b8006ac0, key_info_buffer=0x7fba3d30c170, key_count=0x7fba3d30c16c, select_field_count=0) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_table.cc:3721
#3 0x00000000016cac22 in create_table_impl (thd=0x7fb9b8000b70, db=0x7fb9b8006728 "test", table_name=0x7fb9b8006168 "test123", error_table_name=0x7fb9b8006168 "test123", path=0x7fba3d30c180 "./test/test123",
create_info=0x7fba3d30c6b0, alter_info=0x7fba3d30c600, internal_tmp_table=false, select_field_count=0, no_ha_table=false, is_trans=0x7fba3d30c3da, key_info=0x7fba3d30c170, key_count=0x7fba3d30c16c)
at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_table.cc:5131
#4 0x00000000016cb884 in mysql_create_table_no_lock (thd=0x7fb9b8000b70, db=0x7fb9b8006728 "test", table_name=0x7fb9b8006168 "test123", create_info=0x7fba3d30c6b0, alter_info=0x7fba3d30c600, select_field_count=0,
is_trans=0x7fba3d30c3da) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_table.cc:5417
#5 0x00000000016cb9a2 in mysql_create_table (thd=0x7fb9b8000b70, create_table=0x7fb9b80061a0, create_info=0x7fba3d30c6b0, alter_info=0x7fba3d30c600) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_table.cc:5463
#6 0x00000000016335be in mysql_execute_command (thd=0x7fb9b8000b70, first_level=true) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_parse.cc:3248
#7 0x000000000163a31c in mysql_parse (thd=0x7fb9b8000b70, parser_state=0x7fba3d30d550) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_parse.cc:5582
#8 0x000000000162f0a3 in dispatch_command (thd=0x7fb9b8000b70, com_data=0x7fba3d30de00, command=COM_QUERY) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_parse.cc:1458
#9 0x000000000162df32 in do_command (thd=0x7fb9b8000b70) at /home/dinosaur/Downloads/mysql-5.7.21/sql/sql_parse.cc:999
#10 0x0000000001770f97 in handle_connection (arg=0x570d510) at /home/dinosaur/Downloads/mysql-5.7.21/sql/conn_handler/connection_handler_per_thread.cc:300
#11 0x0000000001de0b41 in pfs_spawn_thread (arg=0x5749fc0) at /home/dinosaur/Downloads/mysql-5.7.21/storage/perfschema/pfs.cc:2190
#12 0x00007fba478aa6ba in start_thread (arg=0x7fba3d30e700) at pthread_create.c:333
#13 0x00007fba46d3341d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

(gdb) p sql_field->length
$2 = 262132
static bool prepare_blob_field(THD *thd, Create_field *sql_field)
{
DBUG_ENTER("prepare_blob_field");

if (sql_field->length > MAX_FIELD_VARCHARLENGTH && // sql_field->length = 262132
!(sql_field->flags & BLOB_FLAG))
{
/* Convert long VARCHAR columns to TEXT or BLOB */
char warn_buff[MYSQL_ERRMSG_SIZE];

if (sql_field->def || thd->is_strict_mode()) // 严格模式下会打印errorERROR 1074 (42000): Column length too big for
{ // column 'name' (max = 16383); use BLOB or TEXT instead
my_error(ER_TOO_BIG_FIELDLENGTH, MYF(0), sql_field->field_name,
static_cast<ulong>(MAX_FIELD_VARCHARLENGTH / // MAX_FIELD_VARCHARLENGTH = 65535
sql_field->charset->mbmaxlen)); // sql_field->charset->mbmaxlen = 4
DBUG_RETURN(1);
}
...
}

也就是严格模式下,varchar 最大是65535字节的内容,改成varchar(16383)看看

mysql> CREATE TABLE `test123` ( `name` varchar(16383)) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4;
Query OK, 0 rows affected (0.26 sec)

ok,没有问题

倒排索引

· 6 min read

es编译

gradle idea

跑了很久

BUILD SUCCESSFUL in 49m 34s 334 actionable tasks: 334 executed

es 堆栈

prepareRequest:61, RestCatAction (org.elasticsearch.rest.action.cat)
handleRequest:80, BaseRestHandler (org.elasticsearch.rest)
handleRequest:69, SecurityRestFilter (org.elasticsearch.xpack.security.rest)
dispatchRequest:240, RestController (org.elasticsearch.rest)
tryAllHandlers:337, RestController (org.elasticsearch.rest)
dispatchRequest:174, RestController (org.elasticsearch.rest)
dispatchRequest:324, AbstractHttpServerTransport (org.elasticsearch.http)
handleIncomingRequest:374, AbstractHttpServerTransport (org.elasticsearch.http)
incomingRequest:303, AbstractHttpServerTransport (org.elasticsearch.http)
channelRead0:66, Netty4HttpRequestHandler (org.elasticsearch.http.netty4)
channelRead0:31, Netty4HttpRequestHandler (org.elasticsearch.http.netty4)
channelRead:105, SimpleChannelInboundHandler (io.netty.channel)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:58, Netty4HttpPipeliningHandler (org.elasticsearch.http.netty4)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:102, MessageToMessageDecoder (io.netty.handler.codec)
channelRead:111, MessageToMessageCodec (io.netty.handler.codec)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:102, MessageToMessageDecoder (io.netty.handler.codec)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:102, MessageToMessageDecoder (io.netty.handler.codec)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:323, ByteToMessageDecoder (io.netty.handler.codec)
channelRead:297, ByteToMessageDecoder (io.netty.handler.codec)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:286, IdleStateHandler (io.netty.handler.timeout)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:1434, DefaultChannelPipeline$HeadContext (io.netty.channel)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:965, DefaultChannelPipeline (io.netty.channel)
read:163, AbstractNioByteChannel$NioByteUnsafe (io.netty.channel.nio)
processSelectedKey:644, NioEventLoop (io.netty.channel.nio)
processSelectedKeysPlain:544, NioEventLoop (io.netty.channel.nio)
processSelectedKeys:498, NioEventLoop (io.netty.channel.nio)
run:458, NioEventLoop (io.netty.channel.nio)
run:897, SingleThreadEventExecutor$5 (io.netty.util.concurrent)
run:834, Thread (java.lang)

以及

handleRequest:97, BaseRestHandler (org.elasticsearch.rest)
handleRequest:69, SecurityRestFilter (org.elasticsearch.xpack.security.rest)
dispatchRequest:240, RestController (org.elasticsearch.rest)
tryAllHandlers:337, RestController (org.elasticsearch.rest)
dispatchRequest:174, RestController (org.elasticsearch.rest)
dispatchRequest:324, AbstractHttpServerTransport (org.elasticsearch.http)
handleIncomingRequest:374, AbstractHttpServerTransport (org.elasticsearch.http)
incomingRequest:303, AbstractHttpServerTransport (org.elasticsearch.http)
channelRead0:66, Netty4HttpRequestHandler (org.elasticsearch.http.netty4)
channelRead0:31, Netty4HttpRequestHandler (org.elasticsearch.http.netty4)
channelRead:105, SimpleChannelInboundHandler (io.netty.channel)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:58, Netty4HttpPipeliningHandler (org.elasticsearch.http.netty4)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:102, MessageToMessageDecoder (io.netty.handler.codec)
channelRead:111, MessageToMessageCodec (io.netty.handler.codec)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:102, MessageToMessageDecoder (io.netty.handler.codec)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:102, MessageToMessageDecoder (io.netty.handler.codec)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:323, ByteToMessageDecoder (io.netty.handler.codec)
channelRead:297, ByteToMessageDecoder (io.netty.handler.codec)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:286, IdleStateHandler (io.netty.handler.timeout)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:340, AbstractChannelHandlerContext (io.netty.channel)
channelRead:1434, DefaultChannelPipeline$HeadContext (io.netty.channel)
invokeChannelRead:362, AbstractChannelHandlerContext (io.netty.channel)
invokeChannelRead:348, AbstractChannelHandlerContext (io.netty.channel)
fireChannelRead:965, DefaultChannelPipeline (io.netty.channel)
read:163, AbstractNioByteChannel$NioByteUnsafe (io.netty.channel.nio)
processSelectedKey:644, NioEventLoop (io.netty.channel.nio)
processSelectedKeysPlain:544, NioEventLoop (io.netty.channel.nio)
processSelectedKeys:498, NioEventLoop (io.netty.channel.nio)
run:458, NioEventLoop (io.netty.channel.nio)
run:897, SingleThreadEventExecutor$5 (io.netty.util.concurrent)
run:834, Thread (java.lang)

倒排索引简介

到排索引解决什么问题?

当我们有一个文档a.txt,里面有一堆文字hello wrold ,i am dinosaur

我们需要从所有文档里面判断这个文档里面是否存在world 这个词汇,应该怎么做呢?

当文档的数量很少的时候,可以

  • 1 打开文件
  • 2 从头开始去读取文件内容判断是否包含world

那么当我们不仅仅只有一个文档a.txt,我们还有b.txtc.txt的时候,我们怎么判断某个词word是否在这些文档里面呢?如果word在里面,又在那些文档的第几行呢?

如果我们还用之前的从头开始一个个文件读的话,如果文档数量少还好,如果文档很多,我们就非常慢才能读完所有的文档。

倒排索引解决的其中一个问题就是如何快速定位某个词是是否在这些文档中,如果在又在哪些文档里面。

相关例子

baseline invert index

倒排索引包括主要两个部分:

  • 第一部分$word$包含$t$两个域:
    • $f_{t}$: 文档(document)中包含词$t$的文档个数,也就是说有多少个文档含有词$t$,那么$f_{t}$等于几。
    • 指向$inverted list$的指针
  • 第二部分$invert list$是一个列表,列表的每个元素包括以下两个域:
    • $docid$: 文档对应的id,可以理解为文档主键
    • $f_{d}$: 该$docid$ 中包含词$t$的数量

uwiAvq.png

我自己写了的demo代码github 地址,输出如下

 keeper  3|[{1 1} {4 1} {5 1}]
In 1|[{2 1}]
house 2|[{2 1} {3 1}]
nignt 2|[{4 1} {5 1}]
did 1|[{4 1}]
dark 1|[{6 1}]
old 4|[{1 1} {2 1} {3 1} {4 1}]
night 3|[{1 1} {5 1} {6 1}]
had 1|[{3 1}]
sleeps 1|[{6 1}]
keep 3|[{1 1} {3 1} {5 1}]
big 2|[{2 1} {3 1}]
keeps 3|[{1 1} {5 1} {6 1}]
the 6|[{1 1} {2 1} {3 1} {4 1} {5 1} {6 1}]
never 1|[{4 1}]
and 1|[{6 1}]
And 1|[{6 1}]
in 5|[{1 1} {2 1} {3 1} {5 1} {6 1}]
The 3|[{1 1} {3 1} {5 1}]
sleep 1|[{4 1}]
Where 1|[{4 1}]
town 2|[{1 1} {3 1}]
gown 1|[{2 1}]

构造倒排索引的步骤

  • 1 读取文档
  • 2 分词
  • 3 对分词正规化(normalized)
  • 4 建立包含词频和偏移量的倒排索引

分词

https://www.cnblogs.com/forfuture1978/archive/2010/06/06/1752837.html

Lucene 的堆栈,主要的逻辑都在invert方法里面

incrementToken:48, FilteringTokenFilter (org.apache.lucene.analysis)
invert:812, DefaultIndexingChain$PerField (org.apache.lucene.index)
processField:442, DefaultIndexingChain (org.apache.lucene.index)
processDocument:406, DefaultIndexingChain (org.apache.lucene.index)
updateDocument:250, DocumentsWriterPerThread (org.apache.lucene.index)
updateDocument:495, DocumentsWriter (org.apache.lucene.index)
updateDocument:1594, IndexWriter (org.apache.lucene.index)
addDocument:1213, IndexWriter (org.apache.lucene.index)
indexDoc:198, IndexFiles (com.dinosaur)
visitFile:155, IndexFiles$1 (com.dinosaur)
visitFile:151, IndexFiles$1 (com.dinosaur)
walkFileTree:2670, Files (java.nio.file)
walkFileTree:2742, Files (java.nio.file)
indexDocs:151, IndexFiles (com.dinosaur)
main:113, IndexFiles (com.dinosaur)

Lucene分词的核心在于incrementToken获取token

举个例子

Lucene的标准分词器

  private final CharTermAttribute termAtt = addAttribute(CharTermAttribute.class);  // final的单例
@Override
public final boolean incrementToken() throws IOException {
...
scanner.getText(termAtt); // scanner 返回一个词并将那个词设置到termAtt上面
...
}

maven打包NoClassDefFoundError

· 2 min read

maven打包NoClassDefFoundError

刚刚在学习怎么使用maven,可以编译通过,但是运行命令java -jar xxx.jar 的时候却报了错误NoClassDefFoundError

踩坑开始

踩坑第一步是去stack overflow 找了一个答案,使用插件maven-shade-plugin,其实这个也是正确的答案

这是正确答案

<project>
...
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
...
</project>

相关链接

我踩坑在哪里呢?

我当时不了解xml节点<pluginManagement>下面的plugins节点

  • 这个是错误的写法
<project>
...
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</pluginManagement>
</build>
...
</project>

最终我的写法

最终写法就是得放在build 节点的下一级,不能放在pluginManagement里面的<plugins>节点里面

<project>
...
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
<pluginManagement>
...
</pluginManagement>
</build>
...
</project>

然后运行mvn package 就能打包所有依赖进去

hello world java

· 8 min read

hello

java hello world

public class HelloWorld {

public static void main(String[] args) {
// Prints "Hello, World" to the terminal window.
System.out.println("Hello, World");
}

}


编译

编译 需要添加g 选项

javac -g HelloWorld.java 

调试

方法一:

使用jdb 调试hello wrold

jdb -classpath . HelloWorld
> stop  in HelloWorld.main                                
Deferring breakpoint HelloWorld.main.
It will be set after the class is loaded.
> run
run HelloWorld
Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
>
VM Started: Set deferred breakpoint HelloWorld.main

Breakpoint hit: "thread=main", HelloWorld.main(), line=5 bci=0
5 System.out.println("Hello, World");

main[1]

使用maven 编写helloworld

当遇到maven package后,java -java some.jar 说找不到main的时候可以参考以下答案
https://stackoverflow.com/a/9689877/6229548

加载类

(gdb) bt
#0 open64 () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007ffff695b544 in os::open (path=0x7ffff7fcefd0 "/home/dinosaur/jdk8/build/linux-x86_64-normal-server-slowdebug/jdk/classes/java/lang/Class.class", oflag=0, mode=0)
at /home/dinosaur/jdk8/hotspot/src/os/linux/vm/os_linux.cpp:5188
#2 0x00007ffff63ffdfc in ClassPathDirEntry::open_stream (this=0x7ffff006f178, name=0x7ffff000cce8 "java/lang/Class.class", __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/classLoader.cpp:210
#3 0x00007ffff640055b in LazyClassPathEntry::open_stream (this=0x7ffff001ad48, name=0x7ffff000cce8 "java/lang/Class.class", __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/classLoader.cpp:330
#4 0x00007ffff640209b in ClassLoader::load_classfile (h_name=0x7ffff4062108, __the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/classLoader.cpp:909
#5 0x00007ffff6a8570a in SystemDictionary::load_instance_class (class_name=0x7ffff4062108, class_loader=..., __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1304
#6 0x00007ffff6a838b8 in SystemDictionary::resolve_instance_class_or_null (name=0x7ffff4062108, class_loader=..., protection_domain=..., __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:779
#7 0x00007ffff6a81ff7 in SystemDictionary::resolve_or_null (class_name=0x7ffff4062108, class_loader=..., protection_domain=..., __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:232
#8 0x00007ffff6a819f2 in SystemDictionary::resolve_or_fail (class_name=0x7ffff4062108, class_loader=..., protection_domain=..., throw_error=true, __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:171
#9 0x00007ffff6a81d64 in SystemDictionary::resolve_or_fail (class_name=0x7ffff4062108, throw_error=true, __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:212
#10 0x00007ffff6a87277 in SystemDictionary::initialize_wk_klass (id=SystemDictionary::Class_klass_knum, init_opt=0, __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1866
#11 0x00007ffff6a873a7 in SystemDictionary::initialize_wk_klasses_until (limit_id=SystemDictionary::Cloneable_klass_knum, start_id=@0x7ffff7fd0a84: SystemDictionary::Object_klass_knum,
__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1882
#12 0x00007ffff6a8b13c in SystemDictionary::initialize_wk_klasses_through (end_id=SystemDictionary::Class_klass_knum, start_id=@0x7ffff7fd0a84: SystemDictionary::Object_klass_knum,
__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.hpp:408
#13 0x00007ffff6a874e0 in SystemDictionary::initialize_preloaded_classes (__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1901
#14 0x00007ffff6a87199 in SystemDictionary::initialize (__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1843
#15 0x00007ffff6ad68c9 in Universe::genesis (__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/memory/universe.cpp:288
#16 0x00007ffff6ad8db6 in universe2_init () at /home/dinosaur/jdk8/hotspot/src/share/vm/memory/universe.cpp:991
#17 0x00007ffff66463b3 in init_globals () at /home/dinosaur/jdk8/hotspot/src/share/vm/runtime/init.cpp:114
#18 0x00007ffff6ab93ef in Threads::create_vm (args=0x7ffff7fd0e80, canTryAgain=0x7ffff7fd0e03) at /home/dinosaur/jdk8/hotspot/src/share/vm/runtime/thread.cpp:3424
#19 0x00007ffff6702ed0 in JNI_CreateJavaVM (vm=0x7ffff7fd0ed8, penv=0x7ffff7fd0ee0, args=0x7ffff7fd0e80) at /home/dinosaur/jdk8/hotspot/src/share/vm/prims/jni.cpp:5166
#20 0x00007ffff7bc3bda in InitializeJVM (pvm=0x7ffff7fd0ed8, penv=0x7ffff7fd0ee0, ifn=0x7ffff7fd0f30) at /home/dinosaur/jdk8/jdk/src/share/bin/java.c:1145
#21 0x00007ffff7bc1a36 in JavaMain (_args=0x7fffffffa910) at /home/dinosaur/jdk8/jdk/src/share/bin/java.c:371
#22 0x00007ffff73d66ba in start_thread (arg=0x7ffff7fd1700) at pthread_create.c:333
#23 0x00007ffff78f741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

加载classloader

(gdb) bt
#0 open64 () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007ffff695b544 in os::open (path=0x7ffff7fcefd0 "/home/dinosaur/jdk8/build/linux-x86_64-normal-server-slowdebug/jdk/classes/java/lang/ClassLoader.class", oflag=0, mode=0)
at /home/dinosaur/jdk8/hotspot/src/os/linux/vm/os_linux.cpp:5188
#2 0x00007ffff63ffdfc in ClassPathDirEntry::open_stream (this=0x7ffff006f178, name=0x7ffff000cd08 "java/lang/ClassLoader.class", __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/classLoader.cpp:210
#3 0x00007ffff640055b in LazyClassPathEntry::open_stream (this=0x7ffff001ad48, name=0x7ffff000cd08 "java/lang/ClassLoader.class", __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/classLoader.cpp:330
#4 0x00007ffff640209b in ClassLoader::load_classfile (h_name=0x7ffff40621c8, __the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/classLoader.cpp:909
#5 0x00007ffff6a8570a in SystemDictionary::load_instance_class (class_name=0x7ffff40621c8, class_loader=..., __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1304
#6 0x00007ffff6a838b8 in SystemDictionary::resolve_instance_class_or_null (name=0x7ffff40621c8, class_loader=..., protection_domain=..., __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:779
#7 0x00007ffff6a81ff7 in SystemDictionary::resolve_or_null (class_name=0x7ffff40621c8, class_loader=..., protection_domain=..., __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:232
#8 0x00007ffff6a819f2 in SystemDictionary::resolve_or_fail (class_name=0x7ffff40621c8, class_loader=..., protection_domain=..., throw_error=true, __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:171
#9 0x00007ffff6a81d64 in SystemDictionary::resolve_or_fail (class_name=0x7ffff40621c8, throw_error=true, __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:212
#10 0x00007ffff6a87277 in SystemDictionary::initialize_wk_klass (id=SystemDictionary::ClassLoader_klass_knum, init_opt=0, __the_thread__=0x7ffff000c000)
at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1866
#11 0x00007ffff6a873a7 in SystemDictionary::initialize_wk_klasses_until (limit_id=SystemDictionary::SoftReference_klass_knum, start_id=@0x7ffff7fd0a84: SystemDictionary::Cloneable_klass_knum,
__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1882
#12 0x00007ffff6a8b13c in SystemDictionary::initialize_wk_klasses_through (end_id=SystemDictionary::Reference_klass_knum, start_id=@0x7ffff7fd0a84: SystemDictionary::Cloneable_klass_knum,
__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.hpp:408
#13 0x00007ffff6a87553 in SystemDictionary::initialize_preloaded_classes (__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1918
#14 0x00007ffff6a87199 in SystemDictionary::initialize (__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/classfile/systemDictionary.cpp:1843
#15 0x00007ffff6ad68c9 in Universe::genesis (__the_thread__=0x7ffff000c000) at /home/dinosaur/jdk8/hotspot/src/share/vm/memory/universe.cpp:288
#16 0x00007ffff6ad8db6 in universe2_init () at /home/dinosaur/jdk8/hotspot/src/share/vm/memory/universe.cpp:991
#17 0x00007ffff66463b3 in init_globals () at /home/dinosaur/jdk8/hotspot/src/share/vm/runtime/init.cpp:114
#18 0x00007ffff6ab93ef in Threads::create_vm (args=0x7ffff7fd0e80, canTryAgain=0x7ffff7fd0e03) at /home/dinosaur/jdk8/hotspot/src/share/vm/runtime/thread.cpp:3424
#19 0x00007ffff6702ed0 in JNI_CreateJavaVM (vm=0x7ffff7fd0ed8, penv=0x7ffff7fd0ee0, args=0x7ffff7fd0e80) at /home/dinosaur/jdk8/hotspot/src/share/vm/prims/jni.cpp:5166
#20 0x00007ffff7bc3bda in InitializeJVM (pvm=0x7ffff7fd0ed8, penv=0x7ffff7fd0ee0, ifn=0x7ffff7fd0f30) at /home/dinosaur/jdk8/jdk/src/share/bin/java.c:1145
#21 0x00007ffff7bc1a36 in JavaMain (_args=0x7fffffffa910) at /home/dinosaur/jdk8/jdk/src/share/bin/java.c:371
#22 0x00007ffff73d66ba in start_thread (arg=0x7ffff7fd1700) at pthread_create.c:333
#23 0x00007ffff78f741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

打印hello world

这是打印hello world 的堆栈,估计是被优化了打印不了完整堆栈

(gdb) bt
#0 write () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007ffff556779a in handleWrite (fd=1, buf=0x7ffff7fce270, len=12)
at /home/dinosaur/jdk8/jdk/src/solaris/native/java/io/io_util_md.c:164
#2 0x00007ffff556710a in writeBytes (env=0x7ffff000c210, this=0x7ffff7fd0398, bytes=0x7ffff7fd0390, off=0, len=12, append=0 '\000',
fid=0x47e1043) at /home/dinosaur/jdk8/jdk/src/share/native/java/io/io_util.c:189
#3 0x00007ffff555a79c in Java_java_io_FileOutputStream_writeBytes (env=0x7ffff000c210, this=0x7ffff7fd0398, bytes=0x7ffff7fd0390,
off=0, len=12, append=0 '\000') at /home/dinosaur/jdk8/jdk/src/solaris/native/java/io/FileOutputStream_md.c:70
#4 0x00007fffe10298dc in ?? ()
#5 0x0000000000000008 in ?? ()
#6 0x0000000000000008 in ?? ()
#7 0x00007ffff000c000 in ?? ()
#8 0x00007fffe02c74d8 in ?? ()
#9 0x00007fffe1028ee3 in ?? ()
#10 0x00007ffff7fd0318 in ?? ()
#11 0x00007fffe0173f60 in ?? ()
#12 0x00007ffff7fd0398 in ?? ()
#13 0x00007fffe0175120 in ?? ()
#14 0x0000000000000000 in ?? ()
(gdb) c
Continuing.
Hello, World

(gdb) bt
#0 write () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007ffff556779a in handleWrite (fd=1, buf=0x7ffff7fce2d0, len=1)
at /home/dinosaur/jdk8/jdk/src/solaris/native/java/io/io_util_md.c:164
#2 0x00007ffff556710a in writeBytes (env=0x7ffff000c210, this=0x7ffff7fd0400, bytes=0x7ffff7fd03f8, off=0, len=1, append=0 '\000',
fid=0x47e1043) at /home/dinosaur/jdk8/jdk/src/share/native/java/io/io_util.c:189
#3 0x00007ffff555a79c in Java_java_io_FileOutputStream_writeBytes (env=0x7ffff000c210, this=0x7ffff7fd0400, bytes=0x7ffff7fd03f8,
off=0, len=1, append=0 '\000') at /home/dinosaur/jdk8/jdk/src/solaris/native/java/io/FileOutputStream_md.c:70
#4 0x00007fffe10298dc in ?? ()
#5 0x00007ffff7fd0410 in ?? ()
#6 0x00007ffff672dd43 in JVM_ArrayCopy (env=0x7ffff000c210, ignored=0x7ffff7fd0400, src=0x7ffff7fd03f8, src_pos=0,
dst=0x7f00f6265bea, dst_pos=1, length=0) at /home/dinosaur/jdk8/hotspot/src/share/vm/prims/jvm.cpp:298
#7 0x00007fffe1007500 in ?? ()
#8 0x0000000000000000 in ?? ()

java class file

4.1 The ClassFile Structure

A class file consists of a stream of 8-bit bytes. All 16-bit, 32-bit, and 64-bit quantities are constructed by reading in two, four, and eight consecutive 8-bit bytes, respectively. Multibyte data items are always stored in big-endian order, where the high bytes come first. In the Java SE platform, this format is supported by interfaces java.io.DataInput and java.io.DataOutput and classes such as java.io.DataInputStream and java.io.DataOutputStream.

通过jvm文档,可以知道class文件存的magic number0xCAFEBABE,存储方式是大端的

4.1 The ClassFile Structure A class file consists of a single ClassFile structure:

ClassFile {
u4 magic;
u2 minor_version;
u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];
u2 access_flags;
u2 this_class;
u2 super_class;
u2 interfaces_count;
u2 interfaces[interfaces_count];
u2 fields_count;
field_info fields[fields_count];
u2 methods_count;
method_info methods[methods_count];
u2 attributes_count;
attribute_info attributes[attributes_count];
}

The magic item supplies the magic number identifying the class file format; it has the value 0xCAFEBABE.

dinosaur@dinosaur-X550VXK:~/jdk8/build$ hexdump  HelloWorld.class -C

使用hexdump查看class文件


00000000 ca fe ba be 00 00 00 34 00 1d 0a 00 06 00 0f 09 |.......4........|
00000010 00 10 00 11 08 00 12 0a 00 13 00 14 07 00 15 07 |................|
00000020 00 16 01 00 06 3c 69 6e 69 74 3e 01 00 03 28 29 |.....<init>...()|
00000030 56 01 00 04 43 6f 64 65 01 00 0f 4c 69 6e 65 4e |V...Code...LineN|
00000040 75 6d 62 65 72 54 61 62 6c 65 01 00 04 6d 61 69 |umberTable...mai|
00000050 6e 01 00 16 28 5b 4c 6a 61 76 61 2f 6c 61 6e 67 |n...([Ljava/lang|
00000060 2f 53 74 72 69 6e 67 3b 29 56 01 00 0a 53 6f 75 |/String;)V...Sou|
00000070 72 63 65 46 69 6c 65 01 00 0f 48 65 6c 6c 6f 57 |rceFile...HelloW|
00000080 6f 72 6c 64 2e 6a 61 76 61 0c 00 07 00 08 07 00 |orld.java.......|
00000090 17 0c 00 18 00 19 01 00 0c 48 65 6c 6c 6f 2c 20 |.........Hello, |
000000a0 57 6f 72 6c 64 07 00 1a 0c 00 1b 00 1c 01 00 0a |World...........|
000000b0 48 65 6c 6c 6f 57 6f 72 6c 64 01 00 10 6a 61 76 |HelloWorld...jav|
000000c0 61 2f 6c 61 6e 67 2f 4f 62 6a 65 63 74 01 00 10 |a/lang/Object...|
000000d0 6a 61 76 61 2f 6c 61 6e 67 2f 53 79 73 74 65 6d |java/lang/System|
000000e0 01 00 03 6f 75 74 01 00 15 4c 6a 61 76 61 2f 69 |...out...Ljava/i|
000000f0 6f 2f 50 72 69 6e 74 53 74 72 65 61 6d 3b 01 00 |o/PrintStream;..|
00000100 13 6a 61 76 61 2f 69 6f 2f 50 72 69 6e 74 53 74 |.java/io/PrintSt|
00000110 72 65 61 6d 01 00 07 70 72 69 6e 74 6c 6e 01 00 |ream...println..|
00000120 15 28 4c 6a 61 76 61 2f 6c 61 6e 67 2f 53 74 72 |.(Ljava/lang/Str|
00000130 69 6e 67 3b 29 56 00 21 00 05 00 06 00 00 00 00 |ing;)V.!........|
00000140 00 02 00 01 00 07 00 08 00 01 00 09 00 00 00 1d |................|
00000150 00 01 00 01 00 00 00 05 2a b7 00 01 b1 00 00 00 |........*.......|
00000160 01 00 0a 00 00 00 06 00 01 00 00 00 01 00 09 00 |................|
00000170 0b 00 0c 00 01 00 09 00 00 00 25 00 02 00 01 00 |..........%.....|
00000180 00 00 09 b2 00 02 12 03 b6 00 04 b1 00 00 00 01 |................|
00000190 00 0a 00 00 00 0a 00 02 00 00 00 05 00 08 00 06 |................|
000001a0 00 01 00 0d 00 00 00 02 00 0e |..........|

我们来看看hello world这个class文件的各种内容

第一个是magic number: ca fe ba be 四个字节
然后是minor_version:00 00
major_version:00 34

相关阅读

环境变量是什么

· 3 min read

当我们使用shell 的命令env命令的时候可以看到很多字符串,那些就是这个进程的环境变量

环境变量怎么存

The first two arguments are just the same. The third argument envp gives the program’s environment; it is the same as the value of environ. See Environment Variables. POSIX.1 does not allow this three-argument form, so to be portable it is best to write main to take two arguments, and use the value of environ.

posix 相关文档

where argc is the argument count and argv is an array of character pointers to the arguments themselves. In addition, the following variable:

extern char **environ;

is initialized as a pointer to an array of character pointers to the environment strings. The argv and environ arrays are each terminated by a null pointer. The null pointer terminating the argv array is not counted in argc.

例子

下面是例子

#include <stdio.h>

extern char **environ;

int main(int argc, const char *argv[]) {
printf("environment variables:\n");
int i = 0;
while (environ[i]) {
printf("%p\t%s\n", environ[i], environ[i]);
i++;
}

printf("argv:\n");
for (int i = 0; i < argc; i++) {
printf("%p\t%s\n", argv[i], argv[i]);
}
}

编译后会把这些打印出来

gcc main.c -o main
dinosaur@dinosaur-X550VXK:~/test$ ./main
environment variables:
0x7ffc250920c7 XDG_VTNR=7
0x7ffc250920d2 LC_PAPER=zh_CN.UTF-8
0x7ffc250920e7 LC_ADDRESS=zh_CN.UTF-8
0x7ffc250920fe XDG_SESSION_ID=c1
0x7ffc25092110 XDG_GREETER_DATA_DIR=/var/lib/lightdm-data/dinosaur
0x7ffc25092144 LC_MONETARY=zh_CN.UTF-8
0x7ffc2509215c CLUTTER_IM_MODULE=xim
...

glibc变量

定义

定义在glibc-master/posix/environ.c

/* This file just defines the `__environ' variable (and alias `environ').  */

#include <unistd.h>
#include <stddef.h>

/* This must be initialized; we cannot have a weak alias into bss. */
char **__environ = NULL;
weak_alias (__environ, environ) // 弱引用 其实environ 就是__environ
...

读getenv

char *
getenv (const char *name)
{
size_t len = strlen (name);
char **ep;
uint16_t name_start;


...
name_start = *(const uint16_t *) name;
...
len -= 2;
name += 2;

for (ep = __environ; *ep != NULL; ++ep)
{
uint16_t ep_start = *(uint16_t *) *ep;

if (name_start == ep_start && !strncmp (*ep + 2, name, len)
&& (*ep)[len + 2] == '=')
return &(*ep)[len + 3];
}
...

return NULL;
}

写 putenv setenv

putenv setenv 都调用__add_to_environ

int
__add_to_environ (const char *name, const char *value, const char *combined,
int replace)
{
const size_t namelen = strlen (name);
size_t vallen;
...
vallen = strlen (value) + 1;
...
const size_t varlen = namelen + 1 + vallen;
...
memcpy (new_value, name, namelen);
new_value[namelen] = '=';
memcpy (&new_value[namelen + 1], value, vallen);
...
}

其实就是char ** environ 变量存着 key=value的字符串

如何以及什么时机继承

// todo 有空扒一下

总结

环境变量是一堆字符串,继承通过进程父子关系

1 环境变量的来源、原理与应用

2 glibc 文档

cors 相关

· 3 min read

前言

谈到cors之前,必须先谈同源策略和跨域。

为什么要跨域

同源策略限制了从同一个源加载的文档或脚本如何与来自另一个源的资源进行交互。这是一个用于隔离潜在恶意文件的重要安全机制。而且这些是浏览器自己限制的。

rfc 6454

User agents interact with content created by a large number of authors. Although many of those authors are well-meaning, some authors might be malicious. To the extent that user agents undertake actions based on content they process, user agent implementors might wish to restrict the ability of malicious authors to disrupt the confidentiality or integrity of other content or servers.

详细的相关内容

cors

相关阅读

相关阅读2

简单请求

简单请求就是不会触发cors预检请求的请求.

某些请求不会触发 CORS 预检请求。本文称这样的请求为“简单请求”,请注意,该术语并不属于 Fetch (其中定义了 CORS)规范。若请求满足所有下述条件,则该请求可视为“简单请求”:

使用下列方法之一: GET HEAD POST Fetch 规范定义了对 CORS 安全的首部字段集合,不得人为设置该集合之外的其他首部字段。该集合为: Accept Accept-Language Content-Language Content-Type (需要注意额外的限制) DPR Downlink Save-Data Viewport-Width Width Content-Type 的值仅限于下列三者之一: text/plain multipart/form-data application/x-www-form-urlencoded 请求中的任意XMLHttpRequestUpload 对象均没有注册任何事件监听器;XMLHttpRequestUpload 对象可以使用 XMLHttpRequest.upload 属性访问。 请求中没有使用 ReadableStream 对象。