Recording

Commit volatile memory to persistent append-only log

0%

Recently, I and my colleague build a small web project together using Spring.
We use a context path other than “/”, say, “/abc” which is the project name.
This setting is consistent with existing projects in my team.

But after we deploy this project behind nginx, we encounter problem. In nginx, we
config something like this:

1
2
3
location / {
proxy_pass http://127.0.0.1:5000/abc/;
}

It works fine for handwritten url from client. But the url links in the response html,
which is generated by this application and points to resources inside this application,
does not work. All url forwarded by nginx will be prefixed with “/abc”, which is the
context path of this application in web container.
While the application generated url
links have that prefix already. This results in wrong url links for those resources.

There are workarounds to solve this, though, but after investigation and thinking, I
concludes to:

If you use reverse proxy to forward HTTP request, then don’t use context path in your
application, instead let reverse proxy set context path for you.

First, if you use context path other than “/” in your application and you want your application
visited via top domain without subpath, then you must hardcode that context path in reverse
proxy. The same information is hardcoded in two places, your application and the reverse proxy.

Second, if you use context path “/” in your application and you want your application visited via
subpath, reverse proxy such as nginx can set X-Forwarded-Prefix for your application to generate
url links prefixed with that subpath. If the framework backing your application respect this header,
you probably don’t notice that.

The underlying cause behind this conclusion is that both web application and reverse proxy are
passive. They accept request, handle it, response it. The solution comes from the fact that reverse
proxy stand before web application so that it can add extra informations for web application to handle
in programming approach without the need to handcode such informations.

Links:
http://stackoverflow.com/questions/10429487/context-path-for-tomcat-web-application-fronted-with-nginx-as-reverse-proxy
http://stackoverflow.com/questions/19866203/nginx-configuration-to-pass-site-directly-to-tomcat-webapp-with-context

Read more »

Recently, I write a LevelDB implementation in Go. In this post, I summarize
some invariants in algorithm used by LevelDB implementations.

Sequence Number

Sequence number is a monotonically increasing 56 bits integer value. Every
time a key is written to LevelDB, it is tagged with a sequence number one
larger than sequence number tagged with previous key written to LevelDB.
If two entries in LevelDB have same user level keys, the one with larger
sequence must shadow the other.

Sorted Memory Tables

Writes are first recorded in a mutable memory table. If that memory table is
full, it is marked as immutable and a new memory table is created to record
writes. The memory table marked as immutable is then compacted to a sorted
disk table in level 0 and deleted. Thus we conclude that:

If a key appears in mutable memory table, it is newest. Otherwise, if it
appears in immutable memory table, it is newest.

Compaction

When an immutable memory table is compacted to a sorted table in Level 0, it
is assigned with a file number larger than all existing file numbers in this
Level. Thus we have:

Entries stored in newer file in Level 0 shadow entries with same user keys
in older files.

When there are too many files in Level 0 or too many data in Level 1 or above,
we start level compaction. In level N compaction, we select a file set from
level N, such that no remaining files overlap this file set in user key level.
Then we compact this file set with all overlapping files from level N+1, and
produce sorted tables in level N+1. Thus we conclude to two invariants.

Read more »

战场状态

战场状态 = 所有实体的状态之和

这里的实体包括但不限于玩家、AI 、子弹和商店。

时间

每隔单位时间,在上一时刻战场状态的基础上,计算所有该时间单位发起的操作或接收的状态改变。同时时间也作为输入,驱动实体的持续性动作。
这里的单位时间可以是 10ms 这样的间隔。

计算

任意时刻,玩家在客户端的操作在发送给服务器端的同时,也将在本地的下一次计算时,作用于状态的本地副本。在收到服务器端的回应之后,
与这期间收到的所有回应及其他操作在之前保存的战场状态上重新计算,客户端需要对上一时刻计算的副本进行修正。例如:

  1. 假设,Ti 时刻,本地状态和服务器状态一致,为 Si 。
  2. Ti+1 时刻,发起操作 Ai+1; 本地计算该操作,得到本地状态 LSi+1 。
  3. Ti+4 时刻,收到服务器其他玩家的操作 SBi+4 ;本地计算该操作,得到本地状态 LSi+4。
  4. Ti+8 时刻,发起操作 Ai+8 ;本地计算该操作,得到本地状态 LSi+8。
  5. Ti+k 时间,收到操作 Ai 的回应 SAi+5;在状态 Si 上计算 SBi+4 ,得到 Si+4;在 Si+4 上计算操作 SAi+5 , 得到 Si+5 ;在新的
    Si+5 上计算后续的操作,得到新的本地状态 LSi+k-1 ;对比之前的本地状态 LSi+k-1 , 客户端做出修正后,计算该时刻的操作。

对于某些难以修正的动作,如:死亡,客户端可以做延迟处理,直到服务器给出回应。

作弊

Read more »

People coming from C or C++ may think that there is no memory leak in languages with garbage collection. But the truth is:

Memory leak does exist in languages with automatic garbage collection.

In perspective of human beings, garbage means something will never being used after sometime. Unfortunately, computer can’t
understand this. Computer uses a concept unreachable to detect whether
an object is garbage or not. There is a gap, as you may already known, that is “reachable but no longer used”.

Here, I show you a memory leak case using Go. Also, it can happen in other garbage collection language, like Java.
Goroutine is so lightweight and convenient in Go. Sometimes, you may use it to do some background jobs.
Suppose, I write following code for a database library:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
package main

import (
"fmt"
"runtime"
"sync"
"time"
)

type Request struct {
}

type DB struct {
closed bool
reqC chan Request
mutex sync.Mutex
}

func (db *DB) Close() {
db.mutex.Lock()
if db.closed {
db.mutex.Unlock()
return
}
db.closed = true
db.mutex.Unlock()

fmt.Println("db closing")
close(db.reqC)
runtime.SetFinalizer(db, nil)
}

func backgroundWork(db *DB) {
for range db.reqC {
}
fmt.Println("db channel closed")
}

func Open(address string, name string) (*DB, error) {
db := &DB{reqC: make(chan Request, 100)}
runtime.SetFinalizer(db, (*DB).Close)
go backgroundWork(db)
return db, nil
}

func test() {
Open("tcp://127.0.0.1:4444", "test")
}

func main() {
test()
for {
time.Sleep(5)
}
}

You may find that, database object created in function test is a garbage, but it never and will never got collected.
The object is reachable from function backgroundWork called in goroutine fired by go backgroundWork(db) in Open.

If we want garbage collector to collect object for us, we should not reference the object in any case. In above code,
we should not reference database object in backgroundWork. Let us make some changes:

1
2
3
4
5
6
7
8
9
10
11
12
func backgroundWork(reqC chan Request) {
for range reqC {
}
fmt.Println("db channel closed")
}

func Open(address string, name string) (*DB, error) {
db := &DB{reqC: make(chan Request, 100)}
runtime.SetFinalizer(db, (*DB).Close)
go backgroundWork(db.reqC)
return db, nil
}

Now, we run the programm and find that the database object got collected.

I recommend three principles when you use thread or goroutine as background workhorse for API objects:

Read more »

最近尝试了几款 C++ 编译系统。

  • CMake 对于一个编译系统来说,个人觉得过于复杂了。项目依赖不好解决,可能我姿势不对,用不好 ExternalProject 。

  • CPM 基于 CMake ,对项目结构以及名字空间有额外要求。我写我的代码,导出我的接口,你还想管我怎么写?

  • biicode 自动生成了很多隐式配置,用的时候很大可能要修改这些配置,作者脑洞有点大。不推荐。

  • meson 很轻量的一个编译系统,配置语言很好用。依赖可以在系统软件包和源码之间切换。对现有项目提供的 patch 支持很不错,不需要对上游做破坏性更新。文档还是比较少的,有些东西估计得读下代码才知道怎么用。值得一试。

  • bazel Google 出品,离 1.0 还很远,不过已经可以用了。很合适 Google 那种集中式的代码库,根目录一个 WORKSPACE ,其他的项目都只是一个 BUILD ,项目间的依赖也很好指定。和系统软件包的配合不好,连 make install 都没有。patch 支持,一个 BUILD 文件就可以搞定,复杂的话,可能比较麻烦。文档很不错。推荐。

这几款编译系统都没能解决一个问题:编译时第三方依赖的源码和软件包之间的一致性。举例来说:

如果 packageA 的头文件 header.hpp 通过 make install 或其他的包管理工具安装在 packageA/header.hpp ,那么 packageB 一定可以通过 #include <packageA/header.hpp> 引用到这个文件,不论 packageB 依赖的是 packageA 的源码还是系统软件包。header.hpp 可以在 packageA 的任意位置,甚至可能是编译时生成的头文件。

要做到这一点,我的想法是:在编译 packageB 时,如果依赖的是 packageA 的源码,则先编译 packageA,将其安装在私有的目录,之后通过修改编译参数指定头文件包含目录和链接目录。如果依赖的是 packageA 的软件包,则直接用系统的头文件包含目录和链接目录。这样的话,就可以通过配置语言 packageA 的源码和软件包之间切换。

meson 可以在依赖的源码和软件包之间切换,但是需要根据源码结构和包结构之间的差异做特别的定制。bazel 就是一集中化的代码仓库,包含路径只能相对根目录和当前目录。

也许得自己撸一个?Damnit!

Read more »