Recording

Commit volatile memory to persistent append-only log


  • Home

  • Archives

  • Tags

A step by step approach to raft consensus algorithm

Posted on Mar 20 2018

This is my second time to read through Raft Algorithm, and it is hard to recall what I have learned in first reading. This time I decide to record my thoughts for future recall. Hope it is useful to newbies in distributed systems like me.

What does consensus algorithm mean ?

Consensus algorithm is the process used to achieve agreement on shared state among faulty processes in distributed system.

Introduce Replicated State Machines

Here, we define state machine as State' = Machine(State, Input) for simplicity. A state machine can be defined with its start state, and makes progress with sequence of inputs, produces sequence of intermediate states. For examples:

1
2
3
State' = Machine(State, Input)
State'' = Machine(State', Input')
State''' = Machine(State'', Input'')

Given a state machine, how can we figure out that it is a replicated state machine ?

First, replicated state machine is deterministic. Given same start state with same sequence of inputs, the state machine always produce same intermediate states.

Second, two state machines built from same logic on possibly different processes or nodes and even possibly different languages must be same. Here we define two state machines as same based on deterministic: given same start state and same sequence of inputs, if two state machines produce same sequence of intermediate states, we say these two state machine are same. Thus given multiple copies of same state machines with same start state, feeding with same sequence of inputs, they must produces same sequence of intermediate states.

Third, states, including start state and intermediate states, and inputs must be self-contained, thus can be replicated to other processes or nodes with help of serialization and deserialization.

Read more »

AspectJ Load-Time Weaving for Spring

Posted on Aug 31 2017

Spring AOP is proxy-based, using either JDK dynamic proxy or CGLIB. Spring’s Cache Abstraction, Transaction Management and Asynchronous Execution are all built upon AOP proxies.

However proxy can intercept only external method calls. Which means that self-invocation, in effect, a method within the target object calling another method of the target object, will not lead to an actual interception at runtime.

Thus, the following code will not function correctly.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
@Service
public Class SomeServiceImpl implements SomeService {

@Cacheable(value = "SomeCache", key = "#kind + '#' + #id")
private Object methodBase(String kind, String id) {
// ...
return result;
}

@Override
public Object methodA(String id) {
return methodBase("a", id);
}

@Override
public Object methodB(String id) {
return methodBase("b", id);
}

}

AspectJ Load-Time Weaving

Spring provides a library named spring-aspects and AdviceMode.ASPECTJ which can be used as value of mode filed for @EnableCaching, @EnableTransactionManagement and @EnableAsync annotations to support AspectJ load-time weaving. But that is not enough, AspectJ load-time weaver is required to weave aspect for target class.

AspectJ Load-Time Weaver weave target class by transforming class file bytecode using a ClassFileTransformer named ClassPreProcessorAgentAdapter from aspectjweaver.jar. This transformation can be performed either at JVM level through Instrumentation interface or at per ClassLoader level. A method with signature similar to void addTransformer(ClassFileTransformer transformer) is required to apply the transformation in class loading phase, Instrumentation support this method natively, while not all class loaders support this.

Custom class loader to apply class file transformation

Spring’s @EnableLoadTimeWeaving creates a bean named loadTimeWeaver to inject a ClassFileTransformer, which is capable to weave target class with desired apsect, to bean class loader. Unfortunately, Spring Boot does not support this approach.

Due to the fact that loadTimeWeaver is a bean and classloading is happening at bean definition parsing phase which certainly happens before bean creation phase in same application context, thus @EnableLoadTimeWeaving should be enabled in a application context which is a ancestor of the application context where target class located in.

Read more »

Spring Profile

Posted on Apr 8 2017

Spring profile allows conditional registration of beans/components in runtime based on constants specified declaratively in configuration or programmatically in code. Let see some use cases.

Activate @Component based on active profile

Suppose we have a service interface called SomeService which has a method called doSomething:

1
2
3
4
// In SomeService.java
public interface SomeService {
void doSomething();
}

SomeService should only do real work in production environment and trivials in others. We can do this way:

1
2
3
4
5
6
7
8
9
// In SomeServiceImpl.java
@Service
@Profile("production")
public class SomeServiceImpl implements SomeService {
@Override
public void doSomething() {
// Do real work ...
}
}
1
2
3
4
5
6
7
8
@Service
@Profile("!production")
public class NoOpSomeServiceImpl implements SomeService {
@Override
public void doSomething() {
// Do trivial things ...
}
}

Now, based on whether production profile is declared or not, SomeServiceImpl or NoOpSomeServiceImpl will serve requests as SomeService.

Register @Bean based on active profile

1
2
3
4
5
6
7
8
@Configuration
public class WebConfiguration {
@Bean
@Profile({"dev", "!featureA"})
public CustomBean customBean() {
// instantiate, configure and return bean ...
}
}
Read more »

Don't use context path in web application behind reverse proxy

Posted on Aug 20 2016

Recently, I and my colleague build a small web project together using Spring. We use a context path other than “/“, say, “/abc” which is the project name. This setting is consistent with existing projects in my team.

But after we deploy this project behind nginx, we encounter problem. In nginx, we config something like this:

1
2
3
location / {
proxy_pass http://127.0.0.1:5000/abc/;
}

It works fine for handwritten url from client. But the url links in the response html, which is generated by this application and points to resources inside this application, does not work. All url forwarded by nginx will be prefixed with “/abc”, which is the context path of this application in web container. While the application generated url links have that prefix already. This results in wrong url links for those resources.

There are workarounds to solve this, though, but after investigation and thinking, I concludes to:

If you use reverse proxy to forward HTTP request, then don’t use context path in your application, instead let reverse proxy set context path for you.

First, if you use context path other than “/“ in your application and you want your application visited via top domain without subpath, then you must hardcode that context path in reverse proxy. The same information is hardcoded in two places, your application and the reverse proxy.

Second, if you use context path “/“ in your application and you want your application visited via subpath, reverse proxy such as nginx can set X-Forwarded-Prefix for your application to generate url links prefixed with that subpath. If the framework backing your application respect this header, you probably don’t notice that.

The underlying cause behind this conclusion is that both web application and reverse proxy are passive. They accept request, handle it, response it. The solution comes from the fact that reverse proxy stand before web application so that it can add extra informations for web application to handle in programming approach without the need to handcode such informations.

Links:
http://stackoverflow.com/questions/10429487/context-path-for-tomcat-web-application-fronted-with-nginx-as-reverse-proxy http://stackoverflow.com/questions/19866203/nginx-configuration-to-pass-site-directly-to-tomcat-webapp-with-context

Read more »

Invariants in LevelDB algorithm

Posted on Jun 14 2016

Recently, I write a LevelDB implementation in Go. In this post, I summarize some invariants in algorithm used by LevelDB implementations.

Sequence Number

Sequence number is a monotonically increasing 56 bits integer value. Every time a key is written to LevelDB, it is tagged with a sequence number one larger than sequence number tagged with previous key written to LevelDB. If two entries in LevelDB have same user level keys, the one with larger sequence must shadow the other.

Sorted Memory Tables

Writes are first recorded in a mutable memory table. If that memory table is full, it is marked as immutable and a new memory table is created to record writes. The memory table marked as immutable is then compacted to a sorted disk table in level 0 and deleted. Thus we conclude that:

If a key appears in mutable memory table, it is newest. Otherwise, if it appears in immutable memory table, it is newest.

Compaction

When an immutable memory table is compacted to a sorted table in Level 0, it is assigned with a file number larger than all existing file numbers in this Level. Thus we have:

Entries stored in newer file in Level 0 shadow entries with same user keys in older files.

When there are too many files in Level 0 or too many data in Level 1 or above, we start level compaction. In level N compaction, we select a file set from level N, such that no remaining files overlap this file set in user key level. Then we compact this file set with all overlapping files from level N+1, and produce sorted tables in level N+1. Thus we conclude to two invariants.

Read more »
12…6

Kezhu Wang

26 posts
23 tags
RSS
GitHub Douban Twitter
© 2012 — 2019 Kezhu Wang
Powered by Hexo v3.7.1
|
Theme — NexT.Gemini v6.1.0