Boom! Skull-Engine Released

I’m so excited to announce that the Skull-Engine is released today! Follow me to see the details.

What’s Skull

Skull is a serving framework, it can start fast, has high development productivity and is easy to maintain.

Why was Skull born?

To start a new project, we usually need to read tons of documentations of the chosen framework/tech stack, then decide whether to use it and how to use it. It may take a week or more for a newbie, even for an expert, it sometimes still needs a batch of trivial efforts to build it from a scratch. It’s really a terrible experience for us. Life is short, let’s save the time on the most important things instead.

After we have a project, we will need to modify the code/configuration again and again, and do the test, sometimes we really want to switch to another language to solve a specific problem, but it’s extremely hard to do that since we’ve already chosen a programing language for the project at the beginning. Have you ever regretted for the decision? For example, C/C++ provides high runtime performance, but the development cycle is very long, people should be really careful to deal with the low-level memory issues; On another hand, Python has rich packages to solve the problems, but the performance is much worse than C/C++, could we use them all in one program, to choose the right language in the right place, and the different language can communicate with each other very easily and smoothly?

Besides, it’s hard for people to understand the complex program, especially when the code become more complex and huge, it’s just like a big black box, and no one knows how it works, it would become un-maintainable since no one can see it clearly, even a basic workflow. Imagine that, how could a programmer figure out the basic workflow of a complex program without documentation in 5 minutes?

Under this situation, Skull was born to solve those kinds of problems.

Key Components and Features

Skull is based on Google Protobuf and Flibs, target to Linux platform. It’s consist of 3 components:

  • skull-core
  • skull-user-api
  • skull-project-management-scripts

And, it provides the following key features for users who want to build a project easily:

  • Modular Development Environment
  • Project Management
  • Processize
  • Lockfree Environment
  • Native Monitoring
  • Native Async Network IO
  • Native Background IO Job
  • Native Timer
  • Multi-Language Support (C/C++, Python)
  • Integrated with Nginx
  • Service Shareable
  • Basic Functional Test Environment provided

Key Concepts

There are also 3 major concepts in Skull: Workflow, Module and Service. Before using Skull, let’s understand the core concepts first.


Module is a independent logic set, it defines what kind of data/things we should use/do in this step.



Workflow is more like a transaction rules, oriented automator or pipeline, it controls how the transaction works, execute the modules one by one until be finished. Multiple modules can be chosen to join in a workflow, and there also can be multiple workflows in Skull.
Each Workflow has its own SharedData, every Module belongs to this Workflow can read/write it.



Service is designed for managing the data, and provide a group of APIs to access the data. Module can use these APIs to access/consume the data, then decide what you want to do. Also the Service is shareable, it’s highly recommended that users to share their Service to other Skull projects, to make the world better.


Example of a Skull Application



No one would care/adopt a program, if it has a bad system performance result, so let’s see how the Skull performs.

  • Testing Environment
Role CPU Mem NIC
perf client 4 vcpus 2.3GHz 8GB 100 Mbps
skull-engine 8 vcpus 2.3GHz 16GB 1Gbps
  • Hard KPI
    • The response Latency must be lower than 50 milliseconds

Generally, the results are excellent in different scenarios, and as we expected, the Cpp module’s results are much better than Python‘s, that is because Python has GIL to make sure its global state is correct, that hurts the performance in multi-threads cases. The detailed charts are listed below.

Single Cpp Module


A Cpp Module Calling a Service


A Cpp Module Calling a Service(With EndPoint)


Service Timer Job (Read)


Service Timer Job (Write)


Single Python Module


A Python Module Calling a Service


A Python Module Parsing Http Request



Before finishing this article, let’s see how to create a Skull project within minutes:

This is definitely a new experience of doing a project! Write the code happier and make the life easier! Share the Skull Service with others, help them to build their stuff much easier. Join me, let’s make the world better πŸ™‚ View the project at the home page, press the Star button if you like it πŸ˜€

Next, I’ll write some articles, to deep dive in the details of Skull. Stay tuned…

Control your Traffic — Principle

Everyday, we view a lot of websites, we see many videos without care about how it works. Do you want to know the details? just follow me to take a look, and to see how to control it as well.

Basic workflow

The workflow of browser — website.

  • User inputs a website url
  • Browser sends a dns query to dns server for this domain
  • Browser receives the domain ip records
  • Browser sends a query to the website server
  • Browser receives the response data of from the website server
  • Browser draws the website by the data
  • User views the website contents

The two major transactions in the workflow:
1) DNS server transaction
2) Web server transaction

What’s the meaning of Control Traffic

So controlling the traffic means we need to control dns query as well as web server query.

Notes: From above we can see that before sending query to the website server, we need know the IP address of the website and the mapping “domain <–> Β IP” stored in the DNS server. And normally, the DNS query step is hidden for users.

Why we need to Control the Traffic?

Before that, we need to know the reason, right?

In some cases, we want to:
1) Connect some websites directly
2) Connect some websites by proxy A (high speed)
3) Connect some websites by proxy B (more security)

How to Control

How to Control DNS Query

Here we only talk about how to control the DNS from the client side. Because we cannot touch the DNS server, we cannot control it.

There are so many DNS clients we can use. I just recommend the dnsmasq for a beginner, since it’s a full feature DNS client, and easy to start with.


For example, you can configure your dns config and put it into /etc/dnsmasq.d/example.conf:


The configuration means to use Google’s DNS server to resolve the domain like * Normally, some DNS servers, like openDNS which are geo based DNS, would return the IPs closest to your location, so it would speed up your query time.

How to Control Website Query (http query)

In this step, we’ve already got the IP of the website. And we want connect with this IP by a proxy. So, here we can use iptables by maintaining an IP list in iptables and forwarding the traffic to the specific proxy if matched.

The End

Now, you can write your own dnsmasq configuration file and plan your traffic path with dnsmasq + iptable.

Ok, this article is just a brief introduction. There are some problems we have to face:
1) What if the DNS server returns wrong IP records?
2) What if the DNS query is hijacked?
3) It’s really hard to maintain the IP list for the iptables. Is there any other way to handle it? (IPSet)

I’ll talk about these next time. Have fun πŸ™‚

Compile mutrace on RHEL6

Why mutrace

Recently, I want to profile the linux user lock(pthread mutex/rwlock) performance of a project, and there are few options for this purpose:

  • valgrind(drd)
  • systemtap(futexes.stp)
  • lttng
  • mutrace

Finally, I selected the mutrace due to:

  1. valgrind(drd): It’s really slow, cannot provide credible information.
  2. systemtap(futexes.stp): It’s really great in profiling kernel, but the user layer profile need to additional efforts to setup the environment(some debuginfo pkg).
  3. lttng: There is only rpm pkg available on RHEL7, have to compile by myself. And there are more than one dependency lib, hard to fix the dependency issue.
  4. mutrace: There is no rpm pkg available on RHEL6, have to compile by myself, and fortunately it’s not very hard to pass the compilation.

Installation Steps

To get it done, need a few steps:

  1. Get the source code
    git clone git://
  2. Modify the autoconf required version from 2.68 to 2.63
    diff --git a/ b/
    index fcb1397..0d36e41 100644
    --- a/
    +++ b/
    @@ -18,7 +18,7 @@
    # You should have received a copy of the GNU Lesser General Public
    # License along with mutrace. If not, see <>.
    AC_INIT([mutrace], [0.2], [mzzhgenpr (at) 0pointer (dot) net])
  3. Upgrade gcc version to at least gcc 4.6

  4. Do some code changes for backtrace-symbols.c to pass the compilation

    diff --git a/backtrace-symbols.c b/backtrace-symbols.c
    index 0a0d751..6f84c56 100644
    --- a/backtrace-symbols.c
    +++ b/backtrace-symbols.c
    @@ -34,6 +34,8 @@
      along with this program; if not, write to the Free Software
      Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.  */
    +#include "config.h"
    #define fatal(a, b) exit(1)
    #define bfd_fatal(a) exit(1)
    #define bfd_nonfatal(a) exit(1)
    @@ -44,13 +46,13 @@
    #define true 1
    #define false 0
    -#define _GNU_SOURCE
    +//#define _GNU_SOURCE
    #include <string.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <execinfo.h>
    #include <bfd.h>
    -#include <libiberty.h>
    +//#include <libiberty.h>
    #include <dlfcn.h>
    #include <link.h>
    #if 0
  5. Run ./ to generate Makefile

  6. Change ‘-O0’ to ‘-O2’ from CFLAGS in the Makefile
  7. Run make and make install

In the end

Now, mutrace, matrace and the other related libraries have been generated, profile your program by:

mutrace -r $test_app ...

And big thanks to the original author @Lennart Poettering


have a fun πŸ™‚

Flibs 0.7.4 Released

The flibs 0.7.4 has been released, which include the following changes:

1. Refactor all the makefiles, now it can be built in parallel
2. Refactor the header files folder structure, to make them more user friendly
3. Refactor some apis, to make them more user friendly
4. Replace malloc with calloc, to avoid potential un-initialization issue
5. Fix some type conversion incorrect issue, to make it more stablization
6. Fix the valgrind errors when building the libs on 32 bit platform

For example, if you want to compile the static-link libraries, now just run:

make -j4
make install

And all the header files have been moved to include/flibs, so whatever you include it from the submodule of your project or /usr/local, just write it one time:

#include <flibs/xxx.h>

Ok, so for now, all the api prefix are starting with “f”, and you can identify them easier than before.

Just enjoy and have a fun πŸ™‚

Summary of 2014

In the past year, for my personal projects, I did some achievements on them, and created some new projects as well, here I just summarize them for the recording purposes.

Exist Projects

  • flibs: upgraded to 0.6.6, including:
    • The code base refactor
    • fhash refactor
    • flog optimization
    • other bug fixing
  • fenv: more user friendly, and add more than 30+ commits

New Projects

  • ftracer: C/C++ program tracer, which is useful in deep diving into a complex project quickly
  • openwrt-scripts: now, it contain a dnsmasq config generator

The Next

And, in the next year, the draft plan as following:

  1. Separate fmem and fpcap from flibs
  2. Refine the makefile for flibs, make it more user friendly, faster and portable
  3. Create a fstr in flibs, which will be great useful for many C programs
  4. Continue add more scripts in openwrt-scripts project, such as:
    • Geo based ip record selector for dnsmasq or pdnsd
    • Secure dns reply policy for dnsmasq
  5. A new server side framework will be announced

Good bye 2014, good luck 2015 πŸ™‚

Refactor fhash is done


Aha, After some days, I have finished the refactor task for fhash, it’s really really big improvement for flibs, since many many components depend on it, such as:

  • thread pool
  • event framework
  • pcap conversion lib
  • log

Why need to do refactor

The old fhash have some defects, such as:

  • hard to iterate
  • hard to extend
  • hard to modify fhash when iteration
  • some performance issues during iteration and fhash_set interface

so finally, I decided to rewrite it. Before we go through the new design, let’s take a look on the old design at below section.

Graph of Old/New design

mind mapping software

The new design

The new design fix all the issues in old design, which much clean and user friendly.

Let’s take a look the core APIs:

fhash* fhash_create(uint32_t init_size, fhash_opt opt, uint32_t flags);
void fhash_delete(fhash* table);

void fhash_set(fhash* table,
               const void* key, key_sz_t key_sz,
               const void* value, value_sz_t value_sz);
void* fhash_get(fhash* table, const void* key, key_sz_t key_sz,
               value_sz_t* value_sz);
void* fhash_fetch_and_del(fhash* table,
               const void* key, key_sz_t key_sz,
               void* value, value_sz_t value_sz);

fhash_iter fhash_iter_new(fhash* table);
void fhash_iter_release(fhash_iter* iter);
void* fhash_next(fhash_iter* iter);
void fhash_foreach(fhash* table, fhash_each_cb cb, void* ud);

int fhash_rehash(fhash* table, uint32_t new_size);
void fhash_profile(fhash* table, int flags, fhash_profile_data* data);

Full API and Example Documents

Please refer to wiki

The benchmark

And for this time, I just add a benchmark tool for fhash, so that I can review the performance when I need it, let’s take a look one result I ran it on my virtual box:

========= fhash testing without auto rehash =========
fhash_set x100000 spend time: 9189450 usec
fhash_get x100000 spend time: 6535296 usec
fhash_next x100000 spend time: 1111 usec
fhash_rehash (index double), ret: 0, spend time: 42825 usec
[index]: used: 63226, total: 100000, usage rate: 0.632260
[slots]: used: 100000, total: 107241, usage rate: 0.932479
fhash_del x100000 spend time: 32075 usec
========= fhash testing with auto rehash =========
fhash_set x100000 spend time: 112542 usec
fhash_get x100000 spend time: 35542 usec
fhash_next x100000 spend time: 3333 usec
fhash_rehash (index double), ret: 0, spend time: 57153 usec
[index]: used: 63226, total: 100000, usage rate: 0.632260
[slots]: used: 100000, total: 107241, usage rate: 0.932479
fhash_del x100000 spend time: 37410 usec

So, from above, we can see the performance comparison of disabling/enabling auto rehash, the result with auto rehash is winner, that means in a normal case(if user cannot sure how many items will be putted into hash table), the best solution is enable auto rehash feature, it will avoid the hash table convert to a list.

The End

After rewrite fhash, I realized that:

  • To create a user friendly, extendable program is more important than the performance, since the most important thing is that how can we get start with a new library? If the library is hard to use, user will give up and try to find a another library.
  • Another hand, the documentation is also a very important part in the project, since that’s the easiest way to tell user: what it is and how to use it.

Next Step

In the future, I have some actions to optimize it:

  • I can add a counter in every hash node, to record the get/set actions frequency, so that fhash can have ability to optimize the hash node list in every rehash or some other trigger point, so the highest hotspot node will be putted at the front of list.
  • Use a list instead of array, and compare the performance impact.

Ok, let’s hacking begin!~

The Steps of Creating a New Open Source Library

Recently, I’m doing the refactoring job of fhash lib, which is a part of flibs, during the refactoring, I realised that to create a new open source project, we should have some basic steps, and follow the steps we will keep moving to right forward without lost.


Ok, let’s check the core steps:

  1. API Design Lock Down
  2. Write the code
  3. Write the UT
  4. Write the benchmarking tools
  5. Create a document of these APIs
  6. Announce your library

The benefit of Writing Documents

Most of developers don’t like to write the document, it’s so boring. But actually the document is not written only for yourself, it will be a great help for all the people who will maintain this project. Try to think a about it, if there is a open source library without any API document or comments in the code, will you use it in your project? The answer is NO obviously, so let me brief the benefits of writing document:

  • Document will guide people to understand what is this quickly
  • Document will help people to go to the right forward
  • Document will help people who want to use the project

In The End

There are many many open source project without any document, and all of them almost failed, no one knows what is this, no one knows how to use it. So let’s create the open source project with more documents πŸ˜€

Ftracer: The C/C++ callgraph generator

Sometimes, we need the callgraph to help us reading the source code, especially the big C++ project, it’s really hard to understand the program quickly/easily.

So, I wrote a tracer to record the whole code path, and then generate the callgraph by the trace file. It helped me to follow the correct code path, save a lot of time.

The full introduce page at:Β

Have a fun

A Tip of Writing Makefile — Using @

Ok, let’s talk about one of the tips for writing Makefile, the magic character ‘@’

Before continue, let’s think about one thing: when you want to build a project, write a shell script or use the Make for building your project, which one would you prefer?

Ok, I’ll choose Make, the reasons are:

  1. The shell script may contain the code which is specific for the target shell, for example: bash. So, if some users DO NOT use bash, the build process may stop unexpected. And the Make is stand alone, which can run on all UNIX like system, not depend on a specific shell environment.
  2. You need to handle all the building actions, for example, entry the target folder, return back to last folder, do the dependence actions before actual do the building actions, etc, and the Make has already supported all of the above.

If you agree with me, follow me to the next step — Using ‘@’ in your Makefile.

Let’s take a look at a example Makefile:

	echo "hello Makefile"

When we run it, the output is:

bash $ make
echo "hello Makefile"
hello Makefile

From above, we can see the original instructions are shown as well, they are useless and not beautiful, so let’s using ‘@’ to rewrite the Makefile:

	@echo "hello Makefile"

And run it again:

bash $ make
hello Makefile

From above, we can see the original instructions(echo xxx) have gone, it only shows the output of this command. So if you build a very large project, it may contain a lot of instructions, in this way you can using ‘@’ to avoid outputing the command, only show its result.


Compile Clang3.4

As we know, llvm is a great project, and it also include the clang compiler. And for Clang 3.4, it support dumping the format configuration file, so that people can generate the self-style format for their own project. Detail: ClangFormat Β and ClangFormatStyleOptions

And for now, Clang 3.4 has not been released, but we can pull its latest code and compile it.
For RHEL6.x, I remind to compile it, and for ubuntu/debian, there are already have the nightly building package(llvm apt source) , people can install it directly.

1. Ok, the compiling instructions as following, I tested it on a rhel6.x, it worked.

git clone
cd llvm/tools
git clone
cd ../projects
git clone
cd ../../
mkdir build
cd build
../llvm/configure --prefix=~/bin/clang34 --enable-optimized --enable-targets=x86_64
make -j4
make install

2. now, add the clang3.4’s binaries into your PATH.

vim ~/.bash_profile
export $PATH

3. If you are using vim, add the following line into your .vimrc (replace the $path-to-this-file with your real path)

map <C-K> :pyf $path-to-this-file/<CR>
imap <C-K> <ESC>:pyf $path-to-this-file/<CR>i

4. Ok, finally, you can try it with clang-format or open a file with vim and using the Control+K to format your code.

[Other References]:

Final πŸ™‚