Refactor fhash is done

Overview

Aha, After some days, I have finished the refactor task for fhash, it’s really really big improvement for flibs, since many many components depend on it, such as:

  • thread pool
  • event framework
  • pcap conversion lib
  • log

Why need to do refactor

The old fhash have some defects, such as:

  • hard to iterate
  • hard to extend
  • hard to modify fhash when iteration
  • some performance issues during iteration and fhash_set interface

so finally, I decided to rewrite it. Before we go through the new design, let’s take a look on the old design at below section.

Graph of Old/New design

mind mapping software

The new design

The new design fix all the issues in old design, which much clean and user friendly.

Let’s take a look the core APIs:

fhash* fhash_create(uint32_t init_size, fhash_opt opt, uint32_t flags);
void fhash_delete(fhash* table);

void fhash_set(fhash* table,
               const void* key, key_sz_t key_sz,
               const void* value, value_sz_t value_sz);
void* fhash_get(fhash* table, const void* key, key_sz_t key_sz,
               value_sz_t* value_sz);
void* fhash_fetch_and_del(fhash* table,
               const void* key, key_sz_t key_sz,
               void* value, value_sz_t value_sz);

fhash_iter fhash_iter_new(fhash* table);
void fhash_iter_release(fhash_iter* iter);
void* fhash_next(fhash_iter* iter);
void fhash_foreach(fhash* table, fhash_each_cb cb, void* ud);

int fhash_rehash(fhash* table, uint32_t new_size);
void fhash_profile(fhash* table, int flags, fhash_profile_data* data);

Full API and Example Documents

Please refer to wiki

The benchmark

And for this time, I just add a benchmark tool for fhash, so that I can review the performance when I need it, let’s take a look one result I ran it on my virtual box:

========= fhash testing without auto rehash =========
fhash_set x100000 spend time: 9189450 usec
fhash_get x100000 spend time: 6535296 usec
fhash_next x100000 spend time: 1111 usec
fhash_rehash (index double), ret: 0, spend time: 42825 usec
[index]: used: 63226, total: 100000, usage rate: 0.632260
[slots]: used: 100000, total: 107241, usage rate: 0.932479
fhash_del x100000 spend time: 32075 usec
========= fhash testing with auto rehash =========
fhash_set x100000 spend time: 112542 usec
fhash_get x100000 spend time: 35542 usec
fhash_next x100000 spend time: 3333 usec
fhash_rehash (index double), ret: 0, spend time: 57153 usec
[index]: used: 63226, total: 100000, usage rate: 0.632260
[slots]: used: 100000, total: 107241, usage rate: 0.932479
fhash_del x100000 spend time: 37410 usec

So, from above, we can see the performance comparison of disabling/enabling auto rehash, the result with auto rehash is winner, that means in a normal case(if user cannot sure how many items will be putted into hash table), the best solution is enable auto rehash feature, it will avoid the hash table convert to a list.

The End

After rewrite fhash, I realized that:

  • To create a user friendly, extendable program is more important than the performance, since the most important thing is that how can we get start with a new library? If the library is hard to use, user will give up and try to find a another library.
  • Another hand, the documentation is also a very important part in the project, since that’s the easiest way to tell user: what it is and how to use it.

Next Step

In the future, I have some actions to optimize it:

  • I can add a counter in every hash node, to record the get/set actions frequency, so that fhash can have ability to optimize the hash node list in every rehash or some other trigger point, so the highest hotspot node will be putted at the front of list.
  • Use a list instead of array, and compare the performance impact.

Ok, let’s hacking begin!~

The Steps of Creating a New Open Source Library

Recently, I’m doing the refactoring job of fhash lib, which is a part of flibs, during the refactoring, I realised that to create a new open source project, we should have some basic steps, and follow the steps we will keep moving to right forward without lost.

Steps

Ok, let’s check the core steps:

  1. API Design Lock Down
  2. Write the code
  3. Write the UT
  4. Write the benchmarking tools
  5. Create a document of these APIs
  6. Announce your library

The benefit of Writing Documents

Most of developers don’t like to write the document, it’s so boring. But actually the document is not written only for yourself, it will be a great help for all the people who will maintain this project. Try to think a about it, if there is a open source library without any API document or comments in the code, will you use it in your project? The answer is NO obviously, so let me brief the benefits of writing document:

  • Document will guide people to understand what is this quickly
  • Document will help people to go to the right forward
  • Document will help people who want to use the project

In The End

There are many many open source project without any document, and all of them almost failed, no one knows what is this, no one knows how to use it. So let’s create the open source project with more documents :D

Ftracer: The C/C++ callgraph generator

Sometimes, we need the callgraph to help us reading the source code, especially the big C++ project, it’s really hard to understand the program quickly/easily.

So, I wrote a tracer to record the whole code path, and then generate the callgraph by the trace file. It helped me to follow the correct code path, save a lot of time.

The full introduce page at: http://finaldie.github.io/ftracer/

Have a fun

A Tip of Writing Makefile — Using @

Ok, let’s talk about one of the tips for writing Makefile, the magic character ‘@’

Before continue, let’s think about one thing: when you want to build a project, write a shell script or use the Make for building your project, which one would you prefer?

Ok, I’ll choose Make, the reasons are:

  1. The shell script may contain the code which is specific for the target shell, for example: bash. So, if some users DO NOT use bash, the build process may stop unexpected. And the Make is stand alone, which can run on all UNIX like system, not depend on a specific shell environment.
  2. You need to handle all the building actions, for example, entry the target folder, return back to last folder, do the dependence actions before actual do the building actions, etc, and the Make has already supported all of the above.

If you agree with me, follow me to the next step — Using ‘@’ in your Makefile.

Let’s take a look at a example Makefile:

all:
	echo "hello Makefile"

When we run it, the output is:

bash $ make
echo "hello Makefile"
hello Makefile

From above, we can see the original instructions are shown as well, they are useless and not beautiful, so let’s using ‘@’ to rewrite the Makefile:

all:
	@echo "hello Makefile"

And run it again:

bash $ make
hello Makefile

From above, we can see the original instructions(echo xxx) have gone, it only shows the output of this command. So if you build a very large project, it may contain a lot of instructions, in this way you can using ‘@’ to avoid outputing the command, only show its result.

Reference: https://github.com/finaldie/final_dev_env/blob/master/Makefile

Compile Clang3.4

As we know, llvm is a great project, and it also include the clang compiler. And for Clang 3.4, it support dumping the format configuration file, so that people can generate the self-style format for their own project. Detail: ClangFormat  and ClangFormatStyleOptions

And for now, Clang 3.4 has not been released, but we can pull its latest code and compile it.
For RHEL6.x, I remind to compile it, and for ubuntu/debian, there are already have the nightly building package(llvm apt source) , people can install it directly.

1. Ok, the compiling instructions as following, I tested it on a rhel6.x, it worked.

git clone http://llvm.org/git/llvm.git
cd llvm/tools
git clone http://llvm.org/git/clang.git
cd ../projects
git clone http://llvm.org/git/compiler-rt.git
cd ../../
mkdir build
cd build
../llvm/configure --prefix=~/bin/clang34 --enable-optimized --enable-targets=x86_64
make -j4
make install

2. now, add the clang3.4’s binaries into your PATH.

vim ~/.bash_profile
PATH=$HOME/bin/clang34/bin:$PATH
export $PATH

3. If you are using vim, add the following line into your .vimrc (replace the $path-to-this-file with your real path)

map <C-K> :pyf $path-to-this-file/clang-format.py<CR>
imap <C-K> <ESC>:pyf $path-to-this-file/clang-format.py<CR>i

4. Ok, finally, you can try it with clang-format or open a file with vim and using the Control+K to format your code.

[Other References]:

http://llvm.org/docs/GettingStarted.html#compiling-the-llvm-suite-source-code

http://clang.llvm.org/get_started.html

http://verboselogging.com/2009/06/24/compile-llvm-on-ubuntu

http://llvm.org/apt/

——–
Final :)

Migration WordPress from 5.2.1 to 5.7.1

Ok guys, I just made a mistake and broken my blog few days ago, but finally, as it get back to normal, so let me show you what’s happening.

It is beginning from a mad idea, I upgraded my vps OS to ubuntu 12.04, after that, I found that the php-fpm process has broken(due to the php has upgraded too), and this is just a little bit problem, but I also downloaded the latest version of wordpress, so the disaster has coming to me.

During the upgrading to the latest wordpress, I just follow the instructions from its official website, maybe I did a wrong way, I deleted the wp-admin folder, so when I re-try to access my blog from browser, it shown me nothing, yes, there is nothing I can see. I checked the access log, it’s 500 error in there, but I don’t know what’s the exactly error happened.

God help me, I have not too much time to figure it out what’s the root reason, I only guess the old version of wordpress is not compatible with the latest. After a while, I install the latest wordpress and play with it. it’s more beautiful that the old one, so I decide to use this one.

Ok, the next thing I need to do is migration the old data to the new one, at first I dump all the old wordpress mysql db, and restore to the new one, unfortunately, it failed, after restoring, I cannot see anything from browser. I guess the wp-metadata table is not suitable for the new one or some other reasons I don’t care. And when I check the wp-posts table, I didn’t saw there is any special field depend on other tables, it is only contain the pure data, so I only dumped this table’s data and restored it to the new one, that’s great, it works, aha, so excited for that. The exact command as follow:
1. dump the wp-posts table from old wordpress

mysqldump --add-drop-table -u mysqlusername -p old_databasename wp-posts > old_posts.sql

2. restore the wp-posts table to the new wordpress

mysql -u mysqlusername
 -p new_databasename < old_posts.sql

Finally, as you saw, all the things come back except the comments, if you also care about the comments, you should dump the wp-comments table as well.

Have a fun :D

Final_Libs 0.4.3 released

Hi,

This is a huge update, I just refine all the filenames and api names, and also support both 32 and 64bit libraries. So that, if someone who want to use the new version, need to be migrated your code to the new apis. If you don’t want to do it directly, I suggest that people should use the 0.3.8 tag, the detail please refer to https://github.com/finaldie/final_libs

have a fun

Why Signal is Dangerous

Seems this title is also dangerous, aha, this is a long story, some time ago, I want to implement a timer logic(when timer trigger, run some logic), but I don’t want to create a new thread, so I remember the signal, and then someone warning me maybe it’s not safe. I don’t know why, he only told me that it’s not recommend.

Ok, this is a question around my mind for a long time, I want to know why, but most of all articles couldn’t explain the reason clearly. I need to figure it out by myself. So this is why I write this blog.

Some days ago, I took some time to read this article, and I got some conclusions which can explain my questions:

  • Signal handler need as simple as possible, don’t use non-reentry function in it. for example the fprintf api may involve in this issue.
  • Signal handler also will break lock, because it doesn’t care which part was locked. So this may cause dead lock.
  • Signal may merged by kernel, so we can’t use signal to do some statistic.

Ok, I just listed some major points, so we can use the signal to do some simple purpose, write the simple logic in it. That will be safe, for example, we only update a flag in signal handler. :D

Ok, all above, I just described the origin signal usage, for now, we have the signalfd api to create a fd which can poll for monitoring, but you need to investigate whether you can add this fd to your polling list, because this part may inside in your framework which you can’t control it. This method can solve the 1 and 2 issues above.

Have a fun. :D

Magicnote 0.1.6 Release

This release include big enhancement and several new features [bump here]:

0.1.6 2012-12-23
    Issue #7: rename list to ls
    Issue #6: before add/edit/rm, create a index.bak first, and can revert to last version by a new command -> revert
    Issue #3: add -p option for ls and find command, which can show all content of the line without limitation
    Issue #8: add show command for showing all tags info
    Issue #2: add multi tags support
    fix compatible issue, using bash instead of sh
    using getopts instead of manually get args
  Have a fun :D

A new tool — magicnote, let’s rock your notes

I dont know whether you encounter these issue:

  1. Many snippets in anywhere, sometimes you can’t find it when you want to use that
  2. A large size of text hold all kinds of snippets, you can’t find what you want quickly
  3. When you want to run a command in your note, you need to copy first, then switch window, then paste it into terminal. if you using mac, maybe you also need to switch desktop to find which window you need to paste.
  4. Sometimes, you want to sync your notes to other place or share your note to others or want to take a look at others’ notes/snippets, you reach that hardly.
  Ok, stop to list, if you encounter them and want to make a change, just follow me. –> I share you a new tool which almost can resolve these issues –> magicnote
  If you want to try that, follow the README, install and try it. The command is easy:
bash~$ magicnote
usage:
  \_ magicnote addsource source
  \_ magicnote list [tag1 [ tag2 ...]]
  \_ magicnote add [-tag tagname]
  \_ magicnote rm tag@index [tag2@index2 ...]
  \_ magicnote edit tag@index [tag2@index2 ...]
  \_ magicnote find tag1 tag2 ...
  \_ magicnote run tag@index [tag2@index2 ...]
  \_ magicnote gc
bash~$ magicnote list
ipv6
  |- @1      : # ipv6
  |-       #1: ping6 -I eth0 fe80::250:56ff:fe1d:66c0
  |-       #2: ping6 -I eth0 ff02::1
  |-       #3: telnet fe80::250:56ff:fe1d:66c0%eth0 8080
  The powerful run command:
bash~$ magicnote run ping@1
ready to run:
ping 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.043 ms
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.093 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.044 ms
64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.126 ms
^C
--- 127.0.0.1 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.043/0.076/0.126/0.035 ms
  This tool is a newbie, so it may contain some bugs, or issues, just free to open a issue on github, I’ll try my best to fix/improve that, or you can send a pull request for me, that will be better.
  So, in the end, have a fun :D
%d bloggers like this: