用daemon命令负责任务的自动重启

April 14, 2013

不知道大家主意过daemon这个命令行工具没有。这是一个Linux下的工具,也有BSD版本和MacOSX port。它可以启动和管理一个进程,包括这个进程的自动重启。

很多大规模机器学习系统(和其他大规模分布式计算系统)都需要有自动重启的功能。很多大公司有自己的分布式操作系统可以负责自动重启失败的任务(tasks)。可以如果没有这样的分布式操作系统怎么办?daemon命令可以是一个“穷人“的的解法。

启动一个任务

wangyi@localhost:~$ps daemon --name=data_server_1 --respawn -- sleep 999

启动一个任务,名字是data_server_1(对应/tmp/data_server_1.pid这个文件;这个文件用来保证不能再运行同一个daemon命令来启动同一个任务),这个任务执行命令 sleep 999。其中–respawn参数的意思是:如果 — 之后的命令(任务)挂了,daemon应该自动重启之。

此时,启动了一个daemon进程,它会fork(spawn)出来一个进程来执行命令sleep 999:

wangyi@localhost:~$ps auxwww | grep sleep
wangyi 15129 0.0 0.0 2432748 472 ?? S 10:47AM 0:00.00 sleep 999
wangyi 15128 0.0 0.0 2432896 312 ?? Ss 10:47AM 0:00.00 daemon --name=data_server_1 --respawn -- sleep 999

另外,/tmp/data_server_1.pid里存放的是daemon命令得到的process id:

wangyi@localhost:~$cat /tmp/data_server_1.pid
15128

任务的自动重启:

如果我们杀死一个由daemon fork的进程:

wangyi@localhost: kill 15129

Fork这个进程的daemon 进程会自动重启这个任务:

wangyi@localhost:~$ps auxwww | grep sleep
wangyi 15147 0.0 0.0 2432748 472 ?? S 10:47AM 0:00.00 sleep 999
wangyi 15128 0.0 0.0 2432896 312 ?? Ss 10:47AM 0:00.00 daemon --name=data_server_1 --respawn -- sleep 999

可以看到,任务15129虽然被杀死了,但是一个新的任务 15147 被启动来执行sleep 999

再启动一个任务

可以启动另一个任务,只要名字和之前的不同,执行的命令可以是同一个:

wangyi@localhost:~$daemon --name=data_server_2 --respawn -- sleep 999

wangyi@localhost:~$ps auxwww | grep sleep
wangyi 15201 0.0 0.0 2432748 472 ?? S 10:52AM 0:00.00 sleep 999
wangyi 15200 0.0 0.0 2432896 312 ?? Ss 10:52AM 0:00.00 daemon --name=data_server_2 --respawn -- sleep 999
wangyi 15129 0.0 0.0 2432748 472 ?? S 10:47AM 0:00.00 sleep 999
wangyi 15128 0.0 0.0 2432896 312 ?? Ss 10:47AM 0:00.00 daemon --name=data_server_1 --respawn -- sleep 999

这样我们就会有两个pid文件:

wangyi@localhost:~$ls /tmp/data_server*
/tmp/data_server_1.pid /tmp/data_server_2.pid

wangyi@localhost:~$cat /tmp/data_server*
15128
15200

停止任务:

用命令行参数–stop和–name来杀死任务:

wangyi@localhost:~$daemon --stop --name=data_server_1
wangyi@localhost:~$daemon --stop --name=data_server_2
wangyi@localhost:~$ps auxwww | grep sleep


Why Asynchronous SGD Works Better than Its Synchronous Counterpart?

April 9, 2013

In this NIPS 2012 paper, Large Scale Distributed Deep Networks, researchers at Google presented their work on distributed learning of deep neural networks.

One of the most interesting points in this paper is the asynchronous SGD algorithm, which enables a parallel (distributed) software architecture that is scalable and can make use of thousands CPUs.

To apply SGD to large data sets, we introduce Downpour SGD, a variant of asynchronous stochastic gradient descent that uses multiple replicas of a single DistBelief model. The basic approach is as follows: We divide the training data into a number of subsets and run a copy of the model on each of these subsets. The models communicate updates through a centralized parameter server, which keeps the current state of all parameters for the model, sharded across many machines (e.g., if we have 10 parameter server shards, each shard is responsible for storing and applying updates to 1/10th of the model parameters) (Figure 2). This approach is asynchronous in two distinct aspects: the model replicas run independently of each other, and the parameter server shards also run independently of one another.

Intuitively, the asynchronous algorithm looks like a hack, or a compromise between the effectiveness of the mathematical algorithm and the scalability of the distributed system. But to our surprise, the authors claimed that the asynchronous algorithm works more effective than synchronous SGD.

Why??

My understand is that traditional gradient-based optimization is like a bee flying along the direction defined by the current gradient. In batch learning, the direction is computed using the whole training data set. In SGD, the direction is computed using a randomly selected mini-batch of the data.

In contrast, the asynchronous parallel SGD works like a swamp of bees, each flies along a distinct direction. These directions vary because they are computed from the asynchronously updated parameters at the beginning of each mini-batch. For the same reason, these bees wouldn’t be far away.

The swamp of bees optimize collaboratively and covers a region like region-based optimization, where the region is composed of a set of points. This, I think, is the reason that parallel asynchronous SGD works better than traditional gradient-base optimization algorithms.


Specify the local directory for HDFS storage

March 25, 2013

You need to add a hadoop.tmp.dir property into $HADOOP/conf/core-site.xml. For example:

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>

  <property>
    <name>hadoop.tmp.dir</name>
    <value>/Users/${user.name}/hdfs-${user.name}</value>
    <final>true</final>
  </property>
</configuration>

Link All Symbols in a Library into Binaries

March 24, 2013

Morden linkers are smart. But sometimes too smart than what we need, that we would have to tell them what exactly we want.

Say, that a binary A uses some, but not all, symbols in a static library B, the linker would extract those symbol in-use from B and put them into A.

But sometimes, we do not want the linker to be as smart as this; instead, we might want all symbols in B are put into A.

To do so using GNU ln (usually comes with Linux), we say

g++ -o A -Wl,-whole-archive libB.a -Wl,-no-whole-archive A.cc

To do exactly the same thing using ln from Mac OS X, we say

g++ -o A A.cc -all_load libB.a

A typical situation that depends on above tricks is to link a self-registering class factory, and here is an example.


Use Git and Subversion Together

March 17, 2013

I like this short explanation from Learn.Github:

Cloning a Subversion Repository

The first step is to actually clone a Subversion repository using Git. This is accomplished by executing the ‘git svn clone’:

$ git svn clone <subversion-url> ?optional-directory-name?

For example,

$ git svn clone file:///Users/blair/git-svn/SVNROOT git_repo
Initialized empty Git repository in /Users/blair/git-svn/git_repo/.git/
r1 = e1e2a18c6537614f2bebc5c0dfb4ac6f8c829a3b (git-svn)
	A	file1.c
	A	Makefile
r2 = 58e044c16678b0a8b92e4dbed8f0a69d48ba9ec1 (git-svn)
Checked out HEAD:
  file:///Users/blair/git-svn/SVNROOT/trunk r2

As you can see, this creates a new Git repository in the directory named ‘MyGitRepository’, populates the Git history with all Subversion commits to the trunk, and checks out the latest HEAD.

Note that this operation will generally take longer than a corresponding ‘git clone’ command, as the entire history for the Subversion repository in question must be downloaded commit by commit. For large projects, this process can take hours. Thus it’s best to execute this command only when you have some time to kill.

Synchronizing With Upstream

When using Subversion, you synchronize with upstream changes by running ‘svn update’, which pulls down all of the changes made to your Subversion branch and merges your working copy with those changes.

With git-svn, you accomplish a similar task via the ‘git svn rebase’ command.

bucky:git_repo blair$ git svn rebase
	M	Makefile
r3 = ae409c2f5fe0831f22d6dc891652b5f9159f35de (git-svn)
First, rewinding head to replay your work on top of it...
Applying: Local git commit.

So what does this tell us? Well, first, you see a list of modified files. The next line informs us that revision 3 from Subversion is being stored as commit ae409c2 in our local Git repository. No problems thus far. But, what about that line informing us Git is rewinding and replaying our work? Well, this is where ‘svn update’ and ‘git svn rebase’ differ slightly.

Remember that, with a Subversion repository, you don’t have the concept of offline commits. So when ‘svn update’ executes, all it does is pull down all of the commits from the Subversion repository, and replay the changes in your working copy.

Under Git, you’ve likely been making a series of changes and committing them in nice small pieces (you are making regular commits, aren’t you?). So, when ‘git svn rebase’ is executed, it rolls back all of your local commits, pulls down all of the commits from Subversion, and then reapplies your local commits as if you had made them on the current HEAD of the Subversion branch. This has much the same logical effect as ‘svn update’ (all of your local changes are applied on top of the current Subversion HEAD), but may appear a little odd at first glance.

Just like ‘svn update’, it’s a good idea to execute ‘git svn rebase’ periodically in order to limit the number of conflicts you encounter when it comes time to integrate your changes back into the main repository.

Pushing Changes Upstream

So you’ve been hacking away in your Git repository and now have a series of local commits that need to be pushed back to the main Subversion repository. The first step is to execute ‘git svn rebase’ and make sure your local Git repository is up-to-date with the Subversion repository. Next, use the command ‘git svn dcommit’ to push all of your local Git commits back to the Subversion repository.

bucky:git_repo blair$ git svn dcommit
Committing to file:///Users/blair/git-svn/SVNROOT/trunk ...
	M	file1.c
Committed r4
	M	file1.c
r4 = 31a4b40b05e1b42f34dd22c34936f43dd5be90ec (git-svn)
No changes between current HEAD and refs/remotes/git-svn
Resetting to the latest refs/remotes/git-svn

Now all of your local Git commits are present in the Subversion repository and you can continue hacking away locally.

Rules and Guidelines

There are a few rules you should follow when using Git as a Subversion client. Many of these are due to the fact that Git is a much more capable system than Subversion and some of the Git features simply do not work in Subversion.

  • Do not dcommit Git merge commits to the Subversion repository. Subversion doesn’t handle merges in the same way as Git, and this will cause problems. This means you should keep your Git development history linear (i.e., no merging from other branches, just rebasing).
  • Do not amend, reorder, or otherwise change commits that have been dcommited to Subversion. This is essentially the same rule as not changing Git commits that have been pushed to public repositories. Subversion cannot handle modifying or reordering commits.

Compare git merge and git rebase

March 17, 2013

I got this concise and precise explanation from StackOverflow:

Short version:

  • Merge takes all the changes in one branch and merge them into another branch in one commit.
  • Rebase says I want the point at which I branched to move to a new starting point

So when do you use either one?

Merge

  • Let’s say you have created a branch for the purpose of developing a single feature. When you want to bring those changes back to master, you probably want merge (you don’t care about maintaining all of the interim commits).

Rebase

  • A second scenario would be if you started doing some development and then another developer made an unrelated change. You probably want to pull and then rebase to base your changes from the current version from the repo.

XeTeX is Great!

March 17, 2013

XeTeX is a new TeX implementation supporting Unicode and system fonts. I will never use latex again.  Here is how to configure XeTeX and Chinese fonts under Mac OS X: http://blog.jqian.net/post/14213106049/xelatex


Follow

Get every new post delivered to your Inbox.