Speeding up AOSP Builds – System Temp Directory

The Android Development Tools (ADT) is a massive project and is very well done. Getting your hands on and compiling the Android Open Source Project (AOSP) is very easy too. So easy, you’ll quickly want to improve your compile performance. An initial compile of AOSP can take about ~46 minutes on an Intel i7 @ 3.4 GHz. Using the compile cache will definitely speed up sequential builds but more can still be done. Sacrificing just a few MB of RAM (~60 MB) for a system temp directory (/tmp) ramdisk can reduce compile time anywhere from ~2%-10% depending on total system throughput (other hardware specs). The C/C++ compiler creates, writes and deletes a temporary file for each source file crunched. Keeping this I/O traffic in RAM offers efficiency gains.


PC Desktop i7 @ 3.4 GHz x 8 w/16 GB RAM and 7200 RPM Drive
Without Ramdisk: ~44 minutes
With Ramdisk: ~41 minutes
Reduction in time playing swords: ~7%

PC Laptop Intel Core 2 Duo CPU T8300 @ 2.4 Gz x 2 w/4 GB RAM and 5600 RPM Drive
Without Ramdisk: 705 minutes
With Ramdisk: 650 minutes
Reduction in time playing swords: ~8%

To setup your /tmp directory to be a ramdisk on Ubuntu systems add the following to your /etc/fstab file and reboot

ramdisk /tmp tmpfs mode=1777,size=2g

There are two quick ways to make sure the ramdisk is working.
1) run the ‘df’ command and confirm /tmp is mounted with the correct size
2) copy a large file to your home directory and again to the /tmp directory and compare speeds

chris@chis-devpc:~/ServerBackups$ time cp 2012. ~/

real	0m2.657s
user	0m0.000s
sys	0m0.240s
chris@chis-devpc:~/ServerBackups$ time cp 2012. /tmp

real	0m0.224s
user	0m0.000s
sys	0m0.140s

The file copied above is 208MB. As you can see, the ramdisk was an order of magnitude faster.

Git – Command Prompt Extension

If you use Git from a command line interface you probably find yourself typing ‘git status’ and ‘git branch -a’ way too many times a day. The public Git repository has a great set of tools that can be enabled post installation. I’ve found the most useful to be the Bash shell extensions used for displaying the current branch and repository status directly in the PS1 prompt. These three environment variables save me hundreds of typed characters a day:


With these set in your .bash_profile script anytime you change into a Git repository directory, additional information about the state of that repository is automatically appended to the prompt. If simply setting these variables doesn’t work copy the /git/contrib/completion/git-completion.bash file to your home directory and source it in your .bash_profile.

After entering a Git repository directory, what is the command prompt telling me?

The branch name should be obvious. If you see a ‘*’ character it means you have an unstaged changed in the repo (not committed). If you see a ‘+’ character your changes have been staged but not pushed to a remote branch. If you see a ‘%’ character it means you have created new file(s) but haven’t started tracking them yet (haven’t run ‘git add filename’). If you see a ‘>’ character it means what is checked out is ahead of the remote server. A ‘<‘ and ‘=’ character means your checked out repository is behind or equal to the remote server.

WiX Custom Action Sequencing

WiX is an open source project sponsored by Microsoft that exposes its operating system installer functionality via XML elements. The nuances of this declarative technology and inconsistent syntax have given birth to an entire classification of engineers in the software industry called deployment engineers.  These engineers are frequently tasked with packaging and distributing executable binaries for various versions of the Windows platform. All things considered WiX has matured nicely and is becoming quite powerful. That said, it can be a real pain to start using if you’re a novice due to the lack of syntax consistency and plethora of contradictory information on the topic. Not to mention ensuring platform compatibility is nearly impossible given that most people don’t have a copy of every version of Windows ever released… and yes, the installer functionality changed with every version too :-). That said, this post is intended to add to the developer WiX ‘entropy’.

Every book I’ve skimmed through and blog post I’ve read on WiX custom action sequencing is right in ‘theory’ but not always correct in practice… Through trial and error, along with the help of my Chief Architect, we determined that WiX custom action sequencing is fairly arbitrary. To ensure consistent functionality, the action should be schedule after what appears to be an undocumented sequence.

What books and other blogs say to do:

<CustomAction Id="RegisterSomething"
              ExeCommand='regsvr32.exe /s "[INSTALLDIR]something.dll"'

            <Custom Action="RegisterSomething"
                    After="InstallFinalize">NOT Installed</Custom>

What you should consider doing instead is:

            <Custom Action="RegisterSomething"
                    After="RemoveExistingProducts">NOT Installed</Custom>

Why? Most operations deployment engineers will want to perform during install time will require administrator (“elevated”) privileges. Which means if the Windows user is using UAC it will only run elevated if executed prior to the ‘InstallFinalize’ sequence, not after. Simply changing the original line above from ‘After’ to ‘Before’ will not work consistently either. This is because the sequencing is arbitrary. Meaning that it’s possible for your custom actions to be schedule to run before the ‘InstallExecute’ sequence… because it’s technically ‘Before’ ‘InstallFinalize’ :-). In this case, you’ll be running custom actions on binaries and run times that haven’t been committed to the system yet.

Easy Python JSON Client & Server

The jsocket package is for use during the development of distributed systems. There are two ways to use the package. The first and simplest way is to create a custom single threaded server by overloading the the jsocket.ThreadedServer class. The second, is to use the server factory functionality by overloading the jsocket.ServerFactoryThread class and passing this declaration to the jsocket.ServerFactory(FactoryThread) object. This creates a multithreaded custom JSON server for any number of simultaneous clients.

All of the code can be forked on GitHub here, and the latest release can be installed off PyPi using either “pip install jsocket” or “easy_install jsocket” depending on which python package manager you use.

Using some UML notation you can decipher the inheritance of the jsocket package to be six new module classes and two existing Python classes:

Regardless of which way you choose to use the package, class inheritance is mandatory. As stated above, the simplest way to leverage functionality is to create a custom single threaded server by overloading the ThreadedServer class (just like the ServerFactory class). This is a great way to prototype simple applications in which you want to transfer data in JSON format between a client.

     import jsocket
     import logging
     class MyServer(jsocket.ThreadedServer):
          # This is a basic example of a custom ThreadedServer.
          def __init__(self):
               super(MyServer, self).__init__()
               self.timeout = 2.0

          def _process_message(self, obj):
               # pure virtual method from base class
               if obj != '':
                    if obj['message'] == "new connection":
                         logging.info("new connection")

     if __name__ == &quot;__main__&quot;:
          client = jsocket.JsonClient()
          client.send_obj({"message": "new connection"})

Albeit brief, the above example creates a server that is capable of acting on JSON data of type “{‘message’: ‘new connection’}”. There are no restrictions on what you declare in the _process_message virtual method. You can think of this as defining your ‘protocol’ for the server.

The second approach is to use the ServerFactory class to create a multithreaded custom JSON server for any number of simultaneous clients. To do this you must inherit from the ServerFactoryThread class. Just like above you need to implement the virtual _process_message method to support your ‘protocol’ (that is, what kind of data you want the server to respond to). After this is done, simply pass the class declaration to the ServerFactory constructor and it does the rest. Whenever the server sees a new client connection it will fork a separate thread of control to manage the interaction with that client (the thread is automatically terminated when the client disconnects).

     import jsocket
     import logging
     class MyFactoryThread(jsocket.ServerFactoryThread):
          # This is an example factory thread, which the server factory will
          # instantiate for each new connection.
          def __init__(self):
               super(MyFactoryThread, self).__init__()
               self.timeout = 2.0

          def _process_message(self, obj):
               # virtual method - Implementer must define protocol
               if obj != '':
                    if obj['message'] == "new connection":
                         logging.info("new connection")

     if __name__ == &quot;__main__&quot;:
          server = jsocket.ServerFactory(MyFactoryThread)
          server.timeout = 2.0

          #create and connect as many clients as you like here
          client = jsocket.JsonClient()
          client.send_obj({"message": "new connection"})


Windows Resource – Access Denied

In the Windows PC industry it is very common for applications to attached resources to an executable. Resources can be anything from bitmaps, UI components, string translations and copyright information. These resources are then used by the executable at runtime and also by Windows Explorer (for icons and properties information).

Resources can be modified post compilation by using three methods provided by MSDN: BeginUpdateResource, UpdateResource and EndUpdateResource. I recently had to use these methods and painfully discovered a nasty race condition between my resource updater application and Windows Explorer. I found information on these random failures to be scarce, hence the blog post…

When you are modifying the resource of an executable or DLL and Windows Explorer is open and showing the directory of the file being updated (regardless of visibility), a file access race condition exists that produces one of the following two errors:

The system cannot open the device or file specified.

Access denied.

The numbers I collected show an average file access failure rate of 2.3% when updating a resource that Explorer is also showing. If you are experiencing these errors simply close Windows Explorer and your resource updater application will function as expected every time. Given that Windows is closed source I can’t confirm exactly what Explorer is doing but my hypotheses is that Explorer temporarily opens and reads the resources of each file in the directory (most likely so it can determine which icon to show) and refreshes that resource when it notices a change to the file. If your updater application requests a file handle during this time you will see one of the above errors.

Colorado Chicago Basin

This past August I had the pleasure of exploring an area of Colorado called the Chicago Basin with my friends Tim & Catone. The area is very remote, requires at least two nights and includes lots of hoofing it to access the Basin. The main back country camping area provides access to three of the states most remote 14ers (Windom, Sunlight and Eolus). The Basin is absolutely stunning and I can’t wait to visit again.
The best way to experience the Basin is to utilize the Durango Narrow Gauge Railroad. The train will drop you and your pack off at the start of the main trailhead, cutting the hike to the main camping area down to ~8 miles. The train is a true piece of Colorado history operating just like it did back in 1880. The train conductor operates in a very casual manner, as long as you paid for a full round trip ticket you can pretty much take it out of the Basin any day you want. So, once you get there and decide to spend another day enjoying the area, don’t worry about the date on your ticket :). As an added bonus, if you end up taking a different route out of the Basin simply flag the train down at any point in the valley and they’ll stop and pick you up.

The Basin supports a very healthy goat population that has been desensitized to humans. Make sure you goat proof your camp before heading out to bag the 14ers. The goats are very aggressive at sniffing out your scraps of dropped food and love the salt in your urine. Our camp was overrun by goats multiple times. The goats are harmless, but you can help minimize the blending of the worlds by hanging left over food instead of burying it and urinate on large rocks instead of the dirt.

If you haven’t already bagged these peaks, make sure you put them on your list for next year. Exploring the area is definitely worth 14 hours in a car, 5 hours on a train, 16 miles by foot (not including the miles required to summit the peaks) and three nights in a tent at 10,000 feet. Enjoy!

Python – Recursive Glob & Line Counter

The other day I needed a recursive glob to find all the *.py files in my home directory. Much to my amazement Python doesn’t have one built into the glob module. So, I built my own and decided to share it for anyone else who might need it. Fork it here and install it with easy_install off PyPi.

Recursive Glob Module
	rglob(base, pattern)
	lcount(base, pattern, func=lambda x : True)
import glob
import os

def _getDirs(base):
	return [x for x in glob.iglob(os.path.join( base, '*')) if os.path.isdir(x) ]

def _count(files, func):
	lines = 0
	for f in files:
		lines += sum([1 for l in open(f) if func(l)])
	return lines

def rglob(base, pattern):
	""" Recursive glob starting in specified directory """
	flist = []
	dirs = _getDirs(base)
	if len(dirs):
		for d in dirs:
			flist.extend(rglob(os.path.join(base,d), pattern))
	return flist

def rglob_(pattern):
	""" Performs a recursive glob in the current working directory """
	return rglob(os.getcwd(), pattern)

def lcount(base, pattern, func = lambda x : True):
	""" Counts the number of lines in each file found matching pattern.
			base - root directory to start the search
			pattern - pattern for glob to match (i.e '*.py')
			func - boolean filter function
				example: lambda x : True if len(x.strip()) else False #filter empty lines
				default: lambda x : True
	allFiles = rglob(base, pattern)
	return _count(allFiles, func)

if __name__ == "__main__":
	#filter out empty lines and comments
	filterFunc = lambda x : True if (len(x.strip()) and x.strip()[0] != '#') else False

	pyFiles = rglob(os.path.dirname(__file__), "*.py")
	print " {} total lines".format(_count(pyFiles, filterFunc))

	pyFiles_ = rglob_("*.py")
	print " {} total lines".format(_count(pyFiles_, filterFunc))
	print " {} total lines".format(lcount(os.path.dirname(__file__), "*.py", filterFunc))

It works by starting off in a base directory where it uses a generator expression to iterate through a glob iterator that matches any subdirectory. It then goes through these subdirectories one at a time and performs a glob pattern match on the contents of said directory. This is repeated recursively until all subdirectories off the base are inspected.

Back To The Basics – Sorting

Every couple of years it can be fun to revisit the basics. Given that I’m back in interview mode, I figured now was a good time to rewrite some of those sorting algorithms from sophomore year… in Go of course. You can find all the code in this post on GitHub under the sort directory here. The benchmark results can be reproduced by running the “gotest -v -bench=.” command from within the sort package directory. All of the results stem from the same input data and each algorithm is run with both sorted and unsorted data.

The easiest to implement and the most widely used sorting algorithm for small sets of data is Selection sort. It uses two loops and checks each entry for a suitable replacement.

package sort

func swap(a []int, x int, y int){
	var temp = a[x]
	a[x] = a[y]
	a[y] = temp

func SelectionSort(a []int) {
	for i, _ := range a {
		for j := i + 1; j < len(a); j++ {
			if a[j] < a[i] {
				swap(a, i, j)

The results show that Selection sort performs substantially faster if the array is already sorted. The worst-case, average-case, and best-case running time are all quadratic which makes it a bad choice for large quantities of data.

Selection Sort: O(n^2)
sort.BenchmarkSelectionSort_Unsorted      500      4792526 ns/op
sort.BenchmarkSelectionSort_Sorted        2000      1026659 ns/op

Another simple and natural sorting algorithm is Insertion sort. Similar to Selection sort it uses two loops to sort but offers a much better best-case running time. When the data set is nearly sorted it can execute in linear time as the second loop only executes if the value to the left is greater then itself.

func InsertionSort(a []int) {
	for j := 1; j < len(a); j++ {
		var key = a[j]
		var i = j-1
		for i >= 0 && a[i] > key {
			a[i+1] = a[i]
			i = i -1
		a[i+1] = key

The results show a substantial increase over Selection sort above. Both the worst-case and average-case running times are quadratic, but the best-case is linear. Notice the benchmark time for an already sorted array.

Insertion Sort: O(n^2)
sort.BenchmarkInsertionSort_Unsorted      1000       1835507 ns/op
sort.BenchmarkInsertionSort_Sorted      200000       8698 ns/op

Quadratic sorting time just isn’t acceptable for large sets of data. To help achieve a linear sorting time several divide-and-conquer recursive algorithms exists. Quicksort is the most common, although slightly more complicated than the above quadratic algorithms. It offers a substantial performance increase when dealing with large unsorted data sets. A complete breakdown of Quicksort is outside the scope of this post, if necessary you can study it more in-depth here. Basically, Quicksort recursively sorts the data set by continually dividing the set into subsets centered around a pivot point (in this case a[r] in partition() below).

func partition(a []int, p int, r int) int {
	var x = a[r]
	var i = p - 1
	for j := p; j <= r - 1; j++ {
		if a[j] <= x {
			i = i + 1
	return i + 1

func quickSort(a []int, p int, r int) {
	if p < r {
		var q = partition(a, p, r)

func QuickSort(a []int) {
	quickSort(a, 0, len(a)-1)

The worst-case running time is quadratic and happens when the array is already sorted. The results show a slower worst-case sorting time then Selection sort, making Quicksort a bad choice if the data set is already or nearly sorted. However, the average-case and best-case running times are linear making it a good all-around sorting algorithm for large data sets.

Quicksort: O(n log n)
sort.BenchmarkQuickSort_Unsorted      500      2836278 ns/op
sort.BenchmarkQuickSort_Sorted          500      6091908 ns/op

Despite the obvious benefits of Quicksort, it’s often a poor choice for applications requiring a consistent predictable sort time (such as in real-time embedded systems). This is because it is often not possible to know how sorted the data set is at any given time, meaning that it’s difficult to predict if the sort operation will be O(n^2) or O(n log n). In these scenarios using the Heap sort algorithm is a better choice. Although at first, it appears more complicated then Quicksort, it offers both a worst-case and average-case O(n log n) sort time.

A heap is a binary tree where the entries of the nodes can be compared with the less-than operator, meaning that the value of a node is never less than its children and each level of the tree is complete with non-complete leaves. The deepest level is as far left as possible. For a complete description of Heap sort look here.

// in Python these would be a lambda and C++ would be an inline
func parent(i int) int {
	return ((i-1)/2)

func left(i int) int {
	return ((2*i)+1)

func right(i int) int {
	return ((2*i)+2)

func MaxHeapify(a []int, i int, size int) {
	var left = left(i)
	var right = right(i)
	var largest = i

	if left  a[i] {
		largest = left

	if right  a[largest] {
		largest = right

	if largest != i {
		MaxHeapify(a, largest, size)

func BuildMaxHeap(a []int) {
	for i:=(len(a)/2); i >= 0; i-- {

func HeapSort(a []int) {
	var size = len(a)
	for i:=len(a)-1; i >=1; i-- {
		size = size - 1
		MaxHeapify(a,0, size)

Heap Sort: O(n log n)
sort.BenchmarkHeapSort_Unsorted      10000       233454 ns/op
sort.BenchmarkHeapSort_Sorted          10000       262941 ns/op

Unit Testing & Benchmarking – Google Go

Writing unit level tests and benchmarks are a critical aspect to ensuring software remaining reliable and performing as expected as a project grows. With many languages this requires the implementer to go out of their way to create a lot of special code, code that rarely gets updated as the unit properties change. Go makes this super easy and I am very impressed with the built in support for both of these operations. As an example, let’s look at a unit level test and benchmark for two different rot13 methods. Below are two methods for calculating a rot13 translation, one uses a look up table and the other calculates each individual value using the mod operator.


// rot13(rot13(x)) = x
// ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
// NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm

var table = map[byte]byte{'A':'N','B':'O','C':'P','D':'Q','E':'R','F':'S','G':'T','H':'U','I':'V','J':'W',

func Rot13Table(input []byte) {
	for i, c := range input {
		input[i] = table[c]

func Rot13Mod(input []byte) {
        for i, c := range input {
                base := byte(0)
                if c &gt;= 'a' &amp;&amp; c &lt;= 'z' {
                        base = 'a'
                }else if c &gt;= 'A' &amp;&amp; c &lt;= 'Z' {
                        base = 'A'
                if base != 0 {
                        input[i] = byte((c - base + 13) % 26 + base)

Using Go’s built in approach for writing tests and benchmarks we can write a *_test.go file for ensuring our methods are producing the correct result and for profiling their performance. In Go, any method that starts with ‘Test’ or ‘Benchmark’ in a _test.go file are automatically compiled and executed when the ‘gotest’ command is run.


import (

func BenchmarkRot13Mod(b *testing.B){
	translate := &quot;ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz&quot;

	temp := []byte(translate)
	for i:=0; i &lt; b.N; i++ {

func BenchmarkRot13Table(b *testing.B){
	translate := &quot;ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz&quot;

	temp := []byte(translate)
	for i:=0; i &lt; b.N; i++ {

The above (partial) snippet contains two benchmark tests, one for testing the table version and one for the mod operator approach. To run the test, simply execute “gotest -v -bench=.” within the unit and wait for the result. Each benchmark is looped b.N times and is repeated until Go acquires a high enough confidence in the value.

myHttp.BenchmarkRot13Mod 5000000 397 ns/op
myHttp.BenchmarkRot13Table 500000 3761 ns/op

Based on the above results, we can clearly see that the look up table approach is about 10x slower than the mod operator. Having this information available to future developers is highly desirable as it allows them to measure the impact of a unit level change and make an informed decision on which algorithm to use for a solution.

Expanding The Toolbox – Google Go

I’ve decided to experiment with 12 new programming languages or frameworks over the next year. Just like spoken languages, mastery of a coding language can take years even with daily use. With just one month per language I hope to develop a basic understanding of the languages strengths and weaknesses. Languages are tools that help us solve problems and having a broad perspective of what’s in the toolbox can simplify our solutions. Lots of problems can be solved with C, but why use it for the subset that Python can solve faster and more reliably?

All programming languages have a unique feel. Python fells concise and lucid whereas C++ feels bulky and ridged. For August’s language I picked Google’s Go. I selected Go because it supposedly has the feel of a dynamic interpreted language with the speed and type safety of a compiled language.

To get a feel for the syntax I wrote a simple closure that returns two functions, one for adding a list of integers to the sum and one for calculating the average. As you can see, Go certainly has the feel of a scripting language but requires variable type declaration. Like Python it also allows for returning multiple results from a method.

// Note: WordPress doesn't have a 'Go' syntax highlight yet so I'm using javascript
package main

import (

//Closure that returns a sum and average method
func stats() (func(a []int) int, func() float64){
	sum := 0
	entries := 0
	log := func(){
		fmt.Printf("sum: %d, entries: %d\n",sum,entries)
	return func(l []int) int {
		defer log()
		for i:=0;i<len(l);i++ {
			sum += l[i]
			entries += 1
		return sum
	func() float64 {
		return float64(sum)/float64(entries)

func main() {
	sum,avg := stats()
	s := sum([]int{1,2,3})
	s = sum([]int{1,1,1,4,1})
	fmt.Printf("sum: %d avg: %f\n", s, avg())

As a long time C programmer I initially found the reversed type and declaration syntax to be awkward, but after writing a few hundred lines it actually flows nicely. Rob Pike put together a great three day tutorial that you can access online or in the /go/doc directory. For more information on the syntax check out the day one slides. By far the most compelling feature of Go is the built in support for concurrency. As someone who has done lots of multi-threaded and multi-process development I believe the Go development team did a fantastic job at making it a fundamentally ingrained principle. To illustrate this feature I wrote a concurrent Pell Number generator below.

package pell

// Pell Numbers
// 0, 1, 2, 5, 12, 29, 70, 169, 408, 985, 2378
// P0 = 0, P1 = 1, PN = 2PN-1 + PN-2

// Day 3: Do not communicate by sharing memory.
// Instead, share memory by communicating.

func Pell(ch chan<- uint64) {
	var fn_1, fn_2 uint64 = 1, 0
	ch <- 0
	ch <- 1
	for {
		fn_2, fn_1 = fn_1, ((2*fn_1) + fn_2)
		ch <- fn_1

There is nothing special about the above Pell routine. Its parameter is a communication pipe (called a channel in Go) that guarantees synchronization because it is a blocking write. To start the thread you use the “go” keyword and read data from the same channel passed to the routine. Upon a read it unblocks the thread which then calculates the next number in the sequence.

package main

import ( "fmt"

func main() {
	c := make(chan uint64)
	var iterations *int = flag.Int("i", 10, "number of iterations to run")

	go pell.Pell(c)
	for i:=0;i < *iterations; i++ {
		var num uint64 = <-c


From my perspective the one major red flag is the lack of support for object oriented design. Although the ability to add a method to a struct or any data type is very cool I don’t yet see how it can fully replace the concept of a class. I added a small miscellaneous Go repository on gitHub that anyone interested in some social coding can fork and modify at will.