Test Load Balancer(TLB)

Please use a javascript enabled browser to read this documentation(some of the documentation pages use controls that need javascript support to render)

Documentation for version:

This documentation directory covers core concepts and all the configuration options for TLB .

_{Should you wish to refer the documentation for a different release, please choose the corresponding version using the selector above.}

Enhancing and Contributing to TLB

TLB was written from ground up to be as very customizable and tweakable. That means you can add new algorithms, add support for new frameworks etc with relatively less work. The idea is TLB takes care of all the boring, plumbing/TLB specific work and lets the end users to write only what matters to them.

To understand how TLB works, please refer to the TLB concepts. We strongly recommend going through the whole page so that you get a good understanding of TLB's internals if you intend to customize it.

You can may want to peek into the algorithms we are working on. You can find all the spike work under "set-part" in the Projects listing. That page also lists all the TLB projects. If any of them interest you, you can choose to send us patch hungry developers some tasty bug fixes or features :)

Adding new Splitters and Orderers

Most of the algorithmic enhancements are in the splitting and reordering area, this sections talks about how to go about attacking these areas.

Splitter

You can either extend the tlb.splitter.JobFamilyAwareSplitter or tlb.splitter.TestSplitter class and implement the relevant method(s) using whatever algorithm you want to use. Once you have implemented a splitter, point TLB_SPLITTER to it, or add it to the chain of splitters in the TLB_PREFERRED_SPLITTERS variable(so you can offload to built in algorithms in case your algorithm fails) to have it used. Do check out the Configuration Variables to understand how splitter can be configured.

NOTE: Remember that splitters needs to guarentee the contract of mutual exclusion and collective exhaustion.

Orderer

You can extend the tlb.orderer.TestOrderer class and override the relevant method(s). This is used for sorting the test suites. Point TLB_ORDERER to the newly implemented orderer to order the way you want.

NOTE: Remember, the contract of the orderer is that to just reorder and not change the size of or replace the members of the list passed in. If you do, then the mutual exclusion and collective exhaustion rule may be broken.

Any implementation that you write needs to be droped in the classpath for TLB java runtime and TLB will take care of loading it.

Adding support for new frameworks

To load-balance tests, TLB needs to hook up with 2 kinds of frameworks.

Testing framework: This is the framework that actually runs the tests (for instance JUnit or RSpec etc)
Build framework: This is the framework that calls out to the tests (Ant or Rake etc)

Supporting a new Testing framework

TLB assumes that a test framework provides an option to specify a list of test resources (for example: files, test-case names, shared-libraries(like so(s) or DLL(s)) etc) that get executed. The list is passed (Refer to the next section on how to do this) to the Splitter which prunes the resource list. Post pruning, the list of the resources is passed through the orderer, where it is re-ordered. Orderer's contract is to only change the order and not change(add/remove/replace) the resources themselves.
The final list of test resources is what is fed into the test framework for execution.
Once the tests are executed, TLB needs a way to capture the test result and the time taken for execution of each resource in order to report back to the TLB Server. This feedback is used for partitioning/re-ordering tests accurately and sensibly in future invocations. Refer to the next section to find out the details.

Supporting a new Build tool

Supporting build tools is a matter of implementing the end-user interface which delegates to the Splitter and Orderer. It goes a little further than that and does the plumbing work of attaching listeners to the test running framework so the feedback can be posted.

For instance, Ant support hooks up at the fileset-filter level for partitioning and re-ordering and attaches a JUnitResultFormatter with the test task, so the feedback gets posted.

Interacting with TLB in an alien environment

TLB is written in Java. Hence Java is considered to be TLB's native environment. However, it can be used on any platform and any language. We call a non-Java environment "Alien Environment".

TLB's job involves using the balancing and reordering algorithms and posting test data to the TLB server. Since this is something that is common, every platform/language this need not reinvent this. We provide a simple way of reusing, which is implemented in Java.

TLB has a concept of Balancing Server. This is a really light weight HTTP RESTlet server that knows:

Pruning a list of tests (Splitting)
Reordering the pruned list of tests (Ordering)
Posting test information to the TLB server
Fetching test information from the TLB server for splitting and ordering

Given that all these are taken care of, a developer working to implement Test Load Balancing for a given platform or language, just needs to hookup with the subject test framework(s) and build framework(s) as mentioned in the previous section and use the Balancing server to do the actual load balancing.

Using the Balancing Server

The life-cycle of balancer process is:

Start the balancing server. This is done using:
java -jar tlb-alien-gXXX.jar
A few environment variables must be set before this invocation is made. Please refer to the "Balancer against TLB Server" column in Configuration Variables to get the list of the variables that it needs.
Obtain the list of tests that need to be run from the test framework
Pass this list along to the Balancing server. It needs to be an HTTP POST to URL: http://localhost:${TLB_BALANCER_PORT}/balance
The post body should be of the format:
com/foo/project/test_file1\ncom/bar/project/test_file2\n...com/bar/project/test_filex
i.e. the body should have the identifier of the test resource (The format of the names is not important. It can be anything. Usually it is either relative path or a logical name(class-name etc)) and each resource name is separated using "\n"(new-line character). Note that this needs to be "\n" irrespective of platform you are on.
The body of response of the aforementioned HTTP POST request will return you the pruned list of tests in the same format(seperated by new-line characters).
After these tests are finished, you need to pass along the test run time to the balancing server. You can do this by performing an HTTP POST to the URL: http://localhost:${TLB_BALANCER_PORT}/suite_time
The post body should be of the format:
com/foo/project/test_file1:123\ncom/bar/project/test_file2:20\n...com/bar/project/test_filex:344
i.e. the body should have the name(identifier) for the test resource and time taken (a valid integer) separated by a colon (":") and each entry must be separated by "\n" character. The name(identifier) should be the exact same as the one you send across for pruning/reordering.
The results of the tests too, must be posted, if re-order functionality is used. This can be done by performing HTTP POST to the URL: http://localhost:${TLB_BALANCER_PORT}/suite_result
The post body should be of the format:
com/foo/project/test_file1:false\ncom/bar/project/test_file2:false\n...com/bar/project/test_filex:true
i.e. the body should have the name(identifier) of the test resource and the result of the test separated by a colon(":") and each entry separated by "\n" character. The value of the result must be either "false" or "true".
Once all the tests are executed, terminate the balancing server. This can be done by shooting an HTTP GET to the URL: http://localhost:${TLB_BALANCER_PORT}/control/suicide
Doing so will terminate the Balancing server gracefully.

As one can see, the balancing server can be used on any platform and any language as long as you can talk over HTTP to get the splitting and reordering done. This is the exactly how the 'tlb.rb' - the ruby support is implemented. 'tlb.net', which is currently under development also used the same life-cycle.