Testing Bisq #105

freimair · 2019-08-06T10:15:33Z

This is a Bisq Network proposal. Please familiarize yourself with the submission and review process.

The Bisq code base grew rather big, the main features are in place, the DAO is alive and growing and there are no major issues anymore (and with #74, we finally settled on a toolbox for testing). I believe that now is the time to shift the development priorities slightly towards upping the code quality and streamlining the review/merging process. I propose doing that by upping the test coverage of Bisq as the status of the software indicates a clear lack of testing efforts.

Goals

By testing, we can produce benefits in multiple different areas of the Bisq project, functional and non-functional. First and foremost, testing itself has the inherent advantage of testing the code and providing feedback. An non-exhaustive list of advantages is:

reduce risk of introducing breaking changes into the Bisq code
reduce risk of corner cases not being covered in day-to-day manual tests
reduce efforts of release-testing and therefore, prevent skipping tests

Secondary advantages arise from having a basic testing framework in the means of resources. A prominent example is the workload of a reviewer. Whenever a reviewer approves a PR, she should run a manual release test in order to feel confident on a PR not breaking Bisq. First, running a full manual release test is no feasible for all PRs and second, it takes a lot of resources. Especially with Bisq's (currently) very limited dev resources, PRs are either tested insufficiently or kept in the waiting line for a long time. An non-exhaustive list of advantages is:

shorten PR integration time
free up reviewer resources to be used elsewhere
spare new developers the hurdle of setting up Alice, Bob, an arbitrator, a seed node, a bitcoin node, ... just to fix some small issue and therefore keep their efforts low and efficient

Tertiary advantages might pave the way for bigger refactoring and code cleanup attempts (which generally lead to a more stable product). Again, a successful test ensures that a PR does not break Bisq. First, the dev doing the PR has better certainty whether the stuff she delivered is good and second, again, responsibility is taken off the reviewers (of which there are currently three people) which eventually will speeds up PR review and integration. A non-exhaustive list of advantages is:

attempt bigger refactorings and code cleanup
to simplify the code base step by step
eventually improve code quality and thus, product stability

All in all, upping the test coverage is going to streamline the Bisq development cycle and quality and the resources needed to get the testing framework up and running will eventually pay off.

Two approaches

I propose to follow two approaches simultaneously. First, create something that has a profound short-term (positive) effect on developer resources and second, start integrating low-level tests for new functionality.

Short-term

I propose creating an automated test framework and tests that spin up a small Bisq P2P network with Alice (Device under Test), Bob, an Arbitrator, a seed node and a bitcoin node. Use this network to perform day-to-day trading procedures to eventually replace the kind of manual release testing as it is done now.

The obvious pros of such an approach are

quickly reduce manual (release-) testing of day-to-day operations
provide developers a way to quickly check whether their PR breaks anything
remove the need for new developers to setup a full development environment for small fixes
help the reviewer decide on a PR

and with that

use the freed-up resources elsewhere

The obvious con is the sudden need for more computing resources (but I am quite sure that our current Travis-setup can handle it).

Long-term

For a long-term effect on Bisq's code quality and stability, we need to significantly increase the testing coverage of Bisq. Such a task cannot be done over night, especially as testing has not been a priority in the past. In order to up the testing coverage without major delay of functional development, I propose these measures to be enforced:

for each PR that fixes a bug, a set of effective tests has to be provided
for each PR that introduces a new feature, a set of effective tests has to be provided
for each PR that refactors existing code, a set of effective tests has to be provided

A set of effective tests has to meet the following criteria:

the efforts required for getting the test up and running has to be reasonable
the efforts required for keeping the test maintained has to be reasonable

Whether efforts are reasonable or not depend on the situation. Reviewers might demand further tests if they deem it worthy (as agreed on source).

The pros of such an approach are:

test coverage increases over time without having a huge impact on feature development
testing overhead increases linearly, thus, computing resources can be adjusted with time

The obvious con of such an approach is that it takes a long time for effects to be noticeable on a large scale.

Implementation

Implementing the short-term part of the proposal poses quite a challenge. First and foremost, the environment of Bisq contains highly dynamic parts such as the P2P network, the Bitcoin network and thus, network latency and timing issues. Furthermore, the Bisq code is not ready to test as is, since business logic can be found in the GUI parts of the code (source).

For a "real" short-term effect, @blabno's API is the way to go as he already has an extensive set of the kind of tests mentioned before.

Implementing the long-term part of the proposal poses challenges as well. Due to the dynamic environment, bugs cannot always be reproduced by an effective static test.

For a long-term effect, start creating tests.

TL;DR;

In order to push the quality of Bisq's code and its development process, creating a testing culture has become absolutely necessary. Small issues happen on every end of Bisq and patching those issues often leads to more complicated code and leaves the core issue untouched. Furthermore, approving changes (as a reviewer) became a gamble, as things are getting so complex that big changes (necessary ones that actually touch the core issues) are almost impossible to assess.

For a short-term strategy, I propose to get a green/red end-to-end test up and running. If this test succeeds, the proposed change does not break Bisq in its roots. A red light on the test might uncover issues that might no even be related to the proposed change. For reasons of efforts, I propose to use some sort of API.

For a long-term strategy, I propose to just get started and create actual tests following reason in terms of efforts and effect.

blabno · 2019-08-12T15:37:01Z

Regarding static code analysis I usually used several tools together as each had some checks that others did not. Those were :

For the long run I think we should unit test everything at least in the core. With the UI will have to see how time consuming it is to test view classes thoroughly.
In the beginning the main obstacle, for writing good unit tests, will be the tangles between different components and layers of code. But I think we should not get discouraged by that, because well tested code will pay off in the long run.

mpolavieja · 2019-09-16T15:24:46Z

Approved on Cycle 5 DAO voting

oneingan · 2019-12-04T10:59:16Z

Please, take a look to proposal #146. I think would be very convenient for the short-term approach.

mpolavieja · 2019-12-12T12:03:53Z

Closed as approved

This was referenced Aug 13, 2019

Dev Call "Seed nodes: Goals and strategies on how to get seed nodes more stable and reliable" bisq-network/events#27

Closed

Dev-Call "Testing Roadmap" bisq-network/events#28

Closed

Cycle 5 bisq-network/compensation#336

Closed

freimair mentioned this issue Aug 21, 2019

Dev Call Topic Collector bisq-network/events#31

Open

17 tasks

freimair mentioned this issue Sep 8, 2019

Dev Call "How to proceed with the API" bisq-network/events#30

Closed

9 tasks

freimair mentioned this issue Sep 16, 2019

Cycle 6 bisq-network/compensation#363

Closed

mpolavieja added the was:approved label Sep 16, 2019

oneingan mentioned this issue Dec 3, 2019

Nix package #146

Closed

mpolavieja closed this as completed Dec 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Bisq #105

Testing Bisq #105

freimair commented Aug 6, 2019 •

edited

blabno commented Aug 12, 2019

mpolavieja commented Sep 16, 2019

oneingan commented Dec 4, 2019

mpolavieja commented Dec 12, 2019

Testing Bisq #105

Testing Bisq #105

Comments

freimair commented Aug 6, 2019 • edited

Goals

Two approaches

Short-term

Long-term

Implementation

TL;DR;

blabno commented Aug 12, 2019

mpolavieja commented Sep 16, 2019

oneingan commented Dec 4, 2019

mpolavieja commented Dec 12, 2019

freimair commented Aug 6, 2019 •

edited