Testing the untestable
- CommentsAdmit it: how many times you have seen “software from this branch is completely untested, use it at your own risk” when you checked the latest code from any FOSS project? I bet you have, many times. For any reasonably modern project, this is not entirely true: Continuous Integration and automated testing are a huge help in ensuring that the code builds and at least does what it is supposed to do. KDE is no exception to this, thanks to build.kde.org and a growing number of unit tests.
Is it enough?
This however does not count functional testing, i.e. checking whether the software actually does what it should. You wouldn’t want KMail to send kitten pictures as a reply to a meeting invitation from your boss, for example, or you might want to test that your office suite starts and is able to actually save documents without crashing. This is something you can’t test with traditional unit testing frameworks.
Why does this matter to KDE? Nowadays, the dream of always summer in trunk as proposed 8 years ago is getting closer, and there are several ways to run KDE software directly from git. However, except for the above strategy, there is no additional testing done.
Or, should I rather say, there wasn’t.
Our savior, openQA
Those who use openSUSE Tumbleweed know that even if it is technically a “rolling release” distribution, it is extensively tested. That is made possible by openQA, which runs a full series of automated functional tests, from installation to actual use of the desktops shipped by the distribution. The recently released openSUSE Leap has also benefited from this testing during the development phase.
“But, Luca,” you would say, “we already know about all this stuff.”
Indeed, this is not news. But the big news is that, thanks mainly to the efforts of Fabian Vogt and Oliver Kurz, now openQA is testing also KDE software from git! This works by feeding the Argon (Leap based) and Krypton (Tumbleweed based) live media, which are roughly built daily, to openQA, and running a series of specific tests.
You can see here an example for Argon and an example for Krypton (note: some links may become dead as tests are cleaned up, and will be adjusted accordingly). openQA tests both the distro-level stuff (the console
test) and KDE specific operations (the X11
test). In the latter case, it tests the ability to launch a terminal, running a number of programs (Kate, Kontact, and a few others) and does some very basic tests with Plasma as well.
Is it enough to test the full experience of KDE software? No, but this is a good solid foundation for more automated testing to spot functional regressions: during the openSUSE Leap 42.2 development cycle, openQA found several upstream issues in Plasma which were then communicated to the developers and promptly fixed.
Is this enough for everything?
Of course not. Automated testing only gets so much, so this is not an excuse for being lazy and not filing those reports. Also, since the tests run in a VM, they won’t be able to catch some issues that only occur on real hardware (multiscreen, compositing). But is surely a good start to ensure that at least obvious regressions are found before the code is actually shipped to distributions and then to end users.
What needs to be done? More tests, of course. In particular, Plasma regression tests (handling applets, etc.) would be likely needed. But as they say, every journey starts with the first step.