Code lines wrongly detected as executable in parallel jobs for a Julia project

Description

We use Travis to test our Julia project in parallel. That is, different parts of the code are tested in separate jobs to reduce the build time. Merging of the reports generally works fine. However, since we changed from one job per build to multiple parallel jobs per build, the Codecov.io coverage reports indicate a significantly lower coverage than before. We tracked down the issue to the fact that Codecov erroneously reports method declarations as executable lines (which it didn’t before when we did not have parallel builds enabled). To make sure that it’s not just a simple snafu, we compared it to the report by Coveralls, which does not seem to have this issue. Below is an example for a file that shows the weird behavior in action (line 9).

Example from Codecov.io (Codecov)

Example from Coveralls.io (trixi-framework/Trixi.jl | Build 108 | src/amr/amr.jl | Coveralls - Test Coverage History & Statistics)

image removed due to new user restrictions

Repository

CI/CD

Travis

Here’s the missing example from Coveralls.io:

Hi @sloede, thanks for this. Could you provide a build before you switched to multiple parallel jobs?

As for why you are seeing that line being uncovered, builds 133.3 and 133.7 show explicitly that that line is uncovered. How are you collecting coverage metrics?

Hi! I’m the one who deactivated Codecov integration after enabling parallel tests in run tests on Travis CI in parallel by ranocha · Pull Request #140 · trixi-framework/Trixi.jl · GitHub, linked to Codecov build Codecov. I was asked to remove the Codecov integration completely, so we don’t have any old reports and the one linked above will not work anymore. I just activated Codecov again after some time (without adding a badge in the README.md).

We are collecting coverage metrics in Julia using GitHub - JuliaCI/Coverage.jl: Take Julia code coverage and memory allocation results, do useful things with them.

@tom Any idea what could cause the problem or how we could fix it?

Hi all, this one was tricky, and I still don’t have a great answer.

I’m focusing on this build and the file src/amr/amr.jl. What you’ll notice is that the Codecov step happens before the Coveralls step. But in between those two lines, something else runs. You can see this line in between the two that states

[ Info: CoverageTools.process_file: Detecting coverage for src/amr/amr.jl
┌ Info: CoverageTools.process_cov: Coverage file(s) for src/amr/amr.jl do not exist.
└ Assuming file has no coverage.

My guess is for this build, Coveralls is not receiving line coverage for that file, but we still are. I can’t figure out how to view the raw report sent to Coveralls, but you can find the build here.

We, however, do get coverage for that file here. which shows those two lines being uncovered:

image

My guess as to how to fix this would be to run the two commands at the same time and you should expect the same coverage (dependent on partial coverage). I was unaware of being able to do this in Travis, but we always recommend the bash uploader. In this case, I’m not sure how you would control for timing.

Hi @tom,

Sorry for not replying earlier to you. First of all, thank you very much for taking the time to investigate. We have since switched to GitHub Actions and the problem persists. We also found at least one other Julia project with similar issues, so we don’t believe it is something specific to our CI setup.

To recap, we still get the same wrong coverage analysis that the first line of a Julia function definition (starting with function ...) is counted as “executable” with zero executions, thus artificially lowering the overall coverage percentage. Examples are in

However, I am not 100% convinced that it is related to the parallel builds (or at least not in a simply reproducible way). For example, in these projects, the function definition line function ... is correctly counted as “not executable” and thus does not affect the overall coverage, even though they are tested in parallel:

As you can see with the example for OrdinaryDiffEq, even within the same repository sometimes the lines are correctly counted, sometimes they are not. Any idea what might be causing this and how to fix it?

Hi @sloede, unfortunately, this one got away from me last month as I was recovering from COVID. I’m getting is that the function lines are still being shown as green (covered) when they shouldn’t be. Is that accurate? If so, are the examples you pasted above still valid?

I have the same problem with a parallel build in Python.
Compare:

Correct:

Setting: single build with single version of Python
Observation of correct output: the function definitions (131, 180, 495, 600, 661, 684, 735) are white, i.e., not executable.

Incorrect:

Setting: parallel build, combining two kinds of unit testing approaches with two different versions of Python.
Observation of incorrect results: the function definition lines (131, 180, 495, 600, 661, 684, 735) are shown in red, i.e., as an executable line that was not executed.

Desired behaviour: I need my parallel build to test everything, but I would like the parallel build to continue showing the function definitions either as non-executable lines, or as executable lines that did get executed (like the “coverage html” does on my PC).

@joanise please open a new topic for this

Done, see Parallel coverage in Python handles @click function definitions differently than single run coverage