Codecov not correctly merging reports

Description

We currently have a large number of separate tests run per pull request, and in the past Codecov’s ability to automatically merge coverage reports has worked perfectly.

However, one PR in our repository (#721) seemed to have an issue with coverage report merging; everything would work fine as each test job finished (Codecov would show slowly increasing coverage percentage), until our last test job, qchem-tests, finished. The last test job would ‘overwrite’ the coverage report, suddenly lowering coverage from 98% to 95%.

Due to urgency this PR was merged, and this issue now affects the master branch and all other PRs.

Repository

CI/CD

We are using GitHub actions to generate the coverage report.

Uploader

We are currently using the Codecov action.

Commit SHAs

Codecov YAML

Codecov Output

Run codecov/codecov-action@v1.0.7
/bin/bash codecov.sh -f ./qchem/coverage.xml -n  -F 

  _____          _
 / ____|        | |
| |     ___   __| | ___  ___ _____   __
| |    / _ \ / _` |/ _ \/ __/ _ \ \ / /
| |___| (_) | (_| |  __/ (_| (_) \ V /
 \_____\___/ \__,_|\___|\___\___/ \_/
                              Bash-20200728-9fb7d93

 ==> GitHub Actions detected.
    project root: .
    Yaml found at: codecov.yml
    -> Found 1 reports
==> Detecting git/mercurial file structure
==> Reading reports
    + ./qchem/coverage.xml bytes=289148
==> Appending adjustments
    https://docs.codecov.io/docs/fixing-reports
    -> No adjustments found
==> Gzipping contents
==> Uploading reports
    url: https://codecov.io
    query: branch=passthru-qubit&commit=3ec905c3951ec4a7b8de80c49a51a27345a7d80b&build=188437950&build_url=http%3A%2F%2Fgithub.com%2FPennyLaneAI%2Fpennylane%2Factions%2Fruns%2F188437950&name=&tag=&slug=PennyLaneAI%2Fpennylane&service=github-actions&flags=&pr=721&job=&cmd_args=f,n,F
->  Pinging Codecov
https://codecov.io/upload/v4?package=bash-20200728-9fb7d93&token=secret&branch=passthru-qubit&commit=3ec905c3951ec4a7b8de80c49a51a27345a7d80b&build=188437950&build_url=http%3A%2F%2Fgithub.com%2FPennyLaneAI%2Fpennylane%2Factions%2Fruns%2F188437950&name=&tag=&slug=PennyLaneAI%2Fpennylane&service=github-actions&flags=&pr=721&job=&cmd_args=f,n,F
->  Uploading to
https://storage.googleapis.com/codecov/v4/raw/2020-07-30/CD4F3E97F843D9EE025F0DF6F9A7D2C3/a2ff0216ad2ee2a01ea29f012a4bed61e727b7ad/fee961d3-7e54-4f9b-8e8c-69ed6860e3c1.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=GOOG1EQX6OZVJGHKK3633AAFGLBUCOOATRACRQRQF6HMSMLYUP6EAD6XSWAAY%2F20200730%2FUS%2Fs3%2Faws4_request&X-Amz-Date=20200730T105126Z&X-Amz-Expires=10&X-Amz-SignedHeaders=host&X-Amz-Signature=ea4ededdafe90bf60910ca538719b9fc44d47d026aa17acdcd03a6eff1944cdc
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 21153    0     0  100 21153      0   112k --:--:-- --:--:-- --:--:--  112k
    -> View reports at https://codecov.io/github/PennyLaneAI/pennylane/commit/a2ff0216ad2ee2a01ea29f012a4bed61e727b7ad

Hi @josh146, I’d love a little more information here. When you say last test job, is this a coverage report that is generated and uploaded to Codecov? If so, can you supply a build URL? What are you expecting to be different in these changes? And finally, I noticed you were using an old version of the Actions uploader. This shouldn’t cause this issue above, but it’s worth noting.

Hi @tom!

Here is a link to the builds: https://github.com/PennyLaneAI/pennylane/runs/927217388

For each PR, we have multiple jobs run on GitHub actions. Each one tests a different part of the project, so the coverage report from all jobs needs to be merged to get an accurate measurement of the coverage.

  • the jobs labelled core-tests include unit tests of the pennylane directory
  • the job labelled qchem-tests includes unit tests of the qchem/pennylane_qchem directory

The qchem-tests job takes the longest to run and always finishes last. In this issue, once the qchem-tests coverage report is uploaded, the coverage of the pennylane directory instantaneously decreases as you note in the link to the ‘Coverage Change’.

Hi @tom, just wondering if there is any update, or diagnostics we can do?

Hi @josh146, I’m guessing this might be because of the base commit that we are comparing to here. Do you have a PR that I can take a look at with this issue? Possibly with only one or two commits?

I have a similar issue. Reports coming from different tests overwrite each other, so my overall coverage is reported less than the actual thing! This happens on pushes to master!

@aminya can you please provide a commit SHA where you see coverage being reported less and a line that should be covered that isn’t?

Here is the commit: 3e99b7f906fe310a6493e302939a577217164b48


Hi @aminya, could you provide a file or line that is showing incorrect coverage and what you expect it to be?

If I skip uploading coverage here by reverting the commit, the coverage will be 74% but it is 63% by submitting this extra report!

Just compare the last two commits here https://codecov.io/gh/aminya/SnoopCompileBot.jl/commits

Hi @aminya, I think the coverage dropped because you have more lines with explicit coverage.

63%

73%

In the 63% case, you’ll see there are more lines with explicit coverage like in src/SnoopCompileBot.jl

63%
image

73%
image

Notice in particular line 122. This means that you probably have no explicit coverage (the line doesn’t show up in your coverage reports) until the last one. In fact, you can find in the first build.

image

The other coverage reports do not have a line for 122

Hi @tom, I made a PR explicitly to test this here: https://github.com/PennyLaneAI/pennylane/pull/763

Here is the corresponding latest commit from that PR: https://codecov.io/gh/PennyLaneAI/pennylane/commit/9375633e4b973b3eb0de702251ff816b626dea8c/build

I’ll take the pennylane/__init__.py file for a small example. In the final merged Codecov report, it is showing 29 lines missing:

However, most of our CI tests cover this file, as can be seen from the coverage logs:

----------- coverage: platform linux, python 3.8.5-final-0 -----------
Name                                    Stmts   Miss  Cover   Missing
----------------------------------------------------------------------
pennylane/__init__.py                      77      1    99%   90

The last CI test to complete is this one (qchem-tests):

----------- coverage: platform linux, python 3.7.7-final-0 ---------------------------------------------
Name                                                                       Stmts   Miss  Cover   Missing
--------------------------------------------------------------------------------------------------------
/usr/share/miniconda/lib/python3.7/site-packages/pennylane/__init__.py        81     29    64%   78, 81, 176-178, 187, 197, 224-247, 271-302

The coverage from this last, final CI check overrides the Codecov merging of reports, and the coverage reduces across the board.

Is this an issues with path filenames not being consistent between the tests?

@josh146, not quite. What is happening is that for the first 4 uploads, they are not collecting coverage information on every line. As an example, you can view all the builds and check line 78. In the first 4 that show 99% coverage, line 78 is missing, meaning that there is no information (covered or not). WIth the latest, it explicitly is saying line 78 is not covered. So unless we have another report stating that 78 is covered, we will use that information.

Hi @tom, thanks, that’s a good catch!

However it doesn’t seem to be the full issue; in __init__.py, only 6 lines in the file are being excluded from the coverage in the first 4 uploads. On the other hand, the last report shows 29 missing lines of coverage — some of these lines I can see are explicitly covered in the first 4 (example: lines 224-247).

Another thing I noticed that might be important: this issue only showed up when we switched from CircleCI to GitHub Actions. Previously, when we used a CircleCI test matrix, the final report correctly showed coverage of the PennyLane folder.

Oh this is really interesting @josh146. It looks like this is an issue with the paths that are reported. If you use the report selector

image

you’ll see that Codecov is unable to find coverage for this file for the first 4 coverage reports. I’ll dig in here to see what a fix might be. Would you be able to provide an output from the CircleCI run?

Apologies @tom, it was TravisCI, not Circle!

I’ve attached a link to one of the Travis builds before we ported to GitHub actions: https://travis-ci.com/github/XanaduAI/pennylane/jobs/348090680

Hi @tom, just wondering if there is an update on your side :slight_smile: The coverage reporting is causing a few issues for our developers, so anything to help alleviate it will be greatly appreciated!

Hi @josh146, sorry for the delay here. It looks like the Travis build is for a different organization XanaduAI. Do you have a build under PennyLaneAI? I want to be able to compare builds between two commits from each CI.

Hi @tom, unfortunately not. The repository was transferred from XanaduAI (where we were using Travis) to PennyLaneAI (where GitHub actions are being used). After the transfer, we noticed the Codecov issues.

Note that although the org name changed, it is the same (unmodified) repository.

HI @tom, I’m still trying to fix this with not much luck.

Here is what the in-progress files dashboard looks like during the first 5 CI job:

And here is what it looks like after the final CI job (the one titled qchem-tests) is complete:

As you can see, the first 5 CI jobs are not including certain files (such as __init__.py) that are definitely in the coverage report for the CI build:

https://codecov.io/api/gh/PennyLaneAI/pennylane/download/build?path=v4/raw/2020-09-14/CD4F3E97F843D9EE025F0DF6F9A7D2C3/15381bb5b4ebd8db24152a4a40e89d95c3503c91/dc891c7a-d30b-4f45-a840-a9258f86afc6.txt

These do appear on codecov once the qchem-tests build is uploaded. However, this CI job doesn’t specifically test these files, hence the low coverage!

Any ideas of how we can fix this?