Long upload times and timeouts for GHA codecov/codecov-action@v1

Description

We’re experiencing frequent, though irregular, long upload times or timeouts for MacOS code coverage uploads (see https://github.com/uutils/coreutils/runs/769224500?check_suite_focus=true). On the same build, using the same methods, linux and windows upload coverage in seconds.

Repository

CI/CD

Github Actions

Uploader

codecov/codecov-action@v1

Commit SHAs

9e65498

Codecov YAML

comment: false

coverage:
  status:
    project:
      default:
        informational: true
    changes:
      default:
        informational: true
    patch:
      default:
        informational: true

Codecov Output

2020-06-14T03:03:27.3071670Z ##[group]Run codecov/codecov-action@v1
2020-06-14T03:03:27.3071870Z with:
2020-06-14T03:03:27.3072000Z   file: target/debug/lcov.info
2020-06-14T03:03:27.3072120Z   flags: macos_latest
2020-06-14T03:03:27.3072240Z   name: codecov-umbrella
2020-06-14T03:03:27.3072330Z   fail_ci_if_error: false
2020-06-14T03:03:27.3072440Z env:
2020-06-14T03:03:27.3072560Z   PROJECT_NAME: coreutils
2020-06-14T03:03:27.3072680Z   PROJECT_DESC: Core universal (cross-platform) utilities
2020-06-14T03:03:27.3072800Z   PROJECT_AUTH: uutils
2020-06-14T03:03:27.3072920Z   RUST_MIN_SRV: 1.32.0
2020-06-14T03:03:27.3073040Z   RUST_COV_SRV: 2020-04-29
2020-06-14T03:03:27.3073120Z ##[endgroup]
2020-06-14T03:03:27.6053960Z [command]bash codecov.sh -f target/debug/lcov.info -n codecov-umbrella -F macos_latest
2020-06-14T03:03:27.6420850Z 
2020-06-14T03:03:27.6421380Z   _____          _
2020-06-14T03:03:27.6421530Z  / ____|        | |
2020-06-14T03:03:27.6421670Z | |     ___   __| | ___  ___ _____   __
2020-06-14T03:03:27.6421980Z | |    / _ \ / _` |/ _ \/ __/ _ \ \ / /
2020-06-14T03:03:27.6422170Z | |___| (_) | (_| |  __/ (_| (_) \ V /
2020-06-14T03:03:27.6422320Z  \_____\___/ \__,_|\___|\___\___/ \_/
2020-06-14T03:03:27.6423230Z                               Bash-20200602-f809a24
2020-06-14T03:03:27.6423310Z 
2020-06-14T03:03:27.6423410Z 
2020-06-14T03:03:27.6430530Z e[0;90m==>e[0m GitHub Actions detected.
2020-06-14T03:03:27.6590690Z     e[0;90mproject root:e[0m .
2020-06-14T03:03:27.6795140Z     e[0;90mYaml found at:e[0m .codecov.yml
2020-06-14T03:03:27.7335470Z     e[0;90m->e[0m Found 1 reports
2020-06-14T03:03:27.7336960Z e[0;90m==>e[0m Detecting git/mercurial file structure
2020-06-14T03:03:27.7772100Z e[0;90m==>e[0m Reading reports
2020-06-14T03:03:27.7962640Z     e[0;32m+e[0m target/debug/lcov.info e[0;90mbytes=616226e[0m
2020-06-14T03:03:27.8101070Z e[0;90m==>e[0m Appending adjustments
2020-06-14T03:03:27.8102280Z     e[0;36mhttps://docs.codecov.io/docs/fixing-reportse[0m
2020-06-14T03:03:27.9972430Z     e[0;32m+e[0m Found adjustments
2020-06-14T03:03:28.0027790Z e[0;90m==>e[0m Gzipping contents
2020-06-14T03:03:28.0904070Z e[0;90m==>e[0m Uploading reports
2020-06-14T03:03:28.0904730Z     e[0;90murl:e[0m https://codecov.io
2020-06-14T03:03:28.0905440Z     e[0;90mquery:e[0m branch=t&commit=9e65498530720ea64bf0a3c19d3fec76fc53a792&build=134714735&build_url=http%3A%2F%2Fgithub.com%2Fuutils%2Fcoreutils%2Factions%2Fruns%2F134714735&name=codecov-umbrella&tag=&slug=uutils%2Fcoreutils&service=github-actions&flags=macos_latest&pr=&job=
2020-06-14T03:03:28.0966580Z     e[0;90m->e[0m Pinging Codecov
2020-06-14T03:03:28.0967520Z https://codecov.io/upload/v4?package=bash-20200602-f809a24&token=secret&branch=t&commit=9e65498530720ea64bf0a3c19d3fec76fc53a792&build=134714735&build_url=http%3A%2F%2Fgithub.com%2Fuutils%2Fcoreutils%2Factions%2Fruns%2F134714735&name=codecov-umbrella&tag=&slug=uutils%2Fcoreutils&service=github-actions&flags=macos_latest&pr=&job=
2020-06-14T03:03:28.6303510Z     e[0;90m->e[0m Uploading
2020-06-14T03:09:28.7152230Z     e[0;31mX>e[0m Failed to upload
2020-06-14T03:09:28.7153900Z     e[0;90m->e[0m Sleeping for 30s and trying again...
2020-06-14T03:09:58.8482860Z     e[0;90m->e[0m Pinging Codecov
2020-06-14T03:09:58.8483910Z https://codecov.io/upload/v4?package=bash-20200602-f809a24&token=secret&branch=t&commit=9e65498530720ea64bf0a3c19d3fec76fc53a792&build=134714735&build_url=http%3A%2F%2Fgithub.com%2Fuutils%2Fcoreutils%2Factions%2Fruns%2F134714735&name=codecov-umbrella&tag=&slug=uutils%2Fcoreutils&service=github-actions&flags=macos_latest&pr=&job=
2020-06-14T03:09:59.2535600Z     e[0;90m->e[0m Uploading
2020-06-14T03:15:59.2694110Z     e[0;31mX>e[0m Failed to upload
2020-06-14T03:15:59.2694810Z     e[0;90m->e[0m Sleeping for 30s and trying again...
2020-06-14T03:16:29.3908960Z     e[0;90m->e[0m Pinging Codecov
2020-06-14T03:16:29.3911910Z https://codecov.io/upload/v4?package=bash-20200602-f809a24&token=secret&branch=t&commit=9e65498530720ea64bf0a3c19d3fec76fc53a792&build=134714735&build_url=http%3A%2F%2Fgithub.com%2Fuutils%2Fcoreutils%2Factions%2Fruns%2F134714735&name=codecov-umbrella&tag=&slug=uutils%2Fcoreutils&service=github-actions&flags=macos_latest&pr=&job=
2020-06-14T03:16:29.8620610Z     e[0;90m->e[0m Uploading
2020-06-14T03:22:29.9225550Z     e[0;31mX>e[0m Failed to upload
2020-06-14T03:22:29.9226820Z     e[0;90m->e[0m Sleeping for 30s and trying again...
2020-06-14T03:23:00.0453360Z     e[0;90m->e[0m Pinging Codecov
2020-06-14T03:23:00.0454410Z https://codecov.io/upload/v4?package=bash-20200602-f809a24&token=secret&branch=t&commit=9e65498530720ea64bf0a3c19d3fec76fc53a792&build=134714735&build_url=http%3A%2F%2Fgithub.com%2Fuutils%2Fcoreutils%2Factions%2Fruns%2F134714735&name=codecov-umbrella&tag=&slug=uutils%2Fcoreutils&service=github-actions&flags=macos_latest&pr=&job=
2020-06-14T03:23:00.5653810Z     e[0;90m->e[0m Uploading
2020-06-14T03:29:00.6543550Z     e[0;31mX>e[0m Failed to upload
2020-06-14T03:29:00.6544350Z     e[0;90m->e[0m Sleeping for 30s and trying again...
2020-06-14T03:29:30.7744090Z     e[0;90m->e[0m Uploading to Codecov
2020-06-14T03:30:14.1577700Z ##[error]The operation was canceled.
2020-06-14T03:30:14.1595320Z Cleaning up orphan processes

Hi @rivy, thanks for this feedback. I’m not exactly sure why this is happening, but this is mostly likely due to timeouts. I’m making some changes to the bash uploader to help making debugging this easier. Thanks for your patience here.

Hi @rivy, sorry for the delay here. I added some more logging to see why this might be happening. Are you able to pull up another recent occurrence?

Hi @tom Here is a super recent occurrence as of few hours ago:
https://github.com/mmoayyed/cas/runs/821405506?check_suite_focus=true

If you’d like further help with troubleshooting this, please let me know. Happy to help.

Hi @mmoayyed, I just released 1.0.8 of the GitHub action. Would you mind trying it and seeing if you still run into issues?

Certainly. Will do that now. Thank you.

I have made the upgrade and pushed the change to run the action and test again. I will post back once the results are out.

Here is another timeout error after having switched to 1.0.8:
https://github.com/mmoayyed/cas/runs/827167930?check_suite_focus=true

2020-07-01T15:29:38.5846998Z ##[group]Run codecov/codecov-action@v1.0.8
2020-07-01T15:29:38.5847269Z with:
2020-07-01T15:29:38.5847412Z   flags: ehcache
2020-07-01T15:29:38.5847595Z   fail_***_if_error: true
2020-07-01T15:29:38.5847729Z env:
2020-07-01T15:29:38.5847886Z   JAVA_OPTS: -Xms512m -Xmx6048m -Xss128m -XX:ReservedCodeCacheSize=512m -server -XX:+UseG1GC
2020-07-01T15:29:38.5848181Z   GRADLE_OPTS: -Xms512m -Xmx6048m -Xss128m -XX:ReservedCodeCacheSize=512m -server -XX:+UseG1GC
2020-07-01T15:29:38.5848456Z   TERM: xterm-256color
2020-07-01T15:29:38.5848571Z   SONATYPE_USER: 
2020-07-01T15:29:38.5848686Z   SONATYPE_PWD: 
2020-07-01T15:29:38.5848798Z   GH_PAGES_TOKEN: 
2020-07-01T15:29:38.5848926Z   RENOVATE_TOKEN: 
2020-07-01T15:29:38.5849050Z   GRADLE_BUILDCACHE_USER: ***
2020-07-01T15:29:38.5849176Z   GRADLE_BUILDCACHE_PSW: ***
2020-07-01T15:29:38.5849296Z   PYTHON_VERSION: 3.8.2
2020-07-01T15:29:38.5849415Z   JDK_CURRENT: 11.0.7
2020-07-01T15:29:38.5849531Z   JDK_LATEST: 14
2020-07-01T15:29:38.5849723Z   JAVA_HOME: /opt/hostedtoolcache/jdk/11.0.7/x64
2020-07-01T15:29:38.5849857Z   JAVA_HOME_11.0.7_x64: /opt/hostedtoolcache/jdk/11.0.7/x64
2020-07-01T15:29:38.5849981Z ##[endgroup]
2020-07-01T15:31:48.7677876Z /home/runner/work/_actions/codecov/codecov-action/v1.0.8/dist/index.js:2362
2020-07-01T15:31:48.7678146Z       throw error;
2020-07-01T15:31:48.7678302Z       ^
2020-07-01T15:31:48.7678426Z 
2020-07-01T15:31:48.7678582Z Error: connect ETIMEDOUT 35.199.43.247:443
2020-07-01T15:31:48.7678760Z     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1129:14) {
2020-07-01T15:31:48.7679080Z   errno: 'ETIMEDOUT',
2020-07-01T15:31:48.7679353Z   code: 'ETIMEDOUT',
2020-07-01T15:31:48.7679625Z   syscall: 'connect',
2020-07-01T15:31:48.7679911Z   address: '35.199.43.247',
2020-07-01T15:31:48.7680058Z   port: 443
2020-07-01T15:31:48.7680198Z }

Hi @mmoayyed, I made a mistake and have published 1.0.9 . Let me know if that works for you.

Thanks again. Updated to 1.0.9 and pushed. Will post back when the results are out.

After 1.0.9, almost all jobs are failing with this error:

2020-07-01T17:13:38.9172336Z /home/runner/work/_actions/codecov/codecov-action/v1.0.9/dist/index.js:16575
2020-07-01T17:13:38.9172601Z 	throw e;
2020-07-01T17:13:38.9172758Z 	^
2020-07-01T17:13:38.9172856Z 
2020-07-01T17:13:38.9173208Z Error: Cannot find module 'retry-request'
2020-07-01T17:13:38.9173664Z     at webpackEmptyContext (/home/runner/work/_actions/codecov/codecov-action/v1.0.9/dist/index.js:16573:9)
2020-07-01T17:13:38.9174160Z     at Object.104 (/home/runner/work/_actions/codecov/codecov-action/v1.0.9/dist/index.js:2027:41)
2020-07-01T17:13:38.9174640Z     at __webpack_require__ (/home/runner/work/_actions/codecov/codecov-action/v1.0.9/dist/index.js:22:30)
2020-07-01T17:13:38.9175120Z     at startup (/home/runner/work/_actions/codecov/codecov-action/v1.0.9/dist/index.js:37:19)
2020-07-01T17:13:38.9175563Z     at /home/runner/work/_actions/codecov/codecov-action/v1.0.9/dist/index.js:41:18
2020-07-01T17:13:38.9176030Z     at Object.<anonymous> (/home/runner/work/_actions/codecov/codecov-action/v1.0.9/dist/index.js:44:10)
2020-07-01T17:13:38.9176254Z     at Module._compile (internal/modules/cjs/loader.js:959:30)
2020-07-01T17:13:38.9176466Z     at Object.Module._extensions..js (internal/modules/cjs/loader.js:995:10)
2020-07-01T17:13:38.9176677Z     at Module.load (internal/modules/cjs/loader.js:815:32)
2020-07-01T17:13:38.9176881Z     at Function.Module._load (internal/modules/cjs/loader.js:727:14) {
2020-07-01T17:13:38.9177220Z   code: 'MODULE_NOT_FOUND'
2020-07-01T17:13:38.9177388Z }
2020-07-01T17:13:38.9275188Z Post job cleanup.

See https://github.com/mmoayyed/cas/runs/827445189?check_suite_focus=true

Hi @mmoayyed thanks for your help here. I’ll be publishing 1.0.10 tomorrow which I think will fix this issue.

Sounds good. Look forward to it.

Hi @mmoayyed, we were able to push out 1.0.10. Would you be able to let me know if that fixes your issue?

Thanks. I have made the upgrade and can happily report that so far no timeout issues are popping up. I’ll run the CI a few more times to stress this some more, and otherwise, I’d say this looks good.

Thanks very much for your time and patience on this!

1 Like

I’ve reviewed the last few builds and it looks like the issue is resolved.

Thanks @tom!

1 Like

Glad to hear it worked for you @rivy and @mmoayyed!

I may not be completely out of the woods yet. With the most recent version, I am seeing this:

2020-07-06T10:10:49.5044629Z Gradle build finished successfully.
2020-07-06T10:10:49.5252932Z ##[group]Run codecov/codecov-action@v1.0.10
2020-07-06T10:10:49.5253157Z with:
2020-07-06T10:10:49.5253336Z   flags: dynamodb
2020-07-06T10:10:49.5253556Z   fail_***_if_error: true
2020-07-06T10:10:49.5253745Z env:
2020-07-06T10:10:49.5253955Z   JAVA_OPTS: -Xms512m -Xmx6048m -Xss128m -XX:ReservedCodeCacheSize=512m -server -XX:+UseG1GC
2020-07-06T10:10:49.5254188Z   GRADLE_OPTS: -Xms512m -Xmx6048m -Xss128m -XX:ReservedCodeCacheSize=512m -server -XX:+UseG1GC
2020-07-06T10:10:49.5254404Z   TERM: xterm-256color
2020-07-06T10:10:49.5254573Z   SONATYPE_USER: 
2020-07-06T10:10:49.5254751Z   SONATYPE_PWD: 
2020-07-06T10:10:49.5254912Z   GH_PAGES_TOKEN: 
2020-07-06T10:10:49.5255074Z   RENOVATE_TOKEN: 
2020-07-06T10:10:49.5255268Z   GRADLE_BUILDCACHE_USER: ***
2020-07-06T10:10:49.5255443Z   GRADLE_BUILDCACHE_PSW: ***
2020-07-06T10:10:49.5255608Z   PYTHON_VERSION: 3.8.2
2020-07-06T10:10:49.5255767Z   JDK_CURRENT: 11.0.7
2020-07-06T10:10:49.5255919Z   JDK_LATEST: 14
2020-07-06T10:10:49.5256139Z   JAVA_HOME: /opt/hostedtoolcache/jdk/11.0.7/x64
2020-07-06T10:10:49.5256333Z   JAVA_HOME_11.0.7_x64: /opt/hostedtoolcache/jdk/11.0.7/x64
2020-07-06T10:10:49.5256506Z ##[endgroup]
2020-07-06T10:22:01.5344179Z (node:4977) UnhandledPromiseRejectionWarning: Error: connect ETIMEDOUT 35.199.43.247:443
2020-07-06T10:22:01.5344564Z     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1129:14)
2020-07-06T10:22:01.5344927Z (node:4977) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
2020-07-06T10:22:01.5346719Z (node:4977) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

When I look at the full log, I do see the relevant tests were executed, yet coverage percentage remains at 0%. So issues with timeouts are solved, but it does not seem like results are in fact successfully uploaded?

If I simply restart the same job, at the next run, I see coverage improved or back to the expected value.

Have I missed something?

@mmoayyed, would you be able to link to the GH Action build?

Certainly. Please see:
https://github.com/mmoayyed/cas/runs/840947054?check_suite_focus=true

We have a Gradle build that runs all tests for a specific category (DynamoDb in this case). In summary:

  1. DynamoDb is started as a Docker container
  2. All tests taggd with DynamoDb are executed using a specific Gradle task
  3. Results in the end are collected with jacoco at the root and published via the action.

As an example, I can see that all tests for cas-server-support-audit-dynamodb were executed using the testDynamoDb task:

2020-07-06T10:01:48.4585095Z > Task :support:cas-server-support-audit-dynamodb:testDynamoDb
2020-07-06T10:01:48.4585410Z 
2020-07-06T10:01:48.4585943Z Results for test suite [Gradle Test Run :support:cas-server-support-audit-dynamodb:testDynamoDb]: SUCCESS @ Mon Jul 06 10:01:48 UTC 2020 
2020-07-06T10:01:48.4586205Z (1 tests, 1 successes, 0 failures, 0 skipped)
2020-07-06T10:01:48.4586336Z 
2020-07-06T10:01:48.4586844Z Finished generating test XML results (0.0 secs) into: /home/runner/work/cas/cas/support/cas-server-support-audit-dynamodb/build/test-results/testDynamoDb
2020-07-06T10:01:48.4587090Z Generating HTML test report...
2020-07-06T10:01:48.4587600Z Finished generating test html results (0.006 secs) into: /home/runner/work/cas/cas/support/cas-server-support-audit-dynamodb/build/reports/tests/testDynamoDb
2020-07-06T10:01:48.5586593Z Stored cache entry for task ':support:cas-server-support-audit-dynamodb:testDynamoDb' with cache key a9056cf362105eb22ac95fa7e1210d6e
2020-07-06T10:01:48.5587730Z :support:cas-server-support-audit-dynamodb:testDynamoDb (Thread[Execution worker for ':',5,main]) completed. Took 2 mins 0.146 secs.

…yet their coverage remains at 0%:

https://codecov.io/gh/mmoayyed/cas/tree/d59b0fa187dde31e869b159d69ddd785606595cd/support/cas-server-support-audit-dynamodb/src/main/java/org/apereo/cas