BugZero | Terraform BugID 37324 - Lock timeout fails when lockfile is deleted from S...

Terraform - Defect ID: 37324

Lock timeout fails when lockfile is deleted from S3 after failing to acquire lock

Terraform - Defect ID: 37324

Lock timeout fails when lockfile is deleted from S3 after failing to acquire lock

Last updated on July 21st, 2025

BugZero Risk Score
8.5 High

Overall: 8.5

Status: 10.0

Community: 3.7

What is the BugZero Risk Score?

Terraform Integration

Learn more about where this data comes from

Terraform Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Bug Details

Priority: Unspecified
Status: Open

Description

### Terraform Version ```shell 1.12.2 ``` ### Terraform Configuration Files ```terraform terraform { backend "s3" { bucket = "your-terraform-state-bucket" key = "path/to/your/statefile.tfstate" region = "us-east-1" encrypt = true use_lockfile = true } } ``` ### Debug Output I've only reproduced this in a production pipeline when triggering multiple concurrent builds, where it's not that simple to get debug logs. I can try to reproduce this in a separate environment if truly necessary to get debug logs, but I think the issue is understandable just from the info logs. ``` Initializing the backend... Initializing modules... Initializing provider plugins... - Reusing previous version of hashicorp/aws from the dependency lock file - Installing hashicorp/aws v5.80.0... - Installed hashicorp/aws v5.80.0 (unauthenticated) Terraform has been successfully initialized! Acquiring state lock. This may take a few moments... Error: Error acquiring the state lock Error message: operation error S3: PutObject, https response error StatusCode: 409, RequestID: , HostID: , api error ConditionalRequestConflict: The conditional request cannot succeed due to a conflicting operation against this resource. unable to retrieve file from S3 bucket 'example-terraform-state-bucket' with key 'clusters/staging/cluster-name/us1/security-groups/terraform.tfstate.tflock': operation error S3: GetObject, https response error StatusCode: 404, RequestID: , HostID: , NoSuchKey: Terraform acquires a state lock to protect the state from being written by multiple users at the same time. Please resolve the issue above and try again. For most commands, you can disable locking with the "-lock=false" flag, but this is not recommended. ``` ### Expected Behavior After failing to acquire the lock, Terraform should retry acquiring the lock until the lock timeout. ### Actual Behavior When the lockfile cannot be found, the command fails immediately without any retries. ### Steps to Reproduce Run `terraform plan -lock-timeout=30m -input=false` several times concurrently. Higher concurrency increases the chance of it happening. Likely the plan also needs to complete very quickly. ### Additional Context From reading the code that implements the state locking for S3, I believe this is what is happening: - Invocation A successfully acquires the lock - Invocation B fails to acquire the lock - Before invocation B does the GetObject to get information about the lock, invocation A releases the lock, [deleting the lockfile](https://github.com/hashicorp/terraform/blob/v1.12.2/internal/backend/remote-state/s3/client.go#L543-L546). - Invocation B now [fails to get information about the lock](https://github.com/hashicorp/terraform/blob/v1.12.2/internal/backend/remote-state/s3/client.go#L680), which [causes the retrier to bail out without performing any retries](https://github.com/hashicorp/terraform/blob/main/internal/states/statemgr/locker.go#L90-L94). In our case, this fails pretty consistently in CI when several pull requests are opened concurrently (typically via automation) and several plans are done within the pipeline. ### References _No response_ ### Generative AI / LLM assisted development? _No response_

Relevant Products

Click on a version to see all relevant bugs

Affected versions:1.12.2

Fixed versions: No known fixed versions

Relevant Products

Click on a version to see all relevant bugs

Affected versions:1.12.2

Fixed versions: No known fixed versions

Top Terraform Defects

8.5Defect ID: 38580
panic: runtime error: invalid memory address or nil pointer dereference
8.5Defect ID: 38641
JSON Plan: resources being removed from state ('forget') are included in planned_values
8.5Defect ID: 38638
s3 module source does not support AWS_PROFILE authentication
8.5Defect ID: 38627
Terraform crash on empty module source
8.5Defect ID: 38629
Support `null` module version

Ready to prevent the next vendor outage?

Get a demo

Terraform - Defect ID: 37324

Lock timeout fails when lockfile is deleted from S3 after failing to acquire lock

Terraform - Defect ID: 37324

Lock timeout fails when lockfile is deleted from S3 after failing to acquire lock

Last updated on July 21st, 2025

BugZero Risk Score8.5 High

Bug Details

Top Terraform Defects

Ready to prevent the next vendor outage?

Links

BugZero Risk Score
8.5 High