Skip to content

feat: auto-downgrading revision #4678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 123 commits into
base: main
Choose a base branch
from

Conversation

korolenkowork
Copy link
Contributor

@korolenkowork korolenkowork commented May 1, 2025

Closes #4665

πŸ“‘ Description

This PR introduces automatic database downgrading support when rolling back KeepHQ to a previous version.

βœ… Checks

  • Main logic completed
  • I have updated the documentation as required
  • New logic covered by test
  • E2E is written
  • All the tests have passed

β„Ή Additional Information

  • From this commit onward, all migration files will be copied to the SECRET_MANAGER_DIRECTORY to enable safe database downgrades during rollback scenarios.
  • A new environment variable ALLOW_DB_DOWNGRADE has been added to explicitly allow or prevent database downgrades, reducing the risk of accidental downgrades.

Copy link

vercel bot commented May 1, 2025

@EnotShow is attempting to deploy a commit to the KeepHQ Team on Vercel.

A member of the Team first needs to authorize it.

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. Feature A new feature labels May 1, 2025
@korolenkowork korolenkowork marked this pull request as draft May 1, 2025 21:23
@shahargl
Copy link
Member

shahargl commented May 2, 2025

hey @EnotShow - thanks for this PR!

as this PR introduces some major changes to how Keep works, I thought I would add some e2e tests for it (mainly install Keep, do some migration, then downgrade). wdyt?

Copy link
Member

@shahargl shahargl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

Copy link

codecov bot commented May 3, 2025

Codecov Report

Attention: Patch coverage is 8.45070% with 65 lines in your changes missing coverage. Please review.

Project coverage is 46.03%. Comparing base (1ae49bf) to head (f1328ef).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
keep/api/core/db_on_start.py 8.45% 65 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4678      +/-   ##
==========================================
- Coverage   46.19%   46.03%   -0.16%     
==========================================
  Files         165      165              
  Lines       17495    17564      +69     
==========================================
+ Hits         8081     8086       +5     
- Misses       9414     9478      +64     

β˜” View full report in Codecov by Sentry.
πŸ“’ Have feedback on the report? Share it here.

πŸš€ New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • πŸ“¦ JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels May 19, 2025
@korolenkowork
Copy link
Contributor Author

Hey @shahargl! Thx for your review! I made all the required changes, so it's ready for re-review :)

@korolenkowork korolenkowork requested a review from shahargl May 19, 2025 13:37
@shahargl
Copy link
Member

@EnotShow let tests pass and if they will I'll review

@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels May 26, 2025
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels May 29, 2025
@korolenkowork
Copy link
Contributor Author

Hey @shahargl, it's resolved!

I also added a separate Docker Compose file for E2E migrations testing, which attaches a dedicated volume. Because it seems that accessing the same volume from different workflows was causing concurrent map writes, like this.

Copy link

cursor bot commented Jun 14, 2025

🚨 BugBot couldn't run

Pull requests from forked repositories are not yet supported (requestId: serverGenReqId_48ddba26-49ed-4dcf-a98c-b6051afed68d).

cursor[bot]

This comment was marked as outdated.

@tuantran0910
Copy link
Contributor

Hi @shahargl, @talboren, can we get this PR merged ? I think it would be a good feature and we are planning to deploy Keep in our production environment.

cursor[bot]

This comment was marked as outdated.

@korolenkowork
Copy link
Contributor Author

@shahargl @talboren any updates on this?

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Migration Function Fails with Incomplete Cleanup

The copy_migrations function has insufficient error handling and an incomplete cleanup mechanism. If os.makedirs fails, the function continues execution, causing subsequent file operations to fail. Furthermore, the cleanup logic only removes files and symlinks, leaving subdirectories. This incomplete cleanup, coupled with clearing the destination before verifying the source, risks data loss of previous backups if the subsequent copy operation fails or conflicts arise.

keep/api/core/db_on_start.py#L183-L204

# Ensure destination exists
try:
os.makedirs(local_migrations_path, exist_ok=True)
except Exception as e:
logger.error(f"Failed to create local migrations folder with error: {e}")
# Clear previous versioned migrations to ensure only migrations relevant to the current version are present
for filename in os.listdir(local_migrations_path):
file_path = os.path.join(local_migrations_path, filename)
if os.path.isfile(file_path) or os.path.islink(file_path):
os.remove(file_path)
# Alembic needs the full migration history to safely perform a downgrade to earlier versions
# Copy new migrations
for item in os.listdir(source_versions_path):
src = os.path.join(source_versions_path, item)
dst = os.path.join(local_migrations_path, item)
if os.path.isdir(src):
shutil.copytree(src, dst, dirs_exist_ok=True)
else:
shutil.copy(src, dst)

Fix in Cursor


Bug: Environment Variable Name Discrepancy

Environment variable name mismatch: The docker-compose files set MIGRATION_PATH (singular), but the Python code in keep/api/core/db_on_start.py expects MIGRATIONS_PATH (plural). This causes the application to ignore the intended /tmp/migrations path and fall back to the default /tmp/keep/migrations for database migrations.

tests/e2e_tests/docker-compose-e2e-postgres.yml#L41-L42

- SECRET_MANAGER_DIRECTORY=/app
- MIGRATION_PATH=/tmp/migrations

tests/e2e_tests/docker-compose-e2e-postgres-migrations.yml#L40-L41

- SECRET_MANAGER_DIRECTORY=/app
- MIGRATION_PATH=/tmp/migrations

Fix in Cursor


Bug: Database Migration Error Handling Flaws

The migrate_db function incorrectly assumes any exception during alembic.command.upgrade() indicates the database needs a downgrade, potentially triggering inappropriate downgrade attempts for unrelated issues (e.g., connectivity, permissions, syntax errors). Additionally, copy_migrations is unconditionally called at the end of migrate_db, which can corrupt the local migration backup if a downgrade operation failed, preventing future successful downgrades.

keep/api/core/db_on_start.py#L275-L289

logger.info("Running migrations...")
try:
alembic.command.upgrade(config, "head")
except Exception as e:
logger.error(f"{e} it's seems like Keep was rolled back to a previous version")
if not os.getenv("ALLOW_DB_DOWNGRADE", "false") == "true":
logger.error(f"ALLOW_DB_DOWNGRADE is not set to true, but the database schema ({current_revision}) doesn't match application version ({expected_revision})")
raise RuntimeError("Database downgrade is not allowed")
logger.info("Downgrading database schema...")
downgrade_db(config, expected_revision, local_migrations_path, app_migrations_path)
# Copy migrations to local folder for safe downgrade purposes
copy_migrations(app_migrations_path, local_migrations_path)

Fix in Cursor


Bug: Missing Step ID Causes Empty Revision Output

The workflow references steps.get_revision.outputs.revision to set the WORKFLOW_REVISION environment variable. However, no step in the workflow has the ID get_revision. This prevents the revision output from being captured, resulting in an empty WORKFLOW_REVISION and breaking the intended revision comparison logic.

.github/workflows/run-migrations-e2e-tests.yml#L335-L336

env:
WORKFLOW_REVISION: ${{ steps.get_revision.outputs.revision }}

.github/workflows/run-migrations-e2e-tests.yml#L479-L480

env:
WORKFLOW_REVISION: ${{ steps.get_revision.outputs.revision }}

Fix in Cursor


Was this report helpful? Give feedback by reacting with πŸ‘ or πŸ‘Ž

@shahargl
Copy link
Member

shahargl commented Jul 3, 2025

Hey @korolenkowork, honestly it looks good but it requires some big effort for me to review it’s ready. This one touches the main flow so I’m bit worried.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Improvements or additions to documentation Feature A new feature size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[βž• Feature]: Implement "auto downgrades"
3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy