Go Back

lakeFS

lakefs.io

lakeFS is a data version control platform that applies Git-like operations to data lakes, enabling isolated development, parallel experimentation, and reproducibility for machine learning and AI projects. It helps teams manage data quality, streamline MLOps workflows, and ensure data consistency across various environments.

Features
8/31
See all

Must Have

2 of 9

Cloud Storage Integration

Privacy Controls

AI File Chat

Automated Sorting Rules

Semantic Search

Automated Folder Organization

Conversational AI Interface

File Editing & Renaming

User Feedback Learning

Other

6 of 22

Manual Approval Workflow

Local File Access

Multi-User Collaboration

Enterprise SSO & Compliance

Centralized Team Billing

Data Encryption & Security

Feedback-Driven Refinement

Demo Mode

Usage Credits & Quotas

Advanced AI Model

Cloud Storage Integrations

Local File System Access

File Cleaning & Deduplication

Content-based Q&A

Security & Privacy Controls

Version History

Multi-tier Pricing Plans

User Roles & Permissions

Cross-platform Support

Bulk Operations & Batch Processing

Customizable Sorting Rules

Notifications & Reminders

Pricing
Tiered
See all

Open Source

$0.00 one time
Popular
  • Format-Agnostic Data Version Control
  • Cloud-Agnostic
  • Zero Clone copy for isolated environment (via branches)
  • Atomic Data Promotion (via merges)
  • Data Stays in One Place
  • Configurable Garbage Collection
  • Data CI/CD Using lakeFS Hooks
  • Integrates with Your Data Stack
  • Run locally

Enterprise

Custom
Popular
  • All Open Source features
  • Role-Based Access Control (RBAC)
  • Single Sign On (SSO)
  • SCIM Support
  • IAM Roles
  • Mount Capability
  • Audit Logs
  • Transactional Mirroring
  • Simplified Garbage Collection (Managed or Standalone)
  • SOC2
  • Support SLA
Rationale

lakeFS is a data version control system for data lakes, offering Git-like operations for data. While it provides versioning, branching, and merging for data, which can help with reproducibility and collaboration in ML/AI workflows, it does not explicitly offer AI-powered file organization features like AI file chat, semantic search, or automated sorting rules for general file management. Its focus is on data versioning for large datasets, particularly for ML and data engineering, rather than intelligent file organization for individuals or small teams.