Filevine is forging the future of legal work with cloud-based workflow tools. We have a reputation for intuitive, streamlined technology that helps professionals manage their organization and serve their clients better. We’re also known for our team of extraordinary and passionate professionals who love working together to help organizations thrive. Our success has catapulted Filevine to the forefront of our field—we are ranked as one of the most innovative and fastest-growing technology companies in the country by both Deloitte and Inc.
Our Mission
Filevine is building the seamless intersection between legal and business by creating a world- class platform to help professionals scale.
We’re a team of driven, enthusiastic problem solvers with strong backgrounds in data engineering, machine learning, product management, legal, and operations all working to help attorneys resolve cases faster, for better outcomes. With one of the largest proprietary datasets in the legal industry spanning documents, notes, communications, billing, deadlines, and calendar information we’re now focused on building robust data infrastructure to support our growing AI, analytics, and product intelligence initiatives.
Key Responsibilities
- Develop and manage data ingestion from diverse structured and unstructured sources (documents, communications, billing, operational data, etc.)
- Ensure high data quality, integrity, and consistency across Snowflake and other data systems
- Collaborate with ML engineers and data scientists to enable efficient model training, evaluation, and monitoring workflows
- Design and optimize Snowflake data models, schemas, and transformation pipelines using modern ELT practices
- Implement robust data monitoring, logging, and alerting to ensure reliability and visibility
- Support cost optimization, security, and governance initiatives within Filevine’s cloud data infrastructure
- Contribute to internal data tooling that improves developer experience, observability, and overall data reliability
- Design, build, and maintain scalable data pipelines and ETL processes to support machine learning, analytics, and business intelligence use cases
Requirements
- 3+ years of experience in data engineering or related roles
- Strong proficiency in Python and SQL
- Experience with Snowflake or similar data warehouse platforms. Expertise in, query performance tuning, data modeling, and data transformation workflows
- Familiarity with modern data orchestration tools such as AWS Kinesis, Firehose, and Eventbus.
- Experience with AWS services for data storage and computation
- Solid understanding of ETL/ELT best practices, data quality management, and CI/CD integration for data pipelines
- Experience with work with Vector Databases is a benefit
- Excellent communication skills and ability to collaborate across ML, engineering, and product teams
- Experience with large or sensitive datasets (legal, financial, or medical) is a plus
- Experience with CI/CD deployment processes. Experience with terraform is a plus.