Managing and Monitoring our Data Pipelines
Data Operations (DataOps) at Crux
Beyond the core Crux Deliver feature of building data pipelines, a critical element of our value proposition is ensuring that the data pipelines are well-maintained. We sat down with Tim Marrin, Director of Data Operations to learn how they keep over 1,000 dataset pipelines running smoothly.
First off, what is the scale of what Crux is doing?
Did you know that the Crux team currently processes tens of thousands of data instances (or ingestion pipelines) over a 24-hour period? Each of these data instances can have multiple discrete tasks and some data instances can even have over a hundred discrete tasks. That’s not an insignificant number of activities and processes running through our pipelines.
What is the role of DataOps in this?
One of the jobs of our DataOps team is that it monitors these tens of thousands of ingestion pipelines per day. Pipelines are comprised of tasks that are orchestrated and run on cloud-based microservices that are also monitored by the team to ensure consistent, reliable and fast processing delivery of data.
What is the full range of services that Crux DataOps provides?
The easiest way will be to bullet this out. We have:
- 24/7 global support that monitors all data feeds
- Cloud-based big-data stores and microservices for data processing and storage
- Dedicated telephone, email, ticket-based support based on client preference
- Incident and problem management with regular status updates to clients
- Identification and proactive planning for format and schema changes with suppliers and data consumers
- Fully managed and automated CI/CD infrastructure with canary deployments
- Automated notifications through various delivery methods to show data availability and metadata
- Full support and monitoring of supplier data feeds – handle support of suppliers on behalf of data consumers
- Full transparency with reporting and analytics on production incidents and outages
What are example errors that the DataOps team monitors for?
This is a question that comes up with our clients and prospects and these are the ones that we lean on:
- Data validation issues
- Invalid schema
- Invalid datatype
- Missing values
- File delivery timing issues
- Market and Calendar holiday related failures
- Remote source not available (supplier ftp, supplier late with files, incomplete files, etc.)
- Schema issues due to unscheduled changes
What are some of the challenges that our DataOps team is working on?
The big challenge is supporting our clients, platform and data simultaneously while we scale the number of data instances that we’re continuing to monitor. As we grow, the complexity increases with different instances, consumptions methods and SLAs we work with. These challenges can literally keep me up at night, but also keep us at the top of our game.
How does this compare to what you were doing previously?
I spent 15 years working on trading floors, most recently with the Electronic Trading SRE team at Goldman Sachs. We faced similar issues of scale, managing complex systems, real-time processing, and high throughput. I honestly find myself drawing upon that experience on a daily basis as we try to solve for even greater challenges here at Crux.
With this complexity, why should a firm work with us to outsource their data pipeline operations?
We’re leveraging performant big data and cloud solutions combined with best practices in ITSM (IT Service Management) to provide a very high level of services and technical solutions that is unique. The challenges of scaling to manage the volume of datasets is what’s driving the industry interest in Crux and the DataOps team is meeting the challenge through innovative engineering solutions. I’m really proud of our ability to solve these technical challenges while providing white-glove client experience in a startup.
To learn more about our Crux DataOps and get started, fill out this form: