We’re delighted to welcome SafeGraph’s geolocation data products to the Crux platform! SafeGraph is a geolocation data provider that maintains ground truth datasets for human movement, physical places, and visits to places of interest. Connect to SafeGraph or learn more here.
In the world of Data Engineering, the acronym “ETL” is standard short-hand for the process of ingesting data from some source, structuring it, and loading it into a database from which you’ll then use that data. ETL stands for “Extract” (ingest, such as picking up files from an FTP site or hitting a data vendor’s API), “Transform” (change the structure of the data into a form that is useful for you), and “Load” (store the transformed data into a database for your own use).
At Crux, where we are serving the needs of many clients, our approach is slightly different, designed both to gain the economies of scale that we can leverage as an industry utility, serving multiple clients across multiple sources of data, and to give individual clients the customized experience that they’d expect from a managed service to meet the needs of their specific use cases.
“EVLS TTT” stands for Extract, Validate, Load, Standardize, Transform, Transform, Transform… “Extract” here is the same as above (ingest data from its source). “Validate” means to check the data as it comes in to make sure that it is accurate and complete. If a data update fails any of the Validation tests we run on it, our Data Operators are notified and corrective action is taken immediately. Examples of validations include ensuring that an incoming data update matches the expected schema, ensuring the values in a column conform to expected min/max ranges (for example should numbers in this column always be positive), or making sure that enumerated types (such as Country) fall within the accepted set of values. Validations also test data coverage: does the date range in the update conform to what is expected? If the dataset is supposed to cover companies in the S&P 500, does it in fact cover all those companies, no more, no less? Validations also look for unlikely spikes and jumps in continuous data, identifying outliers for closer examination by our Data Operators.
“Load” is as above (store the data into a database). “Standardize” is a set of special Transformations that get the data into an industry-standard form that makes the data easy to understand, easy to join across multiple datasets from multiple vendors, easy to compare, etc. Examples of standardization include data shape (such as unstacking data and storing it all as point-in-time), entity mappings (such as security identifiers), and data formats (such as using ISO datetimes). We do of course store the raw data exactly as it comes from the data supplier, which customers can access as easily as the cleaned/standardized version.
We leave a gap between the “S” and the “TTT…” because at that point the data has been loaded, checked, and put into a form that should suit the needs of most customers as is. We call this spot “the Plane of Standards”. The “TTT…” are any number of specific Transformations that are required by individual clients to suit their own use cases (mapping to an internal security master, joining several sets of data together, enriching data with in-house metadata, etc). Those Ts give clients the ability to have a customized experience when using Crux.
The goals of all this are to be able to gain economies of scale and mutualize the effort and cost of the commoditized parts of the process, making it cheaper and faster for everyone, while still giving clients the ability to get a bespoke output to serve their individual use cases. Crux does not license, sell, or re-sell any data itself: Data Suppliers license their products directly to customers, maintain direct relationships with those clients, and control who has access their data on the Crux platform. We work in partnership with Data Suppliers to help make their data delightful for our mutual clients.
Hello from the CEO
The Crux team has ramped up our rhythm, and our efforts are reverberating in the marketplace. As you may have heard, Citi invested $5 million in Crux earlier this year. This support from Citi affirms the resonance of our services for leading companies. Helping people work in harmony with data creates meaningful movement forward.
My musical metaphors here are intentional. An unknown fact about me: In addition to several decades working in finance and tech, I’ve learned a lot about business from many years of composing music.
A composer has to keep their vision clear in their head as they work, and make sure that each element they develop truly supports it. While building a rhythm that flows and uncovering the notes that resonate, a composer relates these technical processes back to the larger purpose of the work.
I founded Crux with a meaningful mission: to create harmony between people and data. Now my role is to make sure that every element of our business works in harmony to achieve that goal. Every day, I conduct our talent, skills, and resources to flow together towards our purpose. When I face a challenge, I remind myself how it fits into the big picture and consider how my decision today can affect our company’s core goal.
Crux can orchestrate your data supply chain to deliver more value to you. We help data flow in a way that enriches everyone. That’s music to our ears, and we hope to yours too.
Crux Insights Blog: Core Access Service
Are you curious? Do you love solving complex problems and delivering white glove service to clients. If so, let’s talk. The Crux team is growing and seeking highly skilled talent that believes data can be delightful. Check out our job postings here.
Have data to share? Our data supplier community is growing by leaps and bounds. Our diverse datasets range from stock quotes to corporate trends to transportation data and more. No data is irrelevant. Check out our network and create a profile of your own.
Out and About
STONE POINT CAPITAL FINTECH SYMPOSIUM – April 5 | The Park Hyatt, NYC
BENZINGA GLOBAL FINTECH AWARDS – May 15-16 | New World Stages, NYC
MBA’S NATIONAL SECONDARY MARKET CONFERENCE & EXPO 2018 – May 20-23 | New York Marriott Marquis, NYC
DATADISRUPT – May 22-24 | Lerner Hall, NYC
Hello from the CEO
While Data Science gets the headlines, Data Engineering is working hard behind the scenes to make the Data Science magic possible. And by working hard, I mean that Data Engineering typically accounts for 70-80% of the total effort a firm spends on making use of data. Data Science and the unique insights it delivers are business differentiators, but most firms spend a minority of the time on them.
That’s why forward-looking companies increasingly turn to a partner like Crux. By offloading their Data Engineering work, these companies give more time and energy to Data Science and move much more quickly to produce valuable new insights that power their businesses.
Crux brings laser focus, deep expertise, operational oversight, and a valuable network of data suppliers to help you orchestrate, implement, and operate your information supply chain.
At Crux, we make data delightful.
Crux Insights Blog
Five in 5 with Head of Data Engineering Andrew Clark
At Crux, being a data engineer means handling the tough work that makes data more actionable for our clients, and designing the tools that make our clients’ lives easier over time. Data engineers sit on the “data wrangling” side of the pipeline, meaning we are the folks who handle the hard work of figuring out where certain elements of the dataset live, slicing and dicing data, and repackaging it for distribution.
Today, the folks managing information supply chains are embracing the fact that the whole process does not need to exist on-premises anymore. While firms used to believe their data engineering was their “secret sauce”, today they realize it’s the insights they can glean that are more important. Using experts like Crux to remove as much of the tedious, upfront work as possible is now the preferred model.
Is it difficult to get access to useable data? Let Crux experts engineer your data to make it ready to use. Our data engineers take on your data challenges so that you can spend your time finding signals. Click HERE to chat with our team of experts.
Have data to share? Our data supplier community is growing by leaps and bounds. Our diverse datasets range from stock quotes to corporate trends to transportation data and more. No data is irrelevant. Create a Crux login HERE to browse our network and become a supplier.
Out and About
We’ve been building our community. In the past month, we’ve met with hundreds of suppliers and buyers of alternative data.
Quandl Alternative Data Conference | January 18, 2018
New York, NY
Battlefin Discovery Day Miami | January 30-31, 2018
Outsell Data Money | February 1, 2018
New York, NY
AI in Fintech Forum | February 8, 2018
Stanford University, Stanford, CA
Originally published November 8, 2017 at Inside Market Data
Author: Max Bowie
Originally published November 8, 2017 at Reuters
Author: Anna Irrera
Crux’s platform processes the data for financial firms, including banks, hedge funds, private equity groups and insurers, so they can focus resources on carrying out more differentiating tasks such as building artificial intelligence algorithms to extract value from the information.
This removes the biggest pain point, or “crux” of data analytics in finance, said Philip Brittan, chief executive of Crux.
“Everyone is looking at how to get more data and how to get more value out of the data that they have,” Brittan said in an interview. But “firms spend the majority of their data time on stuff that is not differentiated,” he added.
Crux does not sell or resell the data, but has established a network of information suppliers to help clients discover new sources.
Originally published November 8, 2017 at Silicon Angle
Author: Eric David
The San Francisco-based company calls itself a “data engineering concierge service,” allowing businesses to extract value from their unstructured data quickly and efficiently. Rather than doing the data analysis itself, which is generally left up to specialized artificial intelligence programs, Crux instead extracts and organizes companies’ data to make it more digestible. The company specializes in financial data for banks, hedge funds, financial firms and so on, which is what drew Goldman Sachs to the startup.
Originally published November 8, 2017 at American Banker
Author: Penny Crosman
It’s a given that banks, hedge funds, insurance companies, research firms and others have an insatiable need for data to make decisions – where to place bets, what companies to buy or fund, to whom to extend credit, and so forth.
Finding the right data from new sources, including data aggregators, alternative credit bureaus and satellite imagery, and making it readable to existing programs is a huge chore.
Originally published November 8, 2017 at Techcrunch
Author: Jonathan Shieber
“Think of Crux as a Switzerland for data storage and services. The company won’t reveal any information or resell to anyone else the proprietary information it processes and holds for its clients. It’s merely a processing engine for taking the data that big banks and businesses that depend on big data sets need, and crunches that data — reducing it to the metrics that matter most for the clients it serves.”