Serving Point-of-Sale Data with additional Ingestion Delivery Mechanisms

Supplier Spotlight

GfK Logo

Crux’s mission is to help data flow efficiently between data suppliers and data consumers, and we look to highlight major trends and developments impacting both parties. Today’s spotlight features GfK. Our Q&A was conducted with Cedric Mertes, Commercial Director of GfK Boutique.

Gfk Technology and Consumer Durable Data

What is GfK?

GfK stands for “Growth from Knowledge,” and this credo exemplifies how the company has served its clients over the past 85 years. GfK, a leading market research firm, tracks point-of-sale data at the most granular, SKU level from retailers, resellers, carriers, value-added resellers and distributors in over 75 countries on a weekly and monthly basis, giving clients the information to grow their businesses.

What type of clients do you work with?

GfK works with a range of clients, from manufacturers, to retailers, hedge funds, investment managers. GfK Boutique works directly with the investment managers, all primarily in Tech and Consumer Durables segments, including:

  • Component/Semiconductor Suppliers
  • Handsets
  • PCs/Tablets
  • GPUs
  • CPUs
  • Home Appliances
  • Home Audio
  • Action Cameras
  • Enterprise Software
  • IT Security
  • Enterprise Storage
  • Networking Equipment
  • Servers
  • Digital Cameras & Lenses
  • Action Cameras
  • Navigation Devices
  • IT Peripherals
  • Contact Lenses
  • Printers & Cartridges
  • Tires
  • TVs
  • Gaming Consoles & Software
  • Watches
  • Wearables

What are some examples across sectors of how GfK data is used?

  • Analyzing 5G penetration and content wins/losses in terms of end-customer adoption of 5G phones and which component suppliers are gaining dollar content in those devices.
  • Looking at memory DRAM and NAND content growth from an end-demand standpoint and their impact on supply and prices.
  • Identifying the end-demand success or failure of recent smartphone launches from Apple, Huawei, Xiaomi and Samsung, as well as their subsequent impact on the component suppliers of these devices across the Handset, TV and Wearable categories.
  • Identifying how the Gaming PC, Console and Server markets drive overall CPU and GPU demand and the impact on Intel, NVIDIA and AMD; analyzing the adoption of new gaming consoles from Sony and Nintendo, researching the success or failure of new product launches from Activision and Electronic Arts.
  • Monitoring share shifts at Sonos, Garmin and Logitech based on promotional activity ahead of key shopping events such as Singles’ Day, Black Friday, the Lunar New Year, etc.
  • Quantifying the success of Alcon’s new Daily Contact Lens and its impact on CooperVision’s market share.

What trends are you seeing in the industry?

Many of our clients have new data analyst teams that prefer an FTP feed that allows them to manipulate the data themselves. Given the number of alternative data sets that have come to market over the years, we recognize many of our clients are looking to ingest as much raw data as possible.

However, given the continued demand for GfK’s traditional fundamental products and analyst support, GfK has partnered with Crux to expand its resources to support additional delivery mechanisms, rather than shift resources from its traditional products; Crux has been instrumental in allowing GfK to better serve its clients.

How has the partnership with Crux affected the user experience?

By partnering with Crux, GfK is able to work with additional clients on a platform that is already built out, which eliminates the onboarding process and all the accompanying frustrations.

What has GfK’s experience with Crux been?

Working with Crux has been an excellent experience thus far! The team is extremely diligent, responsive and timely throughout the data onboarding process. All forms of communication were clear and concise. GfK has an extremely granular data set and a large quantity of files, which can be challenging to traditional clients. Crux has been able to eliminate that challenge and streamline the process. GfK will be launching new products outside of the consumer/tech industry and looks forward to working on additional datasets and clients with Crux in 2020!

To receive these updates, join our community.

Developing Proprietary Models using Natural Language Processing and Machine Learning Strategies

Supplier Spotlight

Brain proprietary models

Crux’s mission is to help data flow efficiently between data suppliers and data consumers, and we look to highlight major trends and developments impacting both parties. Today’s spotlight features Brain, which just released a new alternative dataset “Brain Language Metrics on Company Filings” based on Natural Language Processing analysis of 10-K and 10-Q reports for the largest US stocks. Our Q&A was conducted with Francesco Cricchio, PhD and Matteo Campellone, PhD.

What is Brain?

Brain is a research company that develops proprietary signals and algorithms for investment strategies. Brain also supports clients in developing, optimizing and validating their own proprietary models.

The Brain platform includes Natural Language Processing (NLP) and Machine Learning (ML) infrastructures which enable clients to integrate state-of-the-art approaches into their strategies. All of our software is highly customizable to support the investment approach of our clients.

Our system incorporates alternative data and evaluates its relevance to financial models. 

What are some examples of signals that investors should know about?

Two good examples are the Brain Sentiment Indicator (BSI) and the Brain Machine Learning Stock Ranking (BSR).

The BSI is a sentiment indicator of global stocks produced by an automated engine that scans the financial news flow to gain a deeper understanding of the dynamic factors driving investor sentiment. This indicator relies on various NLP techniques to score financial news by company and extracts aggregated metrics on financial sentiment.

The incorporation of BSI rankings helps clients build quantitative strategies that include both sentiment and short-term momentum indicators. On a longer time horizon, the application of BSI adds value to strategies that seek companies that are under- or over-priced due to very low or very high sentiment. 

The BSR is used to generate a daily stock ranking based on the predicted future returns of a universe of stocks for various time horizons. BSR relies on machine learning classifiers that non-linearly combine a variety of features with a series of techniques aimed at mitigating the well-known overfitting problem for financial data with a low signal to noise ratio. This model uses a dynamic universe that is updated each year to avoid survivorship bias.

The incorporation of BSR enhances quantitative models and long/short strategies by adding a stock ranking that non-linearly combines stock specific market data with market regime indicators and calendar anomalies using advanced ML techniques.

How are you different than other firms? 

We’ve developed a scientific and rigorous approach based on our years of research and our experience implementing statistical models in state-of-the-art software.

We try to be as rigorous as possible in our models, which is especially important in extracting information from financial time-series data, where the signal to noise ratio is very low and overfitting risk is very relevant in validating meaningful signals.

What are some use cases for your data?

We offer a two-fold solution for clients. Financial firms can combine our systematic signals (BSI) with their own proprietary signals or algorithms to create a more complete model and perform back-testing validation. 

Alternatively, clients can come to us as a consultancy to support or validate a specific methodology or create a signal they can backtest for their hypotheses using ML or other advanced statistical techniques.

We also develop proprietary signals based on market and economic factors. One example of this is asset allocation models that try to capture risk-on and risk-off phases in the market.

What trends are you seeing in the market?

There are a number of providers of similar signals today. We see greater adoption of NLP-based sentiment signals being increasingly adopted in the market. Some providers are moving towards offering integrated platforms based on their technology and also using graphical interfaces. Our differentiator is that we are focused on continuously enhancing our algorithms. When deploying our integrated solutions, there is a lot of value in the customization of the product for each client.

Brain’s proprietary NLP algorithm uses semantic rules and dictionary-based approaches by looking at financial news to calculate sentiment on stocks. Beyond traditional sentiment data, we also developed other language metrics — like language complexity in earnings calls or similarity of language in regulatory filings — to investigate the correlation of these metrics with the company’s financial performance. 

Great, so who are your target clientele?

We have two main client groups. Large, global quant hedge funds look to us for our raw datasets. Other investment companies look to us for customized solutions. Based on our platform, we will create and integrate for them.

What are your backgrounds? 

As the Co-Founders of Brain, we share a common background in Physics and research. We focused on nurturing this as a common thread throughout the team. 

Matteo, Executive Chairman and Head of Research worked as a Theoretical Physicist, in the field of statistical mechanics of complex systems and non-linear stochastic equations. After receiving his Ph.D in physics and years dedicated to research, he obtained an MBA at IMD Business School. He then went on to work in various areas of finance, from financial engineering to risk management and investing.

Francesco, CEO and Chief Technology Officer, obtained a Ph.D in Computational Physics with a focus on solving complex computational problems with a wide range of techniques. He then focused on using ML methods and advanced statistics in the industrial space. Francesco’s technological know-how underpins the industrial machine learning solutions we deploy in our robust production environment.

What led you to work with Crux?

We believe that the partnership with Crux is a particularly good fit since Brain develops alternative datasets based on NLP and ML techniques while Crux builds and manages pipelines from data suppliers to its cloud platform. Thus, we are excited that our datasets will be delivered effectively to clients without performing different types of integration procedures for each new client we onboard. We rely on the Crux platform to help us scale our products more efficiently.

Thanks so much! Really enjoyed chatting with you both.

About Supplier Spotlight Series


In our mission to help data flow efficiently between data suppliers and data consumers, we look to highlight major trends and developments impacting both parties. The ‘Supplier Spotlight’ series is an impactful content series focused on sharing the latest developments by suppliers and their datasets delivered by Crux.



To receive these updates, join our community.

Delivering Technology Industry Data Without Friction

Supplier Spotlight

International Data Corporation (IDC)
International Data Corporation. Crux Supplier Spotlight Series

What is the background on IDC? 

IDC has been in business for over 50 years. We do comprehensive and very deep research on the Technology industry. 

The bulk of IDC’s business is with large technology vendors who rely on IDC’s industry data and research to make strategic decisions around product development, competitive positioning, new investment opportunities, market entry, product positioning, etc. IDC has also been working with financial clients for many, many years. On the buy-side, that tends to be investors doing deep fundamental work in the tech space, namely long/short discretionary hedge funds, activists, and private equity clients.

What are IDC’s key benefits for clients? 

IDC has unique and highly structured data that cannot be found in SEC filings, company disclosures, on the internet, or elsewhere in the public domain. IDC data sets provide an independent, comprehensive, and coherent picture of technology markets worldwide, which is understood by our clients to be the best proxy for ground truth available.

This is a good point to speak to the key categories of data that IDC offers. What schemas does IDC utilize (frequency, type, etc.)?

IDC has 25 distinct data sets (called Trackers) oriented around various worldwide technology markets (e.g. Mobile Phones, PCs, Servers, Storage, Switches & Routers, Public Cloud Services, Cloud Infrastructure, Software, etc.). Collectively, IDC covers nearly 3,000 technology firms globally, of which over 600 are publicly traded. The data offers comprehensive and very granular insight into Tech-industry fundamentals (e.g. revenue, unit shipments & capacity shipments by vendor, segment, country, price band, channel, form factor, etc.). 

From a geographic standpoint, IDC has country-level data on up to 110 countries. The frequency of the data is monthly, quarterly, or semiannually, depending on the data set. Historical data varies by technology market depending on the maturity of the technology and typically extends back close to the inception of the market. For instance, PC data goes back to 1995, while Mobile Phone data extends back to 2004. IDC’s tracker data is supported by detailed industry taxonomies and underpinned by a rigorous data collection methodology involving a top-down and bottom-up approach, where data is reconciled with direct contacts with leading technology vendors reviewing information gathered from our extensive regional and local relationships, resources, and data sources.

IDC has more than 1,100 technology analysts and research offices in over 50 countries. We leverage a multitude of data sources, including published financial statements and public data, import records, contract details, 3rd-party data from OEMs, component vendors, platform suppliers, and other channel and supply-chain constituents. IDC also curates information from distributor data feeds, import/export records, and pricing data scraped from the web. IDC analysts have over 85,000 vendor client interactions every year and leverage extensive consumer and B2B surveys, with over 350,000 respondents annually.

Can you provide examples of questions that a data consumer might be interested in answering with IDC data?

IDC helps investors understand the size, structure, and competitive and growth dynamics of various tech markets. The questions can include: Who is winning in particular market segments and geographies? Who is gaining traction in various channels? What are the key form factor trends, and pricing dynamics? How many units or how much capacity is being shipped? What is the size and age of the installed base and who/what is vulnerable for displacement by new technologies and competing vendors? Which workloads are moving to the cloud and which still have longevity with on-premise deployments? What architectures and components are being used in new data center construction? Which vendors sold what devices at what prices through which channels last month in China?

Historically, our discretionary clients are using IDC’s tracker data to build sophisticated market models to identify market share gainers and donors to surface long/short ideas and paired trades. The newer quant-oriented use cases might focus on sector rotation (e.g. overweight/underweight Tech), segment rotation within Tech (e.g. Semis vs Software), using market-share data to enhance “quality” factors in factor models,  or even using tracker data as inputs for global macro insights.

So you’ve traditionally worked with technology focused investors, what are some trends you’re seeing with them?

Over the past couple years, we’ve seen our traditional clients becoming increasingly focused on extracting insights from data in more efficient and often more automated ways. On an individual level, investment analysts are becoming more adept with programming languages, statistics packages, and analytical tools, so they are doing heavier lifting with large amounts of raw data. Many discretionary firms are creating centralized functions with data science teams to help fundamental analysts with screening, idea generation, and/or new quant-based insights into existing holdings. Some firms are embedding quants with their fundamental teams. Overall, fundamental analysis is becoming more data intensive and automated. 

At the same time, IDC is working with clients that have full-blown quantitative and systematic mandates. Interestingly, these firms are trying to find ways to acquire more domain expertise, and we can help them do that. So, these previously distinct skill sets (quant/systematic and fundamental/discretionary) are converging along a spectrum of capabilities and mandates. These evolving discretionary use cases and new systematic clients all require more automated, precise and timely delivery of data in various formats. 

We spoke about duplication of work in the past, can you tell me a little more about this?

For IDC customers and prospective customers, their technical folks may encounter a learning curve and upfront work to understand how to ingest our data into their processes. This requires a lot of back-and-forth with IDC’s technical folks. This happens across all of IDC’s clients for all the data vendors they work with. This multiplicative effect is time-consuming and strains their internal technical staff which ultimately inhibits IDC’s clients’ ability to scale their data onboarding, ingestion, and production operations. 

When it comes to ingesting data, can you elaborate on that?

Each of these new and evolving use cases require different delivery requirements of raw data to our clients – flat files via FTP, cloud buckets, or APIs. In addition to delivery, on IDC’s end, there is a lot of work that goes into the preparation of data to take it from raw form to a form that is actionable. This is where Crux comes into play. 

Let’s talk quickly about your experience working and delivering data with Crux?

The two pieces that Crux is helping to address for IDC and IDC’s clients are:

  • The back and forth friction that is required to get a new data set stood up and
  • The ability for our clients to ingest the data into their systems in the format(s) they want

We’re still ramping up our relationship with Crux and just finished the first part, but we’re excited to now be able to deliver our data sets through a variety of file formats quickly to our clients. Onboarding our datasets with the Crux team was relatively painless.

What does the future hold for IDC?

Overall, we think IDC data provides a perfect framework and foundation for creating complex data ensembles or unlocking the value of higher-frequency alternative data in the Tech space. In fact, Yin Luo’s Quantitative Research group at Wolfe Research back tested our data and wrote a report last year on some potential applications of IDC data for systematic investors that is very interesting. In our view, IDC data is still tremendously underutilized by Buy-side clients, particularly for these new types of use cases. Now that we’re working with Crux, our existing and potential clients have a streamlined way to explore, test, ingest, and ultimately exploit IDC data, and we’re very excited about that.

Q&A conducted with Brian Murphy, Director, Financial Sales at IDC.



About Supplier Spotlight Series


In our mission to help data flow efficiently between data suppliers and data consumers, we look to highlight major trends and developments impacting both parties. The ‘Supplier Spotlight’ series is an impactful content series focused on sharing the latest developments by suppliers and their datasets delivered by Crux.

To receive these updates, join our community.

Welcome MT Newswires to Crux!

MT Newswires Datasets on Crux

MT Newswires datasets now delivered by Crux! MT Newswires is a recognized leader in original, unbiased business and financial news. Offering over 130 unique categories of noise-free coverage, MT Newswires’ multi-asset class global news powers most of the industry’s most recognized platforms. Get datasets delivered here.

Crux Informatics Expands Data Delivery Reach to AWS Data Exchange

Crux on AWS Data Exchange

NEW YORK, Nov. 14, 2019 /PRNewswire/ — Crux Informatics (“Crux”), a leading data delivery and operations company, today announced it has joined the newly-launched AWS Data Exchange service from Amazon Web Services (AWS). AWS Data Exchange, is a service that makes it easy for millions of AWS customers to securely find, subscribe to, and use third-party data in the cloud. Availability on AWS Data Exchange further delivers on Crux’s mission to make reliable data from suppliers accessible for consumers around the globe.

Companies today are faced with the challenge of ingesting complex data from a wide variety of suppliers at scale, resulting in significant time and resource investments. As the industry utility that simplifies the data ingestion, operations, and delivery process, Crux ensures that data flows seamlessly between suppliers and financial companies, so they can focus their resources on extracting value from the data. Customers using AWS Data Exchange and the AWS Marketplace are now able to easily access validated data from Crux, and data suppliers can leverage Crux’s capabilities to distribute on AWS Data Exchange.

“Data suppliers and their customers across financial services have told us they want to work with Crux to connect and operate with data sets at scale so we are delighted to welcome Crux to AWS Data Exchange to feed data into customer’s applications, analytics, and machine learning models running on AWS,” said Samantha Gibson, Head of Financial Services Business Development, AWS Data Exchange, Amazon Web Services, Inc. “As our customers continue to move to the cloud, they’re looking for new ways to simplify how they find and consume data and we are excited to see the innovation made possible by having diverse, high-quality, and well-curated data from Crux available on AWS Data Exchange.”

“We are proud to collaborate with AWS to deliver reliable, timely and digestible data into their innovative AWS Data Exchange,” said Philip Brittan, CEO at Crux Informatics. “With availability in AWS Data Exchange, we’re taking another step forward in giving our clients the ability to find and access the data they need with the high degree of operational excellence they’ve come to expect from Crux.”

As of today, AWS Data Exchange customers can access Crux-delivered data directly. See here a selection of over 70 free datasets available on AWS Data Exchange. At the same time, data suppliers can now work with Crux to connect and deliver their data in AWS Data Exchange. Crux takes care of onboarding, validating and reliably operating your data feeds onto AWS Data Exchange, enabling you to accelerate distribution and scale operational efficiency. Connect with us to get your data delivered today.

Welcome Vertical Knowledge to Crux!

Vertical Knowledge Datasets delivered by Crux

Vertical Knowledge datasets now delivered by Crux! Vertical Knowledge provides rich historical libraries of auto, retail, real estate, travel, business intelligence, and other open source collected web data. Combining the power of the Vertical Knowledge web collection engine with the flexibility and depth of the Crux data management platform transforms the way public sector and commercial institutions identify, reference, and analyze open source data to solve their most difficult business problems. Get datasets delivered here.

Welcome IDC to Crux!

Welcome IDC to Crux

IDC’s global technology research data can now be delivered by Crux. IDC is the premier global provider of market intelligence, advisory services, & events for the information technology, telecommunications, and consumer technology markets. Learn more here.

Welcome GTCOM to Crux!

GTCOM Sentiment Data delivered by Crux

GTCOM-US and their JoveBird Sentiment Data now delivered by Crux! GTCOM’s advanced NLP & semantic computing technologies analyzes global news & social streams providing corporate users with comprehensive scenario-based solutions. Get datasets delivered from Crux.

Welcome 2iQ to Crux!

Very excited to welcome @2iqresearch and their insider transactions data onto the Crux Supplier Network! Their datasets cover a universe of over 8.1M transactions from over 200k insiders across a universe of 60k stocks in 50 countries. Learn more here.