DSC Weekly Digest 14 September 2021

Author: Kurt A Cagle



Announcements
  • Marketing Analytics and Data Science Join us on October 25-26, 2021 at the Eau Palm Beach Resort & Spa Palm Beach, FL to meet with MADS speakers and like-minded peers for face-to-face discussions, sessions, and 1-on-1 expert consultations on overcoming your challenges. You will discover the infinite possibilities when marketing analytics and data science align to form a revenue-driving powerhouse. Learn More

  • On September 27th you are invited to the AWS Data Exchange Webinar: How to use external consumer insights and marketing data to build a customer-centric business. External consumer and marketing insights can result in higher customer satisfaction, better retention, and a stronger overall bottom line. Register Now

MADS Informa 160x600

Making Do With Small Data

The 2010s could, arguably, be described as the era of Big Data, where all of a sudden it seemed like businesses were being deluged by huge amounts of data that had to be processed immediately. Part of this was an amplification of the IT hype mills, as Big Data required Big Servers (or lots of little ones), faster processors, and more programmers to do the heavy lifting of creating the Data Lakes and Enterprise Warehouses that were so integral to the zeitgeist, and part of it was the impact of mobile computing as it suddenly expanded the number of sensors in play dramatically.

Yet the reality on the ground was a bit different for most companies, even many in the IT space itself. Most of the really big data was coming from a few focused social media companies, not from business dramatically increasingly data streams elsewhere, and much of that (most of that) was noise outside of the context that it has come from. Social media is actually a poor place to pick up on covert terrorist activities (high noise, subtle signals), though it’s great in identifying domestic terrorists who want to publicly high-five themselves with their buddies over their latest hijinks.

Most data is, at the end of the day, the trail that transactions leave over time. This information can be valuable, but from the perspective of a business, the metadata at the other end of the transaction is usually fragmentary and hard to quantify. This is one of the reasons that any comprehensive AI solution has to incorporate both algorithmic processes (machine learning) and annotational processes (semantics). Most analytics tools, even neural networks, tend to concentrate on data from the perspective of the transaction, while annotational processes are often far more useful to a company as it is a critical source for what is colloquially called “labeling”.

Labeling is often considered bothersome by analysts because it is time-consuming and requires the collection of metadata rather than the analysis of data. This data also requires developing a conceptual model and the distillation of relationships that usually does require human intervention. It is possible to infer this data using statistical techniques, but it requires a huge amount of data to do so, while at the same time providing at best only a hint of that underlying structure.

The next generation of neural networks is beginning to take this small data into account, in essence focusing increasingly on not just the statistics of the data but also its shape. Known as labeled neural networks (LNN) or graph neural networks (GNN), these various convolutional neural nets replace brute force analysis with what amount to Bayesian networks. These use probabilistic models to identify the schema (or model) implicit in the data. With that information (especially when combined with the contextual streaming that provides the working memory for these processes), GNNs can then become self-labeling, determining not only value but also structure to the resulting function.

The biggest benefit of this technology will be in the areas of making it possible to get the benefits of big data systems without requiring big data. Put another way, artificial intelligence is becoming more intuitive, able to parse out valid patterns with far less raw input. By being able to make do with such small data, all users should be a benefit from this technology, not simply the ones with the deepest pockets.

In media res,

Kurt Cagle
Community Editor,
Data Science Central

To subscribe to the DSC Newsletter, go to Data Science Central and become a member today. It’s free! 


Data Science Central Editorial Calendar

DSC is looking for editorial content specifically in these areas for September, with these topics having higher priority than other incoming articles.

  • Machine Learning and IoT
  • Data Modeling and Graphs
  • AI-Enabled Hardware (GPUs and similar tools)
  • Javascript and AI
  • GANs and Simulations
  • ML in Weather Forecasting
  • UI, UX and AI
  • Jupyter Notebooks
  • No-Code Development
  • Metaverse
  • GNNs and LNNs

DSC Featured Articles


Picture of the Week
Hiring difficulties jump from last year
Hiring difficulties jump from last year.

 


To make sure you keep getting these emails, please add mail@newsletter.datasciencecentral.com to your browser’s address book.

This email, and all related content, is published by Data Science Central, a division of TechTarget, Inc.

275 Grove Street, Newton, Massachusetts, 02466 US


You are receiving this email because you are a member of TechTarget. When you access content from this email, your information may be shared with the sponsors or future sponsors of that content and with our Partners, see up-to-date  Partners List  below, as described in our  Privacy Policy . For additional assistance, please contact:  webmaster@techtarget.com


copyright 2021 TechTarget, Inc. all rights reserved. Designated trademarks, brands, logos and service marks are the property of their respective owners.

Privacy Policy  |  Partners List

Go to Source