Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • Call for Nominations: The 2025 R&D 100 Awards
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
    • Explore the 2024 R&D 100 award winners and finalists
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

Sensor data, reimagined: When 90% less data can fuel 100x gains in efficiency in AI projects

By Brian Buntz | January 9, 2025

IoT

[Adobe Stock]

For decades, the Nyquist-Shannon theorem—a foundational principle of signal processing—dictated that fully sampling a signal at or above twice its highest frequency was essential for capturing critical information. Now, a Pennsylvania startup called Lightscline suggests we may be entering a “post-Nyquist era.” According to a recent Nature Scientific Reports paper, the company’s neural-network-based software, inspired by selective attention in human vision, can discard up to 90% of raw sensor data while preserving core details for tasks like anomaly detection and classification. In one benchmark using the Case Western Reserve bearing-fault dataset, Lightscline achieved 96% accuracy on only 30% of the data—compared to 99.77% accuracy for a widely used CNN model based on LeNet5 using the entire dataset.

Lightscline Co-Founders Ankur Verma, Ayush Goyal, and Soundar Kumara, Ph.D., are co-authors of the paper in Nature Scientific Reports.

Lightscline co-founder and CEO Ankur Verma, Ph.D., claims this approach can “Reduce your AI infrastructure and human capital costs by 100×.” While real-world results will vary, Verma clarifies that the 100× figure is more of an “order-of-magnitude” estimate in savings across edge computing, bandwidth, cloud resources, and processing time. According to Verma, these savings can be realized in four main areas: first, source of data collection—edge compute power; second, transmission bandwidth; third, cloud computing resources (FLOPS); and four, overall cloud processing time. The paper also demonstrates significant computational savings: Lightscline’s Shift-Invariant and spectrally stable Undersampled Network (SIUN) can cut FLOPS by a factor of 435 compared to a conventional CNN. This 435× reduction translates directly into lower compute costs.

A “selective attention” approach

Four undersampled sine waves demonstrating sparse sampling effects on periodic signals

Undersampled sine waves (A, B, C, D) illustrating how a periodic signal can appear random when sampled sparsely. Inspired by Figure 1 in [Verma et al., Sci Rep 14, 32041 (2024), https://doi.org/10.1038/s41598-024-83706-8

This efficiency is achieved through an approach that mimics the “selective attention” mechanism of the human brain. “Selective attention means we can’t focus on more than three to four things at any point in time,” Verma explains, referring to “attentional blindness,” which helps us filter out irrelevant stimuli. Lightscline’s SIUN algorithm mirrors this by learning to identify and retain only the most informative data slices for a given task—letting the network discard superfluous details.

This “selective attention” approach is already being put to the test in real-world environments. In pilot tests with a Fortune 150 company, Lightscline demonstrated strong accuracy while training on only a fraction of raw sensor readings. According to Verma, their data was similar to the bearing-fault signals from the CWRU dataset used in the paper, so the estimated performance benefits were on the order of 100–400×. The company is exploring wearable sensor applications with NASA HumanWorks, aiming for enabling AI inference capabilities on low SWaP-C (size, weight, power, and costs) devices.

When less is more

In deep learning, models focusing on complex problems with high-dimensional inputs, such as image recognition and natural language processing, often benefit from larger datasets. This allows them to learn complex patterns, avoid overfitting, and generalize better to unseen data. Reinforcement learning (RL) shares some similarities in that more experience can lead to improved performance.

Ankur Verma

Ankur Verma, Ph.D.

“Lightscline’s neural-network approach discards most of the raw sensor data while preserving critical information. The company reports 96% accuracy using only 30% of one benchmark dataset, and a 435× reduction in FLOPS compared to standard CNNs. Real-world trials have shown 100-400x FLOPS reductions on just 10% of raw data, demonstrating a potential to reduce AI infrastructure costs by an order of magnitude.”

But not all data challenges are created equal. When it comes to sensor data in industrial or edge computing scenarios, collecting every bit of information at all times may be unnecessary—and potentially wasteful. That’s where Lightscline’s approach diverges. “We’re not arguing that you don’t need more data,” Verma said. “We’re saying that for sensor data, you don’t need to collect more than a certain fraction—you don’t need to collect at a certain rate,” he said. “What we’re saying is that augmenting a small dataset is more beneficial than operating on 100% of a dataset.”

Consider the example of underwater mapping for oil and gas exploration. “Every hour of raw data might require 40 hours of processing,” Verma says. Lightscline’s method frees data scientists from exhaustive manual feature extraction. Verma adds that with Lightscline’s end-to-end technique, the total time can be reduced from 40+ hours of manual effort to about 1 hour, enabling data teams to focus on scaling the number of models rather than labor-intensive data pre-processing. “We’re moving into this post-Nyquist era—how do we find the important information for different tasks? That’s problem-dependent sampling.”

“The fundamental question is: How can we find the important information for different tasks,” Verma said. “How can we generalize that? This architecture generalizes the sampling rate according to the information in your signal.”

Hardware-agnostic strategy

The versatility of Lightscline’s approach extends beyond software to encompass a wide range of hardware. They have deployed their models on everything from a $5 Raspberry Pi Pico microcontroller to an NVIDIA Jetson Nano with its 128-core GPU. The company also supports Arduino and the Intel/Ryzen processor.

        from lightscline.lightscline import LightsclineEdge

        # Load data into Lightscline
        ls = LightsclineEdge(data=data, fs=SAMPLING_FREQUENCY)

        # Reduce the amount of data by 90% of the original
        ls.reduce_and_preprocess_data(per_reduction=90)

        # Train the model
        ls.train_model(verbose=True, n_iters=1000)

        # Checking the results
        ls.test_model()

At the right is Lightscline’s core Python implementation, requiring just four lines of code: loading data, reducing it by 90%, training the model, and testing the results.

Lightscline’s work adds to a broader movement questioning whether Nyquist–Shannon is still the final word on sampling. The company demonstrates that it is possible to move beyond Nyquist–Shannon constraints and achieve high accuracy with less data, a key point in sensor data analysis for applications ranging from industrial machinery and wearable devices to underwater exploration and satellite operations.

Even Meta has investigated similar undersampling approaches with its Byte Latent Transformer (BLT), a framework that processes information at the raw byte level and allocates compute only where data complexity requires it. “Even in language models, there’s this general notion of downsampling, adaptive sampling (like this [Meta] paper is showing), and sparsifying the network using low-rank matrices. In the language models,” Verma said. Meta’s and Lightscline’s work underscores a larger trend in compressing or undersampling data without undermining performance. “Nyquist defines how fast you sample,” Verma notes, “but we’re redefining what we save and discard. In many ways, less really can be more.”

Related Articles Read More >

5 R&D developments to keep an eye on this week: Solar crash and Trump’s energy pivot meets Musk’s rebellion
Amazon’s robotics patent portfolio has grown 28x since its Kiva acquisition
SandboxAQ’s SAIR dataset turns 5.2 M protein‑ligand structures into ground‑truth fuel for AI
Inside Meta’s latest $500M+ bet on AI talent suggests a sort of brute-force paradox
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.
RD 25 Power Index

R&D World Digital Issues

Fall 2024 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

Research & Development World
  • Subscribe to R&D World Magazine
  • Enews Sign Up
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • Call for Nominations: The 2025 R&D 100 Awards
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
    • Explore the 2024 R&D 100 award winners and finalists
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE