📡 Survey Methods — A Primer

Why this topic? Survey papers are the map before the expedition — they tell you what's already been explored, where the gaps are, and which paths are worth taking. If you're starting research in WiFi sensing or crowd monitoring, reading (and eventually writing) a good survey is one of the most valuable skills you can develop.

Background

Imagine you've just joined a lab studying how WiFi signals can detect people moving through a building. You search for papers and find hundreds of results — some from 2015, some from last month, covering everything from gesture recognition to building evacuation modeling. Where do you even begin?

This is exactly the problem that survey papers (also called review papers or literature reviews) solve. A survey is a structured synthesis of existing research: instead of presenting new experimental results, the authors read dozens or hundreds of papers in a field, organise them into a coherent taxonomy (a classification scheme), identify common methods and datasets, and point out what remains unsolved. Think of it like a Wikipedia article written by domain experts, with citations to back every claim.

In fast-moving fields like wireless sensing, surveys serve a critical gatekeeping function. A PhD student in 2020 reading ma2020_4782 would immediately understand that CSI — Channel State Information, the fine-grained per-frequency amplitude and phase measurements extracted from WiFi packets — was the dominant signal modality, and that activity recognition and indoor localization were the dominant application targets. Without that survey, they might spend months rediscovering the same landscape.

Survey methodology matters because a bad survey is just as misleading as no survey at all. Researchers have developed systematic approaches — including structured search protocols, inclusion/exclusion criteria, and reproducibility checklists — to ensure that a survey's conclusions are trustworthy and replicable. As you read the papers below, pay attention not just to what they cover, but how they organise and evaluate the literature they review.

Key Methods

Surveys in this domain typically combine several methodological approaches:

Systematic literature review (SLR): A formal protocol where search terms, databases, and inclusion criteria are declared upfront, making the search reproducible. Papers are screened by title, abstract, and full text in stages.
Taxonomy construction: Grouping papers by dimensions such as application domain, signal modality, machine learning method, or hardware platform. For example, wang2026_2758 organises the field along a four-stage sensing pipeline.
Benchmark and reproducibility auditing: Checking whether experimental claims can be verified — e.g., is the dataset public? Is code available? guarino2026_e72c formalises this as a four-condition checklist.
Cross-domain synthesis: Surveying a technique (e.g., Transfer Learning) as it is applied across multiple problem types, identifying which contexts it generalises well to.
PDF information extraction (for automated survey construction): Using rule-based or deep learning tools to automatically harvest metadata, tables, and citations from academic PDFs — a prerequisite for large-scale literature analysis, addressed by meuschke2023_e726.

Active Research in the Vault

WiFi CSI sensing surveys form the largest cluster. ma2020_4782 is the foundational reference, providing a comprehensive taxonomy of CSI-based sensing applications up to 2020. Building on this, wang2026_2758 extends the scope to over 200 papers (2015–2026), specifically addressing the critical but underexplored question of whether systems trained in one environment work in another — a property called generalizability. chen2023_5cbd narrows this further to cross-domain transfer, cataloguing the impact factors that cause performance to degrade when users, locations, or hardware change. ahmad2024_8639 surveys the deep learning methods — CNNs, LSTMs, and transformer architectures — applied to WiFi human sensing, noting the lack of standardised open datasets as a persistent barrier. koo2026_a08d takes an energy-focused lens, surveying how lightweight models can reduce the GPU dependency that makes current CSI systems impractical for edge deployment. yang2023_a34a goes a step further by not just surveying but benchmarking: it re-implements and compares multiple models on shared datasets, providing an empirical basis for claims that are often made only theoretically.

Reproducibility and tooling surveys address a growing crisis in the field. guarino2026_e72c finds that the majority of published CSI sensing papers cannot be reproduced due to missing datasets, undescribed algorithms, or absent code — a stark finding for students who may assume published results are verifiable. meuschke2023_e726 surveys the tools used to automatically extract structured information from academic PDFs, which underpins any large-scale automated literature review pipeline.

Crowd monitoring and safety surveys connect WiFi sensing to broader application contexts. bendalibraham2021_476e surveys computer-vision-based crowd analysis methods, covering pedestrian detection, group behaviour recognition, and surveillance systems. haghani2023_5c35 takes a safety-science perspective, proposing the Swiss Cheese Model — a framework borrowed from aviation safety — to show how crowd disasters result from multiple simultaneous system failures rather than single causes. darsena2023_50b7 surveys sensor modalities (cameras, WiFi, ticketing systems) specifically in public transport contexts. zhong2022_7cb2 surveys data-driven simulation approaches, while yang2020_e295 covers the full spectrum from microscopic (individual agent) to macroscopic (fluid dynamics) models.

Radio-frequency sensing surveys broaden beyond WiFi. shahbazian2023_1172 surveys the full RF landscape — including radar, UWB, and cellular signals — for occupancy and activity detection, situating WiFi CSI as one option among several. soumya2023_9b9f surveys millimetre-wave radar — a modality with finer range resolution than WiFi but requiring dedicated hardware. zhou2022_09d5 and yang2026_6c4f survey the signal processing foundations underpinning all RF sensing systems.

IoT and smart building surveys provide the deployment context for sensing systems. ray2018_a47e is a foundational reference surveying IoT hardware platforms and communication standards. chaudhari2024_6efc and khan2024_43e8 specifically survey occupancy sensing — estimating how many people are in a space — which is the core application motivating much of the WiFi CSI crowd counting literature. chaudhari2026_85b1 takes an economic angle, surveying how CSI sensing infrastructure affects property valuation. gubbi2013_fab5 and zanella2014_d33c provide essential historical context for understanding how IoT frameworks evolved to support large-scale urban sensing. fang2026_0f1d surveys the growing security landscape around IoT deployments.

Indoor spatial modelling surveys address how the physical environment is represented. kang2017_8400 surveys the OGC IndoorGML standard for encoding indoor maps, which is used to contextualise sensor placements. diakit2020_1b54 and sarmiento2020_8095 extend this to IoT sensor integration. ficara2024_f89b surveys MAC address randomisation — a WiFi privacy mechanism that directly disrupts device-counting approaches to crowd monitoring.

Knowledge graph and AI-infrastructure surveys are relevant to the research knowledge base layer of this project. zhong2024_0185 surveys how structured knowledge graphs are built from unstructured text — directly applicable to building a research knowledge base from papers. fan2024_8c1b, peng2026_4bdc, and klesel2025_0638 survey Retrieval-Augmented Generation (RAG) — a technique where a language model's responses are grounded by retrieving relevant documents from an external knowledge base, reducing hallucination. atagong2025_0071 surveys the full pipeline from PDF ingestion to structured knowledge storage. alharthi2026_b1a0 surveys recommender systems in smart city contexts, bridging sensing infrastructure to citizen-facing applications.

Physics and data assimilation surveys round out the methodology landscape. ghorbani2023_c065 surveys how real sensor data can be fused with agent-based crowd simulations to improve fidelity. di2023_285b surveys hybrid physics + neural network models for traffic, with direct methodological parallels to crowd flow estimation. chen2018_894a surveys the foundational social force model — a physics-inspired model where pedestrians are treated as particles subject to attractive and repulsive forces. zhang2026_956c surveys wireless sensing modalities in the context of metaverse applications, including non-contact human body digitisation. zhang2022_5822 and ropitault2024_d49d survey the standardisation landscape for CSI sensing, with the 802.11bf amendment representing the field's move toward interoperable, vendor-neutral implementations. wu2022_75d3 surveys the physical propagation models — specifically the Fresnel zone model that describes which spatial regions a WiFi link is most sensitive to — that underpin model-based (rather than purely data-driven) sensing approaches.

Open Problems & Gaps

How do you write a reproducible survey in a field where most primary papers are not reproducible? Given that guarino2026_e72c finds that most CSI papers fail basic reproducibility checks, any survey of this literature must grapple with how to assess and compare results that cannot be independently verified — what methodological standards should a survey impose?
Can automated knowledge graph construction replace manual literature review? zhong2024_0185 and meuschke2023_e726 suggest the tools exist, but accuracy remains domain-dependent — what would a validated, automated survey pipeline for the WiFi sensing domain look like, and how would its coverage and correctness be evaluated?
How should a survey handle rapidly moving targets? wang2026_2758 covers 2015–2026 with 200+ papers, but the field is evolving monthly; can living survey formats (continuously updated, versioned documents) replace static publications, and what review processes would ensure their quality?
Where is the survey connecting WiFi CSI crowd counting to real-world deployment at scale? Most surveys focus on controlled lab experiments; a survey synthesising evidence from real deployments — stadiums, transit hubs, public events — and honestly characterising the gap between lab and field performance would have immediate practical value and is conspicuously absent from the current literature.

History

Date	Δ Papers	Action	Notes
2026-05-12	—	create	Initial primer.