The ZMap Project is a collection of open source tools for performing large-scale studies of hosts and services on the Internet. The project was started in 2013 with the release of ZMap, a fast single-packet scanner that enabled scanning the entire public IPv4 address space on a single port in under 45 minutes. A year later, we released ZGrab, a Go application-layer scanner that works in tandem with ZMap. Since then, the team has expanded and we have built nearly a dozen open source tools and libraries for performing large-scale Internet measurements. Continued development is supported by the National Science Foundation (NSF). The core team can be reached at team@zmap.io.
We have published several papers that describe how the suite of ZMap tools are architected:
Zakir Durumeric, Eric Wustrow, and J. Alex Halderman
22nd USENIX Security Symposium, August 2013
Internet-wide network scanning has numerous security applications, including exposing new vulnerabilities and tracking the adoption of defensive mechanisms, but probing the entire public address space with existing tools is both difficult and slow. We introduce ZMap, a modular, open-source network scanner specifically architected to perform Internet-wide scans and capable of surveying the entire IPv4 address space in under 45 minutes from user space on a single machine, approaching the theoretical maximum speed of gigabit Ethernet. We present the scanner architecture, experimentally characterize its performance and accuracy, and explore the security implications of high speed Internet-scale network surveys, both offensive and defensive. We also discuss best practices for good Internet citizenship when performing Internet-wide surveys, informed by our own experiences conducting a long-term research survey over the past year.
David Adrian, Zakir Durumeric, Gulshan Singh, and J. Alex Halderman
USENIX Workshop on Offensive Technologies (WOOT), August 2014
We introduce optimizations to the ZMap network scanner that achieve a 10-fold increase in maximum scan rate. By parallelizing address generation, introducing an improved blacklisting algorithm, and using zero-copy NIC access, we drive ZMap to nearly the maximum throughput of 10 gigabit Ethernet, almost 15 million probes per second. With these changes, ZMap can comprehensively scan for a single TCP port across the entire public IPv4 address space in 4.5 minutes given adequate upstream bandwidth. We consider the implications of such rapid scanning for both defenders and attackers, and we briefly discuss a range of potential applications.
Zakir Durumeric, David Adrian, Phillip Stephens, Eric Wustrow, and J. Alex Halderman
Since ZMap’s debut in 2013, networking and security researchers have used the open-source scanner to write hundreds of research papers that study Internet behavior. In addition, ZMap powers much of the attack-surface management and security ratings industries, and more than a dozen security companies have built products on top of ZMap. Behind the scenes, much of ZMap’s behavior—ranging from its pseudorandom IP generation to its packet construction—has quietly evolved as we have learned more about how to scan the Internet. In this work, we quantify ZMap’s adoption over the ten years since its release, describe its modern behavior (and the measurements that motivated those changes), and offer lessons from releasing and maintaining ZMap.
Zakir Durumeric, David Adrian, Ariana Mirian, Michael Bailey, J. Alex Halderman
22nd ACM Conference on Computer and Communications Security (CCS'15)
Fast Internet-wide scanning has opened new avenues for security research, ranging from uncovering widespread vulnerabilities in random number generators to tracking the evolving impact of Heartbleed. However, this technique still requires significant effort: even simple questions, such as, "What models of embedded devices prefer CBC ciphers?", require developing an application scanner, manually identifying and tagging devices, negotiating with network administrators, and responding to abuse complaints. In this paper, we introduce Censys, a public search engine and data processing facility backed by data collected from ongoing Internet-wide scans. Designed to help researchers answer security-related questions, Censys supports full-text searches on protocol banners and querying a wide range of derived fields (e.g., 443.https.cipher). It can identify specific vulnerable devices and networks and generate statistical reports on broad usage patterns and trends. Censys returns these results in sub-second time, dramatically reducing the effort of understanding the hosts that comprise the Internet. We present the search engine architecture and experimentally evaluate its performance. We also explore Censys's applications and show how recent questions become simple to answer.
Liz Izhikevich, Gautam Akiwate, Briana Berger, Spencer Drakontaidis, Anna Ascheman, Paul Pearce, David Adrian, and Zakir Durumeric
ACM Internet Measurement Conference (IMC), October 2022
Active DNS measurement is fundamental to understanding and im- proving the DNS ecosystem. However, the absence of an extensible, high-performance, and easy-to-use DNS toolkit has limited both the reproducibility and coverage of DNS research. In this paper, we introduce ZDNS, a modular and open-source active DNS measure- ment framework optimized for large-scale research studies of DNS on the public Internet. We describe ZDNS’s architecture, evaluate its performance, and present two case studies that highlight how the tool can be used to shed light on the operational complexities of DNS. We hope that ZDNS will enable researchers to better—and in a more reproducible manner—understand Internet behavior.
Liz Izhikevich, Renata Teixeira, and Zakir Durumeric
USENIX Security Symposium, August 2021
Internet-wide scanning is a commonly used research tech- nique that has helped uncover real-world attacks, find crypto- graphic weaknesses, and understand both operator and mis- creant behavior. Studies that employ scanning have largely assumed that services are hosted on their IANA-assigned ports, overlooking the study of services on unusual ports. In this work, we investigate where Internet services are deployed in practice and evaluate the security posture of services on unexpected ports. We show protocol deployment is more dif- fuse than previously believed and that protocols run on many additional ports beyond their primary IANA-assigned port. For example, only 3% of HTTP and 6% of TLS services run on ports 80 and 443, respectively. Services on non-standard ports are more likely to be insecure, which results in studies dramatically underestimating the security posture of Inter- net hosts. Building on our observations, we introduce LZR (“Laser”), a system that identifies 99% of identifiable unex- pected services in five handshakes and dramatically reduces the time needed to perform application-layer scans on ports with few responsive expected services (e.g., 5500% speedup on 27017/MongoDB). We conclude with recommendations for future studies.
Deepak Kumar, Zhengping Wang, Matthew Hyder, Joseph Dickinson, Gabrielle Beck, David Adrian,
Joshua Mason, Zakir Durumeric, J. Alex Halderman, Michael Bailey
IEEE Symposium on Security and Privacy ("Oakland"), May 2018
Over the past 20 years, websites have grown increasingly complex and interconnected. In 2016, only a negligible number of sites are dependency free, and over 90% of sites rely on external content. In this paper, we investigate the current state of web dependencies and explore two security challenges associated with the increasing reliance on external services: (1) the expanded attack surface associated with serving unknown, implicitly trusted third-party content, and (2) how the increased set of external dependencies impacts HTTPS adoption. We hope that by shedding light on these issues, we can encourage developers to consider the security risks associated with serving third-party content and prompt service providers to more widely deploy HTTPS.
Deepak Kumar, Zane Ma, Zakir Durumeric, Ariana Mirian, Joshua Mason, J. Alex Halderman, and Michael Bailey
26th World Wide Web Conference (WWW'17)
Over the past 20 years, websites have grown increasingly complex and interconnected. In 2016, only a negligible number of sites are dependency free, and over 90% of sites rely on external content. In this paper, we investigate the current state of web dependencies and explore two security challenges associated with the increasing reliance on external services: (1) the expanded attack surface associated with serving unknown, implicitly trusted third-party content, and (2) how the increased set of external dependencies impacts HTTPS adoption. We hope that by shedding light on these issues, we can encourage developers to consider the security risks associated with serving third-party content and prompt service providers to more widely deploy HTTPS.