The NDN for Data Intensive Science Experiments (N-DISE) project aims to accelerate the pace of breakthroughs and innovations in data-intensive science fields such as the Large Hadron Collider (LHC) high energy physics program and the BioGenome and human genome projects. Based on Named Data Networking (NDN), a data-centric future Internet architecture, N-DISE will deploy and commission a highly efficient and field-tested petascale data distribution, caching, access and analysis system serving major science programs.
The N-DISE project will design and develop high-throughput caching and forwarding. methods, containerization techniques, hierarchical memory management subsystems, congestion control mechanisms, integrated with Field Programmable Gate Arrays (FPGA) acceleration subsystems, to produce a system capable of delivering LHC and genomic data over a wide area network at throughputs approaching 100 Gbits per second, while dramatically decreasing download times. In addition, N-DISE will utilize NDN’s built-in data security support to ensure data integrity and rovenance tracing. N-DISE will leverage existing infrastructure and build an enhanced testbed with four additional high performance NDN data cache servers at participating institutions.
N-DISE will provide a field-tested working prototype of a multi-domain data distribution and access system offering fast access and low cost, as well as data integrity and provenance, to many data-intensive science and engineering fields. The project plans to hold annual workshops and hackathons to train students, postdocs, and other researchers on NDN architectural design, algorithms, as well as implementation methodologies for specific data-intensive science environments. The N-DISE participating institutions will incorporate concepts from NDN and big data ecosystems in networking, computer science, and data-intensive science courses. The project will undertake initiatives for actively involving under-represented groups, and for educational outreach to K-12 students.
N-DISE will maintain a repository on GitHub, accessible via the URL https://github.com/neu-yehlab/n-dise. The repository will hold up-to-date publications, code, data, results, and simulators. The repository will be maintained by the team for at least three years beyond the duration of the project.