May 21, 2022

The software system award recognizes the Jupyter project team

By LFernando Pérez

By Linda Vu

The Jupyter project team received a Software System Award from the Association of Computing Machinery (ACM) for developing a tool that has had a lasting influence on computing. The Jupyter project evolved from IPython, an effort started by Fernando Pérez, assistant professor of statistics at UC Berkeley and a researcher in the Usable Software Systems group of the Computer Research Division of Lawrence Berkeley National Laboratory (Berkeley Lab).

The prize and a $ 35,000 prize will be presented to the team at the ACM Awards banquet in San Francisco on June 23, 2018.

The Jupyter project is an open international collaboration that develops tools for interactive computing: a process of human-machine interaction for scientific exploration and data analysis. The collaboration is developing applications such as the ever popular Jupyter Notebook, an open source web application that allows users to create and share documents containing live code, equations, visualizations, and narrative text.

Today, over 2 million Jupyter notebooks are hosted on the popular GitHub service, spanning technical documentation to course materials, books, and academic publications. Jupyter has transformed scientific collaborations and reproducibility, as evidenced by its use at the LIGO Observatory, whose discovery of gravitational waves was awarded the 2017 Nobel Prize in Physics. The LIGO Open Science Center publishes Jupyter Notebooks that allow anyone to reproduce their original analyzes. Jupyter Notebooks also serve as the basic infrastructure for research efforts like the Department of Energy (DOE) funded KBase platform for predictive biology, the Broad Institute’s GenePattern Notebook project, and UC San Diego. and the European Union-funded OpenDreamKit project that builds virtual research environments for mathematics.

JupyterHub supports the deployment of Jupyter tools in multi-user environments, from small research groups to universities, enterprises and other organizations. JupyterHub is used in many commercial enterprises, for research at facilities such as CERN and high performance computing centers like the DOE National Scientific Computing Center for Energy Research (NERSC) and the Supercomputer Center in San Diego ( SDSC).

“The flexibility of the Jupyter architecture facilitates deployment in a variety of scenarios: while individual users can run the tools on a laptop or personal workstation, the same tools can be deployed on remote resources,” explains Shane Canon, Project Engineer. at NERSC. “In fact, NERSC offers Jupyter as an interactive tool for remote access to its high-performance computing resources. “

At UC Berkeley two new courses Foundations of Data Science and Principles and techniques of data science, will be supported by Jupyter Notebooks deployed in the cloud and integrated with campus authentication. Classes are offered as part of UC Berkeley’s new Data Science major. Pérez will teach the upper division course Principles and techniques of data science.

In industry, Jupyter Notebook is widely used as a daily data calculation and analysis tool, and large companies have created hosted services based on Jupyter. Google’s Cloud DataLab, Microsoft Notebooks on Azure, and IBM’s Data Science Experience all offer Jupyter Notebooks on their respective cloud infrastructure.

In education, at least 45 different courses use Jupyter Notebooks to teach a wide variety of subjects including computer science, aerodynamics, numerical methods, statistics, computational physics, cognitive science, and science. Datas. These have been deployed to leading universities in the United States and abroad, including UC Berkeley, Cal Poly, MIT, Harvard, Columbia, and Imperial College.

As a graduate student of physics at the University of Colorado in the early 2000s, Pérez recalls using a mishmash of software systems to illustrate code, equations, visualizations, and text in his calculus articles. scientist. This inspired him to create a unified environment for scientific computing. He found researchers around the world who had all independently started building scientific computing tools in Python and combined these disparate efforts into an open source platform called IPython – “I” for interactive. The program was free, and anyone could inspect its code, modify it, and make the output available under liberal licensing terms.

Over the years, IPython has evolved to meet the needs of various communities and in 2014 the project rebranded itself “Jupyter” to recognize the fact that it was no longer just for Python. In 2015, Pérez and Brian Granger of the Polytechnic University of California at San Luis Obispo received $ 6 million from the Leona M. and Harry B. Helmsley Charitable Trust, the Alfred P. Sloan Foundation, and the Gordon and Betty Foundation. Moore to expand and improve the capabilities of the Jupyter Notebook.

Since then, Pérez and Granger have secured additional funding from other sources like the DOE and industry partners like Google, Microsoft and Anaconda Inc. Companies such as Bloomberg, IBM, Microsoft, Netflix, Rackspace and Anaconda also support the project, either with services or with the time of engineers who actively contribute to the development of Jupyter. The next-generation user interface for Jupyter Notebook, known as JupyterLab, is currently being developed in open collaboration with team members and engineers at Bloomberg and Anaconda.

“One afternoon in late 2001, I was a physics graduate student at the University of Colorado working on my thesis and decided to spend an afternoon writing the original, tiny version of IPython, ”explains Pérez. “I could not have imagined that it would become a global platform almost two decades later. For me, it was a mad rush, made possible by going from a personal exploration to an open collaboration with an incredible team ”

“This is a project that has demonstrated 20 years of intellectual contributions with major impact in research, education and industry, and it continues to make its advances available to the world as a platform. open form, ”said Kathy Yelick, associate director of the laboratory at Berkeley Lab. Computer sciences. “The ACM System Software Award is an incredible honor, and this team fully deserves this recognition. “

In addition to Pérez, other members of the Jupyter Project collaboration include Brian E. Granger and Carol Willing (Cal Poly San Luis Obispo), Matthias Bussonnier (UC Berkeley BIDS), Paul Ivanov and Jason Grout (Bloomberg), Thomas Kluyver (European XFEL), Damián Avila (Anaconda, Inc.), Steven Silvester (JP Morgan Chase), Jonathan Frederic (Google), Kyle Kelley (Netflix), Jessica Hamrick (DeepMind), Sylvain Corlay (QuantStack), Peter Parente (Valassis Digital) .

NERSC is a user installation of the DOE Office of Science.

Following:

###

The Lawrence Berkeley National Laboratory tackles the world’s most pressing scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab’s scientific expertise has been awarded 13 Nobel Prizes. The University of California operates the Berkeley Lab for the US Department of Energy’s Bureau of Science. To learn more, visit www.lbl.gov.

The DOE’s Office of Science is the largest supporter of basic research in the physical sciences in the United States and works to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.