Data mining the stars: The virtualized telescope that transformed astronomy

In the 1990s, astrophysicist Dr. Alex Szalay and computer scientist Dr. Jim Gray had a brainstorm: What if a database could be turned into a virtual telescope that could then be data-mined? Open access to such data could revolutionize the field of astronomy.

With time, the idea would become the Sloan Digital Sky Survey (SDSS), an international collaboration of hundreds of scientists at dozens of institutions.

The goal was to index the sky using a dedicated 2.5-meter telescope at Apache Point Observatory in New Mexico. Equipped with a 120-megapixel camera, the telescope would image more than one-quarter of the night sky, 1.5 square degrees at a time. The project used Microsoft SQL Server as the back-end database.

From 1998 to 2009, the telescope operated in both imaging and spectroscopic modes. SDSS retired the imaging camera in 2009, but the telescope continues to observe in spectroscopic mode. The data is openly available through the SkyServer database, an online portal. Today, the database has a 15TB queryable public dataset, and about 150TB of additional raw and calibrated files.

