Paper
12 May 2016 Mining the SDSS SkyServer SQL queries log
Vitor Makiyama Hirota, Rafael Santos, Jordan Raddick, Ani Thakar
Author Affiliations +
Abstract
SkyServer, the Internet portal for the Sloan Digital Sky Survey (SDSS) astronomic catalog, provides a set of tools that allows data access for astronomers and scientific education. One of SkyServer data access interfaces allows users to enter ad-hoc SQL statements to query the catalog. SkyServer also presents some template queries that can be used as basis for more complex queries. This interface has logged over 330 million queries submitted since 2001. It is expected that analysis of this data can be used to investigate usage patterns, identify potential new classes of queries, find similar queries, etc. and to shed some light on how users interact with the Sloan Digital Sky Survey data and how scientists have adopted the new paradigm of e-Science, which could in turn lead to enhancements on the user interfaces and experience in general. In this paper we review some approaches to SQL query mining, apply the traditional techniques used in the literature and present lessons learned, namely, that the general text mining approach for feature extraction and clustering does not seem to be adequate for this type of data, and, most importantly, we find that this type of analysis can result in very different queries being clustered together.
© (2016) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Vitor Makiyama Hirota, Rafael Santos, Jordan Raddick, and Ani Thakar "Mining the SDSS SkyServer SQL queries log", Proc. SPIE 9851, Next-Generation Analyst IV, 98510S (12 May 2016); https://doi.org/10.1117/12.2224237
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Databases

Mining

Feature extraction

Neodymium

Distance measurement

Galactic astronomy

Visualization

RELATED CONTENT

CuGene as a tool to view and explore genomic data
Proceedings of SPIE (August 07 2017)
CAMEL: concept annotated image libraries
Proceedings of SPIE (January 01 2001)
Recent trends in print portals and Web2Print applications
Proceedings of SPIE (January 19 2009)
New perspective on visual information retrieval
Proceedings of SPIE (December 18 2003)
WISE: a content-based Web image search engine
Proceedings of SPIE (December 22 2000)

Back to Top