Issue |
A&A
Volume 649, May 2021
|
|
---|---|---|
Article Number | A53 | |
Number of page(s) | 24 | |
Section | Extragalactic astronomy | |
DOI | https://doi.org/10.1051/0004-6361/202040046 | |
Published online | 11 May 2021 |
Unsupervised classification of SDSS galaxy spectra⋆
1
Univ. Grenoble Alpes, CNRS, IPAG, Grenoble, France
e-mail: [email protected]
2
Université Côte d’Azur, Inria, CNRS, LJAD, Maasai, Nice, France
e-mail: [email protected]
3
Université de Toulouse, CNRS, CNES, UPS, 14 avenue Edouard Belin, 31400 Toulouse, France
e-mail: [email protected]
Received:
2
December
2020
Accepted:
16
February
2021
Context. Defining templates of galaxy spectra is useful to quickly characterise new observations and organise databases from surveys. These templates are usually built from a pre-defined classification based on other criteria.
Aims. We present an unsupervised classification of 702 248 spectra of galaxies and quasars with redshifts smaller than 0.25 that were retrieved from the Sloan Digital Sky Survey (SDSS) database, release 7.
Methods. The spectra were first corrected for redshift, then wavelet-filtered to reduce the noise, and finally binned to obtain about 1437 wavelengths per spectrum. The unsupervised clustering algorithm Fisher-EM, relying on a discriminative latent mixture model, was applied on these corrected spectra. The full set and several subsets of 100 000 and 300 000 spectra were analysed.
Results. The optimum number of classes given by a penalised likelihood criterion is 86 classes, of which the 37 most populated gather 99% of the sample. These classes are established from a subset of 302 214 spectra. Using several cross-validation techniques we find that this classification agrees with the results obtained on the other subsets with an average misclassification error of about 15%. The large number of very small classes tends to increase this error rate. In this paper, we do an initial quick comparison of our classes with literature templates.
Conclusions. This is the first time that an automatic, objective and robust unsupervised classification is established on such a large number of galaxy spectra. The mean spectra of the classes can be used as templates for a large majority of galaxies in our Universe.
Key words: methods: data analysis / methods: statistical / galaxies: statistics / galaxies: general / techniques: spectroscopic
The mean spectra of the 86 classes and the class of each of the 302 248 spectra are only available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/cat/J/A+A/649/A53
© D. Fraix-Burnet et al. 2021
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.