This function, residing within the Scanpy preprocessing module, computes a suite of quality control metrics on single-cell data. These metrics encompass aspects such as the number of genes detected per cell, the total number of transcripts (counts) per cell, and the percentage of reads mapping to mitochondrial genes. As an illustration, the function can determine that a particular cell expresses only a small number of genes, suggesting it might be of poor quality and warrant removal from subsequent analysis.
The calculated metrics are crucial for identifying and filtering out low-quality cells and genes, a necessary step before performing downstream analyses such as clustering, differential expression, and trajectory inference. Retaining low-quality data can introduce bias and lead to inaccurate biological interpretations. Historically, manual inspection and thresholding of these metrics were common, but this function streamlines the process by automating the calculation and providing a structured framework for quality control.