Skip to content

Document the GTF attributes #1300

Open
Open
@brianraymor

Description

@brianraymor

Context

Required Gene Annotations should include more details about which GTF attributes are used by CELLxGENE.

It appears that the most descriptive reference for Ensembl GTF is a README for a GTF download such as https://ftp.ensembl.org/pub/release-113/gtf/drosophila_melanogaster/README:

Attributes

The following attributes are available. All attributes are semi-colon
separated pairs of keys and values.

  • gene_id: The stable identifier for the gene
  • gene_version: The stable identifier version for the gene
  • gene_name: The official symbol of this gene
  • gene_source: The annotation source for this gene
  • gene_biotype: The biotype of this gene
  • transcript_id: The stable identifier for this transcript
  • transcript_version: The stable identifier version for this transcript
  • transcript_name: The symbold for this transcript derived from the gene name
  • transcript_source: The annotation source for this transcript
  • transcript_biotype: The biotype for this transcript
  • exon_id: The stable identifier for this exon
  • exon_version: The stable identifier version for this exon
  • exon_number: Position of this exon in the transcript
  • ccds_id: CCDS identifier linked to this transcript
  • protein_id: Stable identifier for this transcript's protein
  • protein_version: Stable identifier version for this transcript's protein
  • tag: A collection of additional key value tags
  • transcript_support_level: Ranking to assess how well a transcript is supported (from 1 to 5)

otherwise, there's Format description of GENCODE GTF

Also see the history.

Metadata

Metadata

Assignees

Labels

draftingdrafting schema requirementseditorialschemaCELLxGENE Discover dataset schema

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions