Skip to content

Releases: pachterlab/gget

v0.29.1 - mutate and cosmic overhaul

21 Apr 23:19
0af4d0c
Compare
Choose a tag to compare
  • gget mutate:
    • gget mutate has been simplified to focus on taking as input a list of mutations and associated reference genome with corresponding annotation information, and produce as output the sequences with the mutation incorporated and a short region of surrounding context. For the full functionality of the previous version and how it integrates in the context of a novel variant screening pipeline, visit the varseek repository being developed by members of the gget team at https://github.com/pachterlab/varseek.git.
    • Added additional information to returned data frames as described here: #169
  • gget cosmic:
    • Major restructuring of the gget cosmic module to adhere to new login requirements set by COSMIC
    • New arguments email and password were added to allow the user to manually enter their login credentials without required input for data download
    • Default changed: gget_mutate=False
    • Deprecated argument: entity
    • Argument mutation_class is now cosmic_project
  • gget bgee:
    • type="orthologs" is now the default, removing the need to specify the type argument when calling orthologs
    • Allow querying multiple genes at once.
  • gget diamond:
    • Now supports translated alignment of nucleotide sequences to amino acid reference sequences using the --translated flag.
  • gget elm:
    • Improved server error handling.

v0.29.0 - cbio, opentargets, bgee and more

26 Sep 02:07
5589dee
Compare
Choose a tag to compare
  • New modules:
  • gget enrichr now also supports species other than human and mouse (fly, yeast, worm, and fish) via modEnrichR
  • gget mutate:
    gget mutate will now merge identical sequences in the final file by default. Mutation creation was vectorized to decrease runtime. Improved flanking sequence check for non-substitution mutations to make sure no wildtype kmer is retained in the mutation-containing sequence. Addition of several new arguments to customize sequence generation and output.
  • gget cosmic:
    Added support for targeted as well as gene screens. The CSV file created for gget mutate now also contains protein mutation info.
  • gget ref:
    Added out file option.
  • gget info and gget seq:
    Switched to Ensembl POST API to increase speed (nothing changes in front end).
  • Other "behind the scenes" changes:

fixes #157
fixes #121
fixes #144
fixes #140
fixes #103

v0.28.6 - gget mutate, download_cosmic, fixes for Ensembl v112

03 Jun 06:05
4664916
Compare
Choose a tag to compare
  • New module: gget mutate
  • gget cosmic: You can now download entire COSMIC databases using the argument download_cosmic argument
  • gget ref: Can now fetch the GRCh37 genome assembly using species='human_grch37'
  • gget search: Adjust access of human data to the structure of Ensembl release 112 (fixes issue 129)

v0.28.4 - Fix Windows bug in gget elm setup

01 Feb 01:02
061fbdd
Compare
Choose a tag to compare

Fix Windows bug in gget elm setup

v0.28.3 - cosmic, invertebrates for ref and search, elm improvements

22 Jan 21:42
5cad2f3
Compare
Choose a tag to compare
  • gget search and gget ref now also support fungi 🍄, protists 🌝, and invertebrate metazoa 🐝 🐜 🐌 🐙 (in addition to vertebrates and plants)
  • New module: gget cosmic
  • gget enrichr: Fix duplicate scatter dots in plot when pathway names are duplicated
  • gget elm:
    • Changed ortho results column name 'Ortholog_UniProt_ID' to 'Ortholog_UniProt_Acc' to correctly reflect the column contents, which are UniProt Accessions. 'UniProt ID' was changed to 'UniProt Acc' in the documentation for all gget modules.
    • Changed ortho results column name 'motif_in_query' to 'motif_inside_subject_query_overlap'.
    • Added interaction domain information to results (new columns: "InteractionDomainId", "InteractionDomainDescription", "InteractionDomainName").
    • The regex string for regular expression matches was encapsulated as follows: "(?=(regex))" (instead of directly passing the regex string "regex") to enable capturing all occurrences of a motif when the motif length is variable and there are repeats in the sequence (https://regex101.com/r/HUWLlZ/1).
  • gget setup: Use the out argument to specify a directory the ELM database will be downloaded into. Completes this feature request.
  • gget diamond: The DIAMOND command is now run with --ignore-warnings flag, allowing niche sequences such as amino acid sequences that only contain nucleotide characters and repeated sequences. This is also true for DIAMOND alignments performed within gget elm.
  • gget ref and gget search back-end change: the current Ensembl release is fetched from the new release file on the Ensembl FTP site to avoid errors during uploads of new releases.
  • gget search:
    • FTP link results (--ftp) are saved in txt file format instead of json.
    • Fix URL links to Ensembl gene summary for species with a subspecies name and invertebrates.
  • gget ref:
    • Back-end changes to increase speed
    • New argument: list_iv_species to list all available invertebrate species (can be combined with the release argument to fetch all species available from a specific Ensembl release)

v0.28.2 - NCBI server issues and gget elm expand

16 Nov 21:31
a5477e4
Compare
Choose a tag to compare
  • gget info: Return a logging error message when the NCBI server fails for a reason other than a fetch fail (this is an error on the server side rather than an error with gget)
  • Replace deprecated 'text' argument to find()-type methods whenever used with dependency BeautifulSoup
  • gget elm: Remove false positive and true negative instances from returned results
  • gget elm: Add expand argument

v0.28.0 - gget elm + gget diamond

12 Nov 20:48
32c7b10
Compare
Choose a tag to compare
  • Updated documentation of gget muscle to add a tutorial on how to visualize sequences with sequence name lengths + slight change to returned visualization so it's a bit more robust to varying sequence names
  • gget muscle now also allows a list of sequences as input (as an alternative to providing the path to a FASTA file)
  • Allow missing gene filter for gget cellxgene (fixes bug)
  • gget seq: Allow missing gene names (fixes #107)
  • New arguments for gget enrichr: Use arguments kegg_out and kegg_rank to create an image of the KEGG pathway with the genes from the enrichment analysis highlighted (thanks to this PR by Noriaki Sato)
  • New modules: gget elm and gget diamond

co-authored-by: @anhchi172

v0.27.9 - gget enrichr background genes

07 Aug 18:30
9f476af
Compare
Choose a tag to compare
  • gget enrichr background genes
  • expand gget search results to include synonym hits

Resolves #90 , resolves #9

Co-authored-by: @anhchi172

v0.27.8 - Fixed bug in gget pdb; add release argument to gget search

12 Jul 21:30
da3e566
Compare
Choose a tag to compare
  • Fixed bug in gget pdb
  • Added new release argument to gget search

Also see: https://pachterlab.github.io/gget/updates.html

Co-contributor: @anhchi172

v0.27.7 - Cleaned up requirements; gget alphafold compatibility with Python>=3.10

16 May 00:43
8e19699
Compare
Choose a tag to compare

Moved dependencies for modules gget gpt and gget cellxgene from automatically installed requirements to gget setup.
Updated gget alphafold dependencies for compatibility with Python >= 3.10.
Added census_version argument to gget cellxgene.