Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter: Prefer "output sequences" over "output"? #1607

Open
victorlin opened this issue Aug 27, 2024 · 2 comments · May be fixed by #1622
Open

filter: Prefer "output sequences" over "output"? #1607

victorlin opened this issue Aug 27, 2024 · 2 comments · May be fixed by #1622
Labels
proposal Proposals that warrant further discussion

Comments

@victorlin
Copy link
Member

victorlin commented Aug 27, 2024

augur filter allows --output, --output-sequences, and -o to be used interchangeably:

output_group.add_argument('--output', '--output-sequences', '-o', help="filtered sequences in FASTA format")

The order here means that it must be internally referenced as args.output, where output is the default value of dest.

"output" is ambiguous since this is just one of many output options. I would prefer the more specific name to align with other options and subcommands.

Two layers to this proposal:

  1. Prefer "output sequences" over "output" internally.

    • Use dest='output_sequences' and args.output_sequences.
  2. Prefer "output sequences" over "output" for users.

    • Reorder the options to '--output-sequences', '--output', '-o' so that the preferred name is displayed first. This would remove the need for an explicit dest.
    • A bigger change would be deprecating the --output/-o flags and removing in a major release, but maybe that's not necessary and would just be extra churn.
@victorlin victorlin added the proposal Proposals that warrant further discussion label Aug 27, 2024
@huddlej
Copy link
Contributor

huddlej commented Aug 29, 2024

Thanks for documenting this so clearly, @victorlin. I'm definitely in favor of preferring --output-sequences for users and eventually deprecating --output.

@victorlin victorlin linked a pull request Sep 4, 2024 that will close this issue
4 tasks
@victorlin
Copy link
Member Author

It might be worth considering doing the same in augur index. Current usage:

usage: augur index [-h] --sequences SEQUENCES --output OUTPUT [--verbose]

Count occurrence of bases in a set of sequences.

options:
  -h, --help            show this help message and exit
  --sequences SEQUENCES, -s SEQUENCES
                        sequences in FASTA or VCF formats. Augur will summarize the content of FASTA sequences and only report the names of strains found in a given VCF. (default: None)
  --output OUTPUT, -o OUTPUT
                        tab-delimited file containing the number of bases per sequence in the given file. Output columns include strain, length, and counts for A, C, G, T, N, other valid IUPAC characters, ambiguous characters ('?' and '-'), and other invalid
                        characters. (default: None)
  --verbose, -v         print index statistics to stdout (default: False)

There, --output-sequences needs to be added first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Proposals that warrant further discussion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants