p3-generate-close-roles¶
Find Roles That Occur Close Together¶
p3-generate-close-roles.pl [options] <roles.tbl >pairs.tbl
This script is part of a pipeline to compute functionally-coupled roles. It takes a file of locations and roles, then outputs a file of pairs of roles with the number of times features containing those two roles occur close together on the chromosome. Such roles typically have related functions in a genome.
The input file must contain the following four fields.
1
genome ID
2
contig (sequence) ID
3
location in the sequence
4
functional role
The default script assumes the four columns are in that order. This can all be overridden with command-line options.
The input file must be sorted by genome ID and then by sequence ID within genome ID. Otherwise, the results will be incorrect. Use p3-sort to sort the file.
The location is a PATRIC location string, either of the form start..
end or complement(
left..
right)
.
Given a set of genome IDs in the file genomes.tbl
, you can generate the proper file using the following pipe.
p3-get-genome-features --attr sequence_id --attr location --attr product <genomes.tbl | p3-function-to-role
(If PATRIC does not yet have roles defined, you will need to use an additional command-line option on p3-function-to-role.)
Parameters¶
There are no positional parameters.
The standard input can be overriddn using the options in Input Options.
Additional command-line options are
genome
The index (1-based) or name of the column containing the genome ID. The default is
1
.
sequence
The index (1-based) or name of the column containing the sequence ID. The default is
2
.
location
The index (1-based) or name of the column containing the location string. The default is
3
.
role
The index (1-based) or name of the column containing the role description. The default is
4
.
maxGap
The maximum space between two features considered close. The default is
2000
.
minOcc
The minimum number of occurrences for a pair to be considered significant. The default is
4
.