showalign Wiki The master copies of EMBOSS documentation are available at on the EMBOSS Wiki. Please help by correcting and extending the Wiki pages. Function Display a multiple sequence alignment in pretty format Description showalign reads a set of aligned protein or a nucleic acid sequences, and writes them to file (or screen) in a style suitable for publication. Similarities and differences of each sequence to a reference sequence are highlighted for specified types of matches. The reference sequence can be the calculated consensus sequence (default) or one of the input set (specified by name or the ordinal number of that sequence in the file). The output sequences can be displayed in either the input order (the default), sorted in order of their similarity to the reference sequence, or sorted alphabetically by their names. There are many other options to control the content and format of the output. Usage Here is a sample session with showalign % showalign Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the input files for this example Go to the output files for this example Example 2 Display the sequences in order of similarity to the reference sequence % showalign -order=s Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Example 3 Format for HTML and highlight some interesting regions in different colours: % showalign -html -high "4-13 green 43-43 red 51-56 blue" Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Example 4 No consensus line at the bottom No ruler line No numbers line Don't repeat the reference sequence at the bottom of the sequences Use sequence 1 as the reference sequence Display residues from position 10 to 30 only % showalign -nocon -norule -nonum -nobot -ref=1 -sb=10 -send=30 Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Example 5 Show non-identities between the sequences % showalign -show=n Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Example 6 Show all of the sequences % showalign -show=a Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Example 7 Show identities between the sequences % showalign -show=i Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Example 8 Show similarities between the sequences % showalign -show=s Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Example 9 Show dissimilarities between the sequences % showalign -show=d Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Example 10 Use the first sequence as the reference to compare to: % showalign -ref=1 Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Example 11 Show a range of sequences in uppercase, everything else in lowercase % showalign -nocon -ref=1 -sl -upper 9-15 -nosimilarcase Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Example 12 Display the sequences in alphabetic order: % showalign -order=a Display a multiple sequence alignment in pretty format Input (aligned) sequence set: globins.msf Output file [globins.showalign]: Go to the output files for this example Command line arguments Display a multiple sequence alignment in pretty format Version: EMBOSS: Standard (Mandatory) qualifiers: [-sequence] seqset The sequence alignment to be displayed. [-outfile] outfile [*.showalign] Output file name Additional (Optional) qualifiers: -matrix matrix [EBLOSUM62 for protein, EDNAFULL for DNA] This is the scoring matrix file used when comparing sequences. By default it is the file 'EBLOSUM62' (for proteins) or the file 'EDNAFULL' (for nucleic sequences). These files are found in the 'data' directory of the EMBOSS installation. -refseq string [0] If you give the number in the alignment or the name of a sequence, it will be taken to be the reference sequence. The reference sequence is always shown in full and is the one against which all the other sequences are compared. If this is set to 0 then the consensus sequence will be used as the reference sequence. By default the consensus sequence is used as the reference sequence. (Any string) -[no]bottom boolean [Y] If this is true then the reference sequence is displayed at the bottom of the alignment instead of the top. -show menu [N] What to show (Values: A (All of the sequences); I (Identities between the sequences); N (Non-identities between the sequences); S (Similarities between the sequences); D (Dissimilarities between the sequences)) -order menu [I] Output order of the sequences (Values: I (Input order - no change); A (Alphabetical order of the names); S (Similarity to the reference sequence)) -[no]similarcase boolean [Y] If this is set True, then when -show is set to 'Similarities' or 'Non-identities' and a residue is similar but not identical to the reference sequence residue, it will be changed to lower-case. If -show is set to 'All' then non-identical, non-similar residues will be changed to lower-case. If this is False then no change to the case of the residues is made on the basis of their similarity to the reference sequence. -[no]consensus boolean [Y] If this is true then the consensus line is displayed. Advanced (Unprompted) qualifiers: -uppercase range [If this is left blank, then the sequence case is left alone.] Regions to put in uppercase. If this is left blank, then the sequence case is left alone. A set of regions is specified by a set of pairs of positions. The positions are integers. They are separated by any non-digit, non-alpha character. Examples of region specifications are: 24-45, 56-78 1:45, 67=99;765..888 1,5,8,10,23,45,57,99 -[no]number boolean [Y] If this option is true then a line giving the positions in the alignment is displayed every 10 characters above the alignment. -[no]ruler boolean [Y] If this option is true then a ruler line marking every 5th and 10th character in the alignment is displayed. -width integer [60] Width of sequence to display (Integer 1 or more) -margin integer [-1] This sets the length of the left-hand margin for sequence names. If the margin is set at 0 then no margin and no names are displayed. If the margin is set to a value that is less than the length of a sequence name then the sequence name is displayed truncated to the length of the margin. If the margin is set to -1 then the minimum margin width that will allow all the sequence names to be displayed in full plus a space at the end of the name will automatically be selected. (Integer -1 or more) -html boolean [N] Use HTML formatting -highlight range [(full sequence)] Regions to colour if formatting for HTML. If this is left blank, then the sequence is left alone. A set of regions is specified by a set of pairs of positions. The positions are integers. They are followed by any valid HTML font colour. Examples of region specifications are: 24-45 blue 56-78 orange 1-100 green 120-156 red A file of ranges to colour (one range per line) can be specified as '@filename'. -plurality float [50.0] Set a cut-off for the % of positive scoring matches below which there is no consensus. The default plurality is taken as 50% of the total weight of all the sequences in the alignment. (Number from 0.000 to 100.000) -setcase float [@( $(sequence.totweight) / 2)] Sets the threshold for the scores of the positive matches above which the consensus is in upper-case and below which the consensus is in lower-case. By default this is set to be half of the (weight-adjusted) number of sequences in the alignment. (Any numeric value) -identity float [0.0] Provides the facility of setting the required number of identities at a position for it to give a consensus. Therefore, if this is set to 100% only columns of identities contribute to the consensus. (Number from 0.000 to 100.000) -[no]gaps boolean [Y] If this option is true then gap characters can appear in the consensus. The alternative is 'N' for nucleotide, or 'X' for protein Associated qualifiers: "-sequence" associated qualifiers -sbegin1 integer Start of each sequence to be used -send1 integer End of each sequence to be used -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -scircular1 boolean Sequence is circular -sformat1 string Input sequence format -iquery1 string Input query fields or ID list -ioffset1 integer Input start position offset -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-outfile" associated qualifiers -odirectory2 string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write first file to standard output -filter boolean Read first file from standard input, write first file to standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messages -version boolean Report version number and exit Input file format showalign reads in a set of aligned protein or nucleic sequences. Input files for usage example File: globins.msf !!AA_MULTIPLE_ALIGNMENT 1.0 ../data/globins.msf MSF: 164 Type: P 25/06/01 CompCheck: 4278 .. Name: HBB_HUMAN Len: 164 Check: 6914 Weight: 0.61 Name: HBB_HORSE Len: 164 Check: 6007 Weight: 0.65 Name: HBA_HUMAN Len: 164 Check: 3921 Weight: 0.65 Name: HBA_HORSE Len: 164 Check: 4770 Weight: 0.83 Name: MYG_PHYCA Len: 164 Check: 7930 Weight: 1.00 Name: GLB5_PETMA Len: 164 Check: 1857 Weight: 0.91 Name: LGB2_LUPLU Len: 164 Check: 2879 Weight: 0.43 // 1 50 HBB_HUMAN ~~~~~~~~VHLTPEEKSAVTALWGKVN.VDEVGGEALGR.LLVVYPWTQR HBB_HORSE ~~~~~~~~VQLSGEEKAAVLALWDKVN.EEEVGGEALGR.LLVVYPWTQR HBA_HUMAN ~~~~~~~~~~~~~~VLSPADKTNVKAA.WGKVGAHAGEYGAEALERMFLS HBA_HORSE ~~~~~~~~~~~~~~VLSAADKTNVKAA.WSKVGGHAGEYGAEALERMFLG MYG_PHYCA ~~~~~~~VLSEGEWQLVLHVWAKVEAD.VAGHGQDILIR.LFKSHPETLE GLB5_PETMA PIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQE LGB2_LUPLU ~~~~~~~~GALTESQAALVKSSWEEFNANIPKHTHRFFILVLEIAPAAKD 51 100 HBB_HUMAN FFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSE HBB_HORSE FFDSFGDLSNPGAVMGNPKVKAHGKKVLHSFGEGVHHLDNLKGTFAALSE HBA_HUMAN FPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSD HBA_HORSE FPTTKTYFPHFDLSHGSAQVKAHGKKVGDALTLAVGHLDDLPGALSNLSD MYG_PHYCA KFDRFKHLKTEAEMKASEDLKKHGVTVLTALGAILKKKGHHEAELKPLAQ GLB5_PETMA FFPKFKGLTTADQLKKSADVRWHAERIINAVNDAVASMDDTEKMSMKLRD LGB2_LUPLU LFSFLKGTSEVPQNNPELQAHAGKVFKLVYEAAIQLQVTGVVVTDATLKN 101 150 HBB_HUMAN LHCDKLH..VDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVA HBB_HORSE LHCDKLH..VDPENFRLLGNVLVVVLARHFGKDFTPELQASYQKVVAGVA HBA_HUMAN LHAHKLR..VDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVS HBA_HORSE LHAHKLR..VDPVNFKLLSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVS MYG_PHYCA SHATKHK..IPIKYLEFISEAIIHVLHSRHPGDFGADAQGAMNKALELFR GLB5_PETMA LSGKHAK..SFQVDPQYFKVLAAVIADTVAAGDAGFEKLMSMICILLRSA LGB2_LUPLU LGSVHVSKGVADAHFPVVKEAILKTIKEVVGAKWSEELNSAWTIAYDELA 151 164 HBB_HUMAN NALAHKYH~~~~~~ HBB_HORSE NALAHKYH~~~~~~ HBA_HUMAN TVLTSKYR~~~~~~ HBA_HORSE TVLTSKYR~~~~~~ MYG_PHYCA KDIAAKYKELGYQG GLB5_PETMA Y~~~~~~~~~~~~~ LGB2_LUPLU IVIKKEMNDAA~~~ You can specify a file of ranges to display in uppercase by giving the '-uppercase' qualifier the value '@' followed by the name of the file containing the ranges. (eg: '-upper @myfile'). The format of the range file is: * Comment lines start with '#' in the first column. * Comment lines and blank lines are ignored. * The line may start with white-space. * There are two positive (integer) numbers per line separated by one or more space or * TAB characters. * The second number must be greater or equal to the first number. * There can be optional text after the two numbers to annotate the line. * White-space before or after the text is removed. An example range file is: # this is my set of ranges 12 23 4 5 this is like 12-23, but smaller 67 10348 interesting region You can specify a file of ranges to highlight in a different colour when outputting in HTML format (using the '-html' qualifier) by giving the '-highlight' qualifier the value '@' followed by the name of the file containing the ranges. (eg: '-highlight @myfile'). The format of this file is very similar to the format of the above uppercase range file, except that the text after the start and end positions is used as the HTML colour name. This colour name is used 'as is' when specifying the colour in HTML in a '' construct, (where 'xxx' is the name of the colour). The standard names of HTML font colours are given in http:// An example highlight range file is: # this is my set of ranges 12 23 red 4 5 darkturquoise 67 10348 #FFE4E1 Output file format showalign writes out a text file, optionally formatted for HTML. Output files for usage example File: globins.showalign 10 20 30 40 50 60 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN --------VH.tPE.K.A.TaL.G.V.-VD....E..GR-.LVvY.WT.R..Es.GD..T HBB_HORSE --------VQ..GE.KaA.LaL.D.V.-EE....E..GR-.LVvY.WT.R..Ds.GD..N HBA_HUMAN --------------VL.PADKTNV.AA-WGk..AH.GEYGAEAlERMFLS.PT.KTYFPH HBA_HORSE --------------VL.AADKTNV.AA-WSk...H.GEYGAEAlERMFLG.PT.KTYFPH MYG_PHYCA -------VLSEGEWqLVLHVWAKVeAd-VAGH.QDI.IR-.FKSH.ETLEK.DR.KH.KT GLB5_PETMA PIVDTGSVAP..AA.KtKiR.A.APVYSTY.TS.VDiLVKFFTST.AA.E..PK.KG.tT LGB2_LUPLU --------GA.tESqAaL.K.S.EeF.ANIPKHTHRFFILvLE.A.AAkDL.SFLKGT.E Consensus xxxxxxxxxxLSxxExSxVxSxWxKxNxxxEVGGxALxxxLxxIxPxxQxFFxTFxxLSx 70 80 90 100 110 120 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN P.AvM.nPk..A......G.FS.GlA...n.kGTF.T..e..CD..H--...E..r..Gn HBB_HORSE PGAvM.nPk..A......HsFGeG.H...n.kGTF.A..e..CD..H--...E..r..Gn HBA_HUMAN F.LSH..A...G.....AD..TnA.A.v..mPNALsA......H...--...V.....S. HBA_HORSE F.LSH..A...A.....GD..TLA.G.....PGALsN......H...--...V.....S. MYG_PHYCA EAE.KA.EDl.K..VT..T..GAIlKKKGHH.AELKP.aQS..T.Hk--iPIKYLeFiSE GLB5_PETMA A.QlKK.AD.rW.AeriiN.vN.A.ASm..T.KMSMK.R..SGKHAk--SFQVdPqYFKV LGB2_LUPLU VPQNNPEL.AHAGKVFK.VYEAAIQLQvTGvVVTD.T.Kn.GsVHvSKG.ADAh.PvvKE Consensus xDxMxGSxQVKxHGKKVLxALxDxVxHLDDLExxxAxLSDLHAxKLRxxVDPxNFKLLxH 130 140 150 160 ----:----|----:----|----:----|----:----|---- HBB_HUMAN V.vC...H.FGKe...P.Q.aYQ.Vv.G..NA.AH..H------ HBB_HORSE V.vV...R.FGK.....lQ..YQ.Vv.G..NA.AH..H------ HBA_HUMAN C..VT..A.LPAe...A.H..lD.F..S.sTV.TS..R------ HBA_HORSE C..ST..V.LPN....A.H..lD.F.sS.sTV.TS..R------ MYG_PHYCA AiiH..HSRHPG..GAdAQGa.N.A.ELFRKDiAA..KELGYQG GLB5_PETMA LAAViADTVAAG.AGF.KLM..ICI.LRS.Y------------- LGB2_LUPLU Ai.KTiKEVVGAKwsE.lNsaWTIAYDEl.IViKKeMNDAA--- Consensus xLLxVLAxHxxxDFTPEVxASMxKxLAxVAxxLxxKYxx-xxxx Output files for usage example 2 File: globins.showalign 10 20 30 40 50 60 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN --------VH.tPE.K.A.TaL.G.V.-VD....E..GR-.LVvY.WT.R..Es.GD..T HBB_HORSE --------VQ..GE.KaA.LaL.D.V.-EE....E..GR-.LVvY.WT.R..Ds.GD..N HBA_HORSE --------------VL.AADKTNV.AA-WSk...H.GEYGAEAlERMFLG.PT.KTYFPH HBA_HUMAN --------------VL.PADKTNV.AA-WGk..AH.GEYGAEAlERMFLS.PT.KTYFPH GLB5_PETMA PIVDTGSVAP..AA.KtKiR.A.APVYSTY.TS.VDiLVKFFTST.AA.E..PK.KG.tT MYG_PHYCA -------VLSEGEWqLVLHVWAKVeAd-VAGH.QDI.IR-.FKSH.ETLEK.DR.KH.KT LGB2_LUPLU --------GA.tESqAaL.K.S.EeF.ANIPKHTHRFFILvLE.A.AAkDL.SFLKGT.E Consensus xxxxxxxxxxLSxxExSxVxSxWxKxNxxxEVGGxALxxxLxxIxPxxQxFFxTFxxLSx 70 80 90 100 110 120 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN P.AvM.nPk..A......G.FS.GlA...n.kGTF.T..e..CD..H--...E..r..Gn HBB_HORSE PGAvM.nPk..A......HsFGeG.H...n.kGTF.A..e..CD..H--...E..r..Gn HBA_HORSE F.LSH..A...A.....GD..TLA.G.....PGALsN......H...--...V.....S. HBA_HUMAN F.LSH..A...G.....AD..TnA.A.v..mPNALsA......H...--...V.....S. GLB5_PETMA A.QlKK.AD.rW.AeriiN.vN.A.ASm..T.KMSMK.R..SGKHAk--SFQVdPqYFKV MYG_PHYCA EAE.KA.EDl.K..VT..T..GAIlKKKGHH.AELKP.aQS..T.Hk--iPIKYLeFiSE LGB2_LUPLU VPQNNPEL.AHAGKVFK.VYEAAIQLQvTGvVVTD.T.Kn.GsVHvSKG.ADAh.PvvKE Consensus xDxMxGSxQVKxHGKKVLxALxDxVxHLDDLExxxAxLSDLHAxKLRxxVDPxNFKLLxH 130 140 150 160 ----:----|----:----|----:----|----:----|---- HBB_HUMAN V.vC...H.FGKe...P.Q.aYQ.Vv.G..NA.AH..H------ HBB_HORSE V.vV...R.FGK.....lQ..YQ.Vv.G..NA.AH..H------ HBA_HORSE C..ST..V.LPN....A.H..lD.F.sS.sTV.TS..R------ HBA_HUMAN C..VT..A.LPAe...A.H..lD.F..S.sTV.TS..R------ GLB5_PETMA LAAViADTVAAG.AGF.KLM..ICI.LRS.Y------------- MYG_PHYCA AiiH..HSRHPG..GAdAQGa.N.A.ELFRKDiAA..KELGYQG LGB2_LUPLU Ai.KTiKEVVGAKwsE.lNsaWTIAYDEl.IViKKeMNDAA--- Consensus xLLxVLAxHxxxDFTPEVxASMxKxLAxVAxxLxxKYxx-xxxx Output files for usage example 3 File: globins.showalign
10 20 30 40 50 60 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN --------VH.tPE.K.A.TaL.G.V.-VD....E..GR-.LOutput files for usage example 4 File: globins.showalign HBB_HUMAN HLTPEEKSAVTALWGKVN-VD HBB_HORSE Q.sG...a..L...D...-Ee HBA_HUMAN -----VL.PADKTNV.AA-WG HBA_HORSE -----VL..ADKTNV.AA-WS MYG_PHYCA SEGEWqLVLHVWAKVeAd-.A GLB5_PETMA P.sAA..tKiRsA.AP.YSTY LGB2_LUPLU A..ESqAaL.KsS.EeF.ANI Output files for usage example 6 File: globins.showalign 10 20 30 40 50 60 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN --------vhLTpeEkSaVtAlWgKvN-vdEVGGeALgr-LlvVyPwtQrFFeSFgdLSt HBB_HORSE --------vqLSgeEkAaVlAlWdKvN-eeEVGGeALgr-LlvVyPwtQrFFdSFgdLSn HBA_HUMAN --------------vlSpadktnvKaa-wgKVGahAgeygaeaLermflsFptTktyfph HBA_HORSE --------------vlSaadktnvKaa-wsKVGGhAgeygaeaLermflgFptTktyfph MYG_PHYCA -------vlsegewQlvlhvwakvEaD-vaghGqdiLir-LfkshPetlekFdrFkhLkt GLB5_PETMA pivdtgsvapLSaaEkTkIrSaWapvystyEtsGvdIlvkfftstPaaQeFFpkFkgLTt LGB2_LUPLU --------gaLTesQaAlVkSsWeEfNanipkhthrffilVleIaPaaKdlFsflkgtSe Consensus xxxxxxxxxxLSxxExSxVxSxWxKxNxxxEVGGxALxxxLxxIxPxxQxFFxTFxxLSx 70 80 90 100 110 120 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN pDaVmGNpKVKaHGKKVLgAfsDgLaHLDNLKgtfAtLSELHcdKLh--VDPeNFRLLgN HBB_HORSE pgaVmGNpKVKaHGKKVLhSfgEgVhHLDNLKgtfAaLSELHcdKLh--VDPeNFRLLgN HBA_HUMAN fDlshGSaQVKgHGKKVadALtNaVaHVDDMpnalSaLSDLHAhKLR--VDPvNFKLLsH HBA_HORSE fDlshGSaQVKaHGKKVgdALtlaVgHLDDLpgalSnLSDLHAhKLR--VDPvNFKLLsH MYG_PHYCA eaeMkaSedLKkHGvtVLtALgaiLkkkghhEaelkpLAqsHAtKhK--IpikylEfIse GLB5_PETMA aDqLkkSadVRwHaERIInAVnDaVasMDDtEkmsmkLrDLsgkhaK--sfqvDpQyfkv LGB2_LUPLU vpqnnpelQahagkvfkLvyeaaiqlqVtgVvvtdAtLkNLgSvhVskgVadaHFpVVke Consensus xDxMxGSxQVKxHGKKVLxALxDxVxHLDDLExxxAxLSDLHAxKLRxxVDPxNFKLLxH 130 140 150 160 ----:----|----:----|----:----|----:----|---- HBB_HUMAN vLVcVLAhHfgkEFTPpVqAAyqKvVAgVAnaLahKYh------ HBB_HORSE vLVvVLArHfgkDFTPELqASyqKvVAgVAnaLahKYh------ HBA_HUMAN cLLvtLAaHlpaEFTPaVhASLdKfLAsVStvLtsKYr------ HBA_HORSE cLLstLAvHlpnDFTPaVhASLdKfLSsVStvLtsKYr------ MYG_PHYCA aIIhVLhsrhpgDFgaDaqgAMnKaLelfrkdIaaKYkelgyqg GLB5_PETMA laavIadtvaagDagfEklmSMiciLlrsAy------------- LGB2_LUPLU aILktIkevvgakWSeELnSAwtiaydeLAivIkkEmndaa--- Consensus xLLxVLAxHxxxDFTPEVxASMxKxLAxVAxxLxxKYxx-xxxx Output files for usage example 7 File: globins.showalign 10 20 30 40 50 60 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN --------..L...E.S.V...W.K.N-..EVGG.AL..-L....P..Q.FF..F..LS. HBB_HORSE --------..LS..E...V...W.K.N-..EVGG.AL..-L....P..Q.FF..F..LS. HBA_HUMAN --------------..S.......K..-...VG..A..............F..T...... HBA_HORSE --------------..S.......K..-...VGG.A..............F..T...... MYG_PHYCA -------....................-....G...L..-L....P.....F..F..L.. GLB5_PETMA ..........LS..E.....S.W.......E..G...........P..Q.FF..F..L.. LGB2_LUPLU --------..L.......V.S.W...N................I.P.....F......S. Consensus xxxxxxxxxxLSxxExSxVxSxWxKxNxxxEVGGxALxxxLxxIxPxxQxFFxTFxxLSx 70 80 90 100 110 120 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN .D...G...VK.HGKKVL.A..D...HLD.L....A.LS.LH..KL.--VDP.NF.LL.. HBB_HORSE .....G...VK.HGKKVL......V.HLD.L....A.LS.LH..KL.--VDP.NF.LL.. HBA_HUMAN .D...GS.QVK.HGKKV..AL...V.H.DD.......LSDLHA.KLR--VDP.NFKLL.H HBA_HORSE .D...GS.QVK.HGKKV..AL...V.HLDDL......LSDLHA.KLR--VDP.NFKLL.H MYG_PHYCA ...M..S...K.HG..VL.AL..........E.....L...HA.K..--........... GLB5_PETMA .D....S..V..H......A..D.V...DD.E.....L.DL......--........... LGB2_LUPLU ........Q........L.................A.L..L........V....F..... Consensus xDxMxGSxQVKxHGKKVLxALxDxVxHLDDLExxxAxLSDLHAxKLRxxVDPxNFKLLxH 130 140 150 160 ----:----|----:----|----:----|----:----|---- HBB_HUMAN .L..VLA.H....FTP.V.A...K..A.VA..L..KY.------ HBB_HORSE .L..VLA.H...DFTPE..AS..K..A.VA..L..KY.------ HBA_HUMAN .LL..LA.H....FTP.V.AS..K.LA.V...L..KY.------ HBA_HORSE .LL..LA.H...DFTP.V.AS..K.L..V...L..KY.------ MYG_PHYCA ....VL......DF.......M.K.L.........KY....... GLB5_PETMA ............D...E...SM...L...A.------------- LGB2_LUPLU ..L.............E............A...........--- Consensus xLLxVLAxHxxxDFTPEVxASMxKxLAxVAxxLxxKYxx-xxxx Output files for usage example 8 File: globins.showalign 10 20 30 40 50 60 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN --------..Lt..E.S.V.a.W.K.N-..EVGG.AL..-L..v.P..Q.FF.sF..LS. HBB_HORSE --------..LS..E.a.V.a.W.K.N-..EVGG.AL..-L..v.P..Q.FF.sF..LS. HBA_HUMAN --------------..S.......K..-..kVG..A.......l......F..T...... HBA_HORSE --------------..S.......K..-..kVGG.A.......l......F..T...... MYG_PHYCA -------.......q.........e.d-....G...L..-L....P.....F..F..L.. GLB5_PETMA ..........LS..E.t.i.S.W.......E..G..i........P..Q.FF..F..Lt. LGB2_LUPLU --------..Lt..q.a.V.S.W.e.N.............v..I.P..k..F......S. Consensus xxxxxxxxxxLSxxExSxVxSxWxKxNxxxEVGGxALxxxLxxIxPxxQxFFxTFxxLSx 70 80 90 100 110 120 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN .D.v.Gn.kVK.HGKKVL.A..D.l.HLDnLk...A.LSeLH..KL.--VDP.NFrLL.n HBB_HORSE ...v.Gn.kVK.HGKKVL.s..e.V.HLDnLk...A.LSeLH..KL.--VDP.NFrLL.n HBA_HUMAN .D...GS.QVK.HGKKV..AL.n.V.HvDDm....s.LSDLHA.KLR--VDP.NFKLL.H HBA_HORSE .D...GS.QVK.HGKKV..AL...V.HLDDL....s.LSDLHA.KLR--VDP.NFKLL.H MYG_PHYCA ...M..S..lK.HG..VL.AL...l......E.....La..HA.K.k--i.....e.i.. GLB5_PETMA .D.l..S..Vr.H.erii.Av.D.V..mDD.E.....L.DL.....k--....d.q.... LGB2_LUPLU ........Q........L.........v..v....A.L.nL.s..v...V...hF.vv.. Consensus xDxMxGSxQVKxHGKKVLxALxDxVxHLDDLExxxAxLSDLHAxKLRxxVDPxNFKLLxH 130 140 150 160 ----:----|----:----|----:----|----:----|---- HBB_HUMAN .Lv.VLA.H...eFTP.V.Aa..K.vA.VA..L..KY.------ HBB_HORSE .Lv.VLA.H...DFTPEl.AS..K.vA.VA..L..KY.------ HBA_HUMAN .LL..LA.H...eFTP.V.ASl.K.LA.Vs..L..KY.------ HBA_HORSE .LL..LA.H...DFTP.V.ASl.K.Ls.Vs..L..KY.------ MYG_PHYCA .ii.VL......DF..d...aM.K.L......i..KY....... GLB5_PETMA ....i.......D...E...SM...L...A.------------- LGB2_LUPLU Consensus xLLxVLAxHxxxDFTPEVxASMxKxLAxVAxxLxxKYxx-xxxx Output files for usage example 9 File: globins.showalign 10 20 30 40 50 60 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN --------VH..PE.K.A.T.L.G.V.-VD....E..GR-.LV.Y.WT.R..E..GD..T HBB_HORSE --------VQ..GE.K.A.L.L.D.V.-EE....E..GR-.LV.Y.WT.R..D..GD..N HBA_HUMAN --------------VL.PADKTNV.AA-WG...AH.GEYGAEA.ERMFLS.PT.KTYFPH HBA_HORSE --------------VL.AADKTNV.AA-WS....H.GEYGAEA.ERMFLG.PT.KTYFPH MYG_PHYCA -------VLSEGEW.LVLHVWAKV.A.-VAGH.QDI.IR-.FKSH.ETLEK.DR.KH.KT GLB5_PETMA PIVDTGSVAP..AA.K.K.R.A.APVYSTY.TS.VD.LVKFFTST.AA.E..PK.KG..T LGB2_LUPLU --------GA..ES.A.L.K.S.E.F.ANIPKHTHRFFIL.LE.A.AA.DL.SFLKGT.E Consensus xxxxxxxxxxLSxxExSxVxSxWxKxNxxxEVGGxALxxxLxxIxPxxQxFFxTFxxLSx 70 80 90 100 110 120 ----:----|----:----|----:----|----:----|----:----|----:----| HBB_HUMAN P.A.M..P...A......G.FS.G.A......GTF.T.....CD..H--...E.....G. HBB_HORSE PGA.M..P...A......H.FG.G.H......GTF.A.....CD..H--...E.....G. HBA_HUMAN F.LSH..A...G.....AD..T.A.A.....PNAL.A......H...--...V.....S. HBA_HORSE F.LSH..A...A.....GD..TLA.G.....PGAL.N......H...--...V.....S. MYG_PHYCA EAE.KA.ED..K..VT..T..GAI.KKKGHH.AELKP..QS..T.H.--.PIKYL.F.SE GLB5_PETMA A.Q.KK.AD..W.A....N..N.A.AS...T.KMSMK.R..SGKHA.--SFQV.P.YFKV LGB2_LUPLU VPQNNPEL.AHAGKVFK.VYEAAIQLQ.TG.VVTD.T.K..G.VH.SKG.ADA..P..KE Consensus xDxMxGSxQVKxHGKKVLxALxDxVxHLDDLExxxAxLSDLHAxKLRxxVDPxNFKLLxH 130 140 150 160 ----:----|----:----|----:----|----:----|---- HBB_HUMAN V..C...H.FGK....P.Q..YQ.V..G..NA.AH..H------ HBB_HORSE V..V...R.FGK......Q..YQ.V..G..NA.AH..H------ HBA_HUMAN C..VT..A.LPA....A.H...D.F..S..TV.TS..R------ HBA_HORSE C..ST..V.LPN....A.H...D.F..S..TV.TS..R------ MYG_PHYCA A..H..HSRHPG..GA.AQG..N.A.ELFRKD.AA..KELGYQG GLB5_PETMA LAAV.ADTVAAG.AGF.KLM..ICI.LRS.Y------------- LGB2_LUPLU A..KT.KEVVGAK..E..N..WTIAYDE..IV.KK.MNDAA--- Consensus xLLxVLAxHxxxDFTPEVxASMxKxLAxVAxxLxxKYxx-xxxx Output files for usage example 10 File: globins.showalign 10 20 30 40 50 ...v....----:----|----:----.|----:----|.----:----|----:----| HBB_HUMAN --------VHLTPEEKSAVTALWGKVN-VDEVGGEALGR-LLVVYPWTQRFFESFGDLST HBB_HORSE --------.Q.sG...a..L...D...-Ee.........-............d......N HBA_HUMAN --------------VL.PADKTNV.AA-WGk..AH.GEYGAEAlERMFLS.PTtKTYFPH HBA_HORSE --------------VL..ADKTNV.AA-WSk...H.GEYGAEAlERMFLG.PTtKTYFPH MYG_PHYCA -------VlSEGEWqLVLHVWAKVeAd-.AGH.QdI.I.-.FKSh.E.LEK.dR.KH.K. GLB5_PETMA PIVDTGSVAP.sAA..tKiRsA.AP.YSTY.TS.VDiLVKFFTST.AA.E..PK.KG.t. LGB2_LUPLU --------GA..ESqAaL.KsS.EeF.ANIPKHTHRFFILv.EiA.AAkDL.SFLKGT.E Consensus xxxxxxxxxxLSxxExSxVxSxWxKxNxxxEVGGxALxxxLxxIxPxxQxFFxTFxxLSx 60 70 80 90 100 110 ----:----|----:----|----:----|----:----|----:--..--|----:--- HBB_HUMAN PDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLH--VDPENFRLLGN HBB_HORSE .G................Hs.Ge.vH..........A..........--........... HBA_HUMAN F.LSH.sAq..G.....AD.LtnAv..v.dmPNALsA..d..AH..R--...V..k..Sh HBA_HORSE F.LSH.sAq........GD.LtLAvG...d.P.ALsN..d..AH..R--...V..k..Sh MYG_PHYCA EAEmKAsEDl.K..VT..T.LGAI.KKKGhHeAELKP.aqS.AT.HK--iPIkYLEFiSE GLB5_PETMA A.QlKKsAD.rW.AeriiN.Vn.Av.Sm.dTeKMSMK.Rd.SGKHAK--SFQVdPqYFKV LGB2_LUPLU VPQNNPELqAH.GKVFK.VYEaAIQLQvTGvVV.D...KN.GSVHvSKG.ADAh.PvvKE Consensus xDxMxGSxQVKxHGKKVLxALxDxVxHLDDLExxxAxLSDLHAxKLRxxVDPxNFKLLxH 10 120 130 140 -|----:----|----:----|----:----|----:-....v. HBB_HUMAN VLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH------ HBB_HORSE ...V...R....d...El..s.................------ HBA_HUMAN C.lVT..A.LPA....A.H.sLD.Fl.S.sTV.TS..R------ HBA_HORSE C.lST..V.LPNd...A.H.sLD.FlsS.sTV.TS..R------ MYG_PHYCA AiiH..HSRHPGd.GADA.G.MN.AlELFRKDi.A..KELGYQG GLB5_PETMA lAAViADTVAAGdAGFEKLMsMICilLRS.Y------------- LGB2_LUPLU AilKTiKEVV.AkwsEElNs.wTIAYDEl.IViKKeMnDAA--- Consensus xLLxVLAxHxxxDFTPEVxASMxKxLAxVAxxLxxKYxx-xxxx Output files for usage example 11 File: globins.showalign 10 20 30 40 50 ...v....----:----|----:----.|----:----|.----:----|----:----| HBB_HUMAN --------VHLTPEEksavtalwgkvn-vdevggealgr-llvvypwtqrffesfgdlst HBB_HORSE --------.Q.SG...a..l...d...-ee.........-............d......n HBA_HUMAN --------------Vl.padktnv.aa-wgk..ah.geygaealermfls.pttktyfph HBA_HORSE --------------Vl..adktnv.aa-wsk...h.geygaealermflg.pttktyfph MYG_PHYCA GLB5_PETMA LGB2_LUPLU --------GA..ESQaal.kss.eef.anipkhthrffilv.eia.aakdl.sflkgt.e 60 70 80 90 100 110 ----:----|----:----|----:----|----:----|----:--..--|----:--- HBB_HUMAN pdavmgnpkvkahgkkvlgafsdglahldnlkgtfatlselhcdklh--vdpenfrllgn HBB_HORSE HBA_HUMAN HBA_HORSE MYG_PHYCA GLB5_PETMA LGB2_LUPLU 10 120 130 140 -|----:----|----:----|----:----|----:-....v. HBB_HUMAN vlvcvlahhfgkeftppvqaayqkvvagvanalahkyh------ HBB_HORSE ...v...r....d...el..s.................------ HBA_HUMAN c.lvt..a.lpa....a.h.sld.fl.s.stv.ts..r------ HBA_HORSE c.lst..v.lpnd...a.h.sld.flss.stv.ts..r------ MYG_PHYCA GLB5_PETMA laaviadtvaagdagfeklmsmicillrs.y------------- LGB2_LUPLU ailktikevv.akwseelns.wtiaydel.ivikkemndaa--- Output files for usage example 12 File: globins.showalign 10 20 30 40 50 60 ----:----|----:----|----:----|----:----|----:----|----:----| GLB5_PETMA PIVDTGSVAP..AA.KtKiR.A.APVYSTY.TS.VDiLVKFFTST.AA.E..PK.KG.tT HBA_HORSE --------------VL.AADKTNV.AA-WSk...H.GEYGAEAlERMFLG.PT.KTYFPH HBA_HUMAN --------------VL.PADKTNV.AA-WGk..AH.GEYGAEAlERMFLS.PT.KTYFPH HBB_HORSE --------VQ..GE.KaA.LaL.D.V.-EE....E..GR-.LVvY.WT.R..Ds.GD..N HBB_HUMAN --------VH.tPE.K.A.TaL.G.V.-VD....E..GR-.LVvY.WT.R..Es.GD..T LGB2_LUPLU --------GA.tESqAaL.K.S.EeF.ANIPKHTHRFFILvLE.A.AAkDL.SFLKGT.E MYG_PHYCA -------VLSEGEWqLVLHVWAKVeAd-VAGH.QDI.IR-.FKSH.ETLEK.DR.KH.KT Consensus xxxxxxxxxxLSxxExSxVxSxWxKxNxxxEVGGxALxxxLxxIxPxxQxFFxTFxxLSx 70 80 90 100 110 120 ----:----|----:----|----:----|----:----|----:----|----:----| GLB5_PETMA A.QlKK.AD.rW.AeriiN.vN.A.ASm..T.KMSMK.R..SGKHAk--SFQVdPqYFKV HBA_HORSE F.LSH..A...A.....GD..TLA.G.....PGALsN......H...--...V.....S. HBA_HUMAN F.LSH..A...G.....AD..TnA.A.v..mPNALsA......H...--...V.....S. HBB_HORSE PGAvM.nPk..A......HsFGeG.H...n.kGTF.A..e..CD..H--...E..r..Gn HBB_HUMAN P.AvM.nPk..A......G.FS.GlA...n.kGTF.T..e..CD..H--...E..r..Gn LGB2_LUPLU VPQNNPEL.AHAGKVFK.VYEAAIQLQvTGvVVTD.T.Kn.GsVHvSKG.ADAh.PvvKE MYG_PHYCA EAE.KA.EDl.K..VT..T..GAIlKKKGHH.AELKP.aQS..T.Hk--iPIKYLeFiSE Consensus xDxMxGSxQVKxHGKKVLxALxDxVxHLDDLExxxAxLSDLHAxKLRxxVDPxNFKLLxH 130 140 150 160 ----:----|----:----|----:----|----:----|---- GLB5_PETMA LAAViADTVAAG.AGF.KLM..ICI.LRS.Y------------- HBA_HORSE C..ST..V.LPN....A.H..lD.F.sS.sTV.TS..R------ HBA_HUMAN C..VT..A.LPAe...A.H..lD.F..S.sTV.TS..R------ HBB_HORSE V.vV...R.FGK.....lQ..YQ.Vv.G..NA.AH..H------ HBB_HUMAN V.vC...H.FGKe...P.Q.aYQ.Vv.G..NA.AH..H------ LGB2_LUPLU Ai.KTiKEVVGAKwsE.lNsaWTIAYDEl.IViKKeMNDAA--- MYG_PHYCA AiiH..HSRHPG..GAdAQGa.N.A.ELFRKDiAA..KELGYQG Consensus xLLxVLAxHxxxDFTPEVxASMxKxLAxVAxxLxxKYxx-xxxx Data files showalign reads in scoring matrices to determine the consensus sequence and to determine which matches are similar or not. EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by the EMBOSS environment variable EMBOSS_DATA. To see the available EMBOSS data files, run: % embossdata -showall To fetch one of the data files (for example 'Exxx.dat') into your current directory for you to inspect or modify, run: % embossdata -fetch -file Exxx.dat Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata". The directories are searched in the following order: * . (your current directory) * .embossdata (under your current directory) * ~/ (your home directory) * ~/.embossdata Notes showalign reads in a scoring matrix to determine the consensus sequence and to determine which matches are similar or not. By using the -show option, the displayed sequences can be shown as: * complete (-show=All), * only identical matches between the sequence and the reference sequence, all other positions being replaced by '.' characters (-show=Identities) * non-identical matches, with identical matches being replaced by '.' characters, similar matches are shown in lower case (-show=Non-identities) * similar matches, with non-similar matches being replaced by '.' characters, similar matches are shown in lower case (-show=Similarities) * dissimilar matches, with identical or similar matches being replaced by '.' characters (-show=Dissimilarities) Changing the similar matches to lowercase can optionally be disabled by using the option -nosimilarcase. A small table of the way these alignments are displayed illustrates this. If we have a reference protein sequence of "III" and a sequence aligned to this of "ILW", then we have an identical matching residue, then a similar one, then a dissimilar one. The different methods of display would give the following: Reference III All ILw Identical I.. Non-id .lW Similar Il. Dissimilar ..W The consensus line can be displayed in a mixture of uppercase and lowercase symbols. Uppercase indicates a strong consensus and lowercase a weak one. The cutoff for setting the consensus case is set by the qualifier -setcase. If the number of residues at that position that match the consensus value is greater than this, then the symbol is in uppercase, otherwise the symbol is in lowercase. By default, the value of -setcase is set so that if there are more than 50% of residues identical to the consensus at that position, then the consensus is in uppercase. To put all of the consensus symbols into uppercase or lowercase, make -setcase zero or very large (try 100000 ?). Other display options include Sequence numbering ruler with ticks above the sequence. The width of a line can be set. The width of a margin to the left of the sequences that shows the sequence names can be set. Specified regions of the sequence can be displayed in uppercase to highlight them. The output can be formatted for HTML. If the output is being formatted for HTML, then specified regions of the sequence can be displayed in any valid HTML colours. References None. Warnings None. Diagnostic Error Messages None. Exit status It always exits with status 0. Known bugs None. See also Program name Description abiview Display the trace in an ABI sequencer file coderet Extract CDS, mRNA and translations from feature tables edialign Local multiple alignment of sequences emma Multiple sequence alignment (ClustalW wrapper) entret Retrieve sequence entries from flatfile databases and files extractalign Extract regions from a sequence alignment infoalign Display basic information about a multiple sequence alignment infoseq Display basic information about sequences plotcon Plot conservation of a sequence alignment prettyplot Draw a sequence alignment with pretty formatting refseqget Get reference sequence seqxref Retrieve all database cross-references for a sequence entry seqxrefget Retrieve all cross-referenced data for a sequence entry tranalign Generate an alignment of nucleic coding regions from aligned proteins variationget Get sequence variations whichdb Search all sequence databases for an entry and retrieve it Author(s) Gary Williams formerly at: MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK Please report all bugs to the EMBOSS bug team (emboss-bug (c) not to the original author. History Written (23 May 2001) - Gary Williams Target users This program is intended to be used by everyone and everything, from naive users to embedded scripts. Comments None HBA_HORSE F.LSH..A...A.....GD..TLA.G.....PGALsN......H...--...V.....S. MYG_PHYCA EAE.KA.EDl.K..VT..T..GAIlKKKGHH.AELKP.aQS..T.Hk--iPIKYLeFiSE GLB5_PETMA A.QlKK.AD.rW.AeriiN.vN.A.ASm..T.KMSMK.R..SGKHAk--SFQVdPqYFKV LGB2_LUPLU VPQNNPEL.AHAGKVFK.VYEAAIQLQvTGvVVTD.T.Kn.GsVHvSKG.ADAh.PvvKE Consensus xDxMxGSxQVKxHGKKVLxALxDxVxHLDDLExxxAxLSDLHAxKLRxxVDPxNFKLLxH 130 140 150 160 ----:----|----:----|----:----|----:----|---- HBB_HUMAN V.vC...H.FGKe...P.Q.aYQ.Vv.G..NA.AH..H------ HBB_HORSE V.vV...R.FGK.....lQ..YQ.Vv.G..NA.AH..H------ HBA_HUMAN C..VT..A.LPAe...A.H..lD.F..S.sTV.TS..R------ HBA_HORSE C..ST..V.LPN....A.H..lD.F.sS.sTV.TS..R------ MYG_PHYCA AiiH..HSRHPG..GAdAQGa.N.A.ELFRKDiAA..KELGYQG GLB5_PETMA LAAViADTVAAG.AGF.KLM..ICI.LRS.Y------------- LGB2_LUPLU Ai.KTiKEVVGAKwsE.lNsaWTIAYDEl.IViKKeMNDAA--- Consensus xLLxVLAxHxxxDFTPEVxASMxKxLAxVAxxLxxKYxx-xxxx