How to get sequence homology percent from a multiple sequence alignment in DECIPHER?

0

Entering edit mode

reubenmcgregor88 • 0

@reubenmcgregor88-13722

Last seen 4.2 years ago

I have recently carried out a stipple sequence alignment using DECIPHER package in R. The code was fairly simple as follows:

bod_aa <- readAAStringSet("BOD_aa.fasta")

aligned_aa <- AlignSeqs(bod_aa)

BrowseSeqs(aligned_aa)

This worked nicely but I would like some information on the alignment, specifically the percent homology between the different proteins? If possible percent homology to one of the sequences in particular. I.e. I really want to align all the sequences to one "query sequence" and get a percent homology to that query sequence.

Thanks

decipher r protein alignment • 2.3k views

ADD COMMENT • link 6.9 years ago reubenmcgregor88 • 0

1

Entering edit mode

Just to clarify, do you mean that you would like to get a distance matrix?

d <- DistanceMatrix(bod_aa)

And then look at the distances to a specific sequence (e.g., #1)?

d[1,]

ADD REPLY • link 6.9 years ago Erik Wright ▴ 150

0

Entering edit mode

If the distance Matrix looks like this:

BOD1L2 BOD1L1 BOD1
0.8895349 0.9459459 0.0000000

Does that mean BOD1L1 is 94% homologous to BOD1 and BOD1L2 is 88% homologous to BOD1? Or something similar? Sorry for simplicity of questions

ADD REPLY • link 6.9 years ago reubenmcgregor88 • 0

1

Entering edit mode

The outputs of DistanceMatrix, including examples, are described in:

?DistanceMatrix

Generally, it is a matrix containing the distance between each pair of input sequences. The distance is 1 - the similarity (fraction of sites that are identical).

Homology is not well defined in this context, but I assume you mean similarity.

ADD REPLY • link 6.9 years ago Erik Wright ▴ 150

0

Entering edit mode

Ah ok, I will go back to reference material and try and understand more,

Thanks

ADD REPLY • link 6.9 years ago reubenmcgregor88 • 0

Login before adding your answer.

Similar Posts

Reading fasta file with multiple sequences •

updated 8.1 years ago by Martin Morgan 25k • written 8.1 years ago by Riot • 0

Hello all, I'm trying to read a fasta file that has over 5000 sequences.  The plan is to create a vector that calls out all the seque…

pathview: object 'bods' not found •

updated 5.7 years ago by anamaria ▴ 10 • written 6.5 years ago by Shi-Xiang Wang • 0

I use `` pathfindR `` package and find a problem related to `` pathview `` package. Error in pathview::pathview(gene.data = gene_data,…

Getting Top Scoring Pair Scores greater than 1 •

8.4 years ago bishwa.slwl ▴ 10

Hello, I'm using switchBox to find different TSP of gene expressions predicting a diagnosis of prostate cancer in African American males a…

Pathview with minor species •

11.2 years ago Luo Weijun ★ 1.6k

<div class="preformatted">Someone asked a similar question on using pathview or gage with honey bee. Here is the solution: http://seqanswer…

biomaRt and Ensembl probe set filter.... •

updated 16.2 years ago by James W. MacDonald 68k • written 16.2 years ago by Jesper Ryge ▴ 110

<div class="preformatted">Hi everybody q1. I have been using biomaRt to filter Affymetrix probe sets prior to statistical testing such as …

pairwise alignment and homology score •

updated 13.2 years ago by Hervé Pagès 16k • written 13.2 years ago by erikafalisi@tin.it ▴ 10

<div class="preformatted"> Dear List, we are trying to compare more than 600 amino acid sequences in clustal format (or FASTA format too) w…

Pairwise Alignment on Large Protein Sequence Data Set •

updated 11.3 years ago by Hervé Pagès 16k • written 11.3 years ago by Guest User ★ 13k

<div class="preformatted"> Basically, I have a runtime/performance issue question. I wrote the following nested while-loop using the "Biost…

Comparing different method for DESEQ2 •

updated 2.5 years ago by Michael Love 43k • written 2.5 years ago by Vapin • 0

Hi, this could be the general question regarding differential gene expression analysis. **Method 1:** Mapping with STAR/HISAT2, counting…

Re: MetaData 1.6.0 •

20.9 years ago John Zhang ★ 2.9k

<div class="preformatted">Hi, all, The developmental version of annotation packages (1.6.0) is now available and can be accessed by doing …

Figuring out what long decent Phred value unaligned reads are •

8.4 years ago Matthew Thornton ▴ 380

Hello! My question is about reads that don't align to the genome yet are long and have very good Phred scores. Currently, my workflow is F…

Mapping RNA-seq reads - subjunc and align functions •

updated 3.4 years ago by Wei Shi ★ 3.6k • written 3.4 years ago by sha-ked • 0

Hi, I am using Rsubread to map Illumina RNA-seq reads, to investigate alternative splicing events. Using: 1000 Synthetic reads (size of r…

Genome editing analysis with Illumina sequencing •

9.9 years ago Merienne Nicolas ▴ 120

Dear all, I have a very large and open question concerning the methods to analyse genome editing experiments with sequencing. I am using t…

Pathview with minor species •

11.2 years ago Luo Weijun ★ 1.6k

<div class="preformatted">I forgot to mention that in pathview vignette there is a dedicated section covering on species related issues, i.…

GenomicAlignments/access correct bp positions near indels •

updated 7.9 years ago by Hervé Pagès 16k • written 7.9 years ago by Daniel.Berner@unibas.ch ▴ 90

Hi all! My goal is to access the nucleotide content at specific base positions in Illumina reads derived from whole-genome poolseq data, f…

I: R: R: R: Probe-level analysis of exon arrays using xps •

13.6 years ago Lavorgna Giovanni ▴ 80

<div class="preformatted">Dear Christian, thanks for your answer. Sorry for not being clear in my previous massage. I'll try to clarify th…

GDCprepare() - R Session Aborted •

3.0 years ago • updated 2.5 years ago spyrouglykeria • 0

Hello! I get the “R Session Aborted, R encountered a fatal error. The session was terminated.” warning in RStudio when I’m running loading …

Automated blasting of short nucleotide sequences against each other •

20.1 years ago Rohit Ghai ▴ 80

<div class="preformatted">Ken Doing a blast of 500 sequences against a database of 500 sequences is not a big job. you hardly need a scrip…

what is the threshold for R homology package •

updated 12.3 years ago by Marc Carlson ★ 7.2k • written 12.4 years ago by Guest User ★ 13k

<div class="preformatted"> I found no of homology pairs in R homology package is different from http://inparanoid.sbc.su.se/download/curren…

DADA2 no paired reads after quality filtering •

**MISSING**

Gramene now available via biomaRt •

updated 8.4 years ago by maialiaeb • 0 • written 18.1 years ago by Steffen Durinck ▴ 580

<div class="preformatted">Dear biomaRt users, I would like to notify you that the Gramene database has recently been added to the BioMart …

annotationTools incorrect use of ortholog •

16.7 years ago Nathan.Watson-Haigh@csiro.au ▴ 210

<div class="preformatted">I've been looking at the vignette for the annotationTools package and I had to send a message to the list to info…

Gviz: UcscTrack and "GC Percent" information •

2.3 years ago • updated 18 months ago Daniel E. Weeks ▴ 30

Previously working `coMET` epigenetic plotting code now fails. It appears that this is because now `UcscTrack` fails when trying to pull i…

Place summary stats in names in flowViz xyplot •

updated 16.0 years ago by Florian Hahne ▴ 540 • written 16.0 years ago by Aric Gregson ▴ 270

<div class="preformatted">Hello, I'm trying to get the summary stats, such as the percent positive events, to be drawn onto the xyplot in …

edgeR processAmplicons not recognizing hairpin sequences •

updated 2.6 years ago by Gordon Smyth 52k • written 2.7 years ago by Claire.Prince ▴ 10

I tried using the processAmplicons function from edgeR where the hairpin sequence is at start in the fastq file and the barcode towards the…

Using duplication rate as a covariate •

updated 5.6 years ago by Michael Love 43k • written 5.6 years ago by rbutler • 0

Working with a workflow that uses Fastp -> Salmon -> Deseq2 Is it generally considered good practice to control for Fastp's read d…

getHomolog in biomaRt •

updated 18.0 years ago by Steffen Durinck ▴ 580 • written 18.0 years ago by Steve Pederson ▴ 60

<div class="preformatted">Hi, I'm still on a steep learning curve with R & am trying to convert a large batch of mouse entrezIDs to ho…

homology packages and LocusLink IDs •

20.0 years ago Lynn Young ▴ 30

<div class="preformatted">Greetings, In the vignette, "How to use the homology packages" under Task 1, what is the recommended method of o…

homology packages and LocusLink IDs •

updated 19.9 years ago by rgentleman ★ 5.5k • written 19.9 years ago by Lynn Young ▴ 30

<div class="preformatted">Greetings, In the vignette, "How to use the homology packages" under Task 1, what is the recommended method of o…

Recent changes to Bioconductor packages •

21.7 years ago • updated 21.6 years ago madman@jimmy.harvard.edu ▴ 850

<div class="preformatted">This is an automated message sent out weekly to report recent changes to Bioconductor packages. Please see the U…

CFDscore calculations used in CRISPRseek •

updated 6.8 years ago by Julie Zhu ★ 4.3k • written 6.8 years ago by Dawid G. Nowak ▴ 40

Hi, I have a question about “CFDscore" calculations used in CRISPRseek. I have a guide that I am testing against target DNA sequence …

Loading Similar Posts

Traffic: 405 users visited in the last hour

Content Search
Users
Tags
Badges

Help About
FAQ

Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the

version 2.3.6