Come tracciare i cambiamenti di stato del personaggio da una matrice presenza-assenza a una filogenesi

Gloom

2018-10-08 16:13:07 UTC

view on stackexchange narkive permalink

Desidero assegnare i cambiamenti di stato del carattere da una matrice presenza-assenza a una filogenesi. Quindi voglio identificare l'ipotesi più parsimoniosa per quale nodo e ramo si è verificata una data "mutazione" o "cambiamento".

Ho provato ad assegnare ogni carattere al suo nodo foglia, e poi se il nodo foglia è la sorella ha lo stesso carattere riassegno il personaggio al nodo genitore (e lavoro fino a quando non vengono assegnati tutti i nodi). Sto usando un set di dati fittizia per cercare di raggiungere questo obiettivo:

  Matrix>Dme_0011110000000000111>Dme_0021110000000000011>Cfa_0010110000000000011>Mms_0010110000000000011>Hsa_0010110000000000010>Ptr_0020110000000000011>Mmu_0020110000000000011>Hsa_0020110000000000011>Ptr_0010110000000000011>Mmu_0010110000000000011Phylogeny ((Dme_001, Dme_002), (((Cfa_001, Mms_001), ((Hsa_001, Ptr_001), Mmu_001)), ( Ptr_002, (Hsa_002, Mmu_002))));

Assegno i nodi interni usando ete3, quindi il mio output dovrebbe essere:

  BranchID CharacterState ChangeNode_1: 0 0->1Hsa_001: 15 1->0

Dato che il mio codice assegna gli stati dei caratteri in base alle loro sorelle, se si verifica una perdita, si incasina l'uscita in modo che:

  BranchID CharacterState ChangeNode_1: 0 0->1Node_3 15 0->1Node_5 15 0->1Node_ 8 15 0->1

Sto codificando in Python 2.7 e sono benvenuta assistenza.

Il mio codice:

  from ete3 import PhyloTreefrom collections import Counterimport itertoolsPAM = open ('PAM', 'r') gene_tree = '((Dme_001, Dme_002), ( ((Cfa_001, Mms_001), ((Hsa_001, Ptr_001), Mmu_001)), (Ptr_002, (Hsa_002, Mmu_002)))); 'NodeIDs = [] tree = PhyloTree (gene_tree) edge = 0for node in tree.traverse ( ): se non node.is_leaf (): node.name = "Node_% d"% edge edge + = 1 NodeIDs.append (node.name) if node.is_leaf (): NodeIDs.append (node.name)
f = open ('PAM', 'r') taxa = [] pap = [] per la riga in f: term = line.strip (). split ('\ t') taxa.append (termine [0]) p = [p for p in term [1]] pap.append (p) statesD = dict (zip (taxa, pap)) def PlotCharacterStates (): Plots = [] events = [] for key, value in statesD.iteritems ( ): count = -1 for s in value: count + = 1 if s == CharacterState: a = key, count events.append (a) Round3_events = [] while len (events) > 0: for rel in Relationships: node_store = [] sis_store = [] per evento negli eventi: if rel [0] == evento [0]: node_store.append (evento [1]) if rel [1] == evento [0]: sis_store.append (evento [ 1]) if (len (node_store) > 0) e (len (sis_store) > 0): place = rel, node_store, sis_store Round3_events.append (place) spostato = [] per placeme nt in Round3_events: intercetta = (set (posizionamento [1]) & set (posizionamento [2])) node_plot = (set (posizionamento [1]) - set (posizionamento [2])) sis_plot = (set (posizionamento [2 ]) - set (placement [1])) if len (node_plot) > 0: for x in node_plot: y = placement [0] [0], x Plots.append (y) moving.append (y) if len ( sis_plot) > 0: for x in sis_plot: y = posizionamento [0] [1], x Plots.append (y) moving.append (y) if len (intercetta) > 0: for x in intercetta: y = posizionamento [ 0] [2], x y1 = posizionamento [0] [0], x y2 = posizionamento [0] [1], x spostato.append (y1) spostato.append (y2) eventi.append (y) per evento in eventi: se evento [0] == "Nodo_0":
Plots.append (event) moving.append (event) events2 = (set (events) - set (moving)) events = [] for event in events2: events.append (event) pl = set (Plots) Plots = [] for p in pl: Plots.append (p) print CharacterState, Plots '' 'assegna sorelle a foglie, interni' '' e = [] round1b_e = [] round2a_e = [] posizionamenti = [] Relazioni = [] Round = [ ] per nodo in tree.traverse (): sorelle = nodo.get_sisters () genitore = nodo.up cycle1 = [] if node.is_leaf (): per sorella in sorelle: if sister.is_leaf (): round1a = ["Round1a ", node.name, sister.name, parent.name] node_names = node.name, sister.name Rounds.append (round1a) e.append (node_names) x = node.name, sister.name, parent.name," leaf-leaf "Relationships.append (x) if not sister.is_leaf (): round1b = [" Round1b ", node.name, sister.name, parent.name] node_names = node.name, sister.name Rounds.append (round1b) round1b_e.append (node_names) x = node.name, sister.name, parent.name, "node-leaf" Relationships.append (x) elif non nodo. is_leaf (): if not node.is_root (): for sister in sisters: if not sister.is_leaf (): node_names = node.name, sister.name round2a_e.append (node_names) x = node.name, sister.name, parent.name, "nodo-nodo" Relationships.append (x) x = [] CharacterStates = [] per chiave, valore in statesD.iteritems (): per valore in value: x.append (valore) y = ordinato (impostato (x)) per x in y: CharacterStates.append (x) per CharacterState in CharacterStates: PlotCharacterStates ()

Il mio codice è stato aggiunto, grazie per il suggerimento. Sto usando python 2.7 e non sto ancora usando biopython ma sono felice di usarlo se c'è una soluzione. Sto cercando di replicare la funzione -apo da TnT (Goloboff 2008) se possibile (se hai familiarità con questo)

Per favore [modifica] la domanda per includere queste informazioni, potrebbe essere utile per altri (non ho familiarità con quest'area, mi dispiace).