π» Π6: ΠΡΠΎΠ½Π°ΠΎΡΠ°ΡΠ΅ Π½Π° ΡΠΈΡΠ΅ CDSΒΆ
ΠΠΏΠΈΡ Π½Π° Π±Π°ΡΠ°ΡΠ΅
Π‘ΠΎ ΠΏΠΎΠΌΠΎΡ Π½Π° BioPython, ΠΏΡΠΎΠ½Π°ΡΠ΄Π΅ΡΠ΅ Π³ΠΈ ΡΠ΅ΠΊΠ²Π΅Π½ΡΠΈΠΈΡΠ΅ Π½Π° ΡΠ°Π·Π»ΠΈΡΠ½ΠΈΡΠ΅ ΠΊΠΎΠ΄Π½ΠΈ ΡΠ΅Π³ΠΈΠΎΠ½ΠΈ ΠΎΠ·Π½Π°ΡΠ΅Π½ΠΈ ΠΊΠ°ΠΊΠΎ (CDS). CDS ΡΠ΅ ΡΠ΅Π³ΠΈΠΎΠ½ΠΈΡΠ΅ Π΄ΠΎΠ±ΠΈΠ΅Π½ΠΈ ΠΏΠΎΡΠ»Π΅ ΠΏΡΠΎΡΠ΅Π΄ΡΡΠ°ΡΠ° Π½Π° ΠΎΡΡΠ΅ΠΊΡΠ²Π°ΡΠ΅ Π½Π° ΠΈΠ½ΡΡΠΎΠ½ΠΈΡΠ΅.
from Bio import SeqIO
gene_record = SeqIO.read("yersinia-pestis-fasta/NC_005816.gb", "genbank")
ΠΠΊΡΠΏΠ½Π°ΡΠ° Π΄ΠΎΠ»ΠΆΠΈΠ½Π° Π½Π° ΠΠΠ ΡΠ΅ΠΊΠ²Π΅Π½ΡΠ°ΡΠ° Π΅:
print(len(gene_record.seq))
9609
ΠΠΎΠ΄Π΅ΠΊΠ° Π½Π°Ρ Π½Π΅ ΠΈΠ½ΡΠ΅ΡΠ΅ΡΠΈΡΠ° Π΅Π»Π΅ΠΌΠ΅Π½ΡΠΎΡ ΠΎΠ΄ gene_record.features
, ΠΊΠΎΡ Π΅ Π»ΠΈΡΡΠ° ΠΎΠ΄ ΠΊΠ°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊΠΈ ΠΎΠ΄ Π³ΠΎΠ»Π΅ΠΌΠΎ Π·Π½Π°ΡΠ΅ΡΠ΅ Π·Π° ΠΎΠΏΠΈΡΠΎΡ Π½Π° ΡΠ°ΠΌΠ°ΡΠ° ΡΠ΅ΠΊΠ²Π΅Π½ΡΠ°. ΠΡΠΊΠ°ΠΊΠΎ ΡΠ΅ ΡΠ΅ Π·Π°ΠΏΠΎΡΠ½Π΅ ΡΠΎ ΠΊΠΎΡΠΈΡΡΠ΅ΡΠ΅ Π½Π° ΠΎΠ²ΠΈΠ΅ ΡΠ΅ΠΊΠ²Π΅Π½ΡΠΈ, ΠΎΠ²Π° Π΅ Π΅Π΄Π΅Π½ Π²ΠΈΠ΄ Π½Π° ΠΎΡΠ³Π°Π½ΠΈΠ·Π°ΡΠΈΡΠ° ΡΡΠΎ Π»Π΅ΡΠ½ΠΎ Π½ΠΈ ΠΎΠ·Π²ΠΎΠ·ΠΌΠΎΠΆΡΠ²Π° Π΄Π° Π΄ΠΎΠ±ΠΈΠ΅ΠΌΠ΅ βΠΏΠΎΠ°ΠΏΡΡΡΠ°ΠΊΡΠ½Π°β ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΡΠ° ΡΡΠΎ ΡΠ΅ Π·Π½Π°Π΅ Π·Π° ΡΠ°ΠΌΠ°ΡΠ° ΡΠ΅ΠΊΠ²Π΅Π½ΡΠ°.
ΠΠΊΡΠΏΠ½ΠΈΠΎΡ Π±ΡΠΎΡ Π½Π° ΠΎΠ²ΠΈΠ΅ features ΠΌΠΎΠΆΠ΅ Π΄Π° Π³ΠΎ Π΄ΠΎΠ±ΠΈΠ΅ΠΌΠ΅ ΡΠΎ:
print(len(gene_record.features))
41
Π‘Π΅ΠΊΠΎΡ Π΅Π΄Π΅Π½ feature ΠΈΠΌΠ° Π½Π΅ΠΊΠΎΠ»ΠΊΡ Π°ΡΡΠΈΠ±ΡΡΠΈ, ΠΊΠ°ΠΊΠΎ Π½Π° ΠΏΡΠΈΠΌΠ΅Ρ ΠΏΡΠ²ΠΈΠΎΡ ΠΎΠ΄ Π»ΠΈΡΡΠ°ΡΠ° Π³Π»Π΅Π΄Π°ΠΌΠ΅ Π΄Π΅ΠΊΠ° ΠΈΠΌΠ° ΠΏΠΎΠ²Π΅ΡΠ΅ ΠΏΡΠΎΠΌΠ΅Π½Π»ΠΈΠ²ΠΈ ΠΊΠΎΠΈ Π³ΠΎ ΠΎΠΏΠΈΡΠ²Π°Π°Ρ ΠΌΠ΅ΡΡ ΠΊΠΎΠΈ Π½Π°ΡΠ±ΠΈΡΠ½ΠΈ ΡΠ΅:
.type:
ΡΠΈΠΏΠΎΡ Π½Π° ΠΊΠ°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊΠ° (βCDSβ, βgeneβ, β¦).location:
Π»ΠΎΠΊΠ°ΡΠΈΡΠ° Π½Π° ΡΠ°ΠΌΠ°ΡΠ° ΡΠ΅ΠΊΠ²Π΅Π½ΡΠ°, ΠΊΠ°ΠΊΠΎ Π²ΠΈΠ΄ ΠΌΠ°ΠΏΠΈΡΠ°ΡΠ΅ (ΠΏΠΎΡΠ΅ΡΠΎΠΊ:ΠΊΡΠ°Ρ)
dir(gene_record.features[0])
['__bool__',
'__class__',
'__contains__',
'__delattr__',
'__dict__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__iter__',
'__le__',
'__len__',
'__lt__',
'__module__',
'__ne__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__sizeof__',
'__str__',
'__subclasshook__',
'__weakref__',
'_flip',
'_get_location_operator',
'_get_ref',
'_get_ref_db',
'_get_strand',
'_set_location_operator',
'_set_ref',
'_set_ref_db',
'_set_strand',
'_shift',
'extract',
'id',
'location',
'location_operator',
'qualifiers',
'ref',
'ref_db',
'strand',
'translate',
'type']
ΠΡΠΈΠΌΠ΅Ρ ΠΎΠ΄ ΠΊΠ½ΠΈΠ³Π°ΡΠ°, ΠΈ ΠΈΠ½ΡΠ΅ΡΠ΅ΡΠ΅Π½ CDS Π΅ βpim
β Π³Π΅Π½ΠΎΡ, YP_pPCP05 ΠΊΠΎΡ ΡΠ΅ Π½Π°ΠΎΡΠ° Π²ΠΎ ΡΠ΅ΠΊΠ²Π΅Π½ΡΠ°ΡΠ° ΠΌΠ΅ΡΡ Π±Π°Π·Π½ΠΈΡΠ΅ ΠΏΠ°ΡΠΎΠ²ΠΈ [4342:4780]:
print(gene_record.features[21])
type: CDS
location: [4342:4780](+)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478716', 'GeneID:2767712']
Key: gene, Value: ['pim']
Key: locus_tag, Value: ['YP_pPCP05']
Key: note, Value: ['similar to many previously sequenced pesticin immunity protein entries of Yersinia pestis plasmid pPCP, e.g. gi| 16082683|,ref|NP_395230.1| (NC_003132) , gi|1200166|emb|CAA90861.1| (Z54145 ) , gi|1488655| emb|CAA63439.1| (X92856) , gi|2996219|gb|AAC62543.1| (AF053945) , and gi|5763814|emb|CAB531 67.1| (AL109969)']
Key: product, Value: ['pesticin immunity protein']
Key: protein_id, Value: ['NP_995571.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MGGGMISKLFCLALIFLSSSGLAEKNTYTAKDILQNLELNTFGNSLSHGIYGKQTTFKQTEFTNIKSNTKKHIALINKDNSWMISLKILGIKRDEYTVCFEDFSLIRPPTYVAIHPLLIKKVKSGNFIVVKEIKKSIPGCTVYYH']
ΠΠΎ, Π½ΠΈΠ΅ ΡΡΠ΅Π±Π° Π΄Π° Π³ΠΈ Π½Π°ΡΠ΄Π΅Π΅ ΡΠΈΡΠ΅ Π²Π°ΠΊΠ²ΠΈ CDS ΠΊΠΎΠ΄ΠΎΠ½ΠΈ ΠΈ ΠΎΠ²Π° ΠΌΠΎΠΆΠ΅ Π΄Π° Π³ΠΎ Π½Π°ΠΏΡΠ°Π²ΠΈΠΌΠ΅ ΡΠΎ ΡΠΎΠ° ΡΡΠΎ ΡΠ΅ Π±Π°ΡΠ°ΠΌΠ΅ Π½ΠΈΠ· ΡΠ΅Π»Π°ΡΠ° Π½ΠΈΠ·Π° ΠΎΠ΄ 41 features ΠΊΠ°Π΄Π΅ ΡΠ΅ Π½Π°ΠΎΡΠ°Π°Ρ ΠΎΠ½ΠΈΠ΅ ΠΊΠΎΠΈ ΠΈΠΌΠ°Π°Ρ ΡΠΈΠΏ βCDSβ:
CDS_list = []
for i in range(0, len(gene_record.features)):
if gene_record.features[i].type == "CDS":
CDS_list.append(i)
print(f"ΠΡΠΎΡ Π½Π° ΠΏΡΠΎΠ½Π°ΡΠ΄Π΅Π½ΠΈ CDS: {len(CDS_list)}")
ΠΡΠΎΡ Π½Π° ΠΏΡΠΎΠ½Π°ΡΠ΄Π΅Π½ΠΈ CDS: 10
ΠΠ°ΠΊΠ²ΠΈ Π³Π»Π΅Π΄Π°ΠΌΠ΅ Π΄Π΅ΠΊΠ° ΡΠ΅ Π²ΠΊΡΠΏΠ½ΠΎ 10, ΠΈ Π½ΠΈΠ² ΠΌΠΎΠΆΠ΅ΠΌΠ΅ Π΄Π° Π³ΠΈ ΠΈΡΠΏΡΠΈΠ½ΡΠ°ΠΌΠ΅:
for i in CDS_list:
print(gene_record.features[i])
type: CDS
location: [86:1109](+)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478712', 'GeneID:2767718']
Key: locus_tag, Value: ['YP_pPCP01']
Key: note, Value: ['similar to corresponding CDS from previously sequenced pPCP plasmid of Yersinia pestis KIM (AF053945) and CO92 (AL109969), also many transposase entries for insertion sequence IS100 of Yersinia pestis. Contains IS21-like element transposase, HTH domain (Interpro|IPR007101)']
Key: product, Value: ['putative transposase']
Key: protein_id, Value: ['NP_995567.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MVTFETVMEIKILHKQGMSSRAIARELGISRNTVKRYLQAKSEPPKYTPRPAVASLLDEYRDYIRQRIADAHPYKIPATVIAREIRDQGYRGGMTILRAFIRSLSVPQEQEPAVRFETEPGRQMQVDWGTMRNGRSPLHVFVAVLGYSRMLYIEFTDNMRYDTLETCHRNAFRFFGGVPREVLYDNMKTVVLQRDAYQTGQHRFHPSLWQFGKEMGFSPRLCRPFRAQTKGKVERMVQYTRNSFYIPLMTRLRPMGITVDVETANRHGLRWLHDVANQRKHETIQARPCDRWLEEQQSMLALPPEKKEYDVHLDENLVNFDKHPLHHPLSIYDSFCRGVA']
type: CDS
location: [1105:1888](+)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478713', 'GeneID:2767716']
Key: locus_tag, Value: ['YP_pPCP02']
Key: note, Value: ['similar to corresponding CDS form previously sequenced pPCP plasmid of Yersinia pestis KIM (AF053945) and CO92 (AL109969), also many ATP-binding protein entries for insertion sequence IS100 of Yersinia pestis. Contains Chaperonin clpA/B (Interpro|IPR001270). Contains ATP/GTP-binding site motif A (P-loop) (Interpro|IPR001687, Molecular Function: ATP binding (GO:0005524)). Contains Bacterial chromosomal replication initiator protein, DnaA (Interpro|IPR001957, Molecular Function: DNA binding (GO:0003677), Molecular Function: DNA replication origin binding (GO:0003688), Molecular Function: ATP binding (GO:0005524), Biological Process: DNA replication initiation (GO:0006270), Biological Process: regulation of DNA replication (GO:0006275)). Contains AAA ATPase (Interpro|IPR003593, Molecular Function: nucleotide binding (GO:0000166))']
Key: product, Value: ['transposase/IS protein']
Key: protein_id, Value: ['NP_995568.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MMMELQHQRLMALAGQLQLESLISAAPALSQQAVDQEWSYMDFLEHLLHEEKLARHQRKQAMYTRMAAFPAVKTFEEYDFTFATGAPQKQLQSLRSLSFIERNENIVLLGPSGVGKTHLAIAMGYEAVRAGIKVRFTTAADLLLQLSTAQRQGRYKTTLQRGVMAPRLLIIDEIGYLPFSQEEAKLFFQVIAKRYEKSAMILTSNLPFGQWDQTFAGDAALTSAMLDRILHHSHVVQIKGESYRLRQKRKAGVIAEANPE']
type: CDS
location: [2924:3119](+)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478714', 'GeneID:2767717']
Key: gene, Value: ['rop']
Key: gene_synonym, Value: ['rom']
Key: locus_tag, Value: ['YP_pPCP03']
Key: note, Value: ['Best Blastp hit =gi|16082682|ref|NP_395229.1| (NC_003132) putative replication regulatory protein [Yersinia pestis], gi|5763813|emb|CAB531 66.1| (AL109969) putative replication regulatory protein [Yersinia pestis]; similar to gb|AAK91579.1| (AY048853), RNAI modulator protein Rom [Salmonella choleraesuis], Contains Regulatory protein Rop (Interpro|IPR000769)']
Key: product, Value: ['putative replication regulatory protein']
Key: protein_id, Value: ['NP_995569.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MNKQQQTALNMARFIRSQSLILLEKLDALDADEQAAMCERLHELAEELQNSIQARFEAESETGT']
type: CDS
location: [3485:3857](+)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478715', 'GeneID:2767720']
Key: locus_tag, Value: ['YP_pPCP04']
Key: note, Value: ['Best Blastp hit = gi|321919|pir||JQ1541 hypothetical 16.9K protein - Salmonella typhi murium plasmid NTP16.']
Key: product, Value: ['hypothetical protein']
Key: protein_id, Value: ['NP_995570.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MSKKRRPQKRPRRRRFFHRLRPPDEHHKNRRSSQRWRNPTGLKDTRRFPPEAPSCALLFRPCRLPDTSPPFSLREAWRFLIAHAVGISVRCRSFAPSWAVCTNPPFSPTTAPYPVTIVLSPTR']
type: CDS
location: [4342:4780](+)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478716', 'GeneID:2767712']
Key: gene, Value: ['pim']
Key: locus_tag, Value: ['YP_pPCP05']
Key: note, Value: ['similar to many previously sequenced pesticin immunity protein entries of Yersinia pestis plasmid pPCP, e.g. gi| 16082683|,ref|NP_395230.1| (NC_003132) , gi|1200166|emb|CAA90861.1| (Z54145 ) , gi|1488655| emb|CAA63439.1| (X92856) , gi|2996219|gb|AAC62543.1| (AF053945) , and gi|5763814|emb|CAB531 67.1| (AL109969)']
Key: product, Value: ['pesticin immunity protein']
Key: protein_id, Value: ['NP_995571.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MGGGMISKLFCLALIFLSSSGLAEKNTYTAKDILQNLELNTFGNSLSHGIYGKQTTFKQTEFTNIKSNTKKHIALINKDNSWMISLKILGIKRDEYTVCFEDFSLIRPPTYVAIHPLLIKKVKSGNFIVVKEIKKSIPGCTVYYH']
type: CDS
location: [4814:5888](-)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478717', 'GeneID:2767721']
Key: gene, Value: ['pst']
Key: locus_tag, Value: ['YP_pPCP06']
Key: note, Value: ['Best Blastp hit =|16082684|ref|NP_395231.1| (NC_003132) pesticin [Yersinia pestis], gi|984824|gb|AAA75369.1| (U31974) pesticin [Yersinia pestis], gi|1488654|emb|CAA63438.1| (X92856) pesticin [Yersinia pestis], gi|2996220|gb|AAC62544.1| (AF053945) pesticin [Yersinia pestis], gi|5763815|emb|CAB53168.1| (AL1099 69) pesticin [Yersinia pestis]']
Key: product, Value: ['pesticin']
Key: protein_id, Value: ['NP_995572.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MSDTMVVNGSGGVPAFLFSGSTLSSYRPNFEANSITIALPHYVDLPGRSNFKLMYIMGFPIDTEMEKDSEYSNKIRQESKISKTEGTVSYEQKITVETGQEKDGVKVYRVMVLEGTIAESIEHLDKKENEDILNNNRNRIVLADNTVINFDNISQLKEFLRRSVNIVDHDIFSSNGFEGFNPTSHFPSNPSSDYFNSTGVTFGSGVDLGQRSKQDLLNDGVPQYIADRLDGYYMLRGKEAYDKVRTAPLTLSDNEAHLLSNIYIDKFSHKIEGLFNDANIGLRFSDLPLRTRTALVSIGYQKGFKLSRTAPTVWNKVIAKDWNGLVNAFNNIVDGMSDRRKREGALVQKDIDSGLLK']
type: CDS
location: [6004:6421](+)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478718', 'GeneID:2767719']
Key: locus_tag, Value: ['YP_pPCP07']
Key: note, Value: ['Best Blastp hit = gi|16082685|ref|NP_395232.1| (NC_003132) hypothetical protein [Yersinia pestis], gi|5763816|emb|CAB53169.1| (AL109969) hypothetical protein [Yersinia pestis]']
Key: product, Value: ['hypothetical protein']
Key: protein_id, Value: ['NP_995573.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MKFHFCDLNHSYKNQEGKIRSRKTAPGNIRKKQKGDNVSKTKSGRHRLSKTDKRLLAALVVAGYEERTARDLIQKHVYTLTQADLRHLVSEISNGVGQSQAYDAIYQARRIRLARKYLSGKKPEGVEPREGQEREDLP']
type: CDS
location: [6663:7602](+)
qualifiers:
Key: EC_number, Value: ['3.4.23.48']
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478719', 'GeneID:2767715']
Key: gene, Value: ['pla']
Key: locus_tag, Value: ['YP_pPCP08']
Key: note, Value: ['outer membrane protease; involved in virulence in many organisms; OmpT; IcsP; SopA; Pla; PgtE; omptin; in Escherichia coli OmpT can degrade antimicrobial peptides; in Yersinia Pla activates plasminogen during infection; in Shigella flexneria SopA cleaves the autotransporter IcsA']
Key: product, Value: ['outer membrane protease']
Key: protein_id, Value: ['NP_995574.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MKKSSIVATIITILSGSANAASSQLIPNISPDSFTVAASTGMLSGKSHEMLYDAETGRKISQLDWKIKNVAILKGDISWDPYSFLTLNARGWTSLASGSGNMDDYDWMNENQSEWTDHSSHPATNVNHANEYDLNVKGWLLQDENYKAGITAGYQETRFSWTATGGSYSYNNGAYTGNFPKGVRVIGYNQRFSMPYIGLAGQYRINDFELNALFKFSDWVRAHDNDEHYMRDLTFREKTSGSRYYGTVINAGYYVTPNAKVFAEFTYSKYDEGKGGTQIIDKNSGDSVSIGGDAAGISNKNYTVTAGLQYRF']
type: CDS
location: [7788:8088](-)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478720', 'GeneID:2767713']
Key: locus_tag, Value: ['YP_pPCP09']
Key: note, Value: ['Best Blastp hit = gi|16082687|ref|NP_395234.1| (NC_003132) putative transcriptional regulator [Yersinia pestis], gi|5763818|emb|CAB53171.1| (AL109969) putative transcriptional regulator [Yersinia pestis].']
Key: product, Value: ['putative transcriptional regulator']
Key: protein_id, Value: ['NP_995575.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MRTLDEVIASRSPESQTRIKEMADEMILEVGLQMMREELQLSQKQVAEAMGISQPAVTKLEQRGNDLKLATLKRYVEAMGGKLSLDVELPTGRRVAFHV']
type: CDS
location: [8087:8360](-)
qualifiers:
Key: codon_start, Value: ['1']
Key: db_xref, Value: ['GI:45478721', 'GeneID:2767714']
Key: locus_tag, Value: ['YP_pPCP10']
Key: note, Value: ['Best Blastp hit = gi|16082688|ref|NP_395235.1| (NC_003132) hypothetical protein [ Yersinia pestis], gi|5763819|emb|CAB53172.1| (AL109969) hypothetical protein [Yersinia pestis]']
Key: product, Value: ['hypothetical protein']
Key: protein_id, Value: ['NP_995576.1']
Key: transl_table, Value: ['11']
Key: translation, Value: ['MADLKKLQVYGPELPRPYADTVKGSRYKNMKELRVQFSGRPIRAFYAFDPIRRAIVLCAGDKSNDKRFYEKLVRIAEDEFTAHLNTLESK']