[python] a method to reduce ID length using ascii value
Bioinformatics/Biological data analysis 2012. 12. 24. 17:29def asciiID(i,one,two,three,four,five):
if one >= len(chrtb): one=0;two+=1
if two >= len(chrtb): two=0;three+=1
if three >= len(chrtb): three=0;four+=1
if four >= len(chrtb): four=0;five+=1
pk = chrtb[five]+chrtb[four]+chrtb[three]+chrtb[two]+chrtb[one]
one+=1
return pk,one,two,three,four,five
chrtb=[]
for ch in range(33,127): chrtb.append( chr(ch) )
one=0;two=0;three=0;four=0;five=0
for inputInteger in range(1,3000000000):
pk,one,two,three,four,five = asciiID(inputInteger,one,two,three,four,five)
print inputInteger,pk
When a ID is formatted with sequencial interger, it is able to convert to characters based on ascii code. Only 5 characters are able to present 7.3 billon IDs (33~126: 94, 94*94*94*94*94).
if one >= len(chrtb): one=0;two+=1
if two >= len(chrtb): two=0;three+=1
if three >= len(chrtb): three=0;four+=1
if four >= len(chrtb): four=0;five+=1
pk = chrtb[five]+chrtb[four]+chrtb[three]+chrtb[two]+chrtb[one]
one+=1
return pk,one,two,three,four,five
chrtb=[]
for ch in range(33,127): chrtb.append( chr(ch) )
one=0;two=0;three=0;four=0;five=0
for inputInteger in range(1,3000000000):
pk,one,two,three,four,five = asciiID(inputInteger,one,two,three,four,five)
print inputInteger,pk
When a ID is formatted with sequencial interger, it is able to convert to characters based on ascii code. Only 5 characters are able to present 7.3 billon IDs (33~126: 94, 94*94*94*94*94).
'Bioinformatics > Biological data analysis' 카테고리의 다른 글
[samtools] SAMtools FAQ (0) | 2012.12.24 |
---|---|
[bam] MD tag and cigar (0) | 2012.12.24 |
[python] decimal to binary (0) | 2012.12.24 |
[python] universal set - computing subsets from a set (list) (0) | 2012.12.24 |
[python] the ways to call external programs (0) | 2012.12.24 |