자바에서 시스템 명령어나 외부 명령어를 시스템 상에서 사용해야할때
standard out 이나 standard error 가 지속적으로 발생하여
정상적인 프로세스 종료를 하지 못하고 프로세스가 좀비화 되는 경우가 있습니다.

이것은 자바가  standard out/error 를 직접 가져오게 되어서 발생되는 현상으로
일정 이상의 데이터가 쌓이게 되면 프로세스는 죽게 됩니다.

따라서 이때에는  standard out/error 을 파일로 떨구어 주어야 합니다.
시스템 상에서 실행되는 명령어에 직접  standard out/error 를 파일로 떨구는 부분을 추가해야 합니다.

예) bcftools -bcvg test.bam > standard.out 2> standard.error

 
Posted by 옥탑방람보
,
레퍼런스 순서와 bam 안의 염색체 순서가 일치해야함.
Posted by 옥탑방람보
,

1. download (설치된 파이썬 버젼과 일치하는 것)
   http://code.google.com/p/pygr/downloads/list
2.  egg 파일 풀기 (알집 등 이용)
3. 파이썬 라이브러리 밑에 위치 (import 가능한 위치)

 
Posted by 옥탑방람보
,
1. 다운로드 
2. 압축 풀기
3. 설치 디렉토리에 mv
4. configure
5. make
6. ~/.bashrc

#python
export PATH=/.../install/Python-2.6.7/:$PATH
export PYTHONPATH=/.../kimps/lib:/.../install/Python-2.6.7/Lib:/.../install/Python-2.6.7/Lib/site-packages 
Posted by 옥탑방람보
,
MD tag and cigar
10A5^AC6

REF:         ATCGTAGCTAATTTGGACATCGGT
READ:        ATCGTAGCTATTTTGG--ATCGGT
MD TAG:      10        A5   ^AC6
CIGAR:       16M             2D6M
READ:        atcGTAGCTATTTTGGATA..GGT (ATCGTAGCTATTTTGGATAAAGGT)
MD TAG:      17               C1TC3
CIGAR:       3S 16M             2N3M
READ:        ATCGTAGCTAATTTGGACATCGGT (ATCGTGGAGCTAATTTGGACATCGGT)
CIGAR:       5M   2I19M


MD TAG
The MD eld aims to achieve SNP/indel calling without looking at the reference. For example, a string `10A5^AC6' means from the leftmost reference base in the alignment, there are 10 matches followed by an A on the reference which is di erent from the aligned read base; the next 5 reference bases are matches followed by a 2bp deletion from the reference; the deleted sequence is AC; the last 6 bases are matches. The MD eld ought to match the CIGAR string.

CIGAR

M     alignment match (can be a sequence match or mismatch)
I     insertion to the reference
D     deletion from the reference
N     skipped region from the reference
S     soft clipping (clipped sequences present in SEQ)
H     hard clipping (clipped sequences NOT present in SEQ)
P     padding (silent deletion from padded reference)
=     sequence match
X     sequence mismatch

H can only be present as the rst and/or last operation.
S may only have H operations between them and the ends of the CIGAR string.
For mRNA-to-genome alignment, an N operation represents an intron. For other types of alignments, the interpretation of N is not de ned.
Sum of lengths of the M/I/S/=/X operations ought to equal the length of SEQ.
Posted by 옥탑방람보
,
import os

a = '../work/'
print os.path.abspath(a) 
Posted by 옥탑방람보
,


SQL 문으로 ERWIN 에서 ERD 그리기
1. create table 의 sql 스크립트
2. Tool > Reverse Engineer 
3. Logical/Physical  > Database 선택 (Oracle)
4. Reverse Engineer From > Script File 선택 ( SQL 선택)
5. 모델링 할 요소들 체크 (테이블, 인덱스, 프로시져, 테이블스페이스 등등.. ) 

Posted by 옥탑방람보
,
Exception in thread "main" java.lang.RuntimeException: SAM validation error: ERROR: Record 62094965, Read name ILLUMINA-A16956_100211:4:14:19403:10471#0, MAPQ should be 0 for unmapped read

이러한 에러 발생시,

VALIDATION_STRINGENCY=
LENIENT

옵션을 추가해준다.


This is a common problem, and you'll run into it with all the Picard suite of tools.

There's a setting that goes something like VALIDATION_STRINGENCY, and if you set it to LENIENT, it will complain about those reads, but it won't stop on them.

This happens with bwa, because it concatenates reference sequence, which leads to slightly odd things happening when a read aligns over the overlap. So this might be the source of your problem. Regardless of what's causing it, you can examine the problem reads, and cut them out, or change the stringency to let them go through.

 
Posted by 옥탑방람보
,
java -jar MarkDuplicates.jar INPUT=test.bam OUTPUT=test.marked.bam METRICS_FILE=test.txt TMP_DIR=. ASSUME_SORTED=true
Posted by 옥탑방람보
,
import inspec
print inspect.getfile( inspect.currentframe() )

import os
print os.path.abspath( __file__ )
 
import sys
sys._getframe().f_code.co_filename 
Posted by 옥탑방람보
,