|
هفتمین همایش بیو انفورماتیک ایران
|
|
|
عنوان فارسی |
|
|
چکیده فارسی مقاله |
|
|
کلیدواژههای فارسی مقاله |
|
|
عنوان انگلیسی |
Evaluation of quality-related parameters in raw NGS data and implementing tools to obtain them |
|
چکیده انگلیسی مقاله |
Given the low accuracy of Next Generation Sequencing (NGS) compared to Sanger sequencing, it is likely to misinterpret the data, if no primary quality control is performed. Quality control (QC) of raw data is considered as an important initiative step for overcomming instrumental artifacts. However, the main problem is that there is niether specific guideline nor gold standard parameters. This study aims to re-introduce parameters related to QC and suggests combination of existing tools for efficient quality checking. The suggested parameters of pre-processing to focus on, namely Quality Score, Read Complexity, Duplicate Reads were extracted for the data. The very first parameter to investigate, quality score as measure of uncertainty of basecall, depended on instrumental variables. Due to variation of length and arrangement of bases in each read, it is necessary to observe base composition visually for further decisions like adapter trimming or defining a quality score cutoff. Another important parameter to consider was read complexity which could cause mistaken alignment.Third one to investigate was duplicate read. Removing duplicate reads, believed to be a result of experimental errors, may cause loss of uniqe biological information. Also efficiency of the sequencer, bases of high quality, Primer/Adapter contamination and N base count were helpful for decision making. Tools used to obtain effective factors and implement an effective pipeline were a suggested combination of PPR Plot program[1], FaQCs[2], AfterQC[3] and NGS QC Toolkit[4]. A lot of information is generated by using QC tools that can help deciding on properties of secondary step of NGS analysis, utilizing our implemented combination of aforementioned tools, data-specific features like Quality Score, Read Complexity and Duplicate Reads could be quantified to simplify quality control for an expert. |
|
کلیدواژههای انگلیسی مقاله |
NGS, Quality Control, DNA-Seq, Pre-processing |
|
نویسندگان مقاله |
H. Mohammadi - Isfahan University of Medical Sciences, Isfahan,
M. Sehhati - Isfahan University of Medical Sciences, Isfahan,
A. Vaez - Isfahan University of Medical Sciences, Isfahan,
|
|
نشانی اینترنتی |
http://www.icb7.ir |
فایل مقاله |
دریافت فایل مقاله |
کد مقاله (doi) |
|
زبان مقاله منتشر شده |
en |
موضوعات مقاله منتشر شده |
|
نوع مقاله منتشر شده |
|
|
|
برگشت به:
صفحه اول پایگاه |
دوره مرتبط |
کنفرانس مرتبط |
فهرست کنفرانس ها
|