Publications

Department of Medicine faculty members published more than 3,600 peer-reviewed articles in 2024.

Filter by

GigaScience

Optimized distributed systems achieve significant performance improvement on sorted merging of massive VCF files.

2018

Authors: Sun X, Gao J, Jin P, Eng C, Burchard EG, Beaty TH, Ruczinski I, Mathias RA, Barnes KC, Wang F, Qin Z

Background

Sorted merging of genomic data is a common data operation necessary in many sequencing-based studies. It involves sorting and merging genomic data from different subjects by their genomic locations. In particular, merging a large number of Variant Call Format (VCF) files is frequently required in large scale whole genome sequencing or whole exome sequencing projects. Traditional single machine based methods become increasingly inefficient when processing large numbers of VCF files due to the excessive computation time and I/O bottleneck. Distributed systems and more recent cloud-based systems offer an attractive solution. However, carefully designed and optimized workflow patterns and execution plans (schemas) are required to take full advantage of the increased computing power while overcoming bottlenecks to achieve high performance.

Findings

In this study, we custom design optimized schemas for three Apache big data platforms, Hadoop (MapReduce), HBase and Spark, to perform sorted merging of a large number of VCF files. These schemas all adopt the divide-and-conquer strategy to split the merging job into sequential phases/stages consisting of subtasks which are conquered in an ordered, parallel and bottleneck-free way. In two illustrating examples, we test the performance of our schemas on merging multiple VCF files into either a single TPED or VCF file, which are benchmarked with the traditional single/parallel multiway-merge methods, message passing interface (MPI) based high performance computing (HPC) implementation and the popular VCFTools.

Conclusions

Our experiments suggest all three schemas either deliver a significant improvement in efficiency or render much better strong and weak scalabilities over traditional methods. Our findings provide generalized scalable schemas for performing sorted merging on genetics and genomics data using these Apache distributed systems.

View on PubMed

Annals of emergency medicine

Liberal Versus Restrictive Intravenous Fluid Therapy for Early Septic Shock: Rationale for a Randomized Trial.

2018

Authors: Self WH, Semler MW, Bellomo R, Brown SM, deBoisblanc BP, Exline MC, Ginde AA, Grissom CK, Janz DR, Jones AE, Liu KD, Macdonald SPJ, Miller CD, Park PK, Reineck LA, Rice TW, Steingrub JS, Talmor D, Yealy DM, Douglas IS, Shapiro NI, CLOVERS Protocol Committee and NHLBI Prevention and Early Treatment of Acute Lung Injury (PETAL) Net

In Silico Pharmacoepidemiologic Evaluation of Drug-Induced Cardiovascular Complications Using Combined Classifiers.

2018

Authors: Cai C, Fang J, Guo P, Wang Q, Hong H, Moslehi J, Cheng F

Volume 99 of Issue 1 | The American journal of tropical medicine and hygiene

Staphylococcus aureus Bacteremia Incidence and Methicillin Resistance in Rural Thailand, 2006-2014.

2018

Authors: Jaganath D, Jorakate P, Makprasert S, Sangwichian O, Akarachotpong T, Thamthitiwat S, Khemla S, DeFries T, Baggett HC, Whistler T, Gregory CJ, Rhodes J

View Abstract

is a common cause of bloodstream infection and methicillin-resistant (MRSA) is a growing threat worldwide. We evaluated the incidence rate of bacteremia (SAB) and MRSA from population-based surveillance in all hospitals from two Thai provinces. Infections were classified as community-onset (CO) when blood cultures were obtained ≤ 2 days after hospital admission and as hospital-onset (HO) thereafter. The incidence rate of HO-SAB could only be calculated for 2009-2014 when hospitalization denominator data were available. Among 147,524 blood cultures, 919 SAB cases were identified. Community-onset bacteremia incidence rate doubled from 4.4 (95% confidence interval [CI]: 3.3-5.8) in 2006 to 9.3 per 100,000 persons per year (95% CI: 7.6-11.2) in 2014. The highest CO-SAB incidence rate was among adults aged 50 years and older. Children less than 5 years old had the next highest incidence rate, with most cases occurring among neonates. During 2009-2014, there were 89 HO-SAB cases at a rate of 0.13 per 1,000 hospitalizations per year (95% CI: 0.10-0.16). Overall, MRSA prevalence among SAB cases was 10% (90/911) and constituted 7% (55/736) of CO-SAB and 20% (22/111) of HO-SAB without a clear temporal trend in incidence rate. In conclusion, CO-SAB incidence rate has increased, whereas MRSA incidence rate remained stable. The increasing CO-SAB incidence rate, especially the burden on older adults and neonates, underscores the importance of strong SAB surveillance to identify and respond to changes in bacteremia trends and antimicrobial resistance.

View on PubMed

Nature communications

Genome-wide and high-density CRISPR-Cas9 screens identify point mutations in PARP1 causing PARP inhibitor resistance.

2018

Authors: Pettitt SJ, Krastev DB, Brandsma I, Dréan A, Song F, Aleksandrov R, Harrell MI, Menon M, Brough R, Campbell J, Frankum J, Ranes M, Pemberton HN, Rafiq R, Fenwick K, Swain A, Guettler S, Lee JM, Swisher EM, Stoynov S, Yusa K, Ashworth A, Lord CJ

Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association - European Renal Association

Chronic kidney disease and peripheral nerve function in the Health, Aging and Body Composition Study.

2018

Authors: Moorthi RN, Doshi S, Fried LF, Moe SM, Sarnak MJ, Satterfield S, Schwartz AV, Shlipak M, Lange-Maia BS, Harris TB, Newman AB, Strotmeyer ES

PloS one

Human Herpes Virus 8 in HIV-1 infected individuals receiving cancer chemotherapy and stem cell transplantation.

2018

Authors: Hogan LE, Hanhauser E, Hobbs KS, Palmer CD, Robles Y, Jost S, LaCasce AS, Abramson J, Hamdan A, Marty FM, Kuritzkes DR, Henrich TJ

Journal of interprofessional care

Interprofessional population health advocacy: Developing and implementing a panel management curriculum in five Veterans Administration primary care practices.

2018

Authors: Dulay M, Bowen JL, Weppner WG, Eastburn A, Poppe AP, Spanos P, Wojtaszek D, Printz D, Kaminetzky CP

Volume 179 of Issue 1 | The British journal of dermatology

KIR3DL2 expression in patients with adult T-cell lymphoma/leukaemia.

2018

Authors: Hurabielle C, Leboeuf C, Ram-Wolff C, Meignin V, Rivet J, Vignon-Pennamen MD, Bonnafous C, Sicard H, Fite C, Raffoux E, Arnulf B, Oksenhendler E, Sicre de Fontbrune F, Peffault de Latour R, Socié G, Bouaziz JD, Lebbé C, Bensussan A, Janin A, Bagot M, Battistella M

Journal of general internal medicine

Palatal Mucormycosis.

2018

Authors: Brondfield S, Kaplan L, Dhaliwal G