The Weber Lab in the Center for Biomedical Informatics (CBMI) at Harvard Medical School is seeking a Postdoctoral Research Associate to help develop probabilistic algorithms and software for biomedical "Big Data". Despite all the recent attention to Big Data in both the scientific literature and popular press, it has yet to have the same impact on medicine as it has in other areas, such as finance, retail business, political campaigns, and national security. We believe the key for clinical applications of Big Data is the ability to create a holistic view of a patient’s health history by linking disparate data sources, such as electronic health records, administrative claims, genomic data, and data outside the healthcare system, including social media and information about the environment. Because there are legal and ethical concerns about sharing health data, coupled with the fact that there is no universal patient identifier, our approach is to use probabilistic algorithms that can protect patient privacy and scale efficiently while searching for information about patients across datasets with billions of records.
The Postdoctoral Research Associate will be responsible for developing computer software to link patient records probabilistically across large biomedical datasets. Although traditional linkage variables such as names or zip codes will be available in some datasets, the software must also handle data that contain only genomic sequences, encounter dates, billing codes, or other de-identified information. Scalability, encryption, compression, performance, and accuracy of the probabilistic algorithms are essential components to the software. The Postdoctoral Research Associate will need to evaluate existing programs in this domain to determine if they can be utilized for our application. The patient linkage tool is one part of an open source distributed query tool being built by a team of scientists and software developers. As a member of a research lab, the Postdoctoral Research Associate will gain experience writing scientific papers and presenting at scientific seminars and conferences.
The Weber Lab is funded by NIH and NSF grants to develop algorithms and open source software for analyzing biomedical "Big Data". We created a social networking website for scientists called Profiles RNS (http://profiles.catalyst.harvard.edu) and contributed to a program for querying clinical data about patients called i2b2 (http://www.i2b2.org). Both of these systems are used in dozens of institutions worldwide.
In addition to this project on patient linkage, there is a separate project in the Weber Lab to model the scientific workforce in order to understand how new collaborations form and how this influences both the effectiveness of teams and the career trajectories of individual scientists. That project, which is also seeking a Postdoctoral Research Associate, involves probabilistic linkage of large datasets related to the scholarly activities of researchers (e.g., 50 million publications, 4 million patents, 2 million grants). Thus, there will be synergies between the two projects.
Candidates must have a PhD or other advanced degree in computer science, biomedical informatics, computational biology, artificial intelligence, biostatics, or a related field. A strong background in collaborative software development, and in particular probabilistic algorithms and/or statistical modeling is required. Experience with large relational databases and/or other Big Data tools such as Hadoop is also important. Software components in the distributed query tool, including the patient linkage module, will be hosted in AWS and communicate using RESTful APIs.
Candidates should be highly motivated, creative, and interested in learning new skills. They must enjoy solving complex and challenging problems and being part of a multidisciplinary research team. Excellent written and verbal communication skills are essential.
Experience with version control software and project management tools like JIRA are desirable. We primarily use Microsoft SQL Server and write software in C# for .NET; though, this is not a requirement, and other development groups that are part of this project use different programming languages including Java, Python, Perl, and R.
The position is available immediately and can be renewed annually.
How to apply:
Email applications including curriculum vitae, summary statement of personal objective and research interests, PDFs of the best two papers, and the names and email addresses of three references to: Griffin M Weber, MD, PhD, firstname.lastname@example.org
Harvard Medical School is an Equal Opportunity/Affirmative Action Employer. Women and minorities are especially encouraged to apply.