Data Scientist - Portfolio Optimization
Company: Formation Bio
Location: New York City
Posted on: March 28, 2026
|
|
|
Job Description:
About Formation Bio Formation Bio is a tech and AI driven pharma
company differentiated by radically more efficient drug
development. Advancements in AI and drug discovery are creating
more candidate drugs than the industry can progress because of the
high cost and time of clinical trials. Recognizing that this
development bottleneck may ultimately limit the number of new
medicines that can reach patients, Formation Bio, founded in 2016
as TrialSpark Inc., has built technology platforms, processes, and
capabilities to accelerate all aspects of drug development and
clinical trials. Formation Bio partners, acquires, or in-licenses
drugs from pharma companies, research organizations, and biotechs
to develop programs past clinical proof of concept and beyond,
ultimately helping to bring new medicines to patients. The company
is backed by investors across pharma and tech, including a16z,
Sequoia, Sanofi, Thrive Capital, Sam Altman, John Doerr, Spark
Capital, SV Angel Growth, and others. You can read more at the
following links: Our Vision for AI in Pharma Our Current Drug
Portfolio Our Technology & Platform At Formation Bio, our values
are the driving force behind our mission to revolutionize the
pharma industry. Every team and individual at the company shares
these same values, and every team and individual plays a key part
in our mission to bring new treatments to patients faster and more
efficiently. About the Position As a Data Scientist on the platform
prediction team, you'll translate our probability of success
predictions into measurable portfolio-level outcomes. You'll
architect core systems — order management, execution simulation,
portfolio construction, risk monitoring, and performance
attribution — that let us rigorously evaluate signals from our
AI-driven predictions in public and private equities and our
internal portfolio. This role sits at the intersection of
quantitative finance, healthcare data, and AI-driven drug
development. If you're excited about applying portfolio
construction and risk management fundamentals to one of the most
consequential prediction problems in healthcare, this is the
role.No other company — hedge fund or pharma — has a technical data
science position translating drug development experience into
durable AI-native portfolio strategies. The skills you develop here
— portfolio construction over assets with radically asymmetric risk
profiles, clinical trial analytics, AI/ML in production, and risk
management across multi-year horizons — can directly impact the
delivery of new and effective therapeutics to patients by best
aligning impactful medicines with economic incentives.
Responsibilities Work with the team to implement and maintain core
portfolio engine: order management system, execution simulation
layer, portfolio construction service, and performance tracking
Design risk frameworks that quantify exposure across a portfolio of
drug development bets with radically different risk profiles,
timelines, and failure modes Run rigorous backtesting experiments
with strict temporal constraints to evaluate Formation strategies
against baseline approaches and measure marginal signal from new
evidence sources Coordinate across the organization to integrate
internal Formation data sources (clinical trial data, genomic
evidence, real-world data) and proprietary tooling into portfolio
analytics pipelines Work with product and engineering teams to
build dashboards and reporting that communicate portfolio
performance, risk metrics, and strategy comparisons to both
technical and executive stakeholders Collaborate with the broader
data science team to ensure portfolio-level evaluation feeds back
into model improvement and evidence prioritization About You
Required Qualifications MS or PhD in a quantitative field
(statistics, finance, physics, computational science, engineering,
or related) 1-3 years in a quantitative research, data science, or
analytics role — finance, healthcare, academic research, or
consulting all count; substantive internships qualify Strong Python
programming skills with experience in data-intensive workflows
(pandas, numpy, scipy) Solid grasp of core portfolio construction
and risk concepts: position sizing, rebalancing, Sharpe ratio,
drawdown, volatility, benchmark comparison Demonstrated ability to
work with messy, real-world datasets — comfortable with data
wrangling, deduplication, and quality assessment Clear communicator
who can present quantitative results to both technical peers and
business stakeholders Preferred Qualifications Experience with
backtesting frameworks or portfolio simulation (vectorbt,
Backtrader, or custom implementations) Exposure to healthcare,
pharma, or biotech data (clinical trials, claims data, -omics,
real-world evidence) Familiarity with alternative data in a
research or investment context Experience with
probability-of-success modeling, drug development decision
analysis, or health economics Comfort with LLMs or AI/ML pipelines
in a production or research setting Familiarity with
dashboard/visualization tools (Streamlit, Plotly, Dash) and
pipeline orchestration (Dagster, Airflow) Healthcare OR finance
domain knowledge is valued; both are not required. Formation Bio is
prioritizing hiring in key hubs, primarily the New York City and
Boston metro areas. These positions will follow a hybrid work model
with 1-3 days required at the office. Please only apply if you
reside in these locations or are willing to relocate. Salary ranges
are informed by a number of factors including geographic location.
The range provided includes base salary only. In addition to base
salary, we offer equity, comprehensive benefits, generous perks,
hybrid flexibility, and more. If this range doesn't match your
expectations, please still apply because we may have something else
for you. Compensation Range: $154,500 - $202,000 You will receive
consideration for employment without regard to race, color,
religion, gender, gender identity or expression, sexual
orientation, national origin, genetics, disability, age, or veteran
status.
Keywords: Formation Bio, New York , Data Scientist - Portfolio Optimization, Science, Research & Development , New York City, New York